Machinae V1.4.8 - Safety Word Collector


Machinae is a tool for collecting tidings from world sites/feeds nearly diverse security-related pieces of data: IP addresses, domain names, URLs, electronic mail addresses, file hashes, together with SSL fingerprints. It was inspired past times Automater, about other first-class tool for collecting information. The Machinae projection was born from wishing to amend Automater inwards four areas:
  1. Codebase - Bring Automater to python3 compatibility piece making the code to a greater extent than pythonic
  2. Configuration - Use a to a greater extent than human readable configuration format (YAML)
  3. Inputs - Support JSON parsing out-of-the-box without the ask to write regular expressions, but even hence back upward regex scraping when needed
  4. Outputs - Support additional output types, including JSON, piece making extraneous output optional

Installation
Machinae tin live installed using pip3:
pip3 install machinae
Or, if you're feeling adventurous, tin live installed straight from github:
pip3 install git+https://github.com/HurricaneLabs/machinae.git
You volition ask to convey any dependencies are required on your organisation for compiling Python modules (on Debian based systems, python3-dev), equally good equally the libyaml evolution packet (on Debian based systems, libyaml-dev).
You'll also want to take in the latest configuration file together with house it inwards /etc/machinae.yml.

Configuration File
Machinae supports a elementary configuration merging organisation to permit y'all to brand adjustments to the configuration without modifying the machinae.yml nosotros render you, making configuration updates a snap. This is done past times finding a system-wide default configuration (default /etc/machinae.yml), merging into that a system-wide local configuration (/etc/machinae.local.yml) together with lastly a per-user local configuration ( /.machinae.yml). The system-wide configuration tin also live located inwards the electrical flow working directory, tin live laid using the MACHINAE_CONFIG surroundings variable, or of class past times using the -c or --config command line options. Configuration merging tin live disabled past times passing the --nomerge option, which volition drive Machinae to exclusively charge the default system-wide configuration (or the 1 passed on the command line).
As an representative of this, tell you'd similar to enable the Fortinet Category site, which is disabled past times default. You could modify /etc/machinae.yml, but these changes would live overwritten past times an update. Instead, y'all tin position the next inwards either /etc/machinae.local.yml or /.machinae.yml:
fortinet_classify:   default: true
Or, conversely, to disable a site, such equally Virus Total pDNS:
vt_ip:   default: faux vt_domain:   default: false

Usage
Machinae usage is rattling similar to Automater:
usage: machinae [-h] [-c CONFIG] [--nomerge] [-d DELAY] [-f FILE] [-i INFILE] [-v]             [-o {D,J,N,S}] [-O {ipv4,ipv6,fqdn,email,sslfp,hash,url}] [-q]             [-s SITES] [-a AUTH] [-H HTTP_PROXY]             [--dump-config | --detect-otype]             ...
  • See inwards a higher house for details on the -c/--config together with --nomerge options.
  • Machinae supports a -d/--delay option, similar Automater. However, Machinae uses 0 past times default.
  • Machinae output is controlled past times ii arguments:
    • -o controls the output format, together with tin live followed past times a unmarried grapheme to indicated the desired type of output:
      • N is the default output ("Normal")
      • D is the default output, but dot characters are replaced
      • J is JSON output
    • -f/--file specifies the file where output should live written. The default is "-" for stdout.
  • Machinae volition endeavour to auto-detect the type of target passed inwards (Machinae refers to targets equally "observables" together with the type equally "otype"). This detection tin live overridden alongside the -O/--otype option. The choices are listed inwards the usage
  • By default, Machinae operates inwards verbose mode. In this mode, it volition output condition information nearly the services it is querying on the console equally they are queried. This output volition e'er live written to stdout, regardless of the output setting. To disable verbose mode, utilisation -q
  • By default, Machinae volition run through all services inwards the configuration that apply to each target's otype and are non marked equally "default: false". To modify this behavior, y'all can:
    • Pass a comma separated listing of sites to run (use the top grade cardinal from the configuration).
    • Pass the particular keyword all to run through all services including those marked equally "default: false"
    Note that inwards both cases, otype validation is even hence applied.
  • Machinae supports passing an HTTP proxy on the command business using the -H/--http-proxy argument. If no proxy is specified, machinae volition search the criterion HTTP_PROXY together with HTTPS_PROXY surroundings variables, equally good equally the less criterion http_proxy together with https_proxy surroundings variables.
  • Lastly, a listing of targets should live passed. All arguments other than the options listed inwards a higher house volition live interpreted equally targets.

Out-of-the-Box Data Sources
Machinae comes alongside out-of-the-box back upward for the next information sources:
  • IPVoid
  • URLVoid
  • URL Unshortener (http://www.toolsvoid.com/unshorten-url)
  • Malc0de
  • SANS
  • FreeGeoIP (freegeoip.io)
  • Fortinet Category
  • VirusTotal pDNS (via spider web scrape - commented out)
  • VirusTotal pDNS (via JSON API)
  • VirusTotal URL Report (via JSON API)
  • VirusTotal File Report (via JSON API)
  • Reputation Authority
  • ThreatExpert
  • VxVault
  • ProjectHoneypot
  • McAfee Threat Intelligence
  • StopForumSpam
  • Cymru MHR
  • ICSI Certificate Notary
  • TotalHash (disabled past times default)
  • DomainTools Parsed Whois (Requires API key)
  • DomainTools Reverse Whois (Requires API key)
  • DomainTools Reputation
  • IP WHOIS (Using RIR REST interfaces)
  • Hacked IP
  • Metadefender Cloud (Requires API key)
  • GreyNoise (Requires API key)
  • IBM XForce (Required API key)
With additional information sources on the way.

HTTP Basic Authentication together with Configuration
Machinae supports HTTP Basic Auth for sites that require it through the --auth/-a flag. You volition ask to practice a YAML file alongside your credentials, which volition include a cardinal to the site that requires the credentials together with a listing of ii items, username together with password or API key. For example, for the included PassiveTotal site this mightiness await like:
passivetotal: ['myemail@example.com', 'my_api_key']
Inside the site configuration nether request y'all volition run into a cardinal such as:
json:   request:     url: '...'     auth: passivetotal
The auth: passivetotal points to the cardinal within the authentication config passed via the command line.

Disabled past times default
The next sites are disabled past times default
  • Fortinet Category (fortinet_classify)
  • Telize Geo IP (telize)
  • TotalHash (totalhash_ip)
  • DomainTools Parsed Whois (domaintools_parsed_whois)
  • DomainTools Reverse Whois (domaintools_reverse_whois)
  • DomainTools Reputation (domaintools_reputation)
  • PassiveTotal Passive DNS (passivetotal_pdns)
  • PassiveTotal Whois (passivetotal_whois)
  • PassiveTotal SSL Certificate History (passivetotal_sslcert)
  • PassiveTotal Host Attribute Components (passivetotal_components)
  • PassiveTotal Host Attribute Trackers (passivetotal_trackers)
  • MaxMind GeoIP2 Passive Insight (maxmind)
  • FraudGuard (fraudguard)
  • Shodan (shodan)
  • Hacked IP
  • Metadefender Cloud (Requires API key)
  • GreyNoise (Requires API key)
  • IBM XForce (Requires API key)

Output Formats
Machinae comes alongside a limited laid of output formats: normal, normal alongside dot escaping, together with JSON. We conception to add together additional output formats inwards the future.

Adding additional sites
*** COMING SOON ***

Known Issues
  • Some ISP's on IPvoid comprise double-encoded HTML entities, which are non double-decoded

Upcoming Features
  • Add IDS dominion search functionality (VRT/ET)
  • Add "More info" link for sites
  • Add "dedup" choice to parser settings
  • Add choice for per-otype asking settings
  • Add custom per-site output for fault codes

Version History

Version 1.4.1 (2018-08-31)
  • New Features
    • Automatically Defangs output
    • MISP Support (example added to machinae.yml)

Version 1.4.0 (2016-04-20)
  • New features
    • "-a"/"--auth" choice for passing an auth config file
      • Thanks johannestaas for the submission
    • "-H"/"--http-proxy" option, together with surroundings support, for HTTP proxies
  • New sites
    • Passivetotal (various forms, cheers johannestaas)
    • MaxMind
    • FraudGuard
    • Shodan
  • Updated sites
    • FreeGeoIP (replaced freegeoip.net alongside freegeoip.io)

Version 1.3.4 (2016-04-01)
  • Bug fixes
    • Convert exceptions to str when outputting to JSON
      • Should genuinely closed #14

Version 1.3.3 (2016-03-28)
  • Bug fixes
    • Correctly grip fault results when outputting to JSON
      • Closes #14
      • Thanks Den1al for the põrnikas report

Version 1.3.2 (2016-03-10)
  • New features
    • "Short" output manner - merely output yes/no/error for each site
    • "-i"/"--infile" choice for passing a file alongside listing of targets

Version 1.3.1 (2016-03-08)
  • New features
    • Prepend "http://" to URL targets when non starting alongside http:// or https://

Version 1.3.0 (2016-03-07)
  • New sites
    • Cymon.io - Threat intel aggregator/tracker past times eSentire
  • New features
    • Support elementary paginated responses
    • Support url encoding 'target' inwards asking URL
    • Support url decoding values inwards results

Version 1.2.0 (2016-02-16)
  • New features
    • Support for sites returning multiple JSON documents
    • Ability to specify fourth dimension format for relative fourth dimension parameters
    • Ability to parse Unix timestamps inwards results together with display inwards ISO-8601 format
    • Ability to specify condition codes to ignore per-API
  • New sites
    • DNSDB - FarSight Security Passive DNS Data base of operations (premium)

Version 1.1.2 (2015-11-26)
  • New sites
    • Telize (premium) - GeoIP site (premium)
    • Freegeoip - GeoIP site (free)
    • CIF - CIFv2 API support, from csirtgadgets.org
  • New features
    • Ability to specify labels for single-line multimatch JSON outputs
    • Ability to specify relative fourth dimension parameters using relatime library

Version 1.0.1 (2015-10-13)
  • Fixed a false-positive põrnikas alongside Spamhaus (Github#10)

Version 1.0.0 (2015-07-02)
  • Initial release