Web Spidering Framework - Malspider

Malspider is a web spidering framework that inspects websites for characteristics of compromise. Malspider has three purposes:
  • Website Integrity Monitoring: monitor your organization’s website (or your personal website) for potentially malicious changes.
  • Generate Threat Intelligence: keep an eye on previously compromised sites, currently compromised sites, or sites that may be targeted by various threat actors.
  • Validate Web Compromises: Is this website still compromised?

What can Malspider detect?

Malspider has built-in detection for characteristics of compromise like hidden iframes, reconnaisance frameworks, vbscript injection, email address disclosure, etc.
As we find stuff we will continue to add classifications to this tool and we hope you will do the same. Malspider will be a much better tool if CIRT teams and security practitioners around the world contribute to the project. ciscocsirt


Prerequisites

Please make sure these technologies are installed before continuing:
  • Python 2.7.6
  • Updated version of pip
  • mysql
Note: If your server already has specific versions of these components installed, you can use a virtualenv to create an isolated python environment.
Tested and working on minimal installations of:
  • Ubuntu 14
  • CentOS 6
  • CentOS 7

Installation

Start the installation process by running “./quick_install” from the command line. Please read the prompts carefully!!
Malspider comes with a quick_install script found in the root directory. This scripts attempts to makes the installation process as painless as possible by completing the following steps:
  1. Install Database: creates a database titled ‘malspider’, creates a new mysql user, and applies db schema.
  2. Install Dependencies: installs ALL dependencies and modules required by Malspider.
  3. Django Migrations: applies django migrations to the database (necessary for the web app).
  4. Create Web Admin User: creates an administrative user for the web application.
  5. Add Access Control: creates iptables rules to block port 6802 (used by the daemon) and open port 8080 (web app).
  6. Add Cronjobs: creates crontab entries to schedule jobs, analyze data, and purge the database after a period of time.
Note: The quick_install script uses scripts found under the install/ directory. If any of the above steps fail you can attempt to complete them manually using those scripts.

Start

Start Malspider by running “./quick_start” from the command line. Malspider comes with a quick_start script found in the root directory. This script attempts to start the daemon and the web application. Malspider can be accessed from your browser on port 8080 @ http://0.0.0.0:8080
Interaction with Malspider happens via an easy-to-use dashboard accessible through your web browser. The dashboard enables you to view alerts, inspect injected code, add websites to monitor, and tune false positives. You can add websites to you want to crawl by navigating to the administrative panel @ http://0.0.0.0:8080/admin (or by clicking on the admin link from the dashboard). Click on “Organizations” and a new Organization. You’ll be prompted for the:
  • website name (ie. “Cisco Systems”)
  • domain (ie. cisco.com)
  • industry/org category (ie. Energy, Political, Education, etc)
By default, Malspider crawls 20 pages per domain. This can be changed. You can crawl as many pages as you like (per domain) or you can crawl only the homepage of each site.

Malspider randomly selects a user agent string from a list found at malspider/resources/useragents.txt. If you would like to add more user agents to the list then simply edit that text file. Malspider has also built-in capabilities for taking screenshots of every page it crawls. Screenshots can be useful in a variety of situations, but this can cause a drastic increase in server space utilization. For that reason, screenshots are turned off by default. For this reason email address detection is also off by default. Malspider crawls websites and stores information about those sites in a database. The data in the database is post-processed and analyzed for potentially malicious characteristics. You can view results from the analyzer by simply viewing the dashboard and clicking on “View Alerts”. Your database can grow rather large very quickly. It is recommended that, for performance reasons, you delete data from the ‘pages’ table and the ‘elements’ table once per month