Moloch In-Depth
I've been using AOL Moloch (yes, that's America On-Line; the developers work for AOL) for well over a month and I have to say it's the best open source packet capture product I've ever used.
The first system I ever worked with was the now famous SHADOW (Secondary
Heuristics Analysis for Defensive Online Warfare), written by Stephen Northcutt many years ago and then developed and maintained by Bill Ralph of the Naval Surface Warfare Center until they retired it.
After that,George Bakos extended SHADOW into IDABench when we was with the Dartmouth ISTS (Institute for Security, Technology and Society) with a framework of Perl scripts that allowed you to examine the packets and run them through ngrep, p0f, Ethereal (yes, it's been a while) and others.
IDABench was a great tool but it wasn't scalable. The console server would pull each hours worth of packets form each capture box, which was feasible when George wrote it, but quickly scaled out as networks got faster and faster.
After IDABench, I tried OpenFPC. The system had great promise, but it was buggy and not maintained on a consistent basis and I eventually had to abandon it. After that, our shop used daemonlogger to capture packets for quite a while. I used bash loops to extract the packets and mergecap to write them all back to one capture file and tools like ngrep, tcpdump and Wireshark to inspect them.
Then my boss came to me one day with a system he came across, which was Moloch. He did a single install on a VM as a test platform and we really liked what we saw, so he rebuilt a rather beefy, recently retired server for the Elastic Search engine and installed the capture and viewer components on our existing packet capture boxes.
Moloch works in this way: Each capture box writes out the packets it sniffs to it's own drive array, but the meta-data about the packets is sent off to the Elastic Search box and indexed. You can run a viewer, the web front end, off of each capture box or off of the Elastic Search box; it doesn't matter. When you do a search on any Moloch machine, you're searching the indexes on ES server. If you're searching on Box 1, you'll see the capture information from Box 1 as well as the meta-data from all the other capture boxes. When you expand a record to look at the ensuing packet data, that data is transferred to the box you're working on unless it was captured on that sensor.
When you expand the packet you'll see a column with both sides of the conversation, color coded in red and blue, with the sources packets on the left and the destinations on the right. If you're familiar with Wireshark, it looks almost exactly like when you right click a packet and choose "Follow TCP Stream".
If the data is gzipped, there's a check box under the header data to uncompress and redisplay it. Another check box gives you the "Show Images and Files" option. Images are rendered in the flow of the conversation, under their corresponding packet data. Files are reconsructed and a link is provided for you to download them. For example, if you're looking at an email conversation and there is a PDF file attached to the email, you're provided a hyperlink to the file. You can also download a pcap file of the session or transfer the raw data into a file from the source, destination or both.
Moloch can save you enormous amounts of time investigating alerts from your IDS/IPS. I run my IDS in one screen and Moloch in another and as go through my alerts, I search for packets and reconstruct the session. In this way I can quickly see what triggered the alert, beyond the one or two packets my IDS captured.
Moloch is a great product. For the functionality it provides they could have easily made this into a commercial product and charged copious amounts of money for it. If you're an intrusion analyst or work on a network team and need near-instantaneous insight into your packet captures, this might be the product for you.
The first system I ever worked with was the now famous SHADOW (Secondary
Heuristics Analysis for Defensive Online Warfare), written by Stephen Northcutt many years ago and then developed and maintained by Bill Ralph of the Naval Surface Warfare Center until they retired it.
After that,George Bakos extended SHADOW into IDABench when we was with the Dartmouth ISTS (Institute for Security, Technology and Society) with a framework of Perl scripts that allowed you to examine the packets and run them through ngrep, p0f, Ethereal (yes, it's been a while) and others.
IDABench was a great tool but it wasn't scalable. The console server would pull each hours worth of packets form each capture box, which was feasible when George wrote it, but quickly scaled out as networks got faster and faster.
After IDABench, I tried OpenFPC. The system had great promise, but it was buggy and not maintained on a consistent basis and I eventually had to abandon it. After that, our shop used daemonlogger to capture packets for quite a while. I used bash loops to extract the packets and mergecap to write them all back to one capture file and tools like ngrep, tcpdump and Wireshark to inspect them.
Then my boss came to me one day with a system he came across, which was Moloch. He did a single install on a VM as a test platform and we really liked what we saw, so he rebuilt a rather beefy, recently retired server for the Elastic Search engine and installed the capture and viewer components on our existing packet capture boxes.
Moloch works in this way: Each capture box writes out the packets it sniffs to it's own drive array, but the meta-data about the packets is sent off to the Elastic Search box and indexed. You can run a viewer, the web front end, off of each capture box or off of the Elastic Search box; it doesn't matter. When you do a search on any Moloch machine, you're searching the indexes on ES server. If you're searching on Box 1, you'll see the capture information from Box 1 as well as the meta-data from all the other capture boxes. When you expand a record to look at the ensuing packet data, that data is transferred to the box you're working on unless it was captured on that sensor.
When you expand the packet you'll see a column with both sides of the conversation, color coded in red and blue, with the sources packets on the left and the destinations on the right. If you're familiar with Wireshark, it looks almost exactly like when you right click a packet and choose "Follow TCP Stream".
If the data is gzipped, there's a check box under the header data to uncompress and redisplay it. Another check box gives you the "Show Images and Files" option. Images are rendered in the flow of the conversation, under their corresponding packet data. Files are reconsructed and a link is provided for you to download them. For example, if you're looking at an email conversation and there is a PDF file attached to the email, you're provided a hyperlink to the file. You can also download a pcap file of the session or transfer the raw data into a file from the source, destination or both.
Moloch can save you enormous amounts of time investigating alerts from your IDS/IPS. I run my IDS in one screen and Moloch in another and as go through my alerts, I search for packets and reconstruct the session. In this way I can quickly see what triggered the alert, beyond the one or two packets my IDS captured.
Moloch is a great product. For the functionality it provides they could have easily made this into a commercial product and charged copious amounts of money for it. If you're an intrusion analyst or work on a network team and need near-instantaneous insight into your packet captures, this might be the product for you.