Basic Skillz, pt II

Following my initial post on this topic, and to dove-tail off of Brett's recent post, I wanted to provide something of a consolidated view based on the comments received.

For the most part, I think Brett's first comment was very much on point:

Should be easy enough to determine what would constitute basic skills, starting with collecting the common skills needed across every specialty (the 'basic things'). Things like, seizing evidence, imaging, hashing, etc..

Okay, so that's a really good start.  Figure out what is common across all specialties, and come up with a core set of skills that are independent of OS, platform, etc., in order to determine what constitutes a "Basic DF Practitioner". These skills will need to be able to be tested and verified; some will likely be "you took a test and achieved a score", while other skills be pass/fail, or verification of the fact that you were able to demonstrate the skill to some degree.  Yes, this will be more subjective that a written test, but there are some skills (often referred to as "soft skills") that while important, one may not be able to put their finger on to the point of having a written test to verify that skill.

Brigs had some great thoughts as far as a break down of skill sets goes, although when I read his comment, I have to admit that in my head, I read it in my Napolean Dynamite voice.  ;-)  Taking this a step further, however, I wanted to address @mattnotmax's comments, as I think they provide a really good means to walk through the thought process.

1. collect the evidence properly 

What constitutes "properly"?  The terms "forensics" and "evidence" bring a legal perspective to the forefront in discussions on this topic, and while I fully believe that there should be one standard to which we all strive to operate, the simple fact is that business processes and requirements very often prevent us from relying on one single standard.  While it would be great to be able to cleanly shut a system down and extract the hard drive(s) for acquisition, there are plenty of times we cannot do so. I've seen systems with RAID configurations shut down and the individual drives acquired, but the order of the drives and the RAID configuration itself was never documented; as such, we had all those disk images that were useless.  On the other hand, I've acquired images from live systems with USB 1.0 connections by mapping a drive (an ext HDD) to another system on the network that had USB 2.0 connections.

I think we can all agree that we won't always have the perfect, isolated, "clean room-ish" setting for acquiring data or 'evidence'.  Yes, it would be nice to have hard drives removed from systems, and be able to have one verified/validated method for imaging that data, but that's not always going to be the case.

Live, bare-metal systems do not have a "hard drive" for memory, and memory acquisition inherently requires the addition of software, which modifies the contents of memory itself.

I have never done mobile forensics but I'm sure that there are instances, or even just specific handsets, where an analyst cannot simply shut the handset down and acquire a complete image of the device.

I would suggest that rather than simply "collect the evidence properly", we lean toward understanding how evidence can be collected (that one size does not fit all), and that the collection process must be thoroughly documented.

2. image the hard drive 

Great point with respect to collection...but what if "the hard drive" isn't the issue?  What if it's memory?  Or a SIM card?  See my thoughts on #1.

3. verify the tool that did the imaging, and then verify the image taken 

I get that the point here is the integrity of the imaging process itself, as well as maintaining and verifying the integrity of the acquired image.  However, if your only option for collecting data is to acquire it from a live system, and you cannot acquire a complete copy of the data, can we agree that what is important here is (a) documentation, and (b) understanding image integrity as it applies to the process being used (and documented)?

For items 1 thru 3, can we combine them into understanding how evidence or data can be collected, techniques for doing so, and that all processes must be thoroughly documented?

4. know what sort of analysis is required even if they don't know how to do it (i.e. can form a hypothesis) 

Knowing what sort of analysis is required is predicated by understanding the goals of the acquisition and analysis process.  What you are attempting to achieve predicates and informs your acquisition process (i.e., what data/evidence will you seek to acquire)

5. document all their process, analysis and findings 

Documentation is the key to all of this, and as such, I am of the opinion that it needs to be addressed very early in the process, as well as throughout the process.

6. can write a report and communicate that report to a technical and non-technical audience.

If you've followed the #DFIR industry for any period of time, you'll see that there are varying opinions as to how reporting should be done.  I've included my thoughts as to report writing both here in this blog, as well as in one of my books (i.e., ch 9 of WFA 4/e).  While the concepts and techniques for writing DFIR reports may remain fairly consistent across the industry, I know that a lot of folks have asked for templates, and those may vary based on personal preference, etc.

All of that being said, I'm in agreement with Brett, with respect to determining a basic skill set that can be used to identify a "Basic DF Practitioner".  From there, one would branch off to different specialties (OS- or platform-specific), likely with different levels (i.e., MacOSX practitioner level 1, MacOSX analyst level 1, etc.)

As such, my thoughts on identifying and developing basic skills in practitioners include:

1. Basic Concepts

Some of the basic concepts for the industry (IMHO) include documentation, writing from an analytic standpoint (exercises), reviewing other's work and having your work reviewed, etc.

For a training/educational program, I'd highly recommend exercises that follow a building block approach.  For example, start by having students document something that they did over the weekend; say, attending an event or going to a restaurant or movie.  Have them document what they did, then share it, giving them the opportunity to begin speaking in public.  Then have them trade their documentation with someone else in the class, and have that person attempt to complete the same task, based on the documentation.  Then, that person reviews the "work product", providing feedback.

Another approach is to give the students a goal, or set of goals, and have them develop a plan for achieving the goals.  Have them implement the plan, or trade plans such that someone else has to implement the plan.  Then conduct a "lessons learned" review; what went well, what could have gone better, and what did we learn from this that we can use in the future?

This is where the building blocks start.  From here, provide reading materials for with the students provide reviews, and instead of having the instructor/teacher read them all, have the students share the reviews with other students.  This may be a good way to begin building the necessary foundation for the industry.

2. Understanding File Systems and Structures

This area is intended to develop an understanding of how data is maintained on storage systems, and is intended to cover the most common formats, from a high level.  For example (and this is just an example):

MacOSX - HPFS, HFS+, file structures such as plists
Linux - ext3/4
Windows - NTFS, perhaps some basic file structures (OLE, Registry)

Depending on the amount of information and the depth into which the instructor/teacher can go, the above list might be trimmed down, or include Android, network packets, common database formats (i.e., SQLite), etc.

Students can then get much more technically in-depth as they progress into their areas of specialization, or into a further level as "practitioner", before they specialize.

Just a note on "specialization" - this doesn't mean that anyone is pigeon-holed into one area; rather, it refers to the training.  This means that skill sets are identified, training is provided, and skills are achieved and measured such that they can be documented.  In this way, someone that achieves "MacOSX analyst level 2" is known to have completed training and passed testing for a specific set of skills that they can then demonstrate.  The same would true with other specialized areas.

3. Data Acquisition and Integrity

The next phase might be one in which basic techniques for data acquisition are understood.  I can see this as being a fantastic area for "fam fires"; that is, opportunities for the students to get hands-on time with various techniques.  Some of these, such as using write blockers, etc., should be done in the classroom, particularly at the early stages.

In this class, you could also get into memory acquisition techniques, with homework assignments to collect memory from systems using various techniques, documenting the entire process.  Then students will provide their "reports" to other students to review.  This provides other opportunities for evaluation, as well; for example, have a student with, say, a Mac system provide their documentation to another student with a Mac, and see if the process returns similar results.

We want to be sure that some other very important topics are not skipped, such a acquiring logs, network captures (full packet captures vs. netflow), etc.  Again, this should be a high-level understanding, with familiarization exercises, and full/complete documentation.

4. Techniques of Analysis

I think that beginning this topic as part of the basic skill set is not only important, but a good segue into areas of specialization. This is a great place to reiterate the foundational concepts; determine goals, develop a plan, document throughout, and conduct a review (i.e., "lessons learned").  With some basic labs and skills development exercises, an instructor can begin including things such as how those "lessons learned" might be implemented.  For example, a Yara rule, or a grep statement for parsing logs or packet captures.  But again, this is high-level, so detailed/expert knowledge of writing a Yara rule or grep expression isn't required; the fact that one can learn from experiences, and share that knowledge with others should be the point.

Again, this is mostly high-level, and a great way to maximize the time might be to have students get into groups and pick or be assigned a project.  The delivery of the project should include a presentation of the goals, conduct of the project, lessons learned, and a review from the other groups.

What needs to be common throughout the courses is the building block approach, with foundations being built upon and skills developed over time.

As far as skill development goes, somethings I've learned over time include:

We all learn different ways.  Some learn through auditory means, others visually, and others by doing.  Yes, at a young age, I sat in a classroom and heard how to put on MOPP NBC protective gear.  However, I really learned by going out to the field and doing it, and I learned even more about the equipment by having to move through thick bush, wearing all of equipment, in Quantico, in July.

I once worked for a CIO who said that our analysts needed to be able to pick up a basic skill through reading books, etc., as we just could not afford to send everyone to intro-level training for everything.  I thought that made perfect sense.  When I got to a larger team, there were analysts who came right out and said that they could not learn something new unless they were sitting in a classroom and someone was teaching it to them.  At first, I was aghast...but then I realized that what they were saying was that, during the normal work day, there were too many other things going on...booking travel, submitting expenses, performing analysis and report writing...such that they didn't feel that they had the time to learn anything.  Being in a room with an instructor took them out of the day-to-day chaos, allowed them to focus on that topic, to understand, and ask questions.  Well, that's the theory, anyway.  ;-)

We begin learning a new skill by developing a foundational understanding, and then practicing the skill based on repeating a "recipe".  Initial learning begins with imitation.  In this way, we learn to follow a process, and as our understanding develops, we begin to move into asking questions.  This helps us develop a further understanding of the process, from which we can then begin making decisions what new situations arise.  However, developing new skills doesn't mean we relinquish old ones, so when a new situation arises, we still have to document our justification for deviation from the process.

Addendum
Some additional thoughts that I had after clicking "publish"...

First, the above "courses" could be part of an overall curriculum, and include other courses, such as programming, etc.

Second, something else that needs to be considered from the very beginning of the program is specificity of language.  Things are called specific names, and this provides as means by which we can clearly communicate with other analysts, as well as non-technical people.  For example, I've read malware write-ups from vendors, including MS, that state that malware will create a Registry "entry"; well, what kind of entry?  A key or a value?  Some folks I've worked with in the past have told me that I'm pedantic for saying this, but it makes a difference; a key is not a value, nor vice versa.  They each have different structures and properties, and as such, should be referred to as what they are, correctly.

Third, to Brett's point, vendor-specific training has its place, but should not be considered foundational. In 1999, I attended EnCase v3 Intro training; during the course, I was the only person in the room who did not have a gun and a badge.  The course was both taught and attended by sworn law enforcement officers.  At one point during the training, the instructor briefly mentioned MD5 hashes, and then proceeded on with the material.  I asked if he could go back and say a few words about what a hash was and why it was important, and in response, he offered me the honor and opportunity of doing so.  My point is the same as Brett's...it's not incumbent upon a vendor to provide foundational training, but that training (and the subsequent knowledge and skills) is, indeed, foundational (or should be) to the industry.

Here is a DFRWS paper that describes a cyber forensics ontology; this is worth consideration when discussing this topic.