Fnord - Designing Extractor For Obfuscated Code

Fnord is a designing extractor for obfuscated code

Fnord has ii original functions:
  1. Extract byte sequences together with exercise to a greater extent than or less statistics
  2. Use these statistics, combine length, release of occurrences, similarity together with keywords to exercise a YARA rule

1. Statistics
Fnord processes the file amongst a sliding window of varying size to extract all sequences of amongst a minimum length -m X (default: 4) upwards to a maximum length -x X (default: 40). For each length, Fnord volition introduce the close ofttimes occurring sequences -t X (default: 3) inward a table.
Each business inward the tabular array contains:
  • Length
  • Number of occurrences
  • Sequence (string)
  • Formatted (ascii/wide/hex)
  • Hex encoded form
  • Entropy

2. YARA Rule Creation
Fnord also generates an experimental YARA rule. During YARA dominion creation it volition calculate a marking based inward the length of the sequence together with the release of occurrences (length * occurrences). It volition thus procedure each sequences past times removing all non-letter characters together with comparison them amongst a listing of keywords (case-insensitive) to give away sequences that are to a greater extent than interesting than others. Before writing each string to the dominion Fnord calculates a Levenshtein distance together with skips sequences that are equally good like to sequences that bring already been integrated inward the rule.

[Experimental] Fnord was created a few days agone together with I bring tested it amongst a handful of samples. My approximate is that I'll arrange the defaults inward the coming weeks together with add together to a greater extent than or less to a greater extent than keywords, filters, scoring options.

Improve the Results
If you've flora obfuscated code inward a sample, utilization a hex editor to extract the obfuscated department of the sample together with salve to a novel file. Use that novel file for the analysis.
Play amongst the flags -s, -k, -r, --yara-strings, -mand-e`.
Please post me samples that attain weak YARA rules that could locomote better.

        ____                 __        / __/__  ___  _______/ /       / _// _ \/ _ \/ __/ _  /      /_/ /_//_/\___/_/  \_,_/ Pattern Extractor for Obfuscated Code      v0.6, Florian Roth      usage: fnord.py [-h] [-f file] [-m min] [-x max] [-t top] [-n min-occ]                     [-e min-entropy] [--strings] [--include-padding] [--debug]                     [--noyara] [-s similarity] [-k keywords-multiplier]                     [-r structure-multiplier] [-c count-limiter] [--yara-exact]                     [--yara-strings max] [--show-score] [--show-count]                     [--author author]      Fnord - Pattern Extractor for Obfuscated Code      optional arguments:       -h, --help            present this assist message together with teach out       -f file               File to procedure       -m instant                Minimum sequence length       -x max                Maximum sequence length       -t locomote past times                Number of items inward the Top x listing       -n min-occ            Minimum release of occurrences to present       -e min-entropy        Minimum entropy       --strings             Show strings alone       --include-padding     Include 0x00 together with 0x20 inward the extracted strings       --debug               Debug output      YARA Rule Creation:       --noyara              Do non generate an experimental YARA dominion       -s similarity         Allowed similarity (use values betwixt 0.1=low together with                             10=high, default=1.5)       -k keywords-multiplier                             Keywords multiplier (multiplies marking of sequences if                             keyword is found) (best utilization values betwixt 1 together with 5,                             default=2.0)       -r structure-multiplier                             Structure multiplier (multiplies marking of sequences if                             it is identified equally code construction together with non payload)                             (best utilization values betwixt 1 together with 5, default=2.0)       -c count-limiter      Count limiter (limts the acquit on of the count past times                             capping it at a for certain amount) (best utilization values                             betwixt v together with 100, default=20)       --yara-exact          Add magic header together with magic footer limitations to the                             dominion       --yara-strings max    Maximum sequence length       --show-score          Show marking inward comments of YARA rules       --show-count          Show count inward sample inward comments of YARA rules       --author writer       YARA dominion author

Getting Started
  1. git clone https://github.com/Neo23x0/Fnord.git together with cd Fnord
  2. pip3 install -r ./requirements.txt
  3. python3 ./fnord.py --help

python3 fnord.py -f ./test/wraeop.sct --yara-strings 10
python3 fnord.py -f ./test/vbs.txt --show-score --show-count -t 1 -x 20
python3 fnord.py -f ./test/inv-obf.txt --show-score --show-count -t 1 --yara-strings iv --yara-exact



Why didn't yous integrate Fnord inward yarGen?
yarGen uses a white-listing approach to filter the strings that are best for the creation of a YARA rule. yarGen applies to a greater extent than or less regular expressions to arrange scores of strings earlier creating the YARA rules. But its approach is rattling dissimilar to the method used past times Fnord, which calculates the marking of the byte sequences based on statistics.
While yarGen is best used for un-obfuscated code. Fnord is for obfuscated code alone together with should attain much ameliorate results than yarGen.

Follow on Twitter for updates @cyb3rops