CeWL - Custom WordList Generator Tool for Password Cracking

CeWL is a ruby app which spiders a given url to a specified depth, optionally following external links, and returns a list of words which can then be used for password crackers such as John the Ripper.

CeWL also has an associated command line app, FAB (Files Already Bagged) which uses the same meta data extraction techniques to create author/creator lists from already downloaded.

Usage
cewl [OPTION] ... URL
--help, -h
Show help
--depth x, -d x
The depth to spider to, default 2
--min_word_length, -m
The minimum word length, this strips out all words under the specified length, default 3
--offsite, -o
By default, the spider will only visit the site specified. With this option it will also visit external sites
--write, -w file
Write the ouput to the file rather than to stdout
--ua, -u user-agent
Change the user agent
-v
Verbose, show debug and extra output
--no-words, -n
Don't output the wordlist
--meta, -a file
Include meta data, optional output file
--email, -e file
Include email addresses, optional output file
--meta_file file
Filename for metadata output
--email_file file
Filename for email output
--meta-temp-dir directory
The directory used used by exiftool when parsing files, the default is /tmp
--count, -c:
Show the count for each of the words found
--auth_type
Digest or basic
--auth_user
Authentication username
--auth_pass
Authentication password
--proxy_host
Proxy host
--proxy_port
Proxy port, default 8080
--proxy_username
Username for proxy, if required
--proxy_password
Password for proxy, if required
--verbose, -v
Verbose
URL
The site to spider.


Change Log
Keeping track of history.
  • Version 4.3 - Various spider bug fixes and the introduction of the sorting the results by count
  • Version 4.2 - Fixed the Spider gem by overriding the function, also handling #name links correctly
  • Version 4.1 - Small bug fixes and added new parameter to set filenames for email and metadata output
  • Version 4 - Runs with Ruby 1.9.x and grabs text out of alt and title tags
  • Version 3 - Now spiders pages referenced in JavaScript location commands
  • Version 2.2 - Data from email addresses and meta data can be written to their own files
  • Version 2.1 - Fixed a bug some people were having while using the email option
  • Version 2 - Added meta data support
  • Version 1 - released