Skip to content

Installing bulk_extractor

Mark Richer edited this page Jul 25, 2014 · 100 revisions

Overview

bulk_extractor can be used on Windows, Linux, and Macintosh OS X platforms.

This page contains instructions for downloading, building and installing bulk_extractor on Linux and OS X, and for downloading and installing the bulk_extractor binary on Windows. For building a Windows binary, a Linux system must be used; see Cross-compiling for Windows.

For additional information on bulk_extractor see Forensics Wiki Entry: http://www.forensicswiki.org/wiki/Bulk_extractor

Windows Users

  1. Download the latest bulk_extractor Windows installer from here.
  2. Install bulk_extractor by running the downloaded Windows installer.
Note: Temporarily turn off your virus checker if it refuses to download and/or install bulk_extractor.

Linux and OS X Users

Install the build environment, then download, and follow the steps to build and install the latest version of bulk_extractor using a command line interface.

Before compiling bulk_extractor for your platform, you may need to install packages on your system which bulk_extractor requires to compile cleanly. See instructions below for installing packages on specific Linux or OS systems. Some general notes follow:

The TRE or libgnurx regular expression library (libgnurx-static) is required. TRE is preferred because experiments indicate that it is about 10X faster.

To read E01 files, you must install the LIBEWF package for your system (see below).

To read AFF files, you must install the AFFLIB package for your system (see below).

If you want to build bulk_extractor from the current development tree, then also read the section Developers below.

Fedora Linux Users

First, please install the build environment with the following commands:

 sudo yum update
 sudo yum groupinstall development-tools
 sudo yum install flex zlib-devel
 sudo yum install libxml2-devel openssl-devel tre-devel boost-devel
 sudo yum install gcc-c++

Note: the following specific packages may be loaded instead of installing development-tools:

  git
  gcc
  gcc-c++
  autoconf
  automake
  libtool
  openssl-devel

If you want to read EWF (E01) files, please install the following package:

    sudo yum install libewf-devel

If you want to read AFF files, please install the following package:

    sudo yum install afflib-devel

Note: To read AFFLIB, openssl-devel and afflib-devel must be installed. Please note, however, that AFF is in the process of being deprecated.

If the Bulk Extractor Viewer (BEViewer) is required, also install a Java JDK Version 6 or newer.

Next, to download, build and install a bulk_extractor release version: download and unzip the latest .tar.gz bulk_extractor distribution tarball available here. For example, type:

 wget http://digitalcorpora.org/downloads/bulk_extractor/bulk_extractor-1.5.0.tar.gz
 tar -xvf bulk_extractor-1.5.0.tar.gz

To install globally using sudo, please type:

 cd bulk_extractor-1.5.0
 ./configure
 make
 sudo make install

To install in your local space without sudo, please type:

 cd bulk_extractor-1.5.0
 ./configure --prefix=$HOME/local/ --exec-prefix=$HOME/local CPPFLAGS=-I$HOME/local/include/ LDFLAGS=-L$HOME/local/lib
 make
 make install

CentOS / RHEL Users

Install the build environment:

 sudo yum update
 sudo yum groupinstall development tools
 sudo yum install flex zlib-devel expat-dev
 sudo yum install libxml2-devel openssl-devel tre-devel

bulk_extractor requires Boost v1.53 or newer. Centos /RHEL 7 includes the Boost v1.53 package.

 sudo yum install boost-dev

CentOS / RHEL 6.5 does not currently have a package for Boost v1.53+, so please build the latest Boost manually. To sufficiently install Boost, bzip2-devel is required.

Install bzip2-devel:

 sudo yum install bzip2-devel

Install Boost:

 mkdir $HOME/local
 mkdir $HOME/local/src
 cd $HOME/local/src
 wget http://sourceforge.net/projects/boost/files/boost/1.55.0/boost_1_55_0.tar.bz2
 bunzip2 boost_1_55_0.tar.bz2
 tar xvf boost_1_55_0.tar
 cd boost_1_55_0
 ./bootstrap.sh --prefix=$HOME/local
 ./b2 link=static install

If you want to run bulk_extractor on images in E01 (EWF) format, you must install libewf.

Installing libewf: The following page on code.google.com tries to example how to install libewf on CentOS /RHEL, but the instructions are not up to date or accurate:

Using RedHat package tools (RPM)

If you want to run bulk_extractor on images in AFF format, you must install afflib.

Installing afflib: A procedure for buildig with AFF suport needs to be added here.

To build and install a released version of bulk_extractor: download and unzip the latest .tar.gz bulk_extractor distribution tarball available here. For example type:

 wget http://digitalcorpora.org/downloads/bulk_extractor/bulk_extractor-1.5.0.tar.gz
 tar -xvf bulk_extractor-1.5.0.tar.gz

To install globally using sudo, please type:

 cd bulk_extractor-1.5.0
 ./configure --with-boost=$HOME/local
 make
 sudo make install

To install in your local space without sudo, please type:

 cd bulk_extractor-1.5.0
 ./configure --with-boost=$HOME/local --prefix=$HOME/local/ --exec-prefix=$HOME/local CPPFLAGS=-I$HOME/local/include/ LDFLAGS=-L$HOME/local/lib
 make
 make install

Debian and Ubuntu Users

On Debian (wheezy) and Ubuntu 12.04, this was sufficient:

  $ sudo apt-get -y install gcc g++ flex libewf-dev 

Mac OS X Users

The install process for Mac users is similar to that for Fedora users. We recommend using MacPorts:

 sudo port install flex autoconf automake pkgconfig

The following might be helpful, but development code might be required. The -devel ports might not be available for OS X, but you try to install these ports anyhow (as they will be updated eventually):

 sudo port install libewf openssl tre

At present, libewf is too old to provide the support needed to process E01 files. However, for OS X, libewf-devel is not available in ports. Therefore, please download libewf source, then type:

  ./configure && make && sudo make install

This page is helpful: https://code.google.com/p/libewf/wiki/Building

Developers

The bulk_extractor source code is available on GitHub. First, please install the git source code management system,, for example, in CentOS, RHEL, and Fedora use yum:

 sudo yum install git

Download bulk_extractor and its submodules, then run bootstrap.sh:

 git clone --recursive https://github.com/simsong/bulk_extractor.git
 cd bulk_extractor
 sh bootstrap.sh

The bootstrap script builds the configure.ac file. Then you run ./configure to create the makefile.

 .configure

Finally you compile with make and install the binary.

  make && make install

(Note: To install globally in a system directory, you can use sudo make install.)

If you are developing with github, after a checkout, you may wish to do this:

make gitfixup # brings every submodule to master CXXFLAGS="-fsanitize=address" ./configure # Runs with ASan (requires clang & libasan to be installed)

- Run -E with all of the scanners one-by-one with ASan to find scanner-specific bugs. Currently there seems to be a bug in email in the histogram generation process and in scan_hex

To keep bulk_extractor and its submodules current with the latest code on GitHub, type:

 cd to the bulk_extractor directory
 make pull

To change your repository to make it use a new master branch of a submodule:

 cd to the submodule
 git pull origin master
 cd back to the bulk_extractor directory
 git add submodule directory,  
 then commit and push the bulk_extractor change using the latest new submodule

Compiling Notes

1. bulk_extractor builds with the GNU auto tools.

2. We recommend compiling bulk_extractor with -O3 and that is the

   default. You can disable all optimization flags by specifying the
   configure option --with-noopt.

3. Building with a different glibc In creating the bulk_extractor.so, it may be necessary to build with an older glibc. We're not sure how to do it, but one of these links may help:

4. The following directories will NOT be installed with the commands provided:
    python/   - bulk_extractor python tools.
    	      	Copy them where you wish and run them directly. 

These tools are experimental.

    plugins/  - This is for C/C++ developers only. You can develop your own
    	      	bulk_extractor plugins which will then be run at run-time

if the .so or .dll files are in the same directory as the bulk_extractor executable.

Cross-compiling for Windows

The Windows configuration of bulk_extractor can be cross-compiled on a Fedora 20 or newer system using mingw. A script is provided in the src_win directory for configuring a Fedora virtual machine to cross-compile to windows. Some users have also reported success at compiling on Ubuntu, but it is harder.

If you downloaded bulk_extractor using git (rather than downloading the .tar.gz file), run bootstrap.sh:

 sh bootstrap.sh

If you have previously run configure for a native build, please clean up:

 make distclean

Install MinGW and the libraries required for cross-compilation. This will take some time and will require the root password:

 cd src_win
 ./CONFIGURE_F20.bash

Cross-compile bulk_extractor and build the Windows installer:

 make

Done. Please install the generated bulk_extractor windows installer .exe file onto your Windows system.

Clone this wiki locally