Sherloq - An Open-Source Digital Image Forensic Toolset

An open source image forensic toolset

Introduction
"Forensic Image Analysis is the application of image science and domain expertise to interpret the content of an image and/or the image itself in legal matters. Major subdisciplines of Forensic Image Analysis with law enforcement applications include: Photogrammetry, Photographic Comparison, Content Analysis, and Image Authentication." (Scientific Working Group on Imaging Technologies)
Sherloq is a personal research project about implementing a fully integrated environment for digital image forensics. It is not meant as an automatic tool that decide if an image is forged or not (that tool probably will never exist...), but as a companion in putting at work various algorithms to discover potential image inconsistencies.
While many commercial solutions have unaffordable prices and are reserved to law enforcement and government agencies only, this toolset aims to be both a powerful and extensible framework providing a starting point for anyone interested in testing or developing state-of-the-art forensic algorithms.
I strongly believe that security-by-obscurity is the wrong way to offer any kind of security service (i.e. "Using this proprietary software I guarantee you that this photo is pristine... and you have to trust me!"). Instead, following the open-source mentality, everyone should be able to personally experiment various techniques, gain more knowledge and share it to the community... even better if they propose code improvements! :)

Features
A Qt-based GUI provides highly responsive widgets for panning, zooming and inspecting images, while all image processing routines are handled by OpenCV for best efficiency. The software is based on a multi-document interface that can use floating or tabbed view for subwindows and tool outputs can be exported in various textual and graphical formats.
These are the currently planned functions [(***) = fully implemented, (**) = partially implemented, (*) = not yet implemented]:

General

Original Image: display the unaltered reference image for visual inspection (***)
Image Digest: compute byte and perceptual hashes together with extension ballistics (**)
Similarity Search: use reverse search services for finding similar images on the web (*)
Automatic Tagging: exploit deep learning algorithms for automatic picture tagging (*)

File

Metadata Dump: gather all metadata information and display security warnings (**)
EXIF Structure: dump the physical EXIF structure and display an interactive view (***)
Thumbnail Analysis: if present, extract embedded thumbnail and highlight discrepancies (***)
Geolocation Data: if present, get geographic data and locate them on a world map view (***)

Inspection

Enhancing Magnifier: apply local visual enhancements for better identifying forgeries (***)
Image Adjustments: apply standard adjustments (contrast, brightness, hue, saturation, ...) (***)
Tonal Range Sweep: interactive tonality range compression for easier artifact detection (***)
Reference Comparison: synchronized double view to compare reference and evidence images (***)

JPEG

Quality Estimation: extract quantization tables and estimate last saved JPEG quality (***)
Compression Ghosts: use error residuals to detect multiple compressions at different levels (**)
Double Compression: exploit First Digit Statistics to discover potential double compression (**)
Error Level Analysis: identify areas with different compression levels against a fixed quality (***)

Colors

RGB/HSV 3D Plots: display interactive 2D and 3D plots of RGB and HSV pixel data (*)
Color Space Conversion: convert image into RGB/HSV/YCbCr/Lab/CMYK color spaces (***)
Principal Component Analysis: use PCA to project RGB values onto a different vector space (***)
RGB Pixel Statistics: compute minimum/maximum/average RGB values for every pixel (***)

Luminance

Luminance Gradient: analyze brightness variations along X/Y axes of the image (***)
Frequency Separation: extract the finest details of the luminance channel (*)
Echo Edge Filter: use 2D Laplacian filter to reveal artificial blurred zones (***)
Wavelet Reconstruction: re-synthesize image varying wavelet coefficient thresholds (*)

Noise

Noise Extraction: estimate and separate the natural noise component of the image (***)
Min/Max Deviation: highlight pixels deviating from block-based min/max statistics (***)
SNR Consistency: evaluate uniformity of signal-to-noise ratio across the image (***)
Noise Segmentation: cluster uniform noise areas for easier tampering detection (*)

Tampering

Contrast Enhancement: analyze histogram inconsistencies caused by enhancements (***)
Clone Detection: use invariant feature descriptors for copy/rotate clone area detection (**)
Resampling Detection: analyze 2D pixel interpolation for detecting resampling traces (**)
Splicing Detection: use DCT coefficient statistics for automatic splicing zone detection (*)

Setup
The software is written in C++11 using Qt Framework for platform-independent GUI and OpenCV Library for efficient image processing. Other external depencies are ExifTool for metadata extraction, LIBSVM for forgery detection and AlgLib for histogram manipulation.
Even if the project objective is clear, actually the software is an early prototype, so some functionalities are still missing (see list above) and it can be run only from Qt Creator under Linux. I put it on Github to track my development progress even during the alpha stage, so expect issues, bugs and installation headaches, however, if you want to take a look around, feel free to contact me if you are experiencing problems in making it run.

Screenshots