I’ve been referencing video evidence by name alone for years, and only recently learned what I was leaving out.

If a video showed an F-250 running a red light and t-boning a Smart car at 55 mph, and it was called “Dunkin-Surv-01.mp4,” I’d simply reference it by the filename. However, after learning more about video (hat tip to Mark Crouch), I now reference both the name and the “hash value.”

Think of the hash value as the digital fingerprint of the file. So you can detect any changes and confirm you're working with the unaltered original.

There are many different hashing algorithms, but they all aim to do the same thing.

  • Take a digital file of any size and chop/mix it (hence the term hash) mathematically.

  • Output a fixed-length hash value

  • Ensure tiny input changes yield wildly different outputs (called the avalanche effect)

  • Act as a one-way process (you can’t reconstruct the input from the hash)

  • Make it extremely difficult to find two different inputs that produce the same hash value (dubbed collision resistance).

The idea of one-way functions was introduced in 1976 and the first practical hash proposal was presented in 1978. In his 1979 thesis, Ralph Merkle invented the “Merkle-Damgård construction” for building hash functions, which laid the groundwork for modern hashing methods.

The NSA developed SHA-0 in 1993, then SHA-1 in 1995. Both produce 160-bit (40 hexadecimal digits) hashes. An amped up Google research team cracked SHA-1 in 2017, creating two different PDFs with the same SHA-1 hash value (the PDFs can be downloaded here if you want to mess with them). However, that effort required 6,500 CPU-years of computation, so it’s no easy feat.

Nevertheless, the SHA-2 family was introduced by the NSA in 2002, including SHA-256, which produces a 256-bit output (64 hexadecimal characters). SHA-256 is used in everything from Bitcoin to TLS certificates and file integrity checks, which is real handy in forensics. To date, SHA-256 hasn’t been cracked, though, quantum computers are coming!

So, how do you generate a hash value? Amped FIVE and Axon Investigate have the algorithms built in, as does the freeware, HxD. For the latter, drop the file in, click on Analysis, Checksums, select your weapon of choice, and the hash value will be displayed.

So, instead of only referencing “Dunkin-Surv-01.mp4,” maybe toss a A71EB337AB0BA1C9B066C1A5D141C768CBBE7272 in there too!

Thanks for reading, keep learning!

Lou Peck
Lightpoint | JS Forensics

Sign up for to the point