Looking for good PDF tool to analyze who is the author of the PDF file and modifications made for e-filling acknowledgement during few days span …
I find PDF Stream Dumper by David Zimmer and PDF CanOpener quite useful in PDF analysis.
A powerful text editor is also a must—and, 010 Editor has a binary template for the PDF format by Didier Stevens and Christian Mehlmauer which makes it a good option for reviewing PDF structures.
Didier Stevens also has a few Python scripts that target PDF documents here:
I hope other examiners in the Community have some additional recommendations for you
I Second Stream dumper and Didier Stevens work. Stream dumper doesn’t show/ deal with image masks. I’ve tried Jpedal without result. Anyone been able to extract an original jpeg image from multiple image masks in a pdf?
I haven’t had to do these sort of forensics outside a CTF. What’s the real world use case here? What are you seeing?
Because many PDFs come in via email I start with Metaspike FEI. It lets me see attachments (often other emails) and extract the attachments, looking for PDFs, from the EMLs.
Then I use PdfWalker which comes in in Lenny Zeltzer’s Remnux REMnux (REMnux® | SANS Institute) to pull PDFs apart and save embedded JFIFs.
Phil Harvey’s Exiftool gives useful info on the various extracted files and gives a good view of producing software, dates, etc. (https://exiftool.org/). For PDFs, the most stable dates are CreateDate and ModifyDate. A CSV output with rows/columns transformed in Excel gives a human readable summary of metadata.
Hi. Thanks. I’ve requested a wet copy so the need has likely gone but out of interest, I’m seeing different color spaces in pdfs of the same ID document. I can only extract parts of the whole image via stream dumper. I am seeing artifacts In the pieces but think that is due to the image mask compression.