XPDF is a cross-platform suite of command-line PDF programs which allow you to view PDF files and includes several tools for converting PDFs into text and extracting images.
Extracts files embedded in PDF files.
Gives information about the fonts used in the PDF file.
Gives detailed information about the specified PDF file.
Converts the PDF into an HTML document.
Renders the PDF into a series of PNG images.
Renders the PDF into a series of PPM images.
Converts the PDF into PS (PostScript) format.
Extracts the text from the PDF document. Includes several features for extracting text while maintaining tabular and line printer layouts.
Converting to PNG
PDFtoPNG.exe will convert each page of the PDF into a PNG image with a DPI you specify. Example:
pdftopng.exe -r 72 mydocument.pdf x
This will convert every page of the PDF into a series of PNG images named x-000000.png, x-000001.png, etc., at 72 DPI.
Converting to Text
PDFtoText.exe is the best program I've seen at converting a PDF document into a plain text file because is has several features to help retain the original layout of the document. Example:
pdftotext.exe -table mydocument.pdf mytext.txt
This will convert the PDF document into a text file while trying to retain any tabular layouts.
pdfimages.exe -j mydocument.pdf x
This will extract all of the images from a PDF into a series of files named x-0000.ppm, x-0001.ppm, etc., while retaining all JPEGs.
- foolabs.com/xpdf/home.html - Official.