Difference between revisions of "XPDF"
(8 intermediate revisions by the same user not shown) | |||
Line 1: | Line 1: | ||
− | '''XPDF''' is a cross-platform suite of command-line [[PDF]] programs which allow you to view | + | '''XPDF''' is a free open source cross-platform suite of command-line [[PDF]] programs which allow you to view PDF files and includes several tools for converting PDFs into text and extracting images. XPDF was originally written for [[Linux]], but has been ported to [[Windows]] and [[Macintosh]]. |
− | == | + | ==Programs== |
+ | ===PDFDetach=== | ||
+ | Extracts files embedded in PDF files. | ||
+ | |||
+ | ===PDFFonts=== | ||
+ | Gives information about the fonts used in the PDF file. | ||
+ | |||
+ | ===PDFImages=== | ||
+ | Extracts images from PDF files. Images are saved in [[PPM]] format, but includes options for maintaining [[JPEG]] files. | ||
+ | |||
+ | ===PDFInfo=== | ||
+ | Gives detailed information about the specified PDF file. | ||
+ | |||
+ | ===PDFToHTML=== | ||
+ | Converts the PDF into an [[HTML]] document. | ||
+ | |||
+ | ===PDFToPNG=== | ||
+ | Renders the PDF into a series of [[PNG]] images. | ||
+ | |||
+ | ===PDFToPPM=== | ||
+ | Renders the PDF into a series of [[PPM]] images. | ||
+ | |||
+ | ===PDFToPS=== | ||
+ | Converts the PDF into [[PS]] (PostScript) format. | ||
+ | |||
+ | ===PDFToText=== | ||
+ | Extracts the text from the PDF document. Includes several features for extracting text while maintaining tabular and line printer layouts. | ||
+ | |||
+ | ==Examples== | ||
===Converting to PNG=== | ===Converting to PNG=== | ||
PDFtoPNG.exe will convert each page of the PDF into a [[PNG]] image with a DPI you specify. Example: | PDFtoPNG.exe will convert each page of the PDF into a [[PNG]] image with a DPI you specify. Example: | ||
Line 24: | Line 52: | ||
==Links== | ==Links== | ||
− | + | {{Link|Wikipedia|https://en.wikipedia.org/wiki/Xpdf}} | |
+ | {{Link|Official|http://www.foolabs.com/xpdf/home.html}} | ||
Line 31: | Line 60: | ||
[[Category: Graphic Software]] | [[Category: Graphic Software]] | ||
[[Category: Useful Software]] | [[Category: Useful Software]] | ||
+ | [[Category: Linux Software]] | ||
+ | [[Category: Macintosh Software]] | ||
+ | [[Category: Windows Software]] | ||
+ | [[Category: Free Software]] | ||
+ | [[Category: Open Source Software]] |
Revision as of 14:00, 18 March 2019
XPDF is a free open source cross-platform suite of command-line PDF programs which allow you to view PDF files and includes several tools for converting PDFs into text and extracting images. XPDF was originally written for Linux, but has been ported to Windows and Macintosh.
Contents
Programs
PDFDetach
Extracts files embedded in PDF files.
PDFFonts
Gives information about the fonts used in the PDF file.
PDFImages
Extracts images from PDF files. Images are saved in PPM format, but includes options for maintaining JPEG files.
PDFInfo
Gives detailed information about the specified PDF file.
PDFToHTML
Converts the PDF into an HTML document.
PDFToPNG
Renders the PDF into a series of PNG images.
PDFToPPM
Renders the PDF into a series of PPM images.
PDFToPS
Converts the PDF into PS (PostScript) format.
PDFToText
Extracts the text from the PDF document. Includes several features for extracting text while maintaining tabular and line printer layouts.
Examples
Converting to PNG
PDFtoPNG.exe will convert each page of the PDF into a PNG image with a DPI you specify. Example:
pdftopng.exe -r 72 mydocument.pdf x
This will convert every page of the PDF into a series of PNG images named x-000000.png, x-000001.png, etc., at 72 DPI.
Converting to Text
PDFtoText.exe is the best program I've seen at converting a PDF document into a plain text file because is has several features to help retain the original layout of the document. Example:
pdftotext.exe -table mydocument.pdf mytext.txt
This will convert the PDF document into a text file while trying to retain any tabular layouts.
Extracting Images
PDFImages.exe can extract all of the images from a PDF document. Unless specified to retain JPEG format, all images will be exported into PPM. Example:
pdfimages.exe -j mydocument.pdf x
This will extract all of the images from a PDF into a series of files named x-0000.ppm, x-0001.ppm, etc., while retaining all JPEGs.