欧美日本免费在线一_ Linux命令提示符下的PDF操作與轉換

如果說(shuō)PDF是電子紙張，那么pdftk就是電子起釘器、打孔機、粘合劑、解密指環(huán)和 X光鏡片。Pdftk是一個(gè)簡(jiǎn)單的工具，可以對PDF文檔進(jìn)行各種日常操作。Pdftk可以讓你簡(jiǎn)單而自由地操作PDF。它不需要Acrobat，并且可以運行在 Linux, Windows, Mac OS X, FreeBSD和Solaris之上。

在Debian/Ubuntu中你可以通過(guò)apt安裝pdftk:

$ sudo aptitude install pdftk

示例:

將兩個(gè)或更多個(gè)PDF合并成一個(gè)新文檔

$ pdftk 1.pdf 2.pdf 3.pdf cat output 123.pdf

或者 (使用句柄):

$ pdftk A=1.pdf B=2.pdf cat A B output 12.pdf

或者 (使用通配符):

$ pdftk *.pdf cat output combined.pdf

將多個(gè)PDF中選定的頁(yè)面分離出來(lái)并形成一個(gè)新文檔

$ pdftk A=one.pdf B=two.pdf cat A1-7 B1-5 A8 output combined.pdf

將PDF的第一頁(yè)順時(shí)針旋轉90度

$ pdftk in.pdf cat 1E 2-end output out.pdf

將整個(gè)PDF文檔的頁(yè)面旋轉180度

$ pdftk in.pdf cat 1-endS output out.pdf

$ pdftk mydoc.pdf output mydoc.128.pdf owner_pw foopass

同上，唯一例外的是需要密碼才能打開(kāi)這個(gè)PDF

$ pdftk mydoc.pdf output mydoc.128.pdf owner_pw foo user_pw baz

同上，例外的是允許打印(在PDF被打開(kāi)以后)

$ pdftk mydoc.pdf output mydoc.128.pdf owner_pw foo user_pw baz allow printing

加密一個(gè)PDF

$ pdftk secured.pdf input_pw foopass output unsecured.pdf

合并兩個(gè)文件，其中一個(gè)是加密的 (輸出是不加密的)

$ pdftk A=secured.pdf mydoc.pdf input_pw A=foopass cat output combined.pdf

解壓PDF頁(yè)面流，以便可以在文本編輯器中編輯PDF代碼

$ pdftk mydoc.pdf output mydoc.clear.pdf uncompress

修復一個(gè)PDF被破壞的XREF表和流長(cháng)度 (如果可能的話(huà))

$ pdftk broken.pdf output fixed.pdf

將單個(gè)PDF文檔拆分成一個(gè)個(gè)頁(yè)面，并且將相關(guān)數據報告到doc_data.txt

$ pdftk mydoc.pdf burst

報告PDF文檔的元數據、書(shū)簽和頁(yè)面標簽

$ pdftk mydoc.pdf dump_data output report.txt

Poppler是一個(gè)基于xpdf-3.0代碼基礎的PDF渲染庫。 Poppler-utils軟件包包括了pdftops (PDF到Postｓｃｒｉｐｔ的轉換器), pdfinfo (PDF文檔信息提取器), pdfimages (PDF圖像提取器), pdftohtml (PDF到HTML的轉換器), pdftotext (PDF到text的轉換器), 以及pdffonts (PDF字體分析器)。

Debian/Ubuntu用戶(hù)可以通過(guò)apt安裝poppler:

$ sudo aptitude install poppler-utils

轉換PDF到TEXT

Pdftotext將可移植文檔格式(PDF)文件轉換成純文本。

$ pdftotext example.pdf example.txt

如果文本文件未指定, pdftotext將file.pdf轉換成file.txt。如果文本文件是 ′-’，則文本會(huì )被送到標準輸出。

轉換第3到7頁(yè)(包括3和7)使用:

$ pdftotext -f 3 -l 7 example.pdf example.txt

只提取第3頁(yè)

$ pdftotext -f 3 -l 3 example.pdf example.txt

$ pdftotext -layout example.pdf example.txt

上面的命令可以維持原始的物理布局并按閱讀順序輸出文本。

如果不想插入頁(yè)面分隔符你可以設置-nopgbrk選項。

如果PDF文件有密碼保護，可以設置-opw (擁有者密碼)或者-upw (用戶(hù)密碼)選項。

從PDF提取圖像

Pdfimages從可移植文檔格式(PDF)文件中提取圖片，保存為可移植像素圖(PPM), 可移植位圖(PBM), 或者JPEG文件。

Pdfimages讀取PDF文件，掃描一個(gè)或多個(gè)頁(yè)面，并將每一個(gè)圖像寫(xiě)入一個(gè)名為image-root-nnn.xxx的PPM、PBM或者JPEG文件，其中nnn是圖像編號，xxx是圖像類(lèi)型(.ppm, .pbm, .jpg)。

Pdfimages從PDF文件提取原始圖像數據，不做任何額外的變化。任何PDF內容流里的旋轉，剪切，顏色反轉等動(dòng)作都被忽略。

$ pfdimages example.pdf exampleimage

上面這個(gè)命令會(huì )從example.pdf提取所有的圖像。圖像會(huì )被保存為PPM格式。

使用-j選項將圖像保存為JPG格式

$ pfdimages -j example.pdf exampleimage

使用-f和-l選項制定起始頁(yè)和結束頁(yè)。為了掃描第3至7頁(yè)(包括3和7)使用：

$ pfdimages -f 3 -l 7 example.pdf exampleimage

只掃描指定的某一頁(yè)使用:

$ pfdimages -f 3 -l 3 example.pdf exampleimage

如果PDF文件有密碼保護使用-opw和-upw選項:

-opw 擁有著(zhù)密碼

-upw 用戶(hù)密碼

轉換PDF到HTML

pdftohtml是一個(gè)將pdf文檔轉換成html的程序。它在當前工作目錄中產(chǎn)生輸出。

用法:

$ pdftohtml file.pdf file.html

如果你想要看到圖形，需要使用 -c (也就是“complex”) 選項:

$ pdftohtml -c file.pdf file.html

轉換PDF到圖像

首先你的機器上必須已經(jīng)安裝 ImageMagick。

要在Debian/Ubuntu上安裝ImageMagick可以運行下面的命令：

$ sudo aptitude install imagemagick

要將 pdf 文件轉換成圖像使用‘convert‘ 命令:

$ convert doc.pdf doc.jpeg

轉換成 tiff

$ convert doc.pdf doc.tiff

英文原版：

f PDF is electronic paper, then pdftk is an electronic staple-remover, hole-punch, binder, secret-decoder-ring, and X-Ray-glasses. Pdftk is a simple tool for doing everyday things with PDF documents. Pdftk allows you to manipulate PDF easily and freely. It does not require Acrobat, and it runs on Linux, Windows, Mac OS X, FreeBSD and Solaris.In Debian/Ubuntu you can install pdftk via apt:

$ sudo aptitude install pdftk

Examples:

Merge Two or More PDFs into a New Document

$ pdftk 1.pdf 2.pdf 3.pdf cat output 123.pdf

or (Using Handles):

$ pdftk A=1.pdf B=2.pdf cat A B output 12.pdf

or (Using Wildcards):

$ pdftk *.pdf cat output combined.pdf

Split Select Pages from Multiple PDFs into a New Document

$ pdftk A=one.pdf B=two.pdf cat A1-7 B1-5 A8 output combined.pdf

Rotate the First Page of a PDF to 90 Degrees Clockwise

$ pdftk in.pdf cat 1E 2-end output out.pdf

Rotate an Entire PDF Document’s Pages to 180 Degrees

$ pdftk in.pdf cat 1-endS output out.pdf

Encrypt a PDF using 128-Bit Strength (the Default) and Withhold All Permissions (the Default)

$ pdftk mydoc.pdf output mydoc.128.pdf owner_pw foopass

Same as Above, Except a Password is Required to Open the PDF

$ pdftk mydoc.pdf output mydoc.128.pdf owner_pw foo user_pw baz

Same as Above, Except Printing is Allowed (after the PDF is Open)

$ pdftk mydoc.pdf output mydoc.128.pdf owner_pw foo user_pw baz allow printing

Decrypt a PDF

$ pdftk secured.pdf input_pw foopass output unsecured.pdf

Join Two Files, One of Which is Encrypted (the Output is Not Encrypted)

$ pdftk A=secured.pdf mydoc.pdf input_pw A=foopass cat output combined.pdf

Uncompress PDF Page Streams for Editing the PDF Code in a Text Editor

$ pdftk mydoc.pdf output mydoc.clear.pdf uncompress

Repair a PDF’s Corrupted XREF Table and Stream Lengths (If Possible)

$ pdftk broken.pdf output fixed.pdf

Burst a Single PDF Document into Single Pages and Report its Data to doc_data.txt

$ pdftk mydoc.pdf burst

Report on PDF Document Metadata, Bookmarks and Page Labels

$ pdftk mydoc.pdf dump_data output report.txt

Poppler is a PDF rendering library based on the xpdf-3.0 code base. The poppler-utils package contains pdftops (PDF to PostScript converter), pdfinfo (PDF document information extractor), pdfimages (PDF image extractor), pdftohtml (PDF to HTML converter), pdftotext (PDF to text converter), and pdffonts (PDF font analyzer).

Debian/Ubuntu users can install pdftk via apt:

$ sudo aptitude install poppler-utils

Convert PDF to TEXT

Pdftotext converts Portable Document Format (PDF) files to plain text.

$ pdftotext example.pdf example.txt

If textfile is not specified, pdftotext converts file.pdf to file.txt. If text-file is ′-’, the text is sent to stdout.

To convert page from 3 to 7 (including 3 and 7) use:

$ pdftotext -f 3 -l 7 example.pdf example.txt

To extract only 3rd page

$ pdftotext -f 3 -l 3 example.pdf example.txt

$ pdftotext -layout example.pdf example.txt

Maintain the original physical layout of the text and output the text in reading order.

Set the -nopgbrk option if you don’t want insert page breaks.

Uset -opw (owner password) or -upw (user password) options if the PDF file is password protected.

Extract Images From PDF

Pdfimages saves images from a Portable Document Format (PDF) file as Portable Pixmap (PPM), Portable Bitmap (PBM), or JPEG files.

Pdfimages reads the PDF file, scans one or more pages, and writes one PPM, PBM, or JPEG file for each image, image-root-nnn.xxx, where nnn is the image number and xxx is the image type (.ppm, .pbm, .jpg).

Pdfimages extracts the raw image data from the PDF file, without performing any additional transforms. Any rotation, clipping, color inversion, etc. done by the PDF content stream is ignored.

$ pfdimages example.pdf exampleimage

The above command will extract all images from example.pdf. The images will be saved in PPM format.

Use -j option to save images as JPG format

$ pfdimages -j example.pdf exampleimage

Use the -f and -l options to specify the startpage and lastpage to scan. To scan pages 3 to 7 (including 3 and 7) use:

$ pfdimages -f 3 -l 7 example.pdf exampleimage

To scan only one specific page use:

$ pfdimages -f 3 -l 3 example.pdf exampleimage

If the PDF file is password protected use -opw or -upw option:

-opw Owner password

-upw User password

Convert PDF to HTML

pdftohtml is a program that converts pdf documents into html. It generates its output in the current working directory.

Usage:

$ pdftohtml file.pdf file.html

If you want to see graphics, you’ll need to use the -c (as in “complex”) option:

$ pdftohtml -c file.pdf file.html

Convert PDF to Image

First you need to have ImageMagick installed in your machine.

To install ImageMagick in Debian/Ubuntu run the following command:

$ sudo aptitude install imagemagick

To convert pdf file to image use the ‘convert‘ command:

$ convert doc.pdf doc.jpeg

convert to tiff

$ convert doc.pdf doc.tiff

文章引自：http://www.linuxeden.com/html/softuse/20100717/103836.html

http://segfault.in/2010/07/pdf-manipulations-and-conversions-from-linux-command-prompt/

本站僅提供存儲服務(wù)，所有內容均由用戶(hù)發(fā)布，如發(fā)現有害或侵權內容，請點(diǎn)擊舉報。

欧美性猛交XXXX免费看蜜桃,成人网18免费韩国,亚洲国产成人精品区综合,欧美日韩一区二区三区高清不卡,亚洲综合一区二区精品久久