MENU
|
Getting Started:
FAQ | Forum
Customizing K2PDFOPT:
Adjusting the output:
Processing Options:
|
|
|
FREQUENTLY ASKED QUESTIONS (last updated 28 Dec 2016)
Why does the Mac OSX version not run on my mac?
If you are running OSX 10.5.x or earlier, k2pdfopt may not run on your system.
See the first paragraph in my Mac install notes.
How do I increase the text size?
See the help page on increasing the magnification.
The output file size (in bytes) is large. Can I make it smaller?
With the default conversion, which allows text re-flow, every converted page is a bitmap, so
the file size of the converted file is often larger than the original; however, many e-readers
can process PDF files made up of bitmaps faster and with less memory overhead than the original
PDF file, so you might still prefer this type of conversion. If you still want a smaller
output file size, see my
help page on output file size for options that reduce the output file size,
mostly at the expense of the output quality. If you don't need text re-flow, you might
try using a mode which converts using native PDF output.
I just want k2pdfopt to remove the excess borders on my PDF file. Can it do that?
Absolutely. As of v1.60, the shorthand option for this is -mode fw (fw = fit width),
which is equivalent to -n -wrap- -col 1 -vb -2 -t -ls.
If you still want to rasterize the output, use -mode fw -n-. If you don't want to
turn the document on its side, use -mode fw -ls-. You can select the mode from the
user menu by typing "mo" at the prompt. Here are some examples of other k2pdfopt modes.
I want to use k2pdfopt like Briss--select a crop region for the document and put only that region in the output PDF.
This is easy when you use the MS-Windows GUI. Make one of the "Crop Areas" active (check box); type in the applicable page range for the crop box (e.g. 2-99), then click the blue Select button and choose your crop region. For the conversion mode, select Crop (command-line: -mode crop).
I just want k2pdfopt to OCR my document. Can it do that?
Yes. Try using -mode copy -ocr as command-line options.
See the OCR help page.
How do I extract the text from a PDF file to an ASCII/UTF-8 file?
k2pdfopt -ocrout textout.txt -mode copy myfile.pdf
The output file has poor resolution on my device. Can I improve it?
Definitely. The default k2pdfopt settings are for a kindle 2, and your device may have
better or slightly different resolution. You can change the device by using
-dev (interactive menu option "d"). Or see my
page on setting k2pdfopt for any custom device resolution.
You can also just use, for example, -dr 2 (new option in v1.60),
which increases
the display resolution by a factor of two. This drawback is that your converted files will
be significantly larger and may take longer to render on your device, so you may want
to experiment to find the right value (you can
use fractions, e.g. -dr 1.5). You can type this option directly into the
user menu prompt, e.g. "-dr 2" (without the double quotes).
When I convert with native PDF output, my kindle has problems reading the output file (runs out of memory / very slow / crashes). Why?
There are likely too many cropped-and-scaled regions in your output file. Try using a specific
conversion mode instead. Modes are shorthand for setting a collection of options that
are best suited for s specific type of optimization.
See the native PDF help page. Another option
is to use a mode like -mode 2col or -mode fitwidth which defaults to native output and then turn
off the native output by unchecking the "Native Output" box or specifying -n- (after
the -mode ... option) on the
command line. The output will look the same, and it will still have searchable and
highlightable text, but it will be bitmapped. For some devices and/or documents, bitmaps
are faster and require less memory overhead to render.
If the bitmapped text is too grainy,
you can use -dr to improve it, e.g. -dr 2 will double the resolution of the
output bitmaps.
I'm having trouble selecting text in the converted PDF file with native PDF output.
If there are more than one cropped/scaled regions on an output page, most PDF reading applications
will get confused and allow selection of "invisible" text which is outside cropped regions and
which overlaps with displayed text.
As of k2pdfopt version 2.31, if you have ghostscript installed, you can use -ppgs to post
process your PDF with ghostscript (in the MS Windows GUI, check the
"Post-process w/Ghostcript" box), which very nicely eliminates this
issue (thank you to Andrea Lazzarotto for suggesting this modification).
You can also use -bp m (in the MS Windows GUI, check the "Avoid Text Select Overlap"
box) to force only one cropped region per output page, but this may
result in a lot of blank space in your converted file.
Or, like in the previous answer, you can use a mode like -mode 2col or -mode fitwidth which defaults to native output and then turn
off the native output by unchecking the "Native Output" box or specifying -n- (after
the -mode ... option) on the
command line. The output will look the same, and it will still have searchable and
highlightable text, but it will be bitmapped and will not have any overlapping "invisible"
text outside the crop regions. For some devices and/or documents, bitmaps
are faster and require less overhead to render.
If the bitmapped text is too grainy,
you can use -dr to improve it, e.g. -dr 2 will double the resolution of the
output bitmaps.
I don't understand how k2pdfopt is interpreting my PDF file.
Try using the -sm command-line option ("sm" from the interactive menu), which will
write out a PDF file that shows the regions found by k2pdfopt.
I want to use text re-flow, but my tables / equations / figures get mangled.
Try protecting those regions by drawing boxes around them.
Is there a k2pdfopt GUI (graphical user interface)?
There is now an integrated MS-Windows GUI (as of v2.x), and there are also a number
of user-contributed front ends for k2pdfopt.
Can k2pdfopt run directly on my kindle?
Yes. See the information on my third-party contributions page.
How do I prevent images / figures from being split across pages?
Use -f2p -1, or select "bp" from the interactive menu and enter -1 for the
"fit-to-page" value.
The columns in my document are not detected correctly.
See the column detection help page.
Some of the text is much larger than the rest. How can I avoid that?
If your document does not have multiple columns, try turning off multiple column
detection with command-line option -col 1 (interactive
menu option "co"). See the page
on column detection and also the page on
showing markings so that you can see how k2pdfopt is converting
your document.
How can I get rid of the document headers, footers, page numbers and/or other marks near the edges of the source pages?
You can tell k2pdfopt to ignore an arbitrarily sized border around your document.
See Ignoring Borders/Headers/Footers.
Sometimes I get multiple rows of text at smaller magnification than the rest of the document. Why?
This generally happens when there is not a clear gap between rows of text and k2pdfopt thus views the
region as a graphical block (figure) rather than as rows of text. If you haven't updated to v1.65, you should do so. K2pdfopt v1.65 is smarter about breaking rows--if it detects a double- or triple-height text row amidst other single-spaced text rows, it will usually fix this. See this post (and the reponse) for tips on how to adjust your k2pdfopt settings.
Is there any way to search / highlight the text in the converted PDF file?
Yes, as of v1.50, k2pdfopt has OCR capability, and as of
v1.60, k2pdfopt has options for native PDF output, much like
Cut2Col,
SoPDF,
and the latest version of PaperCrop.
In fact, as of v2.00, if the text in your source PDF document can be searched or highlighted
(e.g. if it is computer generated or scanned with an OCR layer),
the default output from k2pdfopt should have the same functionality using the
new virtual OCR feature (see "major new features" under v2.00 in the version history for more details). In these cases (when the source PDF
is computer generated or has an OCR layer), it is not necessary to use the Tesseract OCR
engine in k2pdfopt.
Note that PDF highlighting is not possible on some e-readers, such as early Kindles (Kindle 1/2).
My PDF file has a lot of pages. Can I convert only certain pages?
Yes. In the Windows GUI, look for the "Pages to Convert" box and enter a page range, e.g. 1-100.
Or see the Windows Getting Started page and scroll down to
"2. Enter Page Range" (or use the -p command-line option).
Some of the text is truncated / clipped. Can I fix that?
In versions before v1.65, k2pdfopt ignores (crops) a 0.25-inch border around your document
by default. Turn
this off by using command-line option -m 0 (which is no
longer necessary in v1.65--the default is now -m 0). See Ignoring Borders/Headers/Footers.
|
|
|
|