E-book Production and Conversion
E-book layout vs print PDF
While e-book design has come a long way, it’s important
to remember that your e-book will not look exactly like your print
book (except for fixed layout e-books. See below). This is because of the way e-readers display content, as well as
the options users have when reading your e-book. A print book PDF is
actually very different from an e-book.
We simply are not able to control the exact font
size and style of your e-book as with a print book, since readers have
the option to change these types of settings.
Readers can even control the line spacing, margin, and whether or not text appears justified on the screen.
Footnotes will appear at the end of chapters, rather than at the bottom of the page.
Page breaks and line breaks will “reflow” according
to changes in settings or device and may, depending on the reader
settings, appear at inappropriate places.
The e-book format may appear more simplistic than
the printed book, but this is okay! People who read e-books expect this
- the important part is that all of the text within your e-book is
smooth, continuous, and best suited for e-reader display.
Unlike a PDF or a print book, the two major e-book formats, MOBI and EPUB, are designed to allow this level of flexibility.
For your e-book to take full advantage of an
e-reader’s capabilities (and for it to be sold through the major e-book
retailers) it must be available in one or both of these formats.
Differences between e-book and print book
Here is a quick primer on the differences in formatting
for a print book and e-book. First of all, I should say that there is
not much difference, but the differences that do exist are important.
The following points are dealt with when producing your e-book files;
both mobi (for Kindle) and epub (for all other e-book readers).
Page breaks in a print layout will not work the same
way in an e-book. For an e-book, they must be associated with paragraph
styles (such as a new chapter style)
Spacing between paragraphs and special text items needs to be dealt with differently for e-book.
Character attributes (such as bold and italic) must be dealt with a certain way for them to be carried through to e-books.
of contents must be set up in such a way that they will properly
function as links in an e-book. They do not contain page numbers.
Since font sizes, margins, etc can be changed by the
viewer, thereby changing the number of pages, page numbers are
irrelevant and not used in e-books.
E-books should include metadata that is not required in a print book.
Photo and other image alignment is dealt with very differently for e-books.
Print books require a high resolution front, spine and back covers. E-books require a lower resolution front cover.
Separate files must be produced for the print book (PDF), for the Kindle (MOBI) and for all other e-book readers (EPUB).
E-books do not handle tables very well, so in some circumstances, tables need to be formatted specially for e-books.
Books with several heading levels, lists, bullets, indents, etc also need to be handled differently for e-books.
It will take about the same amount of time to format a book only for print as it would to format it only for an e-book.
The most efficient way of producing both print books
and e-books is to first format the book for print, then for the two
Once a print version is formatted, an additional 20% or so would be required to produce the two e-book files
Fixed Layout e-book
When it comes to books that rely heavily on design
elements or large illustrations/photos (cookbooks, children’s books,
comics, etc.), fixed layout may be the better solution if you want to
preserve the qualities of the printed page. To put it plainly, the
pages of a fixed layout eBook are… fixed! Content (images, text, etc.)
will not “flow” across the page if you change your settings, though
most devices will allow the reader to zoom in and out.
Fixed layout is like the digital version of
typesetting; you can embed fonts, choose the exact placement of visual
elements, etc. The benefit of fixed layout is that you (or you’re book
designer) are in complete control of the experience. The drawback is
that readers are NOT! With fixed layout, readers lose the ability to
resize text, change margins, change spacing, and change fonts.
Keep in mind that a fixed layout e-book is different
from a PDF file. While the content is not re-flowable, a fixed layout
e-book can make use of enhanced interactive features. Apple, Kobo,
Barnes & Noble, and Amazon all support derivations of fixed layout
If it is only an e-book that you want to produce and you
think you might like to do it yourself, here is an article that may
* * * *
Creating an e-book: Tips on formatting and converting your document
By Serdar Yegulalp (Computerworld)
After years of marginal acceptance, e-books have finally
started to eclipse their printed-and-bound ancestors. Casual and
sophisticated readers alike are growing much more accustomed to reading
from a device -- an e-reader, a smartphone, a tablet or a laptop.
They're also catching on for business and technical audiences -- for
example, HR departments can distribute employee manuals digitally,
while IT staffers can carry around 800-page reference manuals for their
favorite programming languages or operating systems without having to
dislocate a shoulder.
One of the most attractive features about this process
is that you don't have to be a professional publisher to produce a
useful and well-formatted e-book. Almost anyone can take an existing
manuscript -- a technical manual, a corporate white paper or even a
personal biography -- and turn it into an e-book.
Applications such as Calibre help you convert your document to a variety of e-book formats.
But you need more than just your document. You also need
the right software and the know-how -- because producing an e-book is a
little more complicated than it ought to be. The breadth of e-book
formats out there, and the quirks of converting your source document
into one of those target formats, can make the conversion process
anything but straightforward.
In the following article, I've attempted to unravel that
particular knot by looking at the e-book creation process from
beginning to end -- from formatting the source document to reading the
finished product. I'll discuss what formats you need to start with and
convert to, detail some of the issues you may encounter along the way,
and suggest some software applications that can help.
E-book creation tips
Creating an e-book can be a rocky process, often with no
preset path from the original document to the finished product. It's
difficult to tell in advance what you might need to do, or not do, to
make sure a given project renders correctly. However, before you begin
the conversion process, there are ways to make things go more smoothly.
Start with the cleanest possible input document.
There should be no stylization, formatting or elements present that you
don't want in the final product. If something can't be supported in the
destination format, it may well get stripped out automatically, but
sometimes it might just be translated into something you don't want.
You might have no choice but to clean up the original by hand, but it
may well be possible to script the cleanup process, depending on what
you're using to compose your originals.
Consider using HTML as an intermediate target format in all cases.
Since the majority of e-book formats revolve around some variant of
HTML, it might be a good idea to standardize on HTML as the format to
export to first from whatever program you used to edit the document.
This minimizes the amount of processing that has to be done by the
e-book converter itself. What's more, if you need to perform any manual
editing on the file to get it to process correctly, HTML is a
convenient format to do that: You have direct access to the source code
via nothing more than a plain-text editor.
Test the results on multiple devices.
Get your hands on as many reading devices as possible -- or, failing
that, get in touch with people who have a number of different reading
devices and get feedback from them. The desktop Kindle application, for
instance, has quirks that the actual device does not (e.g., how each
handles non-Western characters), so it helps to know when problems like
this are relevant.
Be prepared to repeat as necessary.
You will almost certainly have to make multiple passes across an e-book
to make sure everything translated correctly. Odds are it won't -- at
least not the first time -- and you'll have to go back and tweak many
different things by hand. In a way, this is another argument for using
HTML as an intermediate format, because many of the tweaks that might
need to be carried out could be partly automated. Keep notes of what
breaks each time so you don't have to repeat your mistakes.
The creation of any e-book starts with a source
document: a manuscript that you have written or that someone else has
provided to you. Right there, the problems begin, since even a "clean"
document can pose conversion difficulties. Your goal is to ensure that
the document's formatting will be preserved intact.
Odds are most documents used as a source for an e-book
will have to go through at least two conversions: first, into a format
that the conversion software can use, and then into the actual e-book
format -- or formats. Sometimes this can be cut down to one stage, but
it's best for the time being to assume you'll need two steps to do the
Here's a rundown of the most likely formats you'll start with:
I already mentioned this in the previous section, but it
bears repeating: If you're looking for a standard, HTML is more or less
it. For one, it's ubiquitous; almost every text-processing program can
generate or read HTML. It also supports many features e-books will use:
hyperlinks, font control, section headings, images and so on.
The tricky part is if you weren't working with HTML in
the first place. If you're collating posts from a blog or a wiki and
assembling them into an e-book, you won't have to put up with quite as
much drudgery. But if you're starting with a Microsoft Word (DOC or
DOCX) or Open Document Format (OpenDocument or ODF) document, your best
bet is to export it directly from the source application into HTML.
(Word users should do a "Save as..." using the "Web Page, Filtered
(HTML)" option, which strips out most of Word's generated cruft.)
Exporting to HTML from your source program helps
preserve the most crucial formatting and typically also preserves
sections and chapters: outline headers are turned into h1/h2/h3 tags,
which most conversion programs correctly recognize. Some are even able
to auto-generate tables of contents from those tags. That said, I've
had good results using Word to generate TOCs before I send the document
to the e-book program, since Word typically gives you a broader range
of formatting options.
Microsoft Word (DOC or DOCX)
If you're dealing with an original manuscript, odds are
it's probably going to be in Microsoft Word format. Proprietary as Word
may be, almost every device on the face of the Earth can read or write
Word documents. And the format has native support for most everything
you could think of: formulas, chaptering, footnotes, indexes -- in
other words, anything that might show up in an e-book.
That said, Word documents are best seen as a starting
point for an intermediate conversion format, most likely HTML, rather
than a format that can be converted directly into an e-book. In fact,
most e-book conversion programs don't accept Word natively as a source
document type. They may accept Word's sibling format, RTF, but that is
already at least one stage of conversion away from the original and
increases the chance that certain features might not make it through
the conversion process. For example, RTF does support features like
sections and footnotes, but the Calibre e-book creation suite, for one,
didn't process them correctly when I tested it for this article.
OpenDocument, or ODF, is the format used by
OpenOffice.org. (Microsoft Word also supports ODF, although it isn't
the default format for Word -- it's just one of the formats it reads
and writes.) Third-party OpenOffice offers extensions that let you
export directly to e-pub formats; there are also a number of standalone
applications, such as ODFToEPub, that will do the same. If you're
already in the habit of creating your documents in ODF, your path to
creating a finished e-book may be slightly shortened because of this.
Adobe's PDF format is used so consistently as an e-book
format that it would be foolish not to mention it. Many programs (such
as Word and OpenOffice.org) export directly to PDF, and the files can
be opened and read in many applications. In fact, before dedicated
e-reader devices made significant inroads into the market, most e-books
were just PDF distillations of their print counterparts.
However, it's generally not a good idea to try to use
PDF as a source format. Because it's designed to precisely reproduce
printed pages, a PDF document needs to be taken apart and put back
together if it's being used as a source format for a non-PDF e-book. As
a result, PDF should only be used as a source for other e-book formats
if you have no choice.
Odds are you won't have just one destination format for
your e-book, but several. If your target readers are using a variety of
devices -- a Nook, a Kindle, an iPad -- it helps to support as many of
those devices as possible. The Kindle, for instance, is notorious for
not supporting ePub format files.
These are the most common e-book destination formats and their quirks.
An open, non-proprietary format that uses XHTML as the
basis for its document format, ePub is widely supported as an output
format by various e-book production applications -- iTunes, for
instance, only accepts ePub as a source format. In fact, it couldn't
hurt to render a copy of your product as ePub no matter what other
formats you're also planning to output to.
EPub has a few downsides. Its formatting methodology
assumes that the text will be reflowed to fit the target device, so
books that require PDF-style page fidelity won't work well in ePub.
Also, there's no support for equations, apart from inserting them as
images -- TeX or MathML, two commonly used languages for representing
math, aren't supported. And ePub doesn't have a standard way to
interpret or share annotations, which might be another drawback for
people publishing digital textbooks.
To that end, it's best for "straight" text, or for documents where reflowed formatting won't be an issue.
Mobi and Kindle
A variant of an earlier version of ePub, Mobi -- or
Mobipocket -- was developed by the company of the same name as a format
to be used with its e-book reader software, designed originally for
PDAs and later smartphones. After Amazon bought the company, it made
Mobi into the basis for the Kindle reader's own e-book format. Mobi
supports digital rights management, but unencrypted Mobi documents can
be read on the Kindle without issues.
PDFs can be read as-is in the majority of e-book
readers, including the Kindle. Exporting to PDF is best when you want
to maintain absolute fidelity to page layout -- images, typefaces and
Ironically, this is the very feature that can make PDFs
a problem in some scenarios, which I hinted at before. Other e-book
formats are designed to work independently of any particular device
resolution, so pages reflow automatically for each device. This is one
of the reasons the Kindle didn't make use of page numbers at first,
since the page numbering for a particular book could vary depending on
what device or screen size you were reading it with.
PDFs, on the other hand, reproduce as closely as
possible the formatting of the original page, no matter what the size
of the destination device. A PDF formatted for an 8.5-by-11-in. page
may be quite readable on a large display, but looks cramped on a Kindle
or Nook. Some PDF readers, such as Adobe's own Acrobat Reader
application, are able to reflow a PDF to fit an arbitrary screen size
-- but this isn't a universally available function, and you shouldn't
count on it being present.
If you're committed to using PDFs, you may want to
consider exporting your document with different page sizes as a
courtesy for people using e-readers with small screens. This may
require some research to figure out what page sizes render best with
popular e-book readers.
Elements to include
When you're building a book, elements that you've
included in the original document may need a little extra work to
translate properly into the finished product. In addition, some
elements that didn't seem important for a print publication may be more
useful in an e-book.
Tables of contents
An e-book that isn't properly chaptered is difficult to
navigate -- doubly so with devices where going to an arbitrary point in
a book is not as easy as it should be. The Kindle, for instance, has no
touch screen, so jumping around in a book without a table of contents
is a chore.
This is most important if you want to set certain
elements apart from the rest of the text -- such as examples of code in
a monospaced font. This isn't so much a formatting issue as it is a
conversion issue, since font choices can sometimes get stripped out
entirely during the conversion process, or not be supported at all on
some target devices.
Be sure to try out at least two different font types in
your documents -- a standard body-text font and a monospaced font -- to
see how they render on different devices and in different book formats.
Sometimes font declarations don't work at all: With the Kindle, for
instance, you need to use the HTML <pre> tag in e-books to
reliably show text in a monospaced font.
This can be a crucial issue for some books. You need to
make sure any illustrations convert correctly depending on the system
you're using. Exporting to HTML as an intermediate step helps here,
since image references in HTML are honored pretty consistently
throughout the conversion process.
Footnotes are typically translated into hyperlinks in
e-books, but they also run the risk of disappearing if the conversion
process doesn't know how to honor them correctly. This is another
reason why exporting to HTML as a first step is a good idea: If
footnotes and endnotes render as properly hyperlinked elements in that
step, they should remain accessible in the finished product, too.
Some languages -- Japanese, for instance -- use what is
called "ruby markup" -- annotations that appear next to the text -- to
indicate how certain things are pronounced. HTML supports ruby markup,
but that doesn't mean it'll always render correctly in the converted
There are a number of other curious issues that can
arise. For instance, if you have a document where outline headings
(which typically indicate chapters) are auto-numbered, the numbering
doesn't always survive the conversion process. One document I had
automatically added "Chapter __:" to the beginning of each chapter, but
once converted into an e-book, the auto-numbering vanished.
Content-creation programs, such as word processors or
publishing suites, are only starting to add e-book formats to their
lists of possible exports. Most of the time, you'll need to use some
kind of standalone application to perform the final conversion.
Some of the tools you might encounter are designed for
extremely specific jobs and are not general conversion utilities. Those
producing e-books for the Kindle, for instance, need to use Amazon's
own e-book tool, called KindleGen, to produce a Kindle-compatible file
from HTML or ePub input.
These are only four of the better-known conversion
applications; there are a lot more out there. In contrasting their
behaviors and capabilities, it's clear we're still a ways from having a
single end-to-end suite that fits the majority of users' needs.
Adobe InDesign CS5.5
InDesign is normally thought of as a full-blown book
design and page layout package, but in its last couple of incarnations
it's been positioned more as a platform for generating output to many
The program now includes export options for the ePub
format. InDesign accepts a broad range of document formats for import
and can even map style information from the source document to whatever
style definitions you have set up in InDesign. A plug-in from Amazon
also lets you export directly from InDesign to the Kindle format.
preparation is a little different from that of
print books in that special attention needs to be given to the table of
contents, line and page breaks, character attribute handling, heading
style assignment and image placement.
Also, metadata should be added to the InDesign source document.
Depending on the complexity of the document, a book prepared forprint
can be modified for ebook production in two to five hours.
InDesign has two big downsides. The first is the scope
and scale of the program. Because it's a full-blown publishing
solution, it requires a lot more experience to generate a finished
product than a simple conversion utility. Second is the price tag: It
starts at $699. That puts it out of reach for users not prepared to
invest that much money, although the 30-day trial version should give
you an idea of whether it's worth the money or is overkill for your
Calibre, a free and open-source application, is marketed
more as a personal e-book management solution than a production suite.
That said, it can be used as an e-book conversion utility, and a
remarkably powerful one -- provided you understand the full range of
options. For that reason, it may well be the best place to start,
especially if you're distilling output for multiple e-book formats.
The best thing about Calibre is its support for a broad
range of input document types: The program can accept ODF, RTF, ePub,
Mobi, PDF and HTML. Calibre can also reformat documents according to
various heuristic rules (unwrapping plain text that has too many line
breaks, for instance) or insert chapter breaks by looking for certain
text structures (such as a line break, the word "Chapter" and then a
However, Calibre doesn't support DOC or DOCX documents,
so anything coming from Word will have to be saved in another format
first. Saving in either ODF or HTML from Word seemed to do the best job
of preserving formatting and features, including things like monospaced
formatting for code examples. The program also convert books in bulk as
well as individually.
OpenOffice.org is itself not an e-book system, of
course: It's a free open-source productivity suite. That said, a number
of people have authored add-ons for OpenOffice.org for exporting to
e-book formats from within the program.
Writer2ePub, for instance, exports directly from within
OpenOffice into ePub format; ODFToEPub can perform standalone
conversion of ODF files or work as an OpenOffice add-in.
OpenOffice.org also has a powerful native PDF export
function, one with a greater range of options than the native exporter
in Microsoft Word. That's useful as long as you don't mind using PDF as
a target document type.
A more modest example of an e-book production
application, Sigil is both free and open source. It's a lot closer to
an editor that exports to e-books (it sports a built-in document
editor) than a conversion suite for existing documents, but it also
includes various tools for collating and assembling a finished e-book
(such as a table-of-contents editor).
Sigil's main drawback is how it handles importing. It
only accepts HTML, plain text or existing ePub files as input
documents, so it will most likely work best if you are able to export
your original document to HTML in a way that preserves all of the most
important formatting. A similar program, Jutoh, accepts OPL files and
has slightly more robust editing options; it costs $39.