Skip to content
/ pdfmark Public

Scripts to add bookmarks to PDF from table of contents

Notifications You must be signed in to change notification settings

armaab/pdfmark

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

33 Commits
 
 
 
 
 
 
 
 

Repository files navigation

pdfmark

Scripts to add boomarks to PDF file according to a table of contents(we'll refer it as toc). This post shows that it's easy to add bookmarks to a PDF using ghostscript. But some PDF has no bookmarks at all, and in general it's hard to manually generate a pdfmarks file required by gs. With this script, bookmarks can be added to PDF according to a toc file (descripted in the following).

Requirements

Toc file

We can add bookmarks to a PDF file by feeding the PDF file and a toc file to this program. A toc file looks like the toc (table of contents) of a book, with some other stuffs at the beginning of each line, as shown in the following:

!Contents 1
!0. Introduction 2
!1. First section 3
*1!1.1 Subsection 3
**!1.1.1 Subsubsection 3
**!1.1.2 Another subsubsection 5
*!1.2 Another subsection 7
!2. Second section 8
*!2.1 Subsection 3
*!2.2 Subsection 3
!3. Third section 8

As you can see, there are zero or more *s in the beginning of each line, followed by an optional 1 and then a !. The number of *s means the level of current items in the bookmarks, starting at 0. That is to say that the top level entries always start with ! or 1!. If there is a 1 before !, then this entry is opened by default, otherwise it is closed. The exclamation mark indicates the end of those asterisks and the beginning of the bookmark title. There are one or more spaces, i.e. ' ', after the title, then follows the page number. In summary, each line of toc file should match the regular expression (^\**)(1?)!(.+?)\s+(-?[0-9]+)\s*$.

Usage

$ pdfmark --in <input> --toc <toc-file> --out <output> [--offset <offset>]

Where <input>, and <output> are input PDF and output PDF, <toc-file> is the toc file as described above, and the option <offset> is optional, it stands for the offset that should be added to the page numbers in toc file in order to get the real page number in the PDF file.

About

Scripts to add bookmarks to PDF from table of contents

Topics

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published