In this Python tutorial, we will discuss what is PyPDF2 in python and various methods of PdfFileMerger and also PdfFileMerger Python examples.
We will learn about the PdfFileMerger class and methods. It is the class from the PyPDF2 module that is widely used to merge multiple PDF files into one in Python.
Also, we will check:
- How to add a bookmark to PDF in Python
- How to add metadata to pdf in Python
- How to append PDF files in Python
- How to merge PDF files in Python using PDFFileMerger
- How to set page layout in PDF in Python
- How to set page mode in PDF in Python
- Write file object using PdfFileMerger in Python
PyPDF2 Python Library
- Python is used for a wide variety of purposes & is adorned with libraries & classes for all kinds of activities. Out of these purposes, one is to read text from PDF in Python.
- PyPDF2 offers classes that help us to Read, Merge, Write a pdf file.
- PdfFileReader used to perform all the operations related to reading a file.
- PdfFileMerger is used to merge multiple pdf files together.
- PdfFileWriter is used to perform write operations on pdf.
- All of the classes have various methods that facilitate a programmer to control & perform any operation on pdf.
- PyPDF2 has stopped receiving any updates after Python3.5 but it is still used to control PDFs. In this tutorial, we will be covering everything about PdfFileMerger class & we will tell you what all functions are depreciated or broken.
Read: PdfFileReader Python example
Install PyPDF2 in python
To use the PyPDF2 library in Python, we need to first install PyPDF2. Follow the below code to install the PyPDF2 module in your system.
pip install PyPDF2
After reading this tutorial, you will have complete knowledge of each function in PdfFileMerger class. Also, we will be demonstrating the examples for each function in PdfFileMerger class.
Read: Create and modify PDF file in Python
PdfFileMerger in Python
PdfFileMerger in Python offers methods that help in merging multiple PDF into one. It offers various functions using which you can control the PDF in Python.
- PdfFileMerger in Python is used to merge two or more PDF files into one. It initializes a PdfFileMerger object.
- It can concatenate, slice and insert PDF file.
- The first step is to import the PyPDF2 module.
import PyPDF2
- The next step is to initialize the class from PyPDF2 module in Python. So this time we will initialize PdfFileMerger.
PyPDF2.PdfFileMerger(strict=True)
Here, strict determines whether users should be warned of all the problems. By default it is True.
Here is the implementation of the above code. If it showing an error that means everything is working fine in the PdfFileMerger in Python.
So this is how we can use the PdfFileMerger in Python. Moving forward we will learn about all the available methods in this class.
Read: PdfFileWriter Python Examples
PdfFileMerger Python Example
In this section, we will learn about the available functions in PdfFileMerger class. Also, we will demonstrate the use of each function with the help of an example.
Note that the PyPDF2 module in Python is not updated after python 3.5 so, you are reading this blog may be few functions won’t work.
Add Bookmark using PdfFileMerger in Python
Let us see how to add bookmark using PdfFileMerger in Python.
PdfFileMerger provides a method addBookmark(title, pagenum, parent=None) which allows to add a bookmark in a PDF file in Python.
- Add a bookmark to the PDF file in Python.
- Parameters:
- title (str): title or name to use for this bookmark.
- pagenum (int): The page number this bookmark will point to.
- parent: A reference to a parent bookmark to create nested bookmarks.
- Here is the example of addBookmark() function of PdfFileMerger class in Python
Code Snippet:
In this code, we have added a bookmark using PdfFileMerger in Python and the name is provided as ‘new bookmark’ and the reference is for page 2.
import PyPDF2
writer = PyPDF2.PdfFileMerger(strict=True)
pdfFile = writer.addBookmark(
title="new bookmark",
pagenum=2,
)
print(pdfFile)
Output:
I this output, you can see that in the terminal dictionary of bookmark information is displayed.
This is how we can add a bookmark in PDF in Python.
Add Metadata in PDF using PdfFileMerger in Python
Now, let us see how to add metadata to PDF files in Python using PdfFileMerger.
PdfFileMerge provides method addMetadata(infos) which allows to add meta data in the PDF file in Python.
- This function adds custom metadata to the output.
- Parameters :
- infos (dict): a python dictionary where each key is a field and each value is your new metadata. Example:
{u'/Title': u'My title'}
- infos (dict): a python dictionary where each key is a field and each value is your new metadata. Example:
- Here is the example of adding metadata using PdfFileMerger in Python
code Snippet:
In this code, we have followed the procedure of entering the metadata using PdfFileMerger in Python. Please note that all the keys and values must be a string.
import PyPDF2
merger = PyPDF2.PdfFileMerger(strict=True)
data = {
u'/author': u'/PythonGuides',
u'/version': u'/1.0'
}
pdfFile = merger.addMetadata(data)
print(pdfFile)
Output:
The expected output, in this case, was appearing of dictionary text in the terminal but it is returning None. We declare this method as broken but in case you find the solution please leave it in the comment section.
This is how we can add metadata to PDF files in Python using PdfFileMerger.
Add Named Destination to PDF files using PdfFileMerger in Python
PdfFileMerger provides a method addNamedDestination(title, pagenum) which allows to add named destination in the PDF file in Python.
- Add a destination to the output.
- Parameters:
- title (str) : title to use
- pagenum (int) : Page number this destination points at.
- Here is the example of addNamedDestination() using PdfFileMerger in Python
Code Snippet:
In this code, we have followed the procedure of adding using PdfFileMerger in Python. Please note that all the keys and values must be a string.
import PyPDF2
merger = PyPDF2.PdfFileMerger(strict=True)
pdfFile = merger.addNamedDestination(
title="test",
pagenum=4
)
print(pdfFile)
Output:
The output for the above code is showing None. We have the same function in PdfFileReader in Python. This function is depreciated and do not work anymore.
This is how to add Named Destination to PDF files using PdfFileMerger in Python.
Append PDF using PdfFileMerger in Python
PdfFileMerger provides a method append(fileobj, bookmark=None, pages=None, import_bookmarks=True) using which PDF pages are concatenated.
- Identical to the merge() method, but assumes you want to concatenate all pages onto the end of the file instead of specifying a position.
- Parameters:
- fileobj – A File Object or an object that supports the standard read and seeks methods similar to a File Object. Could also be a string representing a path to a PDF file.
- bookmark (str) – Optionally, you may specify a bookmark to be applied at the beginning of the included file by supplying the text of the bookmark.
- pages – can be a Page Range or a (start, stop[, step]) tuple to merge only the specified range of pages from the source document into the output document.
- import_bookmarks (bool) – You may prevent the source document’s bookmarks from being imported by specifying this as False.
- Here is the example of append() using PdfFileMerger in Python.
Code Snippet:
In this code, we have
from PyPDF2.pdf import PdfFileReader, PdfFileMerger
f1 = PdfFileReader(open('Marksheet-1998.pdf', 'rb'))
f2 = PdfFileReader(open('Marksheet-1999.pdf', 'rb'))
f3 = PdfFileReader(open('Marksheet-2000.pdf', 'rb'))
merger = PdfFileMerger(strict=True)
merger.append(f1)
merger.append(f2)
merger.append(f3)
merger.write('new.pdf')
Output:
In this output, three PDF files has been appended together and new.pdf is generated. Click on the button below to download all the relevant PDFs
This is how we can append to pdf file using PdfFileMerger in Python.
Close operation in PdfFileMerger in Python
PdfFileMerger provides method close() which closes the operation on PDF in Python. It Shuts all file descriptors (input and output) and clears all memory usage.
Merge PDF files using PdfFileMerger in Python
PdfFileMerger provides a method merge(position, fileobj, bookmark=None, pages=None, import_bookmarks=True) using which multiple files can be merged together in Python.
- Merges the pages from the given file into the output file at the specified page number.
- Parameters:
- position (int) – The page number to insert this file. File will be inserted after the given number.
- fileobj – A File Object or an object that supports the standard read and seek methods similar to a File Object. Could also be a string representing a path to a PDF file.
- bookmark (str) – Optionally, you may specify a bookmark to be applied at the beginning of the included file by supplying the text of the bookmark.
- pages – can be a Page Range or a (start, stop[, step]) tuple to merge only the specified range of pages from the source document into the output document.
- import_bookmarks (bool) – You may prevent the source document’s bookmarks from being imported by specifying this as False.
- Merge and append works in a similar way. They both add two or more PDF together but they do it is different.
- Here is the example of merge method in PyPDF2 Python.
Code Snippet:
In this code, we have merged three PDFs into one PDF. It is a dummy academic report card of a student for three years. Click on the below buttons to download PDFs.
from PyPDF2 import PdfFileReader, PdfFileMerger
f1 = PdfFileReader(open('Marksheet-1998.pdf', 'rb'))
f2 = PdfFileReader(open('Marksheet-1999.pdf', 'rb'))
f3 = PdfFileReader(open('Marksheet-2000.pdf', 'rb'))
file = PdfFileMerger(strict=True)
file.merge(
position=0,
fileobj=f1,
bookmark=('1998', 0)
)
file.merge(
position=1,
fileobj= f2,
bookmark='1999'
)
file.merge(
position=2,
fileobj=f3,
bookmark='2000'
)
file.write('new.pdf')
Output:
In this output, PDF is merged and bookmark is also mentioned on them. Click on the belkow button to download all the PDFs used in this program.
This is how we can merge multiple PDF files using PdfFileMerger in Python.
Set Page Layout using PdfFileMerger in Python
PdfFileMerger provides a method setPageLayout(layout) using which page layout can be set in the PDF file in Python.
- Set the page layout
- Parameters:
- layout (str) – The page layout to be used
Layout | Explanation |
---|---|
/NoLayout | Layout explicitly not specified |
/SinglePage | Show one page at a time |
/OneColumn | Show one column at a time |
/TwoColumnLeft | Show pages in two columns, odd-numbered pages on the left |
/TwoColumnRight | Show pages in two columns, odd-numbered pages on the right |
/TwoPageLeft | Show two pages at a time, odd-numbered pages on the left |
/TwoPageRight | Show two pages at a time, odd-numbered pages on the right |
Here is an example of setPageLayout() method in PdfFileMerger using PyPDF2 in Python.
Code Snippet:
In this Python code, we have set the page layout as ‘SinglePage’ that means each page will be displayed in a single page view in Python.
from PyPDF2 import PdfFileReader, PdfFileMerger
f1 = PdfFileReader(open('Marksheet-1998.pdf', 'rb'))
f2 = PdfFileReader(open('Marksheet-1999.pdf', 'rb'))
f3 = PdfFileReader(open('Marksheet-2000.pdf', 'rb'))
file = PdfFileMerger(strict=True)
file.merge(
position=0,
fileobj=f1,
bookmark='1998'
)
file.merge(
position=1,
fileobj= f2,
bookmark='1999'
)
file.merge(
position=2,
fileobj=f3,
bookmark='2000'
)
file.setPageLayout(layout='/SinglePage')
file.write('new.pdf')
Output:
In this output, new.pdf is created with the SinglePage layout. Options for a PageLayout is displayed on the terminal.
Set Page Mode using PdfFileMerger in Python
PdfFileMerger provide a method setPageMode(mode) using which Page Mode can be set in the PDF file in Python.
- Set the page mode.
- Parameters:
- mode (str) – The page mode to use.
Modes | Explanation |
---|---|
/UseNone | Do not show outlines or thumbnails panels |
/UseOutlines | Show outlines (aka bookmarks) panel |
/UseThumbs | Show page thumbnails panel |
/FullScreen | Fullscreen view |
/UseOC | Show Optional Content Group (OCG) panel |
/UseAttachments | Show attachments panel |
Here is the example of setPageMode() method in PdfFileMerger using PyPDF2 in Python.
Code Snippet:
In this code, we have set the screen to FullScreen mode.
from PyPDF2 import PdfFileReader, PdfFileMerger
f1 = PdfFileReader(open('Marksheet-1998.pdf', 'rb'))
f2 = PdfFileReader(open('Marksheet-1999.pdf', 'rb'))
f3 = PdfFileReader(open('Marksheet-2000.pdf', 'rb'))
file = PdfFileMerger(strict=True)
file.merge(
position=0,
fileobj=f1,
bookmark='1998'
)
file.merge(
position=1,
fileobj= f2,
bookmark='1999'
)
file.merge(
position=2,
fileobj=f3,
bookmark='2000'
)
file.setPageMode(mode='/FullScreen')
file.write('new.pdf')
Output:
In this output, out of many modes we have chosen FullScreen mode. The PDF file opens in a fullscreen window immediately after the program is executed.
Write file object using PdfFileMerger in Python
PdfFileMerger provide method write(fileobj) using which merged data can be written in the new PDF file in Python.
- Writes all data that has been merged to the given output file.
- Parameters:
- fileobj – Output file. Can be a filename or any kind of file-like object.
- Write method is used in all the examples above. It is probably the last line of the code.
You may like following Python tutorials:
- Append to a string Python
- Python zip() Function Examples
- Add string to list Python
- BMI Calculator Using Python Tkinter
- Python Pandas Drop Rows Example
With this, we have completed all the methods of PdfFileMerger in Python. We saw how to work with PDF files using the PdfFileMerger in Python with the below points:
- How to add a bookmark to PDF in Python
- How to add metadata to pdf in Python
- How to append PDF files in Python
- How to merge PDF files in Python using PDFFileMerger
- How to set page layout in PDF in Python
- How to set page mode in PDF in Python
- Write file object using PdfFileMerger in Python
Python is one of the most popular languages in the United States of America. I have been working with Python for a long time and I have expertise in working with various libraries on Tkinter, Pandas, NumPy, Turtle, Django, Matplotlib, Tensorflow, Scipy, Scikit-Learn, etc… I have experience in working with various clients in countries like United States, Canada, United Kingdom, Australia, New Zealand, etc. Check out my profile.