How to Unzip a File in Python?

I faced an issue recently when I needed to extract a large dataset from a zip file as a part of my project for one of my clients. In this tutorial, I will explain how to unzip a file in Python. I researched this topic and I will share my findings in this post along with suitable examples and screenshots.

Start with the zipfile Module

The zipfile module is part of the Python Standard Library, so you don’t need to install any additional packages. Let’s start by importing the module:

import zipfile

Read How to Get the File Size in MB using Python?

Unzip a File in Python

To unzip a file in Python, you need to open the zip file and extract its contents. Let’s assume you have a zip file named data_usa.zip that you want to extract.

Example:

  1. Open the Zip File: Use the zipfile.ZipFile class to open the zip file.
  2. Extract All Files: Use the extractall method to extract all files to a specified directory.

Here’s a complete example:

import zipfile
import os

# Define the path to the zip file and the extraction directory
zip_file_path = 'data_usa.zip'
extraction_dir = 'extracted_data'

# Create the extraction directory if it doesn't exist
if not os.path.exists(extraction_dir):
    os.makedirs(extraction_dir)

# Open the zip file
with zipfile.ZipFile(zip_file_path, 'r') as zip_ref:
    # Extract all the contents into the specified directory
    zip_ref.extractall(extraction_dir)

print(f"Files extracted to {extraction_dir}")

Output:

Files extracted to extracted_data

You can look at the output in the screenshot below.

Unzip a File in Python

In this example, the zip file data_usa.zip is extracted to the extracted_data directory. The os.makedirs function ensures that the extraction directory exists.

Check out How to Check If a File Exists and Create It If Not in Python?

Extract Specific Files

Sometimes, you may not want to extract all files from the zip archive. Instead, you might need to extract only specific files. You can use the Python extract method for this purpose.

Example:

Let’s say you only want to extract report_2024.csv from the zip file:

import zipfile
import os

# Define the path to the zip file and the extraction directory
zip_file_path = 'data_usa.zip'
extraction_dir = 'extracted_data'
specific_file = 'report_2024.csv'

# Create the extraction directory if it doesn't exist
if not os.path.exists(extraction_dir):
    os.makedirs(extraction_dir)

# Open the zip file
with zipfile.ZipFile(zip_file_path, 'r') as zip_ref:
    # Extract the specific file
    zip_ref.extract(specific_file, extraction_dir)

print(f"{specific_file} extracted to {extraction_dir}")

Output:

report_2024.csv extracted to extracted_data

You can look at the output in the screenshot below.

How to Unzip a File in Python

In this example, only report_2024.csv is extracted to the extracted_data directory.

Read How to Copy File and Rename in Python

Handle Large Files

When dealing with large zip files in Python, you might want to avoid loading the entire file into memory. Instead, you can process the file in chunks. This approach is particularly useful when working with large datasets or log files.

Example: Process Large Files

Here’s how you can process a large zip file without extracting it entirely:

import zipfile

# Define the path to the zip file
zip_file_path = 'large_data_usa.zip'

# Open the zip file
with zipfile.ZipFile(zip_file_path, 'r') as zip_ref:
    # List all the file names in the zip archive
    file_names = zip_ref.namelist()

    for file_name in file_names:
        # Open each file in the zip archive
        with zip_ref.open(file_name) as file:
            # Process the file (e.g., read its contents)
            content = file.read()
            print(f"Processed file: {file_name}")

print("All files processed")

Output:

Processed file: report_2024.csv
All files processed

You can look at the output in the screenshot below.

Unzip a File in Python handle large files

In this example, we list all the file names in the zip archive and then process each file individually.

Check out How to Import a Class from a File in Python

Unzip Files from a URL

In some cases, you may need to download a zip file from a URL and then extract its contents. You can achieve this using the Python requests library along with zipfile.

Example:

First, install the requests library if you haven’t already:

pip install requests

Then, use the following code to download and unzip a file from a URL:

import requests
import zipfile
import io

# Define the URL of the zip file
url = 'https://example.com/data_usa.zip'

# Download the zip file
response = requests.get(url)
zip_file = zipfile.ZipFile(io.BytesIO(response.content))

# Extract all the contents
extraction_dir = 'downloaded_data'
zip_file.extractall(extraction_dir)

print(f"Files downloaded and extracted to {extraction_dir}")

In this example, the zip file is downloaded from the specified URL and extracted to the downloaded_data directory.

Read Python file Does Not Exist Exception

Error Handling

When working with zip files in Python, it’s important to handle potential errors gracefully. Common errors include file not found, invalid zip files, and permission issues.

Example:

Here’s how you can add error handling to your unzipping code:

import zipfile
import os

# Define the path to the zip file and the extraction directory
zip_file_path = 'data_usa.zip'
extraction_dir = 'extracted_data'

# Create the extraction directory if it doesn't exist
if not os.path.exists(extraction_dir):
    os.makedirs(extraction_dir)

try:
    # Open the zip file
    with zipfile.ZipFile(zip_file_path, 'r') as zip_ref:
        # Extract all the contents into the specified directory
        zip_ref.extractall(extraction_dir)
    print(f"Files extracted to {extraction_dir}")
except FileNotFoundError:
    print(f"Error: The file {zip_file_path} does not exist.")
except zipfile.BadZipFile:
    print(f"Error: The file {zip_file_path} is not a valid zip file.")
except PermissionError:
    print(f"Error: Permission denied. Unable to extract files to {extraction_dir}.")

In this example, we handle FileNotFoundError, zipfile.BadZipFile, and PermissionError exceptions.

Check out Python File methods

Conclusion

In this tutorial, I have explained how to unzip a file in Python. I discussed the zipfile module, unzipping a file, extracting specific files, handling large files, unzipping files from a URL, and Error handling.

You may also like to read:

51 Python Programs

51 PYTHON PROGRAMS PDF FREE

Download a FREE PDF (112 Pages) Containing 51 Useful Python Programs.

pyython developer roadmap

Aspiring to be a Python developer?

Download a FREE PDF on how to become a Python developer.

Let’s be friends

Be the first to know about sales and special discounts.