Python Pandas CSV Tutorial

In this Python tutorial, let us discuss Python Pandas CSV. We will learn how to read CSV file in Python Pandas and how to save CSV file in Python pandas and will cover the below topis:

  • What is a CSV File in Pandas
  • How to Read CSV File in pandas
  • How to Save CSV File in pandas
  • How to Read CSV File in Pandas without Header
  • How to Read CSV File in Pandas with a Header
  • How to use Append CSV file in pandas
  • What is CSV pandas nan
  • Pandas CSV to DataFrame
  • Pandas CSV to JSON
  • Pandas CSV to excel
  • Pandas CSV to the dictionary

CSV file in Pandas Python

  • In this section, we will learn how to read CSV files using pandas & how to export CSV files using Pandas.
  • A CSV (comma-separated values) file is a text file that has a specific format that allows data to be saved in a table structured format.
  • CSV is considered to be best to work with Pandas due to their simplicity & easy
  • There are two major tasks that we perform while working with CSV
    • Read CSV or Import CSV
    • Write CSV or Export CSV
  • We have discussed both in detail in the upcoming sections.

Read CSV File in Pandas

  • In this section, we will learn how to read CSV files in pandas. Reading CSV file also means importing CSV file in Pandas.
  • Before we process the dataset that is in the CSV format we need to import that CSV.

Syntax:

import pandas as pd

pd.read_csv('file_path_name.csv')

Implementation:

Save CSV File in Pandas

  • In this section, we will learn how to save CSV files in pandas or how to export CSV files in pandas.
  • to_csv() method is used to export files in CSV format.

Syntax:

import pandas as pd

df = pd.DataFrame({'Key1':['val1', 'val2'], 'Key2':['val1', 'val2']})

df.to_csv('export_file.csv')

Implementation:

Read CSV File in Pandas Without Header

  • The default value of the header is the index number starting from 0 for the first column.
  • When we provide a name to the header then these index values are replaced with the provided name.
  • In this section, we will learn how to read CSV files without a header. If the file already has a header then it will turn into a row.
  • header=None this is the command we need to add while reading the file.
  • To make the difference clear we have displayed both with and without header

Syntax:

import pandas as pd

pd.read_csv('file_name.csv', header=None)

Implementation:

In this example, we are reading a dataset of iris Species. This dataset is downloaded from Kaggle. You will notice that when data is displayed without a header then index values as 0, 1, 2.. 5 are displayed. On the other hand, if we compare it with the first one having a header then there are names showing.

Read CSV File in Pandas With a Header

  • The default value of the header is the index number starting from 0 for the first column.
  • When we provide a name to the header then these index values are replaced with the given name.
  • In this section, we will learn how to read CSV files with a header. And to do so simply read the file normally without mentioning the header.
  • To make the difference clear, we have displayed both with and without header

Implementation:

Append CSV File in Pandas

  • Append is used to add more data to a file
  • append can be applied on both Series & DataFrame
  • options in append function are:
    • other: data to append
    • ignore_index: accepts boolean (True/False)
    • verify_integrity: accepts boolean (True/False)
    • sort: accepts boolean (True/False)
  • Here is the complete demonstration of the append function in pandas.

Implementation:

CSV Pandas NaN

  • NaN is the missing value in the CSV file. When we open the CSV file in excel then it shows the blank space. This blank space is denoted with NaN in pandas.
  • The below picture shows five missing values in the CSV file and these values when will be read through pandas will be represented as NaN
machine learning using python pandas missing values

Now, when the above CSV file is read using pandas the missing data will be denoted with NaN as you can see in the below picture.

machine learning using python missing values in pandas
  • Now you know the meaning of NaN in the CSV file, let’s understand how to enter a value in the missing value.
  • The only way to remove the missing value is to provide some value. It is an important step in data cleaning.

Write CSV file in Pandas Python

  • In this section, we will learn how to create or write or export CSV files using pandas in python.
  • to_csv() is used to export the file. The name provided as an argument will be the name of the CSV file.
  • There are options that we can pass while writing CSV files, the most popular one is setting index to false.
  • Here is the demonstration of writing CSV file in pandas Python

Implementation:

Pandas CSV to DataFrame

  • In this section, we will learn about how to import CSV to DataFrame.
  • There was a method pandas.DataFrame.from_csv(file), but the method is depreciated & instead of this we can simply use pd.read_csv(file)
  • Most of the CSV files have more than one column, which means when they will be read using pandas then it automatically uses DataFrame to display the information.

Pandas CSV to JSON

  • JSON stands for JavaScript Object Notation. JSON is a lightweight format for storing and transporting data. It is often used when data is sent from a server to a web page.
  • In this section, we will learn how to read & write JSON format files & strings.

Write JSON

We are using iris.csv that we have downloaded from Kaggle Using this file content we will read & write JSON.

DataFrame.to_json() is used to write JSON format.

Syntax

Here is the syntax to write JSON format

import pandas as pd

df = pd.read_csv('workspace-requirements.csv')

df.to_json()

Read JSON

Make sure the file you are about to read is a JSON file.

Syntax

import pandas as pd 

df = pd.read_csv('workspace-requirements.csv')

data = df.to_json()

pd.read_json(data)

Implementation:

Pandas CSV to excel

  • In this section, we will learn how to export CSV files to excel files.
  • First, we have to read the CSV file and then we can export it using the command to_excel()
  • We need to install the module openpyxl, the best way would so be to type pip install openpyxl in the jupyter notebook and run it. This may take some time. Once completed you can proceed with the export command.
  • Here is the demonstration of exporting the CSV file to excel.

Implementation:

Pandas CSV to the dictionary

  • The Python dictionary is a key-value pair.
  • To export CSV to dictionary firm we have to read the CSV file then we export to the dictionary using to_dict().
  • This section can also be named as Pandas DataFrame to a dictionary.

Implementation:

Python Pandas CSV to HTML

  • In this section, we will learn about how to convert CSV files to Html. This conversion can also be called how to convert DataFrames to Html.
  • The first step is to read the CSV file and store it in a variable. In our case have used df as a variable name.
  • to_html is used to convert files into HTML format.

Syntax:

import pandas as pd

df = pd.read_csv('filepath-name.csv')

df.to_html('filepath-newname.html'

Implementation:

Here is the implementation of code on jupyter notebook. Here we have used the Iris dataset that is downloaded from Kaggle.

Here is the output of generated html file. There is more data but we just have taken screenshot of first 10 rows.

machine learninbg pandas csv to html

You may like the following Python tutorials:

In this tutorial, we have learned to used CSV files in Python Pandas. Also, we have covered these topics.

  • What is a CSV File in Pandas
  • How to Read CSV File in pandas
  • How to Save CSV File in pandas
  • How to Read CSV File in Pandas without Header
  • How to Read CSV File in Pandas with a Header
  • How to use Append CSV file in pandas
  • What is CSV pandas nan
  • Pandas CSV to DataFrame
  • Pandas CSV to JSON
  • Pandas CSV to excel
  • Pandas CSV to the dictionary