In this Python tutorial, let us discuss Python Pandas CSV. We will learn how to read CSV file in Python Pandas and how to save CSV file in Python pandas and will cover the below topis:
- What is a CSV File in Pandas
- How to Read CSV File in pandas
- How to Save CSV File in pandas
- How to Read CSV File in Pandas without Header
- How to Read CSV File in Pandas with a Header
- How to use Append CSV file in pandas
- What is CSV pandas nan
- Pandas CSV to DataFrame
- Pandas CSV to JSON
- Pandas CSV to excel
- Pandas CSV to the dictionary
CSV file in Pandas Python
- In this section, we will learn how to read CSV files using pandas & how to export CSV files using Pandas.
- A CSV (comma-separated values) file is a text file that has a specific format that allows data to be saved in a table structured format.
- CSV is considered to be best to work with Pandas due to their simplicity & easy
- There are two major tasks that we perform while working with CSV
- Read CSV or Import CSV
- Write CSV or Export CSV
- We have discussed both in detail in the upcoming sections.
You may like Python concatenate list with examples
Read CSV File in Pandas
- In this section, we will learn how to read CSV files in pandas. Reading CSV file also means importing CSV file in Pandas.
- Before we process the dataset that is in the CSV format we need to import that CSV.
Syntax:
import pandas as pd
pd.read_csv('file_path_name.csv')
Implementation:
Save CSV File in Pandas
- In this section, we will learn how to save CSV files in pandas or how to export CSV files in pandas.
- to_csv() method is used to export files in CSV format.
Syntax:
import pandas as pd
df = pd.DataFrame({'Key1':['val1', 'val2'], 'Key2':['val1', 'val2']})
df.to_csv('export_file.csv')
Implementation:
Read, Extract text from PDF Python
Read CSV File in Pandas Without Header
- The default value of the header is the index number starting from 0 for the first column.
- When we provide a name to the header then these index values are replaced with the provided name.
- In this section, we will learn how to read CSV files without a header. If the file already has a header then it will turn into a row.
header=None
this is the command we need to add while reading the file.- To make the difference clear we have displayed both with and without header
Syntax:
import pandas as pd
pd.read_csv('file_name.csv', header=None)
Implementation:
In this example, we are reading a dataset of iris Species. This dataset is downloaded from Kaggle. You will notice that when data is displayed without a header then index values as 0, 1, 2.. 5 are displayed. On the other hand, if we compare it with the first one having a header then there are names showing.
Read CSV File in Pandas With a Header
- The default value of the header is the index number starting from 0 for the first column.
- When we provide a name to the header then these index values are replaced with the given name.
- In this section, we will learn how to read CSV files with a header. And to do so simply read the file normally without mentioning the header.
- To make the difference clear, we have displayed both with and without header
Implementation:
Append CSV File in Pandas
- Append is used to add more data to a file
- append can be applied on both Series & DataFrame
- options in append function are:
- other: data to append
- ignore_index: accepts boolean (True/False)
- verify_integrity: accepts boolean (True/False)
- sort: accepts boolean (True/False)
- Here is the complete demonstration of the append function in pandas.
Implementation:
CSV Pandas NaN
- NaN is the missing value in the CSV file. When we open the CSV file in excel then it shows the blank space. This blank space is denoted with NaN in pandas.
- The below picture shows five missing values in the CSV file and these values when will be read through pandas will be represented as NaN
Now, when the above CSV file is read using pandas the missing data will be denoted with NaN as you can see in the below picture.
- Now you know the meaning of NaN in the CSV file, let’s understand how to enter a value in the missing value.
- The only way to remove the missing value is to provide some value. It is an important step in data cleaning.
Write CSV file in Pandas Python
- In this section, we will learn how to create or write or export CSV files using pandas in python.
to_csv()
is used to export the file. The name provided as an argument will be the name of the CSV file.- There are options that we can pass while writing CSV files, the most popular one is setting index to false.
- Here is the demonstration of writing CSV file in pandas Python
Implementation:
Pandas CSV to DataFrame
- In this section, we will learn about how to import CSV to DataFrame.
- There was a method
pandas.DataFrame.from_csv(file)
, but the method is depreciated & instead of this we can simply usepd.read_csv(file)
- Most of the CSV files have more than one column, which means when they will be read using pandas then it automatically uses DataFrame to display the information.
Pandas CSV to JSON
- JSON stands for JavaScript Object Notation. JSON is a lightweight format for storing and transporting data. It is often used when data is sent from a server to a web page.
- In this section, we will learn how to read & write JSON format files & strings.
Write JSON
We are using iris.csv that we have downloaded from Kaggle Using this file content we will read & write JSON.
DataFrame.to_json()
is used to write JSON format.
Syntax
Here is the syntax to write JSON format
import pandas as pd
df = pd.read_csv('workspace-requirements.csv')
df.to_json()
Read JSON
Make sure the file you are about to read is a JSON file.
Syntax
import pandas as pd
df = pd.read_csv('workspace-requirements.csv')
data = df.to_json()
pd.read_json(data)
Implementation:
Pandas CSV to excel
- In this section, we will learn how to export CSV files to excel files.
- First, we have to read the CSV file and then we can export it using the command
to_excel()
- We need to install the module openpyxl, the best way would so be to type
pip install openpyxl
in the jupyter notebook and run it. This may take some time. Once completed you can proceed with the export command. - Here is the demonstration of exporting the CSV file to excel.
Implementation:
Pandas CSV to the dictionary
- The Python dictionary is a key-value pair.
- To export CSV to dictionary firm we have to read the CSV file then we export to the dictionary using
to_dict()
. - This section can also be named as Pandas DataFrame to a dictionary.
Implementation:
Python Pandas CSV to HTML
- In this section, we will learn about how to convert CSV files to Html. This conversion can also be called how to convert DataFrames to Html.
- The first step is to read the CSV file and store it in a variable. In our case have used df as a variable name.
to_html
is used to convert files into HTML format.
Syntax:
import pandas as pd
df = pd.read_csv('filepath-name.csv')
df.to_html('filepath-newname.html'
Implementation:
Here is the implementation of code on jupyter notebook. Here we have used the Iris dataset that is downloaded from Kaggle.
Here is the output of generated html file. There is more data but we just have taken screenshot of first 10 rows.
You may like the following Python tutorials:
- Python write a list to CSV
- How to write Python array to CSV
- Python Read CSV File and Write CSV File
- Python get all files in directory + various examples
- How to Take User Input and Store in Variable using Python Tkinter
- How to convert dictionary to JSON in Python
- Check if a list exists in another list Python
- Missing Data in Pandas
- How to Convert Pandas DataFrame to a Dictionary
- Python Dictionary to CSV
In this tutorial, we have learned to used CSV files in Python Pandas. Also, we have covered these topics.
- What is a CSV File in Pandas
- How to Read CSV File in pandas
- How to Save CSV File in pandas
- How to Read CSV File in Pandas without Header
- How to Read CSV File in Pandas with a Header
- How to use Append CSV file in pandas
- What is CSV pandas nan
- Pandas CSV to DataFrame
- Pandas CSV to JSON
- Pandas CSV to excel
- Pandas CSV to the dictionary
I am Bijay Kumar, a Microsoft MVP in SharePoint. Apart from SharePoint, I started working on Python, Machine learning, and artificial intelligence for the last 5 years. During this time I got expertise in various Python libraries also like Tkinter, Pandas, NumPy, Turtle, Django, Matplotlib, Tensorflow, Scipy, Scikit-Learn, etc… for various clients in the United States, Canada, the United Kingdom, Australia, New Zealand, etc. Check out my profile.