Pandas in Python

In this Python machine learning tutorial, we will learn about pandas in Python also we will cover these topics.

  • Pandas Library in Python documentation
  • installing Pandas via Conda
  • Installing Pandas via Pip
  • pandas version check
  • Types of Pandas in Python
  • Head function in pandas
  • Tail function in pandas

Pandas Library in Python Documentation

  • Python Pandas are the widely used library in Machine Learning & Data Sciences for data analysis. It allows creating, reading, manipulating & deleting data.
  • You might be thinking similar features are provided by Structured query languages as well. So the major difference is the file types. Pandas can use almost any file type whereas Structured Query Languages are limited to database files only.
  • Pandas are simple to use, integrates with many data sciences & Machine Learning Tools & helps to get data ready for Machine Learning.
  • Pandas have two types of objects
    • Series # add a link to a series section
    • DataFrame # add a link to the DataFrame section
  • Click here to see the official documentation of pandas.
Pandas in Python
Pandas in Python

Installing Pandas via conda

  • Conda is a package manager used to install libraries necessary for Machine Learning.
  • In this section, we will learn how to install pandas using the conda package manager.
  • First thing is to create an environment. Click here to learn how to create an environment.
  • Once the environment is created, activate the environment.
  • Now once the environment is created and activated simply type the below code to install Pandas.
conda install pandas -y
  • Here -y means yes to y/n prompt. The installing may take a couple of minutes depending upon the bandwidth speed.
  • Note: We want to spread the right way of doing programming otherwise same things can be done without creating a virtual environment.

Installing Pandas via pip

  • pip is a package installing for python. It has a wide variety of python libraries that can be installed for n-number of purposes.
  • In case you want to know more about pip or you want to know how to install pip on your system click here.
  • We are assuming that pip is installed on your system.
  • Now we to install virtualenv globally so that we can create a virtual environment.

Syntax:

Here is the syntax to install virtualenv globally on the system.

pip install virtualenv
  • To install pandas first we have to create & activate a virtual environment inside which we will be installing all the necessary libraries.
# creating virtual environment with the name env
virtualenv env

# activating environment for windows
env/Scripts/activate    

# activating environmnt for mac & linux
Source env/bin/activate
  • Now to install pandas, simply type pip install pandas
  • Note: We want to spread the right way of doing programming otherwise same things can be done without creating a virtual environment.

Pandas Version Check

Find the version of the dependencies for the given version of the Pandas running on any system. We can use the utility function pd.show_versions() to check the version of the dependencies. Information provided here is used to troubleshoot the problems.

Types of Pandas in Python (Based on Usage)

  • Pandas in Python can be used in two ways
    • Series
    • DataFrame
  • Series are used when we have to display a single line of data
  • Dataframes are used when we have to display data in tabular format.
  • Let’s understand the working of Series & DataFrame in more detail.

Pandas Series

  • In this section, we will discuss Pandas Series. we will talk about the definition, purpose, and example of the Pandas Series.
  • Series are the single-dimensional labeled list capable of holding any data irrespective of type.

Syntax:

import pandas as pd

pd.Series(['item1', 'item2', 'item3'..'item_n'])
  • To use pandas, we need to import them first. so in the first line, we have imported pandas provide it a short name as pd.
  • You can use any name instead of pd but it is the most common name used by programmers.
  • These items can be an integer, float, string, objects, etc.

Example:

In this example, we have created a list of items required to assemble a computer. We have created two Series, one with the name computer and the other with the name feature.

Pandas DataFrame

  • In this section, we will discuss Pandas DataFrame. we will talk about the definition, purpose, and example of Pandas DataFrame.
  • Pandas DataFrame are muti-dimensional representation of data.
  • It displays data in table format. Each entry corresponds to a row and column.
  • DataFrames contain an array of individual Series and DataFrame works in a dictionary format.

Syntax:

In this syntax, we have created 3 columns with 3 rows in each column.

import pandas as pd

pd.DataFrame({'Column1':[item1, item2 .. item_n],'Column2':[item1, item2 .. item_n], 'Column_n':[item1, item2 .. item_n]})

Example:

In this example, we will see how to create DataFrame using the above mentioned series.

Here is another example of dataframe wherein we have created it from scratch.

Head Function in Pandas

  • While working with a large dataset, we make changes & we wanted to see it. But displaying the entire dataset could be time taking. So we use the head function in that case.
  • The Head function displays the first 5 rows of the dataset.
  • You can edit the displaying of a number of rows by passing an integer argument.
  • By default, it displays 5 rows from the top but if passed an integer parameter then will display that number from the top.
  • In our example, we will be using an iris dataset that is downloaded from Kaggle.

Tail function in pandas

  • While working with a large dataset, we make changes & we wanted to see it. But displaying the entire dataset could be time taking. So we use the Tail function in that case.
  • The Tail function displays the first 5 rows of the dataset.
  • You can edit the displaying of a number of rows by passing an integer argument.
  • By default, it displays 5 bottom rows but if passed an integer parameter then will display that number to the bottom.
  • In our example, we will be using an iris dataset that is downloaded from Kaggle.

You may like the following Python tutorials:

In this section, we have learned about Python pandas. We will be seeing more functions and applications of Python pandas in further tutorials.

  • Pandas Library in Python documentation
  • installing Pandas via Conda
  • Installing Pandas via Pip
  • pandas version check
  • Types of Pandas in Python
  • Head function in pandas
  • Tail function in pandas