Convert pandas dataframe to tensorflow dataset

TensorFlow Datasets are used in many TensorFlow examples. So, how exactly do you convert Pandas DataFrame to TensorFlow Datasets? In this case, we’ll generate a pandas data frame first, and then use TensorFlow to load it.

  • Convert pandas dataframe to TensorFlow dataset
  • Convert NumPy array to TensorFlow dataset
  • How to convert the dictionary into a Pandas DataFrame
  • Pandas series to TensorFlow tensor
  • Pandas DataFrame vs TensorFlow

Convert pandas data frame to TensorFlow dataset

  • Here we will discuss how to convert pandas dataframe to Tensorflow dataset in Python TensorFlow.
  • In this example, we will create a dummy Pandas Data Frame and transforms it into a TensorFlow Dataset. For this, we are going to use the tf.data.Dataset.from_tensor_slices() function.
  • Each input tensor from tensors creates a dataset similar to a row of your dataset, whereas each input tensor from tensor slices creates a dataset similar to a column of your data. Therefore, in this case, all tensors must be the same length, and the elements (rows) of the resulting dataset are tuples with one element each.
  • Using tf.data as a resource the slices of an array can be obtained as objects by using the Dataset.from tensor slices() function and tf.data. using the dataset.from tensor slices() function.

Syntax:

Let’s have a look at the Syntax and understand the working of tf.data.Dataset.from_tensor_slices() function in Python TensorFlow.

tf.data.Dataset.from_tensor_slices(list)

Example:

import tensorflow as tf
import pandas as pd
 
df = pd.DataFrame({'Department1':[78,16,89],
                   'Department2': ['Science','Maths','Biology']})
 
df
new_df_val = tf.data.Dataset.from_tensor_slices(dict(df))
new_df_val 
for i in new_df_val .take(3):
    print(i)

In the following given code first, we have imported the tensorflow and pandas library and then created a dataframe by using the pd.DataFrame() function in which we assigned two columns ‘Department1’, ‘Department2’.

Next, we converted the given dataframe to the tensor dataset by using the tf.data.Dataset.from_tensor_slices() and then iterate all the column values in tensor.

Here is the Screenshot of the following given code

Convert pandas data frame to TensorFlow dataset
Convert pandas dataframe to TensorFlow dataset

This is how to convert the Pandas DataFrame to TensorFlow Dataset in Python TensorFlow.

Read: TensorFlow Tensor to numpy

Convert NumPy array to TensorFlow dataset

  • In this section, we will discuss how to convert the Numpy array to TensorFlow Dataset.
  • To perform this task we are going to use the tf.data.Dataset.from_tensor_slices() function and this function is easily convert the numpy array to dataset.

Example:

import tensorflow as tf
import numpy as np

# Creating an array by using the np.array() function
new_array = np.array([[15, 98, 67, 45],
				[55, 25, 16, 67],
				[45, 99, 23, 36],
				[88, 92, 14, 22]])

# BY using tf.data.Dataset.from_tensor_slices() function
result = tf.data.Dataset.from_tensor_slices(new_array)

for m in result:
	print(m.numpy())

In this example, we used the np.array() function for creating an array and then assigning integer values to it. Next, we converted the array into a dataset by using the tf.data.Dataset.from_tensor_slices() function.

Here is the execution of the following given code

Convert NumPy array to TensorFlow dataset
Convert NumPy array to TensorFlow dataset

As you can see in the Screenshot we have converted the numpy array to the TensorFlow dataset.

Read: Python TensorFlow reduce_sum

How to convert the dictionary into a Pandas DataFrame

  • To convert a Python Dictionary into a DataFrame (dict) object, use the pandas.DataFrame.to dict() method. Use this approach If you want to transform a DataFrame into a Python dictionary (dict) object, you can do it by changing the data for each row’s columns into keys and the names of the columns as values.
  • The parameters define the dictionary’s structure and how the key-value pairs are connected. Simple DataFrame to Dictionary conversion using the to-dict function.

Syntax:

Here is the Syntax of DataFrame.to_dict() function

DataFrame.to_dict(orient='dict’, into=)
  • It consists of a few parameters
    • orient: The key-value pair structure of the resulting dict is defined. The input parameter, the format in which the dict is created, and the key-value pair for the produced dict are all displayed, and by default, it takes the ‘dict’ value.
    • into: It is used to define the type of resultant dict. We can give an actual class or an empty instance.

Example:

import pandas as pd

new_val = {'Car_names': ['Tesla','BMW','TOYOTA'],
        'Price': [6782,345555,444323]
        }
result = pd.DataFrame(new_val, columns = ['Car_names', 'Price'])

print (result)
print(type(result))

You can refer to the below Screenshot

How to convert the dictionary into a Pandas DataFrame
How to convert the dictionary into a Pandas DataFrame

This is how we can convert the dictionary into a Pandas DataFrame.

Read: Python TensorFlow reduce_mean

Pandas series to TensorFlow tensor

  • Let us see how to convert the pandas series data to TensorFlow tensor.
  • To do this task we are going to use the tf.data.Dataset.from_tensor_slices() function and this function takes each input tensor from tensors to create a dataset that is similar to a row of your dataset, whereas each input tensor from tensor slices creates a dataset that is similar to a column of your data.

Example:

import tensorflow as tf
import pandas as pd
 
df = pd.DataFrame({'Department1':[178,965,156],
                   'Department2': ['Chemistry','Maths','Biology']})
 
df
new_df_val = tf.data.Dataset.from_tensor_slices(dict(df))
new_df_val 
for i in new_df_val .take(3):
    print(i)

In the following given code first, we have imported the tensorflow and pandas library and then created a dataframe by using the pd.DataFrame() function in which we assigned two columns ‘Department1’, ‘Department2’.

Next, we converted the given dataframe to the tensor dataset by using the tf.data.Dataset.from_tensor_slices() and then iterate all the column values in tensor.

Here is the implementation of the following given code

Pandas series to TensorFlow tensor
Pandas series to TensorFlow tensor

Read: Python TensorFlow expand_dims

Pandas DataFrame vs TensorFlow

  • Two-dimensional data and its related labels are stored in a structure called a Pandas DataFrame. In data science, machine learning, scientific computing, and many other domains that deal with large amounts of data, DataFrames are frequently utilized.
  • A tensor is an n-dimensional vector or matrix that can represent any kind of data. A tensor’s values all have the same data type and known (or at least partially known) shape. The dimensions of the matrix or array correspond to the shape of the data.
  • The columns may be of different kinds, such as int, bool, and others and it can be compared to a series structure dictionary where the columns and rows are both indexed. In the case of columns, it is defined as “columns,” and in the case of rows, as “index.”
  • Both the input data and the output of a calculation can give rise to tensors. In TensorFlow, each operation takes place within a graph. The graph represents a series of calculations that happen in order. A connection between each operation is known as an op node.

In this topic, we have discussed the major difference between the Pandas DataFrame and TensorFlow

You may also like to read the following Tensorflow tutorials.

In this article, we have discussed how to convert pandas dataframe to tensorflow dataset and also we have covered the following given topics

  • Convert pandas dataframe to TensorFlow dataset
  • Convert NumPy array to TensorFlow dataset
  • How to convert the dictionary into a Pandas DataFrame
  • Pandas series to TensorFlow tensor
  • Pandas DataFrame vs TensorFlow