In this TensorFlow tutorial, **I will show you how to use the Tensorflow one_hot encoding function, tf.one_hot().**

This function is beneficial; it changes the data into numerical values that the ML algorithm can work on effectively.

I was creating an image classifier model in Tensorflow, and the task was to classify 10 different kinds of animals. So, I collected data and prepared it, but it wasn’t in the proper format for the ML model, which means it was still in textual form, which ML couldn’t understand properly.

The problem was converting the labels for each animal into a numerical format so that the data could be fed to the ML model. Luckily, I found the function **tf.one_hot encode** in TensorFlow, which helped me convert these labels into numerical values.

So, in this tutorial, I have explained how to use the **tf.one_hot** with syntax and easy examples from scratch.

## What does One_Hot Encoding mean?

One-hot encoding converts categorical data into a numerical format, and this numerical data is fed to a machine-learning model.

But **how does this one-hot encoding work?** Suppose you have categorical data or two categories of data. Using one-hot encoding, each category is represented in a binary vector.

Let me show you an example. You have two categories: **elephant** and **eagle**. If the elephant has a first position and the dog has a second position, you can represent these elements (categories) in the list as **[‘elephant’, ‘eagle’].**

Now, in hot-encoding, this list must be a binary vector, **so for the elephant,** it would be **[1, 0]**, and **for the eagle,** it would be** [0, 1].**

Here, **elephant = [1, 0]**, and **eagle = [0, 1];** thus, you have converted your categorical data **(‘elephant’ and ‘eagle’)** into numerical data **([1, 0], [0, 1])**. This is how one-hot encoding works because you have encoded the categorical data into numerical values here.

Now, you can work smoothly on these numerical data in Python.

## TensorFlow One_Hot Encoding

But **what will happen if you have more than 2 categorical data, such as 5, 10, and 50? Will you encode those categorical data into numerical data manually?** I don’t think so.

For that, TensorFlow provides a function called **tf.one_hot** that you can use to convert categorical data into numerical values like the above.

The **tf.one_hot** function requires two things: **the indices of categorical data** and the **depth (number of categories)**. It then **returns a binary matrix encoded with input data.**

The syntax of **tf.one_hot** is given below.

```
tf.one_hot(
indices,
depth,
on_value=None,
off_value=None,
axis=None,
dtype=None,
name=None
)
```

Where parameters are:

**indices:**This parameter indicates the index number we want to operate and is a tensor of indices.**depth:**This defines the dimension of a hot tensor number of rows and columns.**on_value:**By default, it takes 1**off_value:**By default, it takes a 0 value if not provided.**axis:**This parameter defines the axis to fill; by default, its value is -1.**dtype:**The data type of the output tensor.

For example, suppose you have the same categories, **elephant** and **eagle**.

The first step is to import the TensorFlow using the below code.

`import tensorflow as tf`

As you know, the **tf.one_hot** accepts the indices of categorical data; let’s create indices of categorical data. Suppose **the index of the elephant is 0** and **the eagle is 1**, as shown below.

```
# elephant (0) and eagle (1)
category_indices = [0, 1]
```

Also, the **tf.one_hot** accepts the second parameter **depth** (which number of categories), so consider the depth of one hot dimension equal to 2.

`depth = 2`

Now pass these two values, **category_indices** and **depth**, to the **tf.one_hot()** function, as shown below.

`encoded_data = tf.one_hot(category_indices, depth)`

Output the values of encoded_data.

`print(encoded_data.numpy())`

It outputs the binary matrix output, which contains two lists **[1. 0.] representing elephant** and** [0. 1.] representing eagle in numerical form.**

There are two terms, hot and cold, so if you look at the binary matrix, it contains two separate lists:

- Where
**the elephant is represented as [1. 0.]**because the**first position is for the elephant in the indices**, 1 shows that it is an elephant, and**the eagle is represented as [0. 1.]**because the**second position is for the eagle in the indices**, 1 at the second position in the list shows the eagle. - So, in each list,
**only one position is ‘hot’, marked as 1**, and**the rest are ‘cold’, marked as 0.**

This is how you can use TensorFlow’s **tf.one_hot()** function to convert the categorical data into binary matrix (numerical data).

But **why must we convert categories into numerical values using the tensorflow tf.one_hot() function?**

There is some reason: as you know, computers can work more smoothly or effectively with numerical data than with textual or descriptive data. So, one-hot coding techniques convert text into a numerical form, which computers can process and analyze easily.

The second reason is to avoid misinterpretation of data. suppose you assign numbers such as 1 for the elephant and 2 for the eagle. When the computer directly works on this data, it might consider that the eagle is greater than an elephant or the elephant is smaller than an eagle.

So here, one-hot encoding prevents this by treating each category equally.

Third, your data must be compatible with a machine learning model; in Tensorflow, some neural network models work better with numerical data. So here, **tf.one_hot** allows us to convert the data into a format these neural network models can work with very easily.

Next, I will explain how **tf.one_hot** is used in Natural Language Processing.

### TensorFlow One_Hot Encoding in Natural Language Processing

You can use TensorFlow, a one-hot encoding in natural language processing, to represent the words or characters. For example, suppose you have a text and need to encode each character for a neural network model.

You can use the TensorFlow **tf.one_hot** to encode those characters. Let’s say you have characters **n, l, p.**

So, the first step is to create indices of these characters and specify their depth (the number of characters in this case).

```
# n is (0), l is (1), p is (2)
char_indices = [0, 1, 2]
depth_numchars = 3
```

Now pass the above **char_indices** and **depth_numchars** to **tf.one_hot()** function as shown below.

```
char_encoding = tf.one_hot(char_indices, depth_numchars)
print(char_encoding)
```

It converted the given characters into numerical data such as **n = [1. 0. 0.]**, **l=[0. 1. 0.]** and **p=[0. 0. 1.]**, look in the output.

This is just a simple example of using the **tf.one_hot()** to encode the characters into numerical data in Natural Language Processing.

After learning from the above example and the concepts of the **tf.one_hot()** function, I hope you understand how to use the TensorFlow one-hot encoding method.

## Conclusion

In this TensorFlow tutorial, you learned **how to use the Tensorflow one_hot encoding function, tf.one_hot(),** to convert the given categorical or textual data into numerical data.

You learned about one-hot encoding and how it works with an example. You also did an example where you converted the two categories ‘elephant’ and ‘eagle’ into numerical form.

You may like to read:

- Training Neural Network in TensorFlow
- Build Artificial Neural Network in Tensorflow
- How to Compile Neural Network in Tensorflow

I am Bijay Kumar, a Microsoft MVP in SharePoint. Apart from SharePoint, I started working on Python, Machine learning, and artificial intelligence for the last 5 years. During this time I got expertise in various Python libraries also like Tkinter, Pandas, NumPy, Turtle, Django, Matplotlib, Tensorflow, Scipy, Scikit-Learn, etc… for various clients in the United States, Canada, the United Kingdom, Australia, New Zealand, etc. Check out my profile.