As I was building a neural network to predict housing prices across different U.S. states, I hit a roadblock. My model couldn’t make sense of the state names in my dataset. That’s when I remembered: neural networks don’t understand categorical data like “California” or “Texas” directly.
This is where one-hot encoding comes to the rescue.
In my decade-plus of Python development, I’ve found that properly encoding categorical variables can make or break a machine learning model. TensorFlow’s one-hot encoding functionality provides an elegant solution to this common problem.
Let me show you how to implement one-hot encoding in TensorFlow based on my hands-on experience.
One-Hot Encoding
One-hot encoding transforms categorical variables into a format that works better with machine learning algorithms. It creates binary columns for each category, where only one column has a value of 1 (hot) and the rest are 0 (cold).
For example, if we have U.S. states like New York, California, and Texas, one-hot encoding would create:
New York = [1, 0, 0]
California = [0, 1, 0]
Texas = [0, 0, 1]This numerical representation preserves the categorical information without implying any ordinal relationship.
Read Convert Tensor to Numpy in TensorFlow
TensorFlow’s tf.one_hot() Function
TensorFlow provides the tf.one_hot() function in Python for one-hot encoding. Let’s break down how it works:
import tensorflow as tf
# Basic syntax
tf.one_hot(indices, depth, on_value=1.0, off_value=0.0, axis=-1, dtype=tf.float32, name=None)The key parameters are:
indices: The list of indices to be encodeddepth: The length of each one-hot vectoron_value: The value to use for the “hot” position (default 1.0)off_value: The value to use for the “cold” positions (default 0.0)
Method 1: Basic One-Hot Encoding
Let’s start with a simple example. Imagine we’re encoding the top 5 most populous U.S. states:
import tensorflow as tf
# Representing California (0), Texas (1), Florida (2), New York (3), Pennsylvania (4)
state_indices = [0, 1, 2, 3, 4, 1, 0] # Some example data
# One-hot encode the states
encoded_states = tf.one_hot(state_indices, depth=5)
print(encoded_states.numpy())Output:
[[1. 0. 0. 0. 0.]
[0. 1. 0. 0. 0.]
[0. 0. 1. 0. 0.]
[0. 0. 0. 1. 0.]
[0. 0. 0. 0. 1.]
[0. 1. 0. 0. 0.]
[1. 0. 0. 0. 0.]]You can refer to the screenshot below to see the output:

I’ve found this approach works perfectly for small datasets with a fixed number of categories.
Check out Iterate Over Tensor In TensorFlow
Method 2: Custom Values for One-Hot Encoding
Sometimes, you might want values other than 0 and 1. For instance, in a weighted model:
import tensorflow as tf
# U.S. regions (Northeast=0, Midwest=1, South=2, West=3)
region_indices = [0, 2, 3, 1]
# One-hot encode with custom values
encoded_regions = tf.one_hot(
region_indices,
depth=4,
on_value=5.0, # Use 5.0 for the "hot" position
off_value=-1.0 # Use -1.0 for the "cold" positions
)
print(encoded_regions.numpy())Output:
[[ 5. -1. -1. -1.]
[-1. -1. 5. -1.]
[-1. -1. -1. 5.]
[-1. 5. -1. -1.]]You can refer to the screenshot below to see the output:

I’ve used this technique when certain categories need more emphasis in my models.
Read Use TensorFlow’s get_shape Function
Method 3: Handle Multi-dimensional Input
When working with more complex data structures like U.S. election results by state and year:
import tensorflow as tf
# Multi-dimensional input (2x3 matrix)
# Representing voting patterns across states and years
election_data = [
[0, 1, 2],
[2, 0, 1]
]
# One-hot encode the data
encoded_election = tf.one_hot(election_data, depth=3)
print(encoded_election.numpy())Output:
[[[1. 0. 0.]
[0. 1. 0.]
[0. 0. 1.]]
[[0. 0. 1.]
[1. 0. 0.]
[0. 1. 0.]]]You can refer to the screenshot below to see the output:

This creates a 3D tensor where the last dimension represents the one-hot vectors. I frequently use this approach for time-series data across multiple categories.
Method 4: Control the Axis of Expansion
By default, TensorFlow adds the one-hot dimension as the last axis, but you can change this:
import tensorflow as tf
# U.S. cities population ranks
city_ranks = [3, 1, 0, 2] # Representing Chicago, Los Angeles, New York, Houston
# One-hot encode with axis=0 (first dimension)
encoded_cities = tf.one_hot(city_ranks, depth=4, axis=0)
print(encoded_cities.numpy())Output:
[[0. 0. 1. 0.]
[0. 1. 0. 0.]
[1. 0. 0. 0.]
[0. 0. 0. 1.]]I find this parameter particularly useful when I need to maintain a specific tensor structure for subsequent operations.
Check out Tensorflow Activation Functions
Method 5: One-Hot Encoding in a TensorFlow Model Pipeline
One-hot encoding is often part of a larger machine learning pipeline. Here’s how I integrate it into a model:
import tensorflow as tf
# Create a simple model to predict U.S. housing prices based on state and property type
def create_model():
# Input for state index (0-49 for 50 states)
state_input = tf.keras.layers.Input(shape=(1,), name="state")
# One-hot encode the state (embedded in the model)
state_encoded = tf.one_hot(tf.cast(state_input, tf.int32), depth=50)
state_flattened = tf.keras.layers.Flatten()(state_encoded)
# Input for property features
property_input = tf.keras.layers.Input(shape=(10,), name="property_features")
# Combine inputs
combined = tf.keras.layers.Concatenate()([state_flattened, property_input])
# Rest of the model
hidden = tf.keras.layers.Dense(64, activation='relu')(combined)
output = tf.keras.layers.Dense(1, name="price")(hidden)
model = tf.keras.Model(
inputs=[state_input, property_input],
outputs=output
)
return model
# Create and compile the model
model = create_model()
model.compile(optimizer='adam', loss='mse')
# Model summary
model.summary()Embedding the one-hot encoding directly in the model ensures consistency between training and inference.
Read Tensorflow Gradient Descent in Neural Network
Common Issues and How to Avoid Them
Through my years of experience, I’ve encountered several issues with one-hot encoding:
- Large category sets: When dealing with zip codes across the U.S. (>40,000), one-hot encoding becomes inefficient. In these cases, I recommend using embedding layers instead.
- Forgetting to convert string labels to indices first: TensorFlow’s one_hot requires numeric indices. Always convert your string categories to numbers first.
- Out-of-range indices: If your indices exceed the specified depth, TensorFlow will silently produce invalid encodings. Always ensure your indices are within range.
Performance Considerations
For large datasets with many categories (like all U.S. counties), one-hot encoding can be memory-intensive. In such cases:
- Consider using TensorFlow’s sparse tensors
- Use tf.data pipelines to perform encoding on-the-fly
- Use dimensionality reduction techniques before encoding
One-hot encoding is a fundamental technique in my machine learning toolbox. It transforms categorical data like U.S. states, product categories, or customer segments into a format that neural networks can process effectively.
TensorFlow’s implementation is flexible and integrates seamlessly with the rest of the ecosystem. By understanding the various parameters and use cases, you can handle a wide range of categorical data scenarios.
Remember that one-hot encoding is just one approach to handling categorical data. For very high-cardinality features, consider alternatives like embedding layers or feature hashing.
TensorFlow-related tutorials:
- Compile Neural Network in Tensorflow
- Build an Artificial Neural Network in Tensorflow
- Tensor in TensorFlow

I am Bijay Kumar, a Microsoft MVP in SharePoint. Apart from SharePoint, I started working on Python, Machine learning, and artificial intelligence for the last 5 years. During this time I got expertise in various Python libraries also like Tkinter, Pandas, NumPy, Turtle, Django, Matplotlib, Tensorflow, Scipy, Scikit-Learn, etc… for various clients in the United States, Canada, the United Kingdom, Australia, New Zealand, etc. Check out my profile.