Build Your First Neural Network in TensorFlow

Neural networks may sound intimidating the first time you hear about them, but once you break them down into inputs, hidden layers, and outputs, they become surprisingly approachable.

In this tutorial, you’ll learn exactly how to build your first neural network in TensorFlow, a powerful library for machine learning and deep learning. By the end, you’ll have a working model trained to recognize handwritten digits, and you’ll understand the steps behind it.

What is a Neural Network?

At its core, a neural network is a mathematical system inspired by the way our brains work. Just like the neurons in your brain are connected and pass signals, artificial neural networks are built of simple units called neurons, organized into layers.

  • Input Layer: This is where data enters the network. For example, an image of a handwritten digit will be represented as numbers (pixels).
  • Hidden Layers: These layers apply transformations to the input. Multiple layers stacked together allow the network to learn complex patterns.
  • Output Layer: This is where predictions are made. For digit recognition, the output layer might have 10 possible nodes (0–9).

Each connection between neurons has weights and biases, which the network adjusts during training to minimize errors. Activation functions like ReLU (Rectified Linear Unit) or softmax determine how signals flow and allow for more complex decision-making.

Think of it this way: you give the input layer a blurry picture of a handwritten “5,” the hidden layers work out the edges, curves, and details, and the output layer tells you, “I think this is a 5.”

Why TensorFlow?

TensorFlow, developed by Google, is one of the most widely used frameworks for deep learning. It provides powerful tools for building, training, and deploying machine learning models, while still being accessible to beginners.

Using the Keras API inside TensorFlow, you can build models in just a few lines of code. TensorFlow also supports GPU acceleration, making it efficient when you scale to larger datasets. Many production-grade models run on TensorFlow, which also makes it future-proof as you progress from small projects to real-world applications.

Set Up Your Environment

Before we start coding, let’s make sure everything is ready. You’ll need:

  • Python 3.8+ installed
  • Jupyter Notebook or Google Colab for an interactive environment
  • TensorFlow package

Install TensorFlow with pip:

pip install tensorflow

Once installed, open a Python shell or notebook and confirm the version:

import tensorflow as tf
print(tf.__version__)

If you see a version number without errors, you’re good to go.

Explore the MNIST Dataset

For our first neural network, we’ll use the classic MNIST dataset. It’s a collection of 70,000 grayscale images of handwritten digits (0 to 9), each 28×28 pixels.

Why MNIST? Because it’s small, simple, well-studied, and still useful for demonstrating the principles of neural networks.

Let’s load it:

from tensorflow.keras.datasets import mnist

(x_train, y_train), (x_test, y_test) = mnist.load_data()
print("Training set shape:", x_train.shape)
print("Test set shape:", x_test.shape)

You can see the output in the screenshot below.

Build Your First Neural Network in TensorFlow

You’ll see that x_train contains 60,000 images for training, and x_test contains 10,000 images for testing.

Preprocessing the Data

Neural networks are sensitive to input ranges, so data needs preprocessing.

  1. Flatten the images: Convert each 28×28 image into a 784-dimensional vector.
  2. Normalize pixel values: Convert values from 0–255 into the 0–1 range.
  3. Convert labels: Transform the labels (0–9) into categorical format.
x_train = x_train.reshape(-1, 28*28).astype("float32") / 255
x_test = x_test.reshape(-1, 28*28).astype("float32") / 255

from tensorflow.keras.utils import to_categorical
y_train = to_categorical(y_train, 10)
y_test = to_categorical(y_test, 10)

Now each image is a vector, scaled properly, and the labels are ready for classification.

Build the Neural Network

Now comes the exciting part: building the model. We’ll use the Sequential API in Keras, which allows us to stack layers easily.

Our architecture:

  • Input: 784 neurons (flattened image vector)
  • Hidden Layer: 128 neurons, ReLU activation
  • Output Layer: 10 neurons, softmax activation
from tensorflow.keras.models import Sequential
from tensorflow.keras.layers import Dense

model = Sequential([
    Dense(128, activation='relu', input_shape=(784,)),
    Dense(10, activation='softmax')
])
  • ReLU is chosen for the hidden layer because it works well at introducing non-linearity.
  • Softmax in the output ensures probabilities are distributed across the 10 possible classes.

Compiling the Model

Before training, we need to tell the model how it should learn. This means choosing:

  • Optimizer: controls how weights update (we’ll use Adam).
  • Loss function: measures how far predictions are from correct labels (categorical cross-entropy for multi-class classification).
  • Metrics: what to track while training (accuracy).
model.compile(optimizer='adam',
              loss='categorical_crossentropy',
              metrics=['accuracy'])

Training the Model

Training a model means feeding data into the network, adjusting weights, and gradually reducing the error.

We’ll train for 10 epochs with a batch size of 32, reserving 20% of training data for validation.

history = model.fit(x_train, y_train,
                    epochs=10,
                    batch_size=32,
                    validation_split=0.2)

You can see the output in the screenshot below.

Build First Neural Network in TensorFlow

During training, you’ll see loss and accuracy outputs. Training accuracy should increase, while validation accuracy will stabilize. A typical run with MNIST reaches over 97% validation accuracy.

Evaluating Model Performance

After training, we test on unseen data to evaluate general performance.

test_loss, test_acc = model.evaluate(x_test, y_test)
print("Test Accuracy:", test_acc)

You can see the output in the screenshot below.

First Neural Network in TensorFlow

You should expect accuracy close to 97–98%. If it’s much lower, that’s a sign you may need more training or different hyperparameters.

Making Predictions with the Model

Now let’s predict some results. We’ll use the first 5 test samples:

import numpy as np

predictions = model.predict(x_test[:5])
predicted_labels = np.argmax(predictions, axis=1)

print("Predicted:", predicted_labels)
print("Actual:", np.argmax(y_test[:5], axis=1))

You can see the output in the screenshot below.

TensorFlow Build Your First Neural Network

The output will show predicted vs. actual labels. Most of the time, they’ll match correctly thanks to the network we built.

Improve Your Neural Network

Our simple model already performs very well, but there are ways to improve:

  • Add more hidden layers: Deeper networks capture more patterns.
  • Use regularization: Dropout can help prevent overfitting.
  • Tweak hyperparameters: Experiment with learning rates, batch sizes, or number of epochs.
  • Try other datasets: Move beyond MNIST to more complex sets like CIFAR-10.

For example, here’s how you add dropout:

from tensorflow.keras.layers import Dropout

model = Sequential([
    Dense(128, activation='relu', input_shape=(784,)),
    Dropout(0.2),
    Dense(10, activation='softmax')
])

The dropout layer randomly ignores 20% of neurons during training, forcing the network to generalize better.

Common Issues and Debugging

While working with TensorFlow, you may run into common issues:

  • Shape errors: Always check that inputs are reshaped correctly. MNIST starts as (28,28) but must be flattened to (784).
  • Normalization missing: If your accuracy is stuck, you may have forgotten to scale pixel values.
  • Overfitting: High training accuracy but low test accuracy: use dropout or early stopping.
  • Wrong activation: Softmax should be in the output, not hidden layers.

Understanding these will save you hours of frustration.

Summary and Next Steps

We’ve successfully built our first neural network in TensorFlow! Let’s recap the steps:

  1. Loaded and explored the MNIST dataset.
  2. Preprocessed images and labels.
  3. Defined a sequential model with one hidden layer.
  4. Compiled with Adam optimizer and categorical cross-entropy loss.
  5. Trained the model for 10 epochs.
  6. Evaluated on test data with ~97% accuracy.
  7. Made predictions on sample inputs.

This is your first hands-on introduction, and you’ve already seen how simple it is to get started with TensorFlow. But this is just the beginning. From here, you can experiment with:

  • Convolutional Neural Networks (CNNs) for image classification.
  • Recurrent Neural Networks (RNNs) for sequential data like text and time series.
  • Transfer Learning using pre-trained models.
  • Deployment in real-world applications.

You may also like to read other articles:

Leave a Comment

51 Python Programs

51 PYTHON PROGRAMS PDF FREE

Download a FREE PDF (112 Pages) Containing 51 Useful Python Programs.

pyython developer roadmap

Aspiring to be a Python developer?

Download a FREE PDF on how to become a Python developer.

Let’s be friends

Be the first to know about sales and special discounts.