Training Neural Networks In TensorFlow

Recently, I was working on a project where I needed to build and train a neural network to predict housing prices in the US market. As I’ve discovered over my decade-plus career, TensorFlow is one of the most useful frameworks for this task. The issue is, getting started with neural networks can be intimidating.

In this article, I’ll cover everything you need to know to train neural networks in TensorFlow, from setup to advanced techniques.

So let’s get started!

This Tutorial Covers:

Set Up TensorFlow Environment

Before we can start building neural networks, we need to set up our environment properly. Here’s how I typically do it:

# Install TensorFlow
!pip install tensorflow

# Import necessary libraries
import tensorflow as tf
from tensorflow import keras
import numpy as np
import matplotlib.pyplot as plt

# Check TensorFlow version
print(f"TensorFlow version: {tf.__version__}")

This gives us the foundation for all our neural network work. I always verify the TensorFlow version to ensure compatibility with the code I’m writing.

Read Tensor in TensorFlow

Build Your First Neural Network

Let’s start with a simple neural network to classify images from the MNIST dataset (a collection of handwritten digits):

# Load the dataset
(x_train, y_train), (x_test, y_test) = keras.datasets.mnist.load_data()

# Normalize pixel values to be between 0 and 1
x_train, x_test = x_train / 255.0, x_test / 255.0

# Build the model
model = keras.Sequential([
    keras.layers.Flatten(input_shape=(28, 28)),
    keras.layers.Dense(128, activation='relu'),
    keras.layers.Dropout(0.2),
    keras.layers.Dense(10, activation='softmax')
])

# Compile the model
model.compile(
    optimizer='adam',
    loss='sparse_categorical_crossentropy',
    metrics=['accuracy']
)

This model has a simple architecture with one hidden layer, which is perfect for beginners. The flatten layer converts our 28×28 images into a 1D array, while the dense layers do the actual learning.

Train the Neural Network

Now comes the important part: training our network.

# Train the model
history = model.fit(
    x_train, y_train,
    epochs=10,
    validation_data=(x_test, y_test)
)

# Evaluate the model
test_loss, test_acc = model.evaluate(x_test, y_test, verbose=2)
print(f'\nTest accuracy: {test_acc}')

The fit() method is where the magic happens. It runs the training process for the specified number of epochs, showing progress along the way.

Check out TensorFlow Variable

Visualize Training Progress

I’ve found that visualizing training progress helps immensely in understanding what’s happening:

# Plot training & validation accuracy
plt.figure(figsize=(12, 4))
plt.subplot(1, 2, 1)
plt.plot(history.history['accuracy'])
plt.plot(history.history['val_accuracy'])
plt.title('Model accuracy')
plt.ylabel('Accuracy')
plt.xlabel('Epoch')
plt.legend(['Train', 'Validation'], loc='lower right')

# Plot training & validation loss
plt.subplot(1, 2, 2)
plt.plot(history.history['loss'])
plt.plot(history.history['val_loss'])
plt.title('Model loss')
plt.ylabel('Loss')
plt.xlabel('Epoch')
plt.legend(['Train', 'Validation'], loc='upper right')
plt.tight_layout()
plt.show()

You can see the output in he screenshot below.

These plots show you if your model is learning properly or if you’re facing issues like overfitting.

Read Tensorflow Convert String to Int

Work with Real-World Data

Let’s look at a more practical example – predicting housing prices in California:

# Load California housing dataset
from sklearn.datasets import fetch_california_housing
from sklearn.model_selection import train_test_split
from sklearn.preprocessing import StandardScaler

# Get data
housing = fetch_california_housing()
X, y = housing.data, housing.target

# Split data
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2, random_state=42)

# Normalize features
scaler = StandardScaler()
X_train_scaled = scaler.fit_transform(X_train)
X_test_scaled = scaler.transform(X_test)

# Build model for regression
model = keras.Sequential([
    keras.layers.Dense(64, activation='relu', input_shape=(X_train.shape[1],)),
    keras.layers.Dense(32, activation='relu'),
    keras.layers.Dense(1)
])

# Compile
model.compile(optimizer='adam', loss='mse', metrics=['mae'])

# Train
history = model.fit(
    X_train_scaled, y_train,
    epochs=100,
    validation_split=0.2,
    verbose=0,
    callbacks=[
        keras.callbacks.EarlyStopping(patience=10, restore_best_weights=True)
    ]
)

Notice how I’ve added an early stopping callback. This prevents the model from overfitting by stopping training when the validation loss stops improving.

Check out the TensorFlow Fully Connected Layer

Save and Load Models

After training, I always save my models so I can use them later:

# Save the model
model.save('housing_model.h5')

# Later, to load the model:
loaded_model = keras.models.load_model('housing_model.h5')

You can see the output in he screenshot below.

This is incredibly useful when you’ve spent hours training a complex model.

Read Batch Normalization TensorFlow

Advanced: Use Custom Training Loops

For more control, you can use custom training loops with TensorFlow:

# Setup optimizer and loss function
optimizer = tf.keras.optimizers.Adam(learning_rate=0.001)
loss_fn = tf.keras.losses.SparseCategoricalCrossentropy(from_logits=True)

# Training loop
@tf.function
def train_step(x, y):
    with tf.GradientTape() as tape:
        logits = model(x, training=True)
        loss_value = loss_fn(y, logits)
    grads = tape.gradient(loss_value, model.trainable_weights)
    optimizer.apply_gradients(zip(grads, model.trainable_weights))
    return loss_value

# Example training epoch
def train_epoch(dataset):
    total_loss = 0
    num_batches = 0
    for x_batch, y_batch in dataset:
        loss = train_step(x_batch, y_batch)
        total_loss += loss
        num_batches += 1
    return total_loss / num_batches

This approach gives you complete control over the training process, which is valuable for research and advanced applications.

Read Binary Cross-Entropy TensorFlow

Tips for Better Neural Network Training

Through years of experience, I’ve found these strategies especially helpful:

Batch Normalization: Add this after dense layers to stabilize and accelerate training

   keras.layers.Dense(64, activation='relu'),
   keras.layers.BatchNormalization(),

Learning Rate Scheduling: Reduce the learning rate as training progresses

   lr_scheduler = keras.callbacks.ReduceLROnPlateau(
       monitor='val_loss', factor=0.5, patience=5, min_lr=0.0001
   )
   model.fit(..., callbacks=[lr_scheduler])

Regularization: Combat overfitting with L1/L2 regularization

   keras.layers.Dense(64, activation='relu', kernel_regularizer=keras.regularizers.l2(0.001))

I’ve found these techniques make a huge difference in real-world projects, especially when working with limited data.

Check out Tensorflow Gradient Descent in Neural Network

Use GPUs for Faster Training

When working with larger models, using a GPU can dramatically speed up training:

# Check if GPU is available
print("GPU Available: ", tf.config.list_physical_devices('GPU'))

# If using Google Colab, you can set the GPU
# Runtime > Change runtime type > Hardware accelerator > GPU

For complex models like those used in image recognition or natural language processing, GPU acceleration can reduce training time from days to hours.

Neural networks have transformed how we solve problems with data. From analyzing US housing trends to recognizing handwritten digits, the applications are endless. TensorFlow makes these techniques accessible to Python developers of all skill levels.

While there’s much more to learn about neural networks, this guide provides a solid foundation for building and training your models. As you experiment, you’ll develop an intuition for what works best in different situations.

Training Neural Networks in TensorFlow