Recently, I was working on an image recognition project where I needed to classify thousands of product images for an e-commerce client. Traditional machine learning approaches were falling short, and I needed something more useful. That’s when I turned to Convolutional Neural Networks (CNNs) with TensorFlow.
In this article, I’ll share everything I’ve learned about implementing CNNs with TensorFlow, from the basics to advanced techniques. We’ll walk through practical examples that you can apply to your projects right away.
Let’s get started..!
Convolutional Neural Network
CNNs are specialized neural networks designed primarily for processing grid-like data, such as images. Unlike regular neural networks, CNNs use a mathematical operation called convolution that helps them detect patterns regardless of where they appear in an image.
Think of CNNs as pattern detectors that can identify features like edges, textures, and shapes, and then combine these features to recognize complex objects like faces, cars, or text.
Set Up Your TensorFlow Environment
Before we build our CNN, let’s make sure we have everything set up correctly:
# Install required packages
# pip install tensorflow numpy matplotlib
import tensorflow as tf
from tensorflow.keras import layers, models
import numpy as np
import matplotlib.pyplot as plt
# Check TensorFlow version
print(f"TensorFlow version: {tf.__version__}")
# Check for GPU availability
print(f"GPU available: {tf.config.list_physical_devices('GPU')}")Read AttributeError module ‘tensorflow’ has no attribute ‘summary’
Build a Basic CNN in TensorFlow
Let’s start with a basic CNN architecture for image classification:
import tensorflow as tf
from tensorflow.keras import layers, models
# Define the CNN model
model = models.Sequential([
layers.Conv2D(32, (3, 3), activation='relu', input_shape=(32, 32, 3)),
layers.MaxPooling2D((2, 2)),
layers.Conv2D(64, (3, 3), activation='relu'),
layers.MaxPooling2D((2, 2)),
layers.Conv2D(64, (3, 3), activation='relu'),
layers.MaxPooling2D((2, 2)),
layers.Flatten(),
layers.Dense(64, activation='relu'),
layers.Dense(10, activation='softmax') # 10 classes for CIFAR-10
])
# Compile the model
model.compile(optimizer='adam',
loss='categorical_crossentropy',
metrics=['accuracy'])
# Model summary
model.summary()
# Train the model
history = model.fit(train_images, train_labels, epochs=10,
validation_data=(test_images, test_labels),
batch_size=64)
# Evaluate on test data
test_loss, test_acc = model.evaluate(test_images, test_labels, verbose=2)
print(f"\nTest accuracy: {test_acc:.4f}")You can see the output in the screenshot below.

This simple architecture includes three convolutional layers, each followed by a max pooling layer, and then flattens the output before passing it through fully connected layers.
Check out AttributeError: Module ‘tensorflow’ has no attribute ‘logging’
Load and Preprocess Your Data
For a practical example, let’s use the CIFAR-10 dataset, which contains 60,000 32×32 color images in 10 different classes:
# Load CIFAR-10 dataset
(train_images, train_labels), (test_images, test_labels) = tf.keras.datasets.cifar10.load_data()
# Normalize pixel values to be between 0 and 1
train_images = train_images.astype('float32') / 255.0
test_images = test_images.astype('float32') / 255.0
# Convert labels to one-hot encoding
train_labels = tf.keras.utils.to_categorical(train_labels, 10)
test_labels = tf.keras.utils.to_categorical(test_labels, 10)
# Display sample images
class_names = ['airplane', 'automobile', 'bird', 'cat', 'deer',
'dog', 'frog', 'horse', 'ship', 'truck']
plt.figure(figsize=(10, 10))
for i in range(25):
plt.subplot(5, 5, i+1)
plt.xticks([])
plt.yticks([])
plt.grid(False)
plt.imshow(train_images[i])
plt.xlabel(class_names[np.argmax(train_labels[i])])
plt.show()You can see the output in the screenshot below.

Train Your CNN Model
Now let’s train our CNN model on the CIFAR-10 dataset:
# Build the model
model = build_basic_cnn((32, 32, 3), 10)
# Add data augmentation to prevent overfitting
data_augmentation = tf.keras.Sequential([
layers.RandomFlip("horizontal"),
layers.RandomRotation(0.1),
layers.RandomZoom(0.1),
])
# Create a new model with data augmentation
augmented_model = models.Sequential([
data_augmentation,
model
])
# Train the model
history = augmented_model.fit(
train_images, train_labels,
epochs=20,
batch_size=64,
validation_split=0.2,
callbacks=[
tf.keras.callbacks.EarlyStopping(patience=3, restore_best_weights=True)
]
)
# Evaluate the model
test_loss, test_acc = augmented_model.evaluate(test_images, test_labels)
print(f"Test accuracy: {test_acc:.4f}")Read Module ‘TensorFlow’ has no attribute ‘get_default_graph’
Visualize Training Progress
It’s important to visualize how our model is learning over time:
# Plot training & validation accuracy
plt.figure(figsize=(12, 4))
plt.subplot(1, 2, 1)
plt.plot(history.history['accuracy'], label='Training Accuracy')
plt.plot(history.history['val_accuracy'], label='Validation Accuracy')
plt.xlabel('Epoch')
plt.ylabel('Accuracy')
plt.legend()
# Plot training & validation loss
plt.subplot(1, 2, 2)
plt.plot(history.history['loss'], label='Training Loss')
plt.plot(history.history['val_loss'], label='Validation Loss')
plt.xlabel('Epoch')
plt.ylabel('Loss')
plt.legend()
plt.show()Check out Fix Module ‘TensorFlow’ has no attribute ‘session’
Advanced CNN Architectures
While our basic CNN works well for simple tasks, more complex problems benefit from advanced architectures. Let’s implement a ResNet-inspired model:
def residual_block(x, filters, kernel_size=3, stride=1):
# Shortcut connection
shortcut = x
# First convolution
y = layers.Conv2D(filters, kernel_size, strides=stride, padding='same')(x)
y = layers.BatchNormalization()(y)
y = layers.Activation('relu')(y)
# Second convolution
y = layers.Conv2D(filters, kernel_size, padding='same')(y)
y = layers.BatchNormalization()(y)
# If dimensions change, adjust shortcut
if stride > 1 or x.shape[-1] != filters:
shortcut = layers.Conv2D(filters, 1, strides=stride, padding='same')(x)
shortcut = layers.BatchNormalization()(shortcut)
# Add shortcut to output
y = layers.add([y, shortcut])
y = layers.Activation('relu')(y)
return y
def build_resnet(input_shape, num_classes):
inputs = tf.keras.Input(shape=input_shape)
# Initial convolution
x = layers.Conv2D(64, 7, strides=2, padding='same')(inputs)
x = layers.BatchNormalization()(x)
x = layers.Activation('relu')(x)
x = layers.MaxPooling2D(3, strides=2, padding='same')(x)
# Residual blocks
x = residual_block(x, 64)
x = residual_block(x, 64)
x = residual_block(x, 128, stride=2)
x = residual_block(x, 128)
x = residual_block(x, 256, stride=2)
x = residual_block(x, 256)
# Global average pooling and final dense layer
x = layers.GlobalAveragePooling2D()(x)
x = layers.Dense(num_classes, activation='softmax')(x)
model = tf.keras.Model(inputs, x)
model.compile(
optimizer=tf.keras.optimizers.Adam(1e-3),
loss='categorical_crossentropy',
metrics=['accuracy']
)
return modelTransfer Learning with Pre-trained Models
For many real-world applications, using pre-trained models can save time and improve performance:
def build_transfer_learning_model(input_shape, num_classes):
# Load pre-trained MobileNetV2 model
base_model = tf.keras.applications.MobileNetV2(
input_shape=input_shape,
include_top=False,
weights='imagenet'
)
# Freeze the base model
base_model.trainable = False
# Create new model on top
inputs = tf.keras.Input(shape=input_shape)
# Preprocess input for MobileNetV2
x = tf.keras.applications.mobilenet_v2.preprocess_input(inputs)
# Base model
x = base_model(x, training=False)
# Global pooling and classification layer
x = layers.GlobalAveragePooling2D()(x)
x = layers.Dense(128, activation='relu')(x)
x = layers.Dropout(0.5)(x)
outputs = layers.Dense(num_classes, activation='softmax')(x)
model = tf.keras.Model(inputs, outputs)
model.compile(
optimizer=tf.keras.optimizers.Adam(1e-4),
loss='categorical_crossentropy',
metrics=['accuracy']
)
return model, base_modelFine-tuning Pre-trained Models
After initial training, we can fine-tune the pre-trained layers for even better performance:
def fine_tune_model(model, base_model, train_images, train_labels, test_images, test_labels):
# First, train only the top layers
history1 = model.fit(
train_images, train_labels,
epochs=5,
batch_size=32,
validation_data=(test_images, test_labels)
)
# Unfreeze the base model
base_model.trainable = True
# Freeze early layers and train later layers
for layer in base_model.layers[:100]:
layer.trainable = False
# Recompile the model with a lower learning rate
model.compile(
optimizer=tf.keras.optimizers.Adam(1e-5), # Lower learning rate
loss='categorical_crossentropy',
metrics=['accuracy']
)
# Fine-tune the model
history2 = model.fit(
train_images, train_labels,
epochs=10,
batch_size=32,
validation_data=(test_images, test_labels),
callbacks=[
tf.keras.callbacks.EarlyStopping(patience=3, restore_best_weights=True)
]
)
return history1, history2Read AttributeError: Module ‘keras.backend’ has no attribute ‘get_session’
Save and Load Your CNN Models
Once you’ve trained a good model, you’ll want to save it for future use:
# Save the model
model.save('my_cnn_model.h5')
# Later, load the model
loaded_model = tf.keras.models.load_model('my_cnn_model.h5')Real-world Application: Product Image Classification
Let’s consider a real-world example where we might use CNNs, classifying product images for an e-commerce website:
# Example for a hypothetical e-commerce product classifier
def build_ecommerce_classifier():
# Load a pre-trained model without the top classification layer
base_model = tf.keras.applications.EfficientNetB0(
include_top=False,
weights='imagenet',
input_shape=(224, 224, 3)
)
# Freeze the base model
base_model.trainable = False
# Add custom classification head
model = models.Sequential([
base_model,
layers.GlobalAveragePooling2D(),
layers.Dense(256, activation='relu'),
layers.Dropout(0.5),
layers.Dense(128, activation='relu'),
layers.Dropout(0.3),
# Assuming we have 10 product categories
layers.Dense(10, activation='softmax')
])
model.compile(
optimizer=tf.keras.optimizers.Adam(0.001),
loss='categorical_crossentropy',
metrics=['accuracy']
)
return modelCheck out Module ‘keras.backend’ has no attribute ‘set_session’
Make Predictions with Your CNN
Once your model is trained, you can use it to make predictions on new images:
# Function to preprocess a single image for prediction
def preprocess_image(image_path):
img = tf.keras.preprocessing.image.load_img(
image_path, target_size=(224, 224)
)
img_array = tf.keras.preprocessing.image.img_to_array(img)
img_array = np.expand_dims(img_array, axis=0)
img_array = img_array / 255.0 # Normalize
return img_array
# Make a prediction
def predict_image(model, image_path, class_names):
processed_img = preprocess_image(image_path)
predictions = model.predict(processed_img)
predicted_class = np.argmax(predictions[0])
confidence = predictions[0][predicted_class]
return {
'class': class_names[predicted_class],
'confidence': float(confidence),
'top_predictions': [
{'class': class_names[i], 'confidence': float(predictions[0][i])}
for i in np.argsort(predictions[0])[-3:][::-1]
]
}Understand CNN Layers and Parameters
To effectively work with CNNs, it’s crucial to understand the key components:
Read Fix ModuleNotFoundError: No module named ‘tensorflow.compat’
Convolutional Layers
These layers apply filters to detect patterns in the input:
# Convolutional layer with 32 filters of size 3x3
layers.Conv2D(
filters=32, # Number of filters
kernel_size=(3, 3), # Filter size
strides=(1, 1), # How far the filter moves each step
padding='same', # Padding type ('same' or 'valid')
activation='relu' # Activation function
)Pooling Layers
Pooling layers reduce the spatial dimensions of the data:
# Max pooling layer with 2x2 pool size
layers.MaxPooling2D(
pool_size=(2, 2), # Size of pooling window
strides=(2, 2) # How far the window moves each step
)Batch Normalization
This normalizes the activations of the previous layer, which can speed up training:
# Batch normalization layer
layers.BatchNormalization()Check out ModuleNotFoundError: No module named ‘tensorflow.contrib’
Techniques to Improve CNN Performance
Over my years of working with CNNs, I’ve found these techniques particularly effective:
Data Augmentation
Augmenting your training data can help prevent overfitting:
# Creating an augmentation pipeline
augmentation = tf.keras.Sequential([
layers.RandomFlip("horizontal"),
layers.RandomRotation(0.2),
layers.RandomZoom(0.2),
layers.RandomContrast(0.2),
layers.RandomTranslation(0.1, 0.1)
])
# Apply augmentation during training
augmented_images = augmentation(train_images)Dropout
Dropout randomly sets input units to 0 during training, which helps prevent overfitting:
# Add dropout after dense layers
layers.Dense(512, activation='relu'),
layers.Dropout(0.5), # 50% dropout rateLearning Rate Scheduling
Adjusting the learning rate during training can improve convergence:
# Learning rate scheduler
lr_schedule = tf.keras.optimizers.schedules.ExponentialDecay(
initial_learning_rate=1e-3,
decay_steps=10000,
decay_rate=0.9
)
optimizer = tf.keras.optimizers.Adam(learning_rate=lr_schedule)Read Solve the ModuleNotFoundError: no module named ‘tensorflow_hub’
Deploy Your CNN Model
Once your model is trained and performing well, you’ll want to deploy it. Here’s a simple Flask API example:
from flask import Flask, request, jsonify
import tensorflow as tf
import numpy as np
from PIL import Image
import io
app = Flask(__name__)
# Load your trained model
model = tf.keras.models.load_model('my_cnn_model.h5')
class_names = ['class1', 'class2', 'class3', 'class4', 'class5']
@app.route('/predict', methods=['POST'])
def predict():
if 'image' not in request.files:
return jsonify({'error': 'No image provided'}), 400
image_file = request.files['image']
image = Image.open(io.BytesIO(image_file.read()))
image = image.resize((224, 224))
image_array = np.array(image) / 255.0
image_array = np.expand_dims(image_array, axis=0)
predictions = model.predict(image_array)
predicted_class = np.argmax(predictions[0])
confidence = float(predictions[0][predicted_class])
return jsonify({
'class': class_names[predicted_class],
'confidence': confidence
})
if __name__ == '__main__':
app.run(debug=True)Common Challenges and Solutions
Now, I will explain to you the common challenges faced in Convolution Neural Network and solutions to them.
Overfitting
When your model performs well on training data but poorly on new data:
- Use more training data
- Apply data augmentation
- Add dropout layers
- Use regularization (L1/L2)
- Early stopping
Vanishing/Exploding Gradients
When gradients become too small or too large during training:
- Use batch normalization
- Choose appropriate activation functions (ReLU, Leaky ReLU)
- Use residual connections
- Initialize weights properly
Training Time
When training takes too long:
- Use transfer learning
- Train on GPU/TPU
- Reduce model complexity
- Use mixed precision training
Remember that the best CNN architecture depends on your specific problem. Start simple, iterate, and gradually add complexity only when needed.
I’ve used these techniques to build systems that can identify product defects, classify retail items, and even detect emotions in faces. The possibilities are truly endless.
You may like to read:
- AttributeError: Module ‘keras.optimizers’ has no attribute ‘rmsprop’
- AttributeError: Module ‘tensorflow’ has no attribute ‘dimension’
- AttributeError: module ‘tensorflow.keras.layers’ has no attribute ‘multiheadattention’

I am Bijay Kumar, a Microsoft MVP in SharePoint. Apart from SharePoint, I started working on Python, Machine learning, and artificial intelligence for the last 5 years. During this time I got expertise in various Python libraries also like Tkinter, Pandas, NumPy, Turtle, Django, Matplotlib, Tensorflow, Scipy, Scikit-Learn, etc… for various clients in the United States, Canada, the United Kingdom, Australia, New Zealand, etc. Check out my profile.