Metric Learning for Image Similarity Search in Keras

Finding similar images is a task I have faced many times while building recommendation engines for e-commerce platforms.

Standard classification often falls short when you need to compare how “alike” two distinct objects are.

In this tutorial, I will show you how to use TensorFlow Similarity to perform metric learning, allowing your models to learn a feature space where similar images are close together.

This Tutorial Covers:

Use Metric Learning for Image Similarity Search in Keras

Metric learning is powerful because it doesn’t just predict a label; it learns a distance metric between data points.

I find this particularly useful for “Open Set” problems where you might need to find similarities for items the model didn’t see during training.

Set Up Your Environment for TensorFlow Similarity in Keras

Before we dive into the architecture, we need to install the specialized library that extends Keras for similarity tasks.

I always recommend using a virtual environment to avoid version conflicts with your existing TensorFlow installation.

# Install the necessary library
!pip install tensorflow-similarity

Load Data for Similarity Search Using TensorFlow Similarity in Keras

For this example, I will use the CIFAR-10 dataset, which is a staple in the US computer vision research community.

We need to wrap our data in a TFDataset provider to ensure the sampler can create batches with multiple examples of the same class.

import tensorflow_similarity as tfsim

# Load the dataset through the TF Similarity sampler
(x_train, y_train), (x_test, y_test) = tfsim.datasets.cifar10.load_data()

# Create a sampler that ensures each batch has 'm' instances of 'n' classes
sampler = tfsim.samplers.MultiShotMemorySampler(
    x_train, y_train, classes_per_batch=8, examples_per_class=4
)

Define the Backbone Model Architecture in Keras

I prefer using a lightweight backbone like MobileNetV2 when I want to balance speed and accuracy for real-time similarity search.

We strip the top layer and add a specialized “Metric Head” that outputs the embeddings we need for distance calculation.

from tensorflow.keras import layers, Sequential

# Define a simple CNN backbone
def get_backbone(input_shape):
    return Sequential([
        layers.Input(shape=input_shape),
        layers.Conv2D(32, 3, activation='relu'),
        layers.Flatten(),
        layers.Dense(128, activation='relu')
    ])

backbone = get_backbone((32, 32, 3))

Create the Similarity Model Using TensorFlow Similarity in Keras

The SimilarityModel class in TensorFlow Similarity extends the standard Keras Model with specialized indexing capabilities.

I use this because it allows us to call .index() and .lookup() directly on the model object after training.

from tensorflow_similarity.models import SimilarityModel

# Wrap the backbone and add a 64-dimensional embedding layer
inputs = layers.Input(shape=(32, 32, 3))
x = backbone(inputs)
outputs = tfsim.layers.MetricEmbedding(64)(x)

model = SimilarityModel(inputs, outputs)

Implement Triplet Loss for Metric Learning in Keras

Triplet loss is my go-to choice for training similarity models because it pushes “anchors” closer to “positives” and further from “negatives.”

TensorFlow Similarity provides a highly optimized version of this loss that handles mining the hardest triplets within a batch automatically.

from tensorflow_similarity.losses import MultiSimilarityLoss

# Compile the model with an optimizer and the specialized loss function
model.compile(optimizer='adam', loss=MultiSimilarityLoss())

Train the Model for Image Similarity Search in Keras

Training for similarity is slightly different from standard classification as the loss value depends on the relative distances of embeddings.

I usually monitor the distance metrics to ensure the model is successfully collapsing the distance between related images.

# Train the model using the sampler we created earlier
model.fit(sampler, epochs=10)

Index the Embeddings for Fast Similarity Search in Keras

Once the model is trained, we need to build an index of the feature vectors for all the images in our searchable gallery.

I use the NMSLib indexer for large-scale projects because it provides incredibly fast approximate nearest neighbor lookups.

# Extract embeddings for the training set and add them to the index
model.index(x_train[:1000], data=y_train[:1000])

Perform a Similarity Lookup Using TensorFlow Similarity in Keras

Now comes the exciting part where we take a new, unseen image and find its most similar matches from our indexed gallery.

I find that visualizing the top-k results is the best way to verify if the learned metric space actually makes sense.

# Query the index for the 5 most similar images to a test sample
matches = model.lookup(x_test[:1], k=5)

# Display results (matches[0] contains the distances and metadata)
print(f"Most similar matches found: {matches[0]}")

Visualize Similarity Results in Keras

I always build a small helper function to plot the query image alongside its nearest neighbors to evaluate the model qualitatively.

This step is crucial for debugging whether the model is focusing on the right visual features like color, shape, or texture.

import matplotlib.pyplot as plt

def plot_results(query, matches):
    plt.subplot(1, len(matches) + 1, 1)
    plt.imshow(query)
    plt.title("Query")
    for i, match in enumerate(matches):
        plt.subplot(1, len(matches) + 1, i + 2)
        plt.imshow(x_train[match.data])
        plt.title(f"Dist: {match.distance:.2f}")
    plt.show()

# Visualize the first query
plot_results(x_test[0], matches[0])

Evaluate Model Performance Using TensorFlow Similarity in Keras

To get a quantitative measure of how well our similarity search works, we use metrics like “Precision at K.”

I rely on these metrics when I need to prove to stakeholders that the new metric learning approach outperforms the old classification system.

from tensorflow_similarity.evaluators import MemoryEvaluator

# Evaluate the model's retrieval quality on the test set
evaluator = MemoryEvaluator()
metrics = model.evaluate_retrieval(x_test, y_test, evaluator=evaluator)
print(metrics)

Save and Reload the Similarity Model in Keras

Saving a similarity model is a bit different because you need to persist both the neural network weights and the search index.

I suggest using the .save() method provided by TensorFlow Similarity, which bundles the index configuration with the Keras model.

# Save the entire model including the search index
model.save('my_similarity_model')

# Reload the model for later use
reloaded_model = tfsim.models.load_model('my_similarity_model')

You can refer to the screenshot below to see the output.

Metric Learning for Image Similarity Search Using TensorFlow Similarity in Keras

I hope you find this tutorial helpful when building your own visual search applications.

Using TensorFlow Similarity with Keras simplifies the complex math behind metric learning into a manageable workflow.

I have found that this approach significantly improves the accuracy of recommendation systems in real-world retail environments.

You may read:

Bijay Kumar

Bijay Kumar is an experienced Python and AI professional who enjoys helping developers learn modern technologies through practical tutorials and examples. His expertise includes Python development, Machine Learning, Artificial Intelligence, automation, and data analysis using libraries like Pandas, NumPy, TensorFlow, Matplotlib, SciPy, and Scikit-Learn. At PythonGuides.com, he shares in-depth guides designed for both beginners and experienced developers. More about us.

enjoysharepoint.com/

Metric Learning for Image Similarity Search Using TensorFlow Similarity in Keras