Find Duplicate Keys in Dictionary Python

In this Python tutorial, we will study Python dictionary duplicate keys using some examples in Python. In addition, we will learn different methods that are used to find out the duplicate keys of Python Dictionary.

Before learning the various methods. Let’s first understand the dictionary in Python.

In Python, dictionaries are versatile data structures that facilitate the storage of information in key-value pairs. The uniqueness of the key in each key-value pair is a fundamental characteristic of a dictionary.

This implies that two or more items cannot possess the same key. However, it’s possible to have similar keys in a list of Python dictionaries, which may necessitate identifying and handling these duplicates.

Python dictionary with duplicate keys

Although it’s not possible to have duplicate keys within a single Python dictionary when working with a list of dictionaries, such duplication can occur. Let’s discuss some various ways to find them.

Using a Naive Approach in Python

The simplest approach involves iterating over the Python list of dictionaries and using a temporary list to track keys. If a key is already in the temporary list, it’s a duplicate and we add it to the list of duplicates.

data = [
    {"name": "John", "age": 25, "profession": "Engineer"},
    {"name": "Jane", "age": 30, "profession": "Doctor"},
    {"name": "John", "age": 45, "profession": "Teacher"}
]

def find_duplicates(data, key):
    temp = []
    duplicates = []

    for dictionary in data:
        if dictionary[key] not in temp:
            temp.append(dictionary[key])
        else:
            duplicates.append(dictionary[key])

    return duplicates

print(find_duplicates(data, "name"))

In this function, we create a temporary list, temp, to store unique Python keys. For each dictionary in our list, we check if the key is already in the temp list. If it isn’t, we add it to the list. If it is, we add it to our duplicates list. Finally, we return the duplicates list, which will contain all the duplicated Python dictionary keys.

Output:

find duplicate keys in dictionary python

Read How to remove duplicate values from a Python Dictionary

Using a Python Set for Efficiency

While the above approach works, it’s not the most efficient as checking for membership in a list is an O(n) operation. By using a Python set, we can speed up the operation to O(1) because sets in Python are implemented as hash tables.

data = [
    {"name": "Sam", "age": 35, "profession": "Software Developer"},
    {"name": "Emily", "age": 35, "profession": "Team Lead"},
    {"name": "Oliver", "age": 50, "profession": "Software Developer"},
    {"name": "Emma", "age": 32, "profession": "CEO"},
    {"name": "Aiden", "age": 35, "profession": "Software Senior Developer"}
]

def find_duplicates(data, key):
    temp = set()
    duplicates = []

    for dictionary in data:
        if dictionary[key] not in temp:
            temp.add(dictionary[key])
        else:
            duplicates.append(dictionary[key])

    return duplicates

print(find_duplicates(data, "profession"))
  • The function takes two parameters, data (the list of dictionaries) and key (the key for which we want to find duplicate values).
  • It uses a Python set temp to track unique values and a list duplicates to store duplicate values.
  • The function iterates over each Python dictionary in the list, checking whether the value associated with the given key has been seen before (i.e., whether it’s in temp).
  • If it hasn’t been seen before, it adds the value to temp.
  • If it has been seen before (i.e., it’s already in temp), it’s a duplicate, so it adds value to duplicates.
  • Once it has checked all Python dictionaries, it returns the duplicates list, which contains all duplicate values for the given key in the list of dictionaries.

Output:

python dict with duplicate keys

Using Python collections.Counter

Python’s collections module provides a Counter class that makes it easy to count data. We can use it to identify keys that appear more than once.

from collections import Counter

data = [
    {"make": "Ford", "model": "Mustang", "year": 2018},
    {"make": "Chevrolet", "model": "Camaro", "year": 2018},
    {"make": "Ford", "model": "Fusion", "year": 2020},
    {"make": "Chevrolet", "model": "Impala", "year": 2020},
    {"make": "Ford", "model": "Mustang", "year": 2020}
]


def find_duplicates(data, key):
    key_values = [dictionary[key] for dictionary in data]
    counter = Counter(key_values)

    return [item for item, count in counter.items() if count > 1]

print(find_duplicates(data, "make"))

In this function, we first extract all Python key values into a list using list comprehension. We then create a Counter object from this list. The Counter object is a Python dictionary where keys are the items to be counted and values are their counts. Finally, we create a Python list of items that appear more than once.

Output:

find duplicate keys in dictionary python example

Using List Comprehensions and Sets in Python

We can use list comprehensions and sets to create a compact and efficient function.

data = [
    {"state": "California", "capital": "Sacramento", "region": "West"},
    {"state": "Texas", "capital": "Austin", "region": "South"},
    {"state": "Oregon", "capital": "Salem", "region": "West"},
    {"state": "New York", "capital": "Albany", "region": "Northeast"},
    {"state": "Washington", "capital": "Olympia", "region": "West"}
]

def find_duplicates(data, key):
    key_values = [dictionary[key] for dictionary in data]
    return list(set([value for value in key_values if key_values.count(value) > 1]))

print(find_duplicates(data, "region"))

This function works similarly to the previous one but leverages Python’s set and list comprehension capabilities for more concise code.

Output:

python dictionary with duplicate keys

Conclusion

In this article, we have shown how to find duplicate keys in a list of dictionaries, which can be very useful in data analysis and cleaning. Always remember that although dictionaries do not allow duplicate keys, these can occur when dealing with a list of dictionaries, hence the need to handle them appropriately.

You may like to read: