In this Python tutorial, we will study Python dictionary duplicate keys using some examples in Python. In addition, we will learn different methods that are used to find out the duplicate keys of Python Dictionary.
Before learning the various methods. Let’s first understand the dictionary in Python.
In Python, dictionaries are versatile data structures that facilitate the storage of information in key-value pairs. The uniqueness of the key in each key-value pair is a fundamental characteristic of a dictionary.
This implies that two or more items cannot possess the same key. However, it’s possible to have similar keys in a list of Python dictionaries, which may necessitate identifying and handling these duplicates.
Python dictionary with duplicate keys
Although it’s not possible to have duplicate keys within a single Python dictionary when working with a list of dictionaries, such duplication can occur. Let’s discuss some various ways to find them.
Using a Naive Approach in Python
The simplest approach involves iterating over the Python list of dictionaries and using a temporary list to track keys. If a key is already in the temporary list, it’s a duplicate and we add it to the list of duplicates.
data = [
{"name": "John", "age": 25, "profession": "Engineer"},
{"name": "Jane", "age": 30, "profession": "Doctor"},
{"name": "John", "age": 45, "profession": "Teacher"}
]
def find_duplicates(data, key):
temp = []
duplicates = []
for dictionary in data:
if dictionary[key] not in temp:
temp.append(dictionary[key])
else:
duplicates.append(dictionary[key])
return duplicates
print(find_duplicates(data, "name"))
In this function, we create a temporary list, temp
, to store unique Python keys. For each dictionary in our list, we check if the key is already in the temp
list. If it isn’t, we add it to the list. If it is, we add it to our duplicates
list. Finally, we return the duplicates
list, which will contain all the duplicated Python dictionary keys.
Output:
Read How to remove duplicate values from a Python Dictionary
Using a Python Set for Efficiency
While the above approach works, it’s not the most efficient as checking for membership in a list is an O(n) operation. By using a Python set, we can speed up the operation to O(1) because sets in Python are implemented as hash tables.
data = [
{"name": "Sam", "age": 35, "profession": "Software Developer"},
{"name": "Emily", "age": 35, "profession": "Team Lead"},
{"name": "Oliver", "age": 50, "profession": "Software Developer"},
{"name": "Emma", "age": 32, "profession": "CEO"},
{"name": "Aiden", "age": 35, "profession": "Software Senior Developer"}
]
def find_duplicates(data, key):
temp = set()
duplicates = []
for dictionary in data:
if dictionary[key] not in temp:
temp.add(dictionary[key])
else:
duplicates.append(dictionary[key])
return duplicates
print(find_duplicates(data, "profession"))
- The function takes two parameters,
data
(the list of dictionaries) andkey
(the key for which we want to find duplicate values). - It uses a Python set
temp
to track unique values and a listduplicates
to store duplicate values. - The function iterates over each Python dictionary in the list, checking whether the value associated with the given
key
has been seen before (i.e., whether it’s intemp
). - If it hasn’t been seen before, it adds the value to
temp
. - If it has been seen before (i.e., it’s already in
temp
), it’s a duplicate, so it adds value toduplicates
. - Once it has checked all Python dictionaries, it returns the
duplicates
list, which contains all duplicate values for the givenkey
in the list of dictionaries.
Output:
Using Python collections.Counter
Python’s collections module provides a Counter class that makes it easy to count data. We can use it to identify keys that appear more than once.
from collections import Counter
data = [
{"make": "Ford", "model": "Mustang", "year": 2018},
{"make": "Chevrolet", "model": "Camaro", "year": 2018},
{"make": "Ford", "model": "Fusion", "year": 2020},
{"make": "Chevrolet", "model": "Impala", "year": 2020},
{"make": "Ford", "model": "Mustang", "year": 2020}
]
def find_duplicates(data, key):
key_values = [dictionary[key] for dictionary in data]
counter = Counter(key_values)
return [item for item, count in counter.items() if count > 1]
print(find_duplicates(data, "make"))
In this function, we first extract all Python key values into a list using list comprehension. We then create a Counter object from this list. The Counter object is a Python dictionary where keys are the items to be counted and values are their counts. Finally, we create a Python list of items that appear more than once.
Output:
Using List Comprehensions and Sets in Python
We can use list comprehensions and sets to create a compact and efficient function.
data = [
{"state": "California", "capital": "Sacramento", "region": "West"},
{"state": "Texas", "capital": "Austin", "region": "South"},
{"state": "Oregon", "capital": "Salem", "region": "West"},
{"state": "New York", "capital": "Albany", "region": "Northeast"},
{"state": "Washington", "capital": "Olympia", "region": "West"}
]
def find_duplicates(data, key):
key_values = [dictionary[key] for dictionary in data]
return list(set([value for value in key_values if key_values.count(value) > 1]))
print(find_duplicates(data, "region"))
This function works similarly to the previous one but leverages Python’s set and list comprehension capabilities for more concise code.
Output:
Conclusion
In this article, we have shown how to find duplicate keys in a list of dictionaries, which can be very useful in data analysis and cleaning. Always remember that although dictionaries do not allow duplicate keys, these can occur when dealing with a list of dictionaries, hence the need to handle them appropriately.
You may like to read:
I am Bijay Kumar, a Microsoft MVP in SharePoint. Apart from SharePoint, I started working on Python, Machine learning, and artificial intelligence for the last 5 years. During this time I got expertise in various Python libraries also like Tkinter, Pandas, NumPy, Turtle, Django, Matplotlib, Tensorflow, Scipy, Scikit-Learn, etc… for various clients in the United States, Canada, the United Kingdom, Australia, New Zealand, etc. Check out my profile.