I was cleaning up some text data for a project where I had to prepare a dataset for analysis.
The dataset came from multiple sources across the USA, including customer feedback forms and survey responses.
The issue? Many of these text fields had non-ASCII characters like emojis, accented letters, or symbols. When I tried to process this data, my scripts failed, and I realized I needed a way to remove or filter out these characters.
In this tutorial, I’ll show you seven simple methods I use to remove non-ASCII characters from strings in Python.
What Are Non-ASCII Characters?
ASCII stands for American Standard Code for Information Interchange. It includes 128 characters: English letters, digits, and some symbols.
Anything outside this range (like é, ü, ©) is considered non-ASCII. When working with data in the USA, you’ll often want to strip out these characters to keep everything clean and compatible with systems that expect plain ASCII text.
Method 1 – Use encode() and decode()
One of the most common ways I remove non-ASCII characters is by encoding the Python string into ASCII and ignoring errors.
text = "Café in New York 😊 costs $5"
clean_text = text.encode("ascii", "ignore").decode()
print(clean_text)Output:
Caf in New York costs $5I executed the above example code and added the screenshot below.

Here, the non-ASCII characters (é and 😊) are removed. This method is fast and works well when you just want plain ASCII text.
Method 2 – Use Regular Expressions (re.sub)
If I want more control, I use regular expressions.
import re
text = "Résumé: José lives in Los Ángeles 🌎"
clean_text = re.sub(r'[^\x00-\x7F]+', '', text)
print(clean_text)Output:
Rsum: Jos lives in Los ngeles I executed the above example code and added the screenshot below.

The regex [^\x00-\x7F] matches all non-ASCII characters and removes them. This method is reliable and works when I want to strip everything outside of ASCII.
Method 3 – Use isascii() (Python 3.7+)
Python 3.7 introduced the handy isascii() method in Python.
text = "Python is fun 🐍!"
clean_text = ''.join(char for char in text if char.isascii())
print(clean_text)Output:
Python is fun !I executed the above example code and added the screenshot below.

This is my go-to method when I want a Pythonic one-liner. It’s clean, readable, and efficient.
Method 4 – Use filter() with str.isascii
Another simple approach is to use filter() with isascii in Python.
text = "Curaçao is a beautiful island 🌴"
clean_text = ''.join(filter(str.isascii, text))
print(clean_text)Output:
Curaao is a beautiful islandI executed the above example code and added the screenshot below.

This works similarly to Method 3 but uses filter() for readability.
Method 5 – Use unidecode Library
Sometimes, I don’t want to just remove characters; I want to convert them to ASCII equivalents.
For example, é it should become e.
For this, I use the unidecode library.
from unidecode import unidecode
text = "Café in Montréal costs €10"
clean_text = unidecode(text)
print(clean_text)Output:
Cafe in Montreal costs EUR10This is useful when working with names, addresses, or city names in the USA that may include accented characters.
Method 6 – Use str.translate()
Python’s translate() method allows me to remove unwanted characters using a translation table.
text = "Zoë bought piñata 🎉 for $20"
clean_text = text.translate(str.maketrans('', '', ''.join(chr(i) for i in range(128, 10000))))
print(clean_text)Output:
Zo bought piata for $20This is a bit advanced, but it gives me full control over which characters to strip.
Method 7 – Use map() with isascii
Finally, I sometimes use map() with isascii for functional-style programming in Python.
text = "François works in San José 🚗"
clean_text = ''.join(map(lambda c: c if c.isascii() else '', text))
print(clean_text)Output:
Franois works in San Jose This method is flexible and works well if you like functional programming patterns.
Which Method Should You Use?
- Quick cleanup: Use
encode()anddecode()(Method 1). - Regex lovers: Use
re.sub()(Method 2). - Modern Python: Use
isascii()(Method 3 or 4). - Need transliteration: Use
unidecode(Method 5). - Custom filtering: Use
translate()ormap()(Methods 6 & 7).
While Python doesn’t have a single built-in function dedicated to removing non-ASCII characters, you can easily achieve it with these seven methods.
I use encode() when I need a quick cleanup, and unidecode when I want to keep text readable for USA-based datasets.
No matter which method you choose, the key is to understand your data cleaning goal: Do you want to just remove characters, or do you want to convert them into usable ASCII equivalents?
You can read:
- Print Strings and Variables in Python
- Python Naming Conventions for Variables
- Save Variables to a File in Python
- Set Global Variables in Python Functions

Bijay Kumar is an experienced Python and AI professional who enjoys helping developers learn modern technologies through practical tutorials and examples. His expertise includes Python development, Machine Learning, Artificial Intelligence, automation, and data analysis using libraries like Pandas, NumPy, TensorFlow, Matplotlib, SciPy, and Scikit-Learn. At PythonGuides.com, he shares in-depth guides designed for both beginners and experienced developers. More about us.