How Can Non-ASCII Characters Be Removed From A String In Python?

I was cleaning up some text data for a project where I had to prepare a dataset for analysis.
The dataset came from multiple sources across the USA, including customer feedback forms and survey responses.

The issue? Many of these text fields had non-ASCII characters like emojis, accented letters, or symbols. When I tried to process this data, my scripts failed, and I realized I needed a way to remove or filter out these characters.

In this tutorial, I’ll show you seven simple methods I use to remove non-ASCII characters from strings in Python.

Table of Contents

What Are Non-ASCII Characters?

ASCII stands for American Standard Code for Information Interchange. It includes 128 characters: English letters, digits, and some symbols.

Anything outside this range (like é, ü, ©) is considered non-ASCII. When working with data in the USA, you’ll often want to strip out these characters to keep everything clean and compatible with systems that expect plain ASCII text.

Method 1 – Use encode() and decode()

One of the most common ways I remove non-ASCII characters is by encoding the Python string into ASCII and ignoring errors.

text = "Café in New York 😊 costs $5"
clean_text = text.encode("ascii", "ignore").decode()
print(clean_text)

Output:

Caf in New York  costs $5

I executed the above example code and added the screenshot below.

Here, the non-ASCII characters (é and 😊) are removed. This method is fast and works well when you just want plain ASCII text.

Method 2 – Use Regular Expressions (re.sub)

If I want more control, I use regular expressions.

import re

text = "Résumé: José lives in Los Ángeles 🌎"
clean_text = re.sub(r'[^\x00-\x7F]+', '', text)
print(clean_text)

Output:

Rsum: Jos lives in Los ngeles

I executed the above example code and added the screenshot below.

python remove non ascii characters from string

The regex [^\x00-\x7F] matches all non-ASCII characters and removes them. This method is reliable and works when I want to strip everything outside of ASCII.

Method 3 – Use isascii() (Python 3.7+)

Python 3.7 introduced the handy isascii() method in Python.

text = "Python is fun 🐍!"
clean_text = ''.join(char for char in text if char.isascii())
print(clean_text)

Output:

Python is fun !

I executed the above example code and added the screenshot below.

This is my go-to method when I want a Pythonic one-liner. It’s clean, readable, and efficient.

Method 4 – Use filter() with str.isascii

Another simple approach is to use filter() with isascii in Python.

text = "Curaçao is a beautiful island 🌴"
clean_text = ''.join(filter(str.isascii, text))
print(clean_text)

Output:

Curaao is a beautiful island

I executed the above example code and added the screenshot below.

This works similarly to Method 3 but uses filter() for readability.

Method 5 – Use `unidecode` Library

Sometimes, I don’t want to just remove characters; I want to convert them to ASCII equivalents.
For example, é it should become e.

For this, I use the unidecode library.

from unidecode import unidecode

text = "Café in Montréal costs €10"
clean_text = unidecode(text)
print(clean_text)

Output:

Cafe in Montreal costs EUR10

This is useful when working with names, addresses, or city names in the USA that may include accented characters.

Method 6 – Use str.translate()

Python’s translate() method allows me to remove unwanted characters using a translation table.

text = "Zoë bought piñata 🎉 for $20"
clean_text = text.translate(str.maketrans('', '', ''.join(chr(i) for i in range(128, 10000))))
print(clean_text)

Output:

Zo bought piata  for $20

This is a bit advanced, but it gives me full control over which characters to strip.

Method 7 – Use map() with isascii

Finally, I sometimes use map() with isascii for functional-style programming in Python.

text = "François works in San José 🚗"
clean_text = ''.join(map(lambda c: c if c.isascii() else '', text))
print(clean_text)

Output:

Franois works in San Jose

This method is flexible and works well if you like functional programming patterns.

Which Method Should You Use?

Quick cleanup: Use encode() and decode() (Method 1).
Regex lovers: Use re.sub() (Method 2).
Modern Python: Use isascii() (Method 3 or 4).
Need transliteration: Use unidecode (Method 5).
Custom filtering: Use translate() or map() (Methods 6 & 7).

While Python doesn’t have a single built-in function dedicated to removing non-ASCII characters, you can easily achieve it with these seven methods.

I use encode() when I need a quick cleanup, and unidecode when I want to keep text readable for USA-based datasets.

No matter which method you choose, the key is to understand your data cleaning goal: Do you want to just remove characters, or do you want to convert them into usable ASCII equivalents?

You can read:

Bijay Kumar

Bijay Kumar is an experienced Python and AI professional who enjoys helping developers learn modern technologies through practical tutorials and examples. His expertise includes Python development, Machine Learning, Artificial Intelligence, automation, and data analysis using libraries like Pandas, NumPy, TensorFlow, Matplotlib, SciPy, and Scikit-Learn. At PythonGuides.com, he shares in-depth guides designed for both beginners and experienced developers. More about us.

enjoysharepoint.com/

How can Non-ASCII Characters be Removed from a String in Python?

What Are Non-ASCII Characters?

Method 1 – Use encode() and decode()

Method 2 – Use Regular Expressions (re.sub)

Method 3 – Use isascii() (Python 3.7+)

Method 4 – Use filter() with str.isascii

Method 5 – Use `unidecode` Library

Method 6 – Use str.translate()

Method 7 – Use map() with isascii

Which Method Should You Use?

51 PYTHON PROGRAMS PDF FREE

Aspiring to be a Python developer?

Let’s be friends

What Are Non-ASCII Characters?

Method 1 – Use encode() and decode()

Method 2 – Use Regular Expressions (re.sub)

Method 3 – Use isascii() (Python 3.7+)

Method 4 – Use filter() with str.isascii

Method 5 – Use unidecode Library

Method 6 – Use str.translate()

Method 7 – Use map() with isascii

Which Method Should You Use?

51 PYTHON PROGRAMS PDF FREE

Aspiring to be a Python developer?

Let’s be friends

Method 5 – Use `unidecode` Library