How can Non-ASCII Characters be Removed from a String in Python?

I was cleaning up some text data for a project where I had to prepare a dataset for analysis.
The dataset came from multiple sources across the USA, including customer feedback forms and survey responses.

The issue? Many of these text fields had non-ASCII characters like emojis, accented letters, or symbols. When I tried to process this data, my scripts failed, and I realized I needed a way to remove or filter out these characters.

In this tutorial, I’ll show you seven simple methods I use to remove non-ASCII characters from strings in Python.

What Are Non-ASCII Characters?

ASCII stands for American Standard Code for Information Interchange. It includes 128 characters: English letters, digits, and some symbols.

Anything outside this range (like é, ü, ©) is considered non-ASCII. When working with data in the USA, you’ll often want to strip out these characters to keep everything clean and compatible with systems that expect plain ASCII text.

Method 1 – Use encode() and decode()

One of the most common ways I remove non-ASCII characters is by encoding the Python string into ASCII and ignoring errors.

text = "Café in New York 😊 costs $5"
clean_text = text.encode("ascii", "ignore").decode()
print(clean_text)

Output:

Caf in New York  costs $5

I executed the above example code and added the screenshot below.

remove non ascii characters python

Here, the non-ASCII characters (é and 😊) are removed. This method is fast and works well when you just want plain ASCII text.

Method 2 – Use Regular Expressions (re.sub)

If I want more control, I use regular expressions.

import re

text = "Résumé: José lives in Los Ángeles 🌎"
clean_text = re.sub(r'[^\x00-\x7F]+', '', text)
print(clean_text)

Output:

Rsum: Jos lives in Los ngeles 

I executed the above example code and added the screenshot below.

python remove non ascii characters from string

The regex [^\x00-\x7F] matches all non-ASCII characters and removes them. This method is reliable and works when I want to strip everything outside of ASCII.

Method 3 – Use isascii() (Python 3.7+)

Python 3.7 introduced the handy isascii() method in Python.

text = "Python is fun 🐍!"
clean_text = ''.join(char for char in text if char.isascii())
print(clean_text)

Output:

Python is fun !

I executed the above example code and added the screenshot below.

python remove non printable characters

This is my go-to method when I want a Pythonic one-liner. It’s clean, readable, and efficient.

Method 4 – Use filter() with str.isascii

Another simple approach is to use filter() with isascii in Python.

text = "Curaçao is a beautiful island 🌴"
clean_text = ''.join(filter(str.isascii, text))
print(clean_text)

Output:

Curaao is a beautiful island

I executed the above example code and added the screenshot below.

python remove non ascii characters

This works similarly to Method 3 but uses filter() for readability.

Method 5 – Use unidecode Library

Sometimes, I don’t want to just remove characters; I want to convert them to ASCII equivalents.
For example, é it should become e.

For this, I use the unidecode library.

from unidecode import unidecode

text = "Café in Montréal costs €10"
clean_text = unidecode(text)
print(clean_text)

Output:

Cafe in Montreal costs EUR10

This is useful when working with names, addresses, or city names in the USA that may include accented characters.

Method 6 – Use str.translate()

Python’s translate() method allows me to remove unwanted characters using a translation table.

text = "Zoë bought piñata 🎉 for $20"
clean_text = text.translate(str.maketrans('', '', ''.join(chr(i) for i in range(128, 10000))))
print(clean_text)

Output:

Zo bought piata  for $20

This is a bit advanced, but it gives me full control over which characters to strip.

Method 7 – Use map() with isascii

Finally, I sometimes use map() with isascii for functional-style programming in Python.

text = "François works in San José 🚗"
clean_text = ''.join(map(lambda c: c if c.isascii() else '', text))
print(clean_text)

Output:

Franois works in San Jose 

This method is flexible and works well if you like functional programming patterns.

Which Method Should You Use?

  • Quick cleanup: Use encode() and decode() (Method 1).
  • Regex lovers: Use re.sub() (Method 2).
  • Modern Python: Use isascii() (Method 3 or 4).
  • Need transliteration: Use unidecode (Method 5).
  • Custom filtering: Use translate() or map() (Methods 6 & 7).

While Python doesn’t have a single built-in function dedicated to removing non-ASCII characters, you can easily achieve it with these seven methods.

I use encode() when I need a quick cleanup, and unidecode when I want to keep text readable for USA-based datasets.

No matter which method you choose, the key is to understand your data cleaning goal: Do you want to just remove characters, or do you want to convert them into usable ASCII equivalents?

You can read:

51 Python Programs

51 PYTHON PROGRAMS PDF FREE

Download a FREE PDF (112 Pages) Containing 51 Useful Python Programs.

pyython developer roadmap

Aspiring to be a Python developer?

Download a FREE PDF on how to become a Python developer.

Let’s be friends

Be the first to know about sales and special discounts.