Remove Unicode characters in python

In this Python tutorial, we will discuss how to remove unicode characters in python. Also, we will discuss:

  • Remove Unicode character from string python
  • Python remove Unicode ” u ” from string
  • Remove special characters in python string
  • Remove non-ASCII characters in python

Remove Unicode characters in python from string

In python, to remove Unicode character from string python we need to encode the string by using str.encode() for removing the Unicode characters from the string.

Example:

string_unicode = " Python is easy \u200c to learn. "
string_encode = string_unicode.encode("ascii", "ignore")
string_decode = string_encode.decode()
print(string_decode)

After writing the above code (remove Unicode character from string python), Ones you will print “ string_decode ” then the output will appear as a “ Python is easy to learn. ”. Here, encode() is used to remove the Unicode from the string. You can refer to the below screenshot for removing Unicode characters from string python.

Remove Unicode characters in python
remove unicode characters python

Python remove Unicode “u” from string

In python, to remove Unicode ” u “ character from string then, we can use the replace() method to remove the Unicode ” u ” from the string.

Example:

string = "u\'Python is easy'"
string_unicode = string.replace("u'", "'")
print(string_unicode)

After writing the above code (python remove Unicode ” u ” from a string), Ones you will print “ string_unicode ” then the output will appear as a “ Python is easy. ”. Here, it removes the Unicode ” u “ from the string. You can refer to the below screenshot for removing Unicode ” u ” from string python.

Python remove Unicode " u " from string
remove unicode characters python

We can also, do python remove Unicode ” u ” character from string by using encode(), and here ” u ” is Unicode which is removed with something else.

Example:

string = u'hello world!'
string_encode = string.encode('ascii')
print(string_encode)

After writing the above code (python remove Unicode ” u ” character from a string), Ones you will print “ string_encode ” then the output will appear as a “ b’hello world! ”. Here, it removes the Unicode ” u “ character from the string with something else. You can refer to the below screenshot for removing Unicode ” u ” character from string python.

Python remove Unicode " u " character from a string
Python remove Unicode “u” character from a string

This is how, we can remove Unicode ” u ” character from string python.

Read: Convert PDF file to Docx in Python 

Remove special characters in python string

In python, for removing special characters in python string, we use isalnum() for removing special characters from a string. Special characters can be whitespace, punctuation, or slash.

Example:

my_string = "sgr /k !? 100002"
string = ""
for character in my_string:
if character.isalnum():
string = string + character
print(string)

After writing the above code (remove special characters in python string), Ones you will print “ string” then the output will appear as an “ sgrk100002 ”. Here, it removes the special character from the string and it will return a string with letters and numbers and the loop will iterate through each character. You can refer to the below screenshot for removing special characters in a python string.

Remove special characters in python string
Remove special characters in python string

This is how, we can remove special characters in python string.

Remove non-ASCII characters in python

In python, to remove non-ASCII characters in python, we need to use string.encode() with encoding as ASCII and error as ignore, to returns a string without ASCII character use string.decode().

Example:

string_nonASCII = " àa fuünny charactersß. "
string_encode = string_nonASCII.encode("ascii", "ignore")
string_decode = string_encode.decode()
print(string_decode)

After writing the above code (remove non-ASCII characters in python), Ones you will print “ string_decode ” then the output will appear as “ a funny characters. ”. Here, encode() is used to remove the non-ASCII characters from the string and decode() will encode the string. You can refer to the below screenshot for removing non-ASCII characters in python.

Remove non-ASCII characters in python
Remove non-ASCII characters in python

This is how we can remove non-ASCII characters in python.

You may like following Python tutorials:

In this tutorial, we have discussed how to remove Unicode characters in python. We discussed how to remove Unicode characters with examples in Python.