How to Fix Python Encoding Errors

You tried to read a file or process some text in Python, and you got a confusing error about “encoding,” “UnicodeDecodeError,” or “codec.” These errors look intimidating, but they’re actually quite common and fixable.

Encoding errors happen when Python tries to read text that’s stored in a different format than it expects. Think of it like trying to read a book written in a language you don’t speak — the letters might look like gibberish.

This guide will help you understand and fix Python encoding errors.

1 What Causes Python Encoding Errors
2 Fix 1: Specify the Correct Encoding When Reading Files
3 Fix 2: Handle Encoding Errors Gracefully
4 Fix 3: Fix Encoding When Writing Files
5 What to Do If It Still Doesn’t Work
6 Summary

What Causes Python Encoding Errors

Reading a file with the wrong encoding — The file was saved in one format (like Shift-JIS or Latin-1) but Python is trying to read it as UTF-8 (the most common text format).
Special characters in your data — Characters like accented letters (é, ñ), emojis, or non-English text can cause issues if the encoding doesn’t match.
Mixing bytes and strings — In Python 3, text (strings) and raw data (bytes) are different types, and converting between them incorrectly causes errors.

Fix 1: Specify the Correct Encoding When Reading Files

The most common encoding error happens when reading files. Python defaults to UTF-8, but not all files use UTF-8.

Example of the error:

# This might raise UnicodeDecodeError
with open("data.csv") as f:
    content = f.read()

Step 1: Try specifying UTF-8 explicitly (this fixes many cases).

# Explicitly specify UTF-8 encoding
with open("data.csv", encoding="utf-8") as f:
    content = f.read()
print("File read successfully!")

If the file reads without errors, you’re done.

Step 2: If UTF-8 doesn’t work, try other common encodings.

# Try Latin-1 (also called ISO-8859-1) — common for European text
with open("data.csv", encoding="latin-1") as f:
    content = f.read()

# Try Shift-JIS — common for Japanese text
with open("data.csv", encoding="shift_jis") as f:
    content = f.read()

# Try CP1252 — common for Windows files
with open("data.csv", encoding="cp1252") as f:
    content = f.read()

Try each one until the file reads without errors and the text looks correct.

Step 3: If you’re not sure what encoding the file uses, detect it automatically.

# First, install the chardet library
# pip install chardet

import chardet

# Read the file as raw bytes first
with open("data.csv", "rb") as f:
    raw_data = f.read()

# Detect the encoding
detected = chardet.detect(raw_data)
print(f"Detected encoding: {detected['encoding']}")
print(f"Confidence: {detected['confidence']}")

# Now read with the detected encoding
with open("data.csv", encoding=detected["encoding"]) as f:
    content = f.read()

If the detected encoding reads the file correctly, you’ve found the right encoding.

Fix 2: Handle Encoding Errors Gracefully

Sometimes you just need to read the file even if a few characters are broken. Python provides error handling options for this.

Step 1: Use errors="replace" to replace unreadable characters with a placeholder.

# Replaces unreadable characters with "?" symbols
with open("data.csv", encoding="utf-8", errors="replace") as f:
    content = f.read()
print(content)

If the file reads and most of the text looks correct, this approach works for your use case.

Step 2: Use errors="ignore" to skip unreadable characters entirely.

# Silently skips characters that can't be decoded
with open("data.csv", encoding="utf-8", errors="ignore") as f:
    content = f.read()

Note: This might lose some data, so only use this when you don’t need every character to be perfect.

Fix 3: Fix Encoding When Writing Files

You might also get encoding errors when writing text to a file, especially if the text contains special characters.

Step 1: Always specify UTF-8 when writing files.

# Write with UTF-8 encoding to support all characters
with open("output.txt", "w", encoding="utf-8") as f:
    f.write("Hello, café! こんにちは 🎉")
print("File written successfully!")

If the file is created and the text looks correct when you open it, you’re good.

Step 2: On Windows, if you’re printing special characters to the console:

# Windows command prompt might not display all characters
# Set the console encoding at the top of your script
import sys
import io
sys.stdout = io.TextIOWrapper(sys.stdout.buffer, encoding="utf-8")

What to Do If It Still Doesn’t Work

Check the file source — If you downloaded the file, check if the website or tool offers a UTF-8 version.
Open in a text editor — Programs like Notepad++ or VS Code can show you the file’s encoding and let you convert it. In VS Code, check the bottom-right corner.
Re-save the file as UTF-8 — Open the file in your text editor and save it with UTF-8 encoding. In VS Code, click the encoding in the bottom-right and choose “Save with Encoding.”
Check Python version — Make sure you’re using Python 3, which handles Unicode much better than Python 2.

Summary

Python encoding errors happen when the text format doesn’t match what Python expects.
The most common fix is to add encoding="utf-8" when opening files.
If you don’t know the encoding, use the chardet library to detect it automatically.