You tried to read a file or process some text in Python, and you got a confusing error about “encoding,” “UnicodeDecodeError,” or “codec.” These errors look intimidating, but they’re actually quite common and fixable.
Encoding errors happen when Python tries to read text that’s stored in a different format than it expects. Think of it like trying to read a book written in a language you don’t speak — the letters might look like gibberish.
This guide will help you understand and fix Python encoding errors.
目次
What Causes Python Encoding Errors
- Reading a file with the wrong encoding — The file was saved in one format (like Shift-JIS or Latin-1) but Python is trying to read it as UTF-8 (the most common text format).
- Special characters in your data — Characters like accented letters (é, ñ), emojis, or non-English text can cause issues if the encoding doesn’t match.
- Mixing bytes and strings — In Python 3, text (strings) and raw data (bytes) are different types, and converting between them incorrectly causes errors.
Fix 1: Specify the Correct Encoding When Reading Files
The most common encoding error happens when reading files. Python defaults to UTF-8, but not all files use UTF-8.
Example of the error:
# This might raise UnicodeDecodeError
with open("data.csv") as f:
content = f.read()
Step 1: Try specifying UTF-8 explicitly (this fixes many cases).
# Explicitly specify UTF-8 encoding
with open("data.csv", encoding="utf-8") as f:
content = f.read()
print("File read successfully!")
If the file reads without errors, you’re done.
Step 2: If UTF-8 doesn’t work, try other common encodings.
# Try Latin-1 (also called ISO-8859-1) — common for European text
with open("data.csv", encoding="latin-1") as f:
content = f.read()
# Try Shift-JIS — common for Japanese text
with open("data.csv", encoding="shift_jis") as f:
content = f.read()
# Try CP1252 — common for Windows files
with open("data.csv", encoding="cp1252") as f:
content = f.read()
Try each one until the file reads without errors and the text looks correct.
Step 3: If you’re not sure what encoding the file uses, detect it automatically.
# First, install the chardet library
# pip install chardet
import chardet
# Read the file as raw bytes first
with open("data.csv", "rb") as f:
raw_data = f.read()
# Detect the encoding
detected = chardet.detect(raw_data)
print(f"Detected encoding: {detected['encoding']}")
print(f"Confidence: {detected['confidence']}")
# Now read with the detected encoding
with open("data.csv", encoding=detected["encoding"]) as f:
content = f.read()
If the detected encoding reads the file correctly, you’ve found the right encoding.
Fix 2: Handle Encoding Errors Gracefully
Sometimes you just need to read the file even if a few characters are broken. Python provides error handling options for this.
Step 1: Use errors="replace" to replace unreadable characters with a placeholder.
# Replaces unreadable characters with "?" symbols
with open("data.csv", encoding="utf-8", errors="replace") as f:
content = f.read()
print(content)
If the file reads and most of the text looks correct, this approach works for your use case.
Step 2: Use errors="ignore" to skip unreadable characters entirely.
# Silently skips characters that can't be decoded
with open("data.csv", encoding="utf-8", errors="ignore") as f:
content = f.read()
Note: This might lose some data, so only use this when you don’t need every character to be perfect.
Fix 3: Fix Encoding When Writing Files
You might also get encoding errors when writing text to a file, especially if the text contains special characters.
Step 1: Always specify UTF-8 when writing files.
# Write with UTF-8 encoding to support all characters
with open("output.txt", "w", encoding="utf-8") as f:
f.write("Hello, café! こんにちは 🎉")
print("File written successfully!")
If the file is created and the text looks correct when you open it, you’re good.
Step 2: On Windows, if you’re printing special characters to the console:
# Windows command prompt might not display all characters
# Set the console encoding at the top of your script
import sys
import io
sys.stdout = io.TextIOWrapper(sys.stdout.buffer, encoding="utf-8")
What to Do If It Still Doesn’t Work
- Check the file source — If you downloaded the file, check if the website or tool offers a UTF-8 version.
- Open in a text editor — Programs like Notepad++ or VS Code can show you the file’s encoding and let you convert it. In VS Code, check the bottom-right corner.
- Re-save the file as UTF-8 — Open the file in your text editor and save it with UTF-8 encoding. In VS Code, click the encoding in the bottom-right and choose “Save with Encoding.”
- Check Python version — Make sure you’re using Python 3, which handles Unicode much better than Python 2.
Summary
- Python encoding errors happen when the text format doesn’t match what Python expects.
- The most common fix is to add
encoding="utf-8"when opening files. - If you don’t know the encoding, use the
chardetlibrary to detect it automatically.
Related articles:
- vscode-japanese-encoding.html
- file-not-found-error-python.html
- pip-install-error.html

















Leave a Reply