1. 程式人生 > >Python中pandas讀取*.csv檔案出現編碼問題

Python中pandas讀取*.csv檔案出現編碼問題

1、問題

在使用Python中pandas讀取csv檔案時,由於檔案編碼格式出現以下問題:

Traceback (most recent call last):
  File "pandas\_libs\parsers.pyx", line 1134, in pandas._libs.parsers.TextReader._convert_tokens
  File "pandas\_libs\parsers.pyx", line 1240, in pandas._libs.parsers.TextReader._convert_with_dtype
  File "pandas\_libs\parsers.pyx", line 1256, in pandas._libs.parsers.TextReader._string_convert
  File "pandas\_libs\parsers.pyx", line 1494, in pandas._libs.parsers._string_box_utf8
UnicodeDecodeError: 'utf-8' codec can't decode byte 0xa0 in position 19: invalid start byte

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "E:\PyCharm 2017.3.4\helpers\pydev\pydevd.py", line 1668, in <module>
    main()
  File "E:\PyCharm 2017.3.4\helpers\pydev\pydevd.py", line 1662, in main
    globals = debugger.run(setup['file'], None, None, is_module)
  File "E:\PyCharm 2017.3.4\helpers\pydev\pydevd.py", line 1072, in run
    pydev_imports.execfile(file, globals, locals)  # execute the script
  File "E:\PyCharm 2017.3.4\helpers\pydev\_pydev_imps\_pydev_execfile.py", line 18, in execfile
    exec(compile(contents+"\n", file, 'exec'), glob, loc)
  File "F:/OneDrive - emails.bjut.edu.cn/Program/Python/DCAE/test.py", line 18, in <module>
    load_phenotypes_ABIDE2_RfMRIMaps()
  File "F:/OneDrive - emails.bjut.edu.cn/Program/Python/DCAE\Data\load_data.py", line 109, in load_phenotypes_ABIDE2_RfMRIMaps
    pheno = pd.read_csv(pheno_path)
  File "E:\Python\Python35\lib\site-packages\pandas\io\parsers.py", line 678, in parser_f
    return _read(filepath_or_buffer, kwds)
  File "E:\Python\Python35\lib\site-packages\pandas\io\parsers.py", line 446, in _read
    data = parser.read(nrows)
  File "E:\Python\Python35\lib\site-packages\pandas\io\parsers.py", line 1036, in read
    ret = self._engine.read(nrows)
  File "E:\Python\Python35\lib\site-packages\pandas\io\parsers.py", line 1848, in read
    data = self._reader.read(nrows)
  File "pandas\_libs\parsers.pyx", line 876, in pandas._libs.parsers.TextReader.read
  File "pandas\_libs\parsers.pyx", line 891, in pandas._libs.parsers.TextReader._read_low_memory
  File "pandas\_libs\parsers.pyx", line 968, in pandas._libs.parsers.TextReader._read_rows
  File "pandas\_libs\parsers.pyx", line 1094, in pandas._libs.parsers.TextReader._convert_column_data
  File "pandas\_libs\parsers.pyx", line 1141, in pandas._libs.parsers.TextReader._convert_tokens
  File "pandas\_libs\parsers.pyx", line 1240, in pandas._libs.parsers.TextReader._convert_with_dtype
  File "pandas\_libs\parsers.pyx", line 1256, in pandas._libs.parsers.TextReader._string_convert
  File "pandas\_libs\parsers.pyx", line 1494, in pandas._libs.parsers._string_box_utf8
UnicodeDecodeError: 'utf-8' codec can't decode byte 0xa0 in position 19: invalid start byte

我認為該問題是由於檔案編碼格式不是'utf-8'所導致的,因此,嘗試將檔案格式進行轉換,轉換方式如下:

首先使用txt文字開啟檔案,然後另存為,在右下角將編碼改為‘UTF-8’,點選儲存即可

最後,問題解決