1. 程式人生 > >詳解python中xlrd包的安裝與處理Excel表格

詳解python中xlrd包的安裝與處理Excel表格

python處理Excel常用到的模組是xlrd。使用xlrd可以非常方便的處理Excel文件,下面這篇文章將給大家詳細介紹python中包xlrd的安裝與利用xlrd處理Excel表格的方法,有需要的朋友們可以參考學習,下面來一起看看吧。

一、安裝xlrd

下載後,使用 pip install .whl 安裝即好。

檢視幫助:

?

1

2

3

4

5

6

7

8

9

10

11

12

13

14

15

16

17

18

19

20

21

22

23

24

25

26

27

28

29

30

31

32

33

34

35

36

37

38

39

40

41

42

43

44

45

46

47

48

49

50

51

52

53

54

55

56

57

58

>>> import xlrd

>>> help(xlrd)

C:\Users\Administrator>python
Python 3.7.0 (v3.7.0:1bf9cc5093, Jun 27 2018, 04:59:51) [MSC v.1914 64 bit (AMD64)] on win32
Type "help", "copyright", "credits" or "license" for more information.
>>> import xlrd
>>> help( xlrd )
Help on package xlrd:

NAME
    xlrd

DESCRIPTION
    # Copyright (c) 2005-2012 Stephen John Machin, Lingfo Pty Ltd
    # This module is part of the xlrd package, which is released under a
    # BSD-style licence.

PACKAGE CONTENTS
    biffh
    book
    compdoc
    formatting
    formula
    info
    sheet
    timemachine
    xldate
    xlsx

FUNCTIONS
    count_records(filename, outfile=<_io.TextIOWrapper name='<stdout>' mode='w' encoding='utf-8'>)
        For debugging and analysis: summarise the file's BIFF records.
        ie: produce a sorted file of ``(record_name, count)``.

        :param filename: The path to the file to be summarised.
        :param outfile: An open file, to which the summary is written.

    dump(filename, outfile=<_io.TextIOWrapper name='<stdout>' mode='w' encoding='utf-8'>, unnumbered=False)
        For debugging: dump an XLS file's BIFF records in char & hex.

        :param filename: The path to the file to be dumped.
        :param outfile: An open file, to which the dump is written.
        :param unnumbered: If true, omit offsets (for meaningful diffs).

    open_workbook(filename=None, logfile=<_io.TextIOWrapper name='<stdout>' mode='w' encoding='utf-8'>, verbosity=0, use_mmap=1, file_contents=None, encoding_override=None, formatting_info=False, on_demand=False, ragged_rows=False)
        Open a spreadsheet file for data extraction.

        :param filename: The path to the spreadsheet file to be opened.

        :param logfile: An open file to which messages and diagnostics are written.

        :param verbosity: Increases the volume of trace material written to the
                          logfile.

        :param use_mmap:

          Whether to use the mmap module is determined heuristically.
          Use this arg to override the result.

          Current heuristic: mmap is used if it exists.

        :param file_contents:

          A string or an :class:`mmap.mmap` object or some other behave-alike
          object. If ``file_contents`` is supplied, ``filename`` will not be used,
          except (possibly) in messages.

        :param encoding_override:

          Used to overcome missing or bad codepage information
          in older-version files. See :doc:`unicode`.

        :param formatting_info:

          The default is ``False``, which saves memory.
          In this case, "Blank" cells, which are those with their own formatting
          information but no data, are treated as empty by ignoring the file's
          ``BLANK`` and ``MULBLANK`` records.
          This cuts off any bottom or right "margin" of rows of empty or blank
          cells.
          Only :meth:`~xlrd.sheet.Sheet.cell_value` and
          :meth:`~xlrd.sheet.Sheet.cell_type` are available.

          When ``True``, formatting information will be read from the spreadsheet
          file. This provides all cells, including empty and blank cells.
          Formatting information is available for each cell.

          Note that this will raise a NotImplementedError when used with an
          xlsx file.

        :param on_demand:

          Governs whether sheets are all loaded initially or when demanded
          by the caller. See :doc:`on_demand`.

        :param ragged_rows:

          The default of ``False`` means all rows are padded out with empty cells so
          that all rows have the same size as found in
          :attr:`~xlrd.sheet.Sheet.ncols`.

          ``True`` means that there are no empty cells at the ends of rows.
          This can result in substantial memory savings if rows are of widely
          varying sizes. See also the :meth:`~xlrd.sheet.Sheet.row_len` method.

        :returns: An instance of the :class:`~xlrd.book.Book` class.

DATA
    FMLA_TYPE_ARRAY = 4
    FMLA_TYPE_CELL = 1
    FMLA_TYPE_COND_FMT = 8
    FMLA_TYPE_DATA_VAL = 16
    FMLA_TYPE_NAME = 32
    FMLA_TYPE_SHARED = 2
    MMAP_AVAILABLE = 1
    USE_MMAP = 1
    XL_CELL_BLANK = 6
    XL_CELL_BOOLEAN = 4
    XL_CELL_DATE = 3
    XL_CELL_EMPTY = 0
    XL_CELL_ERROR = 5
    XL_CELL_NUMBER = 2
    XL_CELL_TEXT = 1
    __VERSION__ = '1.1.0'
    biff_text_from_num = {0: '(not BIFF)', 20: '2.0', 21: '2.1', 30: '3', ...
    empty_cell = empty:''
    error_text_from_code = {0: '#NULL!', 7: '#DIV/0!', 15: '#VALUE!', 23: ...
    oBOOL = 3
    oERR = 4
    oNUM = 2
    oREF = -1
    oREL = -2
    oSTRG = 1
    oUNK = 0
    okind_dict = {-2: 'oREL', -1: 'oREF', 0: 'oUNK', 1: 'oSTRG', 2: 'oNUM'...

FILE
    c:\users\administrator\appdata\local\programs\python\python37\lib\site-packages\xlrd\__init__.py

XLWT:

>>> import xlwt
>>> help( xlwt )
Help on package xlwt:

NAME
    xlwt

PACKAGE CONTENTS
    BIFFRecords
    Bitmap
    Cell
    Column
    CompoundDoc
    ExcelFormula
    ExcelFormulaLexer
    ExcelFormulaParser
    ExcelMagic
    Formatting
    Row
    Style
    UnicodeUtils
    Utils
    Workbook
    Worksheet
    antlr
    compat

DATA
    __VERSION__ = '1.3.0'

FILE
    c:\users\administrator\appdata\local\programs\python\python37\lib\site-packages\xlwt\__init__.py

通過上述方法可以檢視xlrd的幫助資訊,裡面有xlrd包中的一些模組以及一些成員變數、常量、函式。

二、python處理Excel表格

1、開啟Excel表

?

1

2

3

4

5

6

7

8

9

10

11

import xlrd

# 獲取一個Book物件

book = xlrd.open_workbook("1.xls")

# 獲取一個sheet物件的列表

sheets = book.sheets()

# 遍歷每一個sheet,輸出這個sheet的名字(如果是新建的一個xls表,可能是sheet1、sheet2、sheet3)

for sheet in sheets:

print(sheet.name)

上面的幫助資訊出現了這個函式:open_workbook() ,開啟工作簿,這就打開了Excel表。

返回的是一個Book物件,通過Book物件我們可以獲得一個Sheet的列表,上面的程式就簡單地把每個sheet的名字都輸了出來。

2、讀出指定單元格內的資料

?

1

2

3

4

5

6

7

8

9

10

11

import xlrd

# 獲取一個Book物件

book = xlrd.open_workbook("1.xls")

# 獲取一個sheet物件的列表

sheets = book.sheets()

# 遍歷每一個sheet,輸出這個sheet的名字(如果是新建的一個xls表,可能是sheet1、sheet2、sheet3)

for sheet in sheets:

print(sheet.cell_value(0, 0))

讀出單元格內資料函式 cell_value(row, col) ,行列均從0起。

除此之外,可以通過:

?

1

2

sheet.cell(row, col) # 獲取單元格物件

sheet.cell_type(row, col) # 獲取單元格型別

3、讀取日期資料

如果Excel儲存的某一個單元格資料是日期的話,需要進行一下處理,轉換為datetime型別

?

1

2

3

4

5

6

7

8

9

10

11

12

13

14

from datetime import datetime

from xlrd import xldate_as_tuple

# 獲取一個Book物件

book = xlrd.open_workbook("1.xls")

# 獲取一個sheet物件的列表

sheets = book.sheets()

timeVal = sheets[0].cell_value(0,0)

timestamp = datetime(*xldate_as_tuple(timestamp, 0))

print(timestamp)

4、遍歷每行的資料

?

1

2

3

rows = sheet.get_rows()

for row in rows:

print(row[0].value) # 輸出此行第一列的資料