1. 程式人生 > >Python通過lxml庫遍歷xml通過xpath查詢(標簽,屬性名稱,屬性值,標簽對屬性)

Python通過lxml庫遍歷xml通過xpath查詢(標簽,屬性名稱,屬性值,標簽對屬性)

style 去掉 odi 之間 [] 符號 層次結構 div amp

xml實例:

版本一:

<?xml version="1.0" encoding="UTF-8"?><country name="chain"><provinces><heilongjiang name="citys"><haerbin/><daqing/></heilongjiang><guangdong name="citys"><guangzhou/><shenzhen/><huhai/></guangdong><taiwan name="citys"><
taibei/><gaoxiong/></taiwan><xinjiang name="citys"><wulumuqi waith="tianqi"></wulumuqi></xinjiang></provinces></country>

沒有空格,換行,的版本

python操作操作實例:

from lxml import etree
class r_xpath_xml(object):
    def __init__(self):
        self.xmetrpa=etree.parse(
info.xml) #讀取xml數據 pass def xpxm(self): xpxlm=self.xmetrpa print etree.tostring(xpxlm) #打印xml數據 root=xpxlm.getroot() #獲得該樹的樹根 print root.tag, , #打印根標簽名 print root.items() #獲得標簽屬性名稱和屬性值 for a in root: ##遍歷根下一集級標簽 print a.tag,a.items(),a.text,
被打印的類型為: ,type(a) #打印標簽名稱,標簽屬性,標簽數據 for b in a: print b.tag,b.items(),b.text#,b for c in b: print c.tag,c.items(),c.text#,c for d in c: print d.tag,d.items(),d.test,d print xpxlm.xpath(//node())#.items()#.tag print ===================================================================================================== xa=xpxlm.xpath(//heilongjiang/*) print xa for xb in xa: print xb.tag,xb.items(),xb.text xc=xpxlm.xpath(//xinjiang/*) print xc for xd in xc: print xd.tag,xd.items(),xd.text if __name__ == __main__: xpx=r_xpath_xml() xpx.xpxm()
應用for循環遍歷標簽層次結構,tag獲取標簽名,items()通過字典函數獲取[(‘屬性名‘ , ‘屬性值‘)],text獲取標簽對之間的數據。tag,items(),text針對的類型為:<type ‘lxml.etree._Element‘>
打印結果:
<country name="chain"><provinces><heilongjiang name="citys"><haerbin/><daqing/></heilongjiang><guangdong name="citys"><guangzhou/><shenzhen/><huhai/></guangdong><taiwan name="citys"><taibei/><gaoxiong/></taiwan><xinjiang name="citys"><wulumuqi waith="tianqi">&#26228;</wulumuqi></xinjiang></provinces></country>
country   [(name, chain)]
provinces [] None  被打印的類型為:  <type lxml.etree._Element>
heilongjiang [(name, citys)] None
haerbin [] None
daqing [] None
guangdong [(name, citys)] None
guangzhou [] None
shenzhen [] None
huhai [] None
taiwan [(name, citys)] None
taibei [] None
gaoxiong [] None
xinjiang [(name, citys)] None
wulumuqi [(waith, tianqi)] 晴
[<Element country at 0x2d47b20>, <Element provinces at 0x2d47990>, <Element heilongjiang at 0x2d479b8>, <Element haerbin at 0x2d47558>, <Element daqing at 0x2d47328>, <Element guangdong at 0x2d47300>, <Element guangzhou at 0x2d476e8>, <Element shenzhen at 0x2d47530>, <Element huhai at 0x2d472d8>, <Element taiwan at 0x2d47260>, <Element taibei at 0x2d47238>, <Element gaoxiong at 0x2d47080>, <Element xinjiang at 0x2d47710>, <Element wulumuqi at 0x2d47968>, u\u6674]
=====================================================================================================
[<Element haerbin at 0x2d479b8>, <Element daqing at 0x2d47148>]
haerbin [] None
daqing [] None
[<Element wulumuqi at 0x2d47968>] 類型為: <type list>
wulumuqi [(waith, tianqi)] 晴

xml實例:

版本二:

<?xml version="1.0" encoding="UTF-8"?>
<country name="chain">
    <provinces>
        <city:table xmlns:city="http://www.w3school.com.cn/furniture">
        <heilongjiang name="citys"><city:haerbin/><city:daqing/></heilongjiang>
        <guangdong name="citys"><city:guangzhou/><city:shenzhen/><city:zhuhai/></guangdong>
        <taiwan name="citys"><city:taibei/><city:gaoxiong/></taiwan>
        <xinjiang name="citys"><city:wulumuqi></city:wulumuqi></xinjiang>
        </city:table>    
    </provinces>
</country>

實例:
print xpxlm.xpath(//node())

打印結果:
空格回車字符,命名空間。
[<Element country at 0x2e79b20>, ‘\n    ‘, <Element provinces at 0x2e79990>, ‘\n        ‘, <Element {http://www.w3school.com.cn/furniture}table at 0x2e79710>, ‘\n        ‘, <Element heilongjiang at 0x2e799b8>, <Element {http://www.w3school.com.cn/furniture}haerbin at 0x2e79328>, <Element {http://www.w3school.com.cn/furniture}daqing at 0x2e79968>, ‘\n        ‘, <Element guangdong at 0x2e79530>, <Element {http://www.w3school.com.cn/furniture}guangzhou at 0x2e79300>, <Element {http://www.w3school.com.cn/furniture}shenzhen at 0x2e792d8>, <Element {http://www.w3school.com.cn/furniture}zhuhai at 0x2e79260>, ‘\n        ‘, <Element taiwan at 0x2e79238>, <Element {http://www.w3school.com.cn/furniture}taibei at 0x2e79080>, <Element {http://www.w3school.com.cn/furniture}gaoxiong at 0x2e79058>, ‘\n        ‘, <Element xinjiang at 0x2e796e8>, <Element {http://www.w3school.com.cn/furniture}wulumuqi at 0x2e79558>, u‘\u6674‘, ‘\n        ‘, ‘    \n    ‘, ‘\n‘]

去掉空格:

        xp=xpxlm.xpath(//node())
        print xp,           #.items()#.tag
        for i in xp:
            if ‘‘ in i or \n in i:
                continue
            else: 
                print i.tag

通過判斷去除空格換行符號

輸出結果:

provinces
{city}table
heilongjiang
{city}haerbin
{city}daqing
guangdong
{city}guangzhou
{city}shenzhen
{city}zhuhai
taiwan
{city}taibei
{city}gaoxiong
xinjiang
{city}wulumuqi





Python通過lxml庫遍歷xml通過xpath查詢(標簽,屬性名稱,屬性值,標簽對屬性)