1. 程式人生 > >最詳細的python 操作 mongodb教程!看完這篇還學不會隨時找我!

最詳細的python 操作 mongodb教程!看完這篇還學不會隨時找我!

條件 cnblogs 江蘇 。。 location flag pre del 修改字段

技術分享圖片

技術分享圖片

準備

我的本機環境是:

  • Python3.6

  • mongodb3.4.3

  • IDE: PyCharm Professional

因為要使用Python來操作數據庫,所以還需要安裝一個pymongo的包。需要註意的是直接用

pip install pymongo

是沒辦法使用的,可能是版本的問題,也可能是兼容性的問題。所以我們需要這麽做。

到 www.lfd.uci.edu/~gohlke/pyt…

下找到合適自己Python版本的whl文件,然後再使用pip安裝這個whl文件即可。

技術分享圖片

技術分享圖片

如果您的控制臺也輸出了如下類似的信息,那說明您已經連接成功了。

Collection(Database(MongoClient(host=[‘localhost:27017‘], document_class=dict, tz_aware=False, connect=True), ‘test‘), ‘user‘)

增:insert

mongodb的插入操作也是很方便的。我這裏事先在MongoVUE中添加了幾條數據。

技術分享圖片

技術分享圖片

代碼運行的結果如下:

D:\Software\Python3\python.exe E:/Code/Python/Python3/MyWork/mongo/mongo-connect.py
{‘_id‘: ObjectId(‘58f9e91295ee7820082b562b‘), ‘name‘: ‘郭 璞‘, ‘age‘: 20}
{‘_id‘: ObjectId(‘58f9e96f95ee7820082b562c‘), ‘name‘: ‘張三‘, ‘age‘: 28, ‘Address‘: ‘北京海澱‘, ‘blog‘: ‘http://blog.csdn.net/zhangsan‘}
{‘_id‘: ObjectId(‘58f9e98f95ee7820082b562d‘), ‘name‘: ‘李四‘, ‘age‘: 19, ‘Address‘: ‘上海灘‘, ‘blog‘: ‘http://blog.csdn.net/lisi‘}
{‘_id‘: ObjectId(‘58f9e9a895ee7820082b562e‘), ‘name‘: ‘王五‘, ‘age‘: 23, ‘Address‘: ‘江蘇杭州‘, ‘blog‘: ‘http://blog.csdn.net/wangwu‘}
{‘_id‘: ObjectId(‘58f9e9c595ee7820082b562f‘), ‘name‘: ‘趙六‘, ‘age‘: 32, ‘Address‘: ‘江蘇南京‘, ‘blog‘: ‘http://blog.csdn.net/zhaoliu‘}
{‘_id‘: ObjectId(‘58fac6c495ee78170c9075c7‘), ‘name‘: ‘可愛的噠噠‘, ‘age‘: 19, ‘address‘: ‘北京朝陽‘, ‘blog‘: ‘噠噠的博客一般人是不會知道的‘}
{‘_id‘: ObjectId(‘58fae2dd95ee7810044c07cf‘), ‘name‘: ‘刀塔傳奇‘, ‘age‘: 23, ‘address‘: ‘北京朝陽‘, ‘blog‘: ‘沒有博客‘}

對比初始狀態,不難發現多了一條新插入的記錄。

插入多條記錄

也許你會有這樣的需求,要一次插入多條記錄。這在命令行中很方便,我們寫個循環就可以了。不過pymongo也給我們提供了這樣的一個接口。

技術分享圖片

運行的結果如下:

D:\Software\Python3\python.exe E:/Code/Python/Python3/MyWork/mongo/mongo-connect.py
{‘_id‘: ObjectId(‘58f9e91295ee7820082b562b‘), ‘name‘: ‘郭 璞‘, ‘age‘: 20}
{‘_id‘: ObjectId(‘58f9e96f95ee7820082b562c‘), ‘name‘: ‘張三‘, ‘age‘: 28, ‘Address‘: ‘北京海澱‘, ‘blog‘: ‘http://blog.csdn.net/zhangsan‘}
{‘_id‘: ObjectId(‘58f9e98f95ee7820082b562d‘), ‘name‘: ‘李四‘, ‘age‘: 19, ‘Address‘: ‘上海灘‘, ‘blog‘: ‘http://blog.csdn.net/lisi‘}
{‘_id‘: ObjectId(‘58f9e9a895ee7820082b562e‘), ‘name‘: ‘王五‘, ‘age‘: 23, ‘Address‘: ‘江蘇杭州‘, ‘blog‘: ‘http://blog.csdn.net/wangwu‘}
{‘_id‘: ObjectId(‘58f9e9c595ee7820082b562f‘), ‘name‘: ‘趙六‘, ‘age‘: 32, ‘Address‘: ‘江蘇南京‘, ‘blog‘: ‘http://blog.csdn.net/zhaoliu‘}
{‘_id‘: ObjectId(‘58fac6c495ee78170c9075c7‘), ‘name‘: ‘可愛的噠噠‘, ‘age‘: 19, ‘address‘: ‘北京朝陽‘, ‘blog‘: ‘噠噠的博客一般人是不會知道的‘}
{‘_id‘: ObjectId(‘58fae2dd95ee7810044c07cf‘), ‘name‘: ‘刀塔傳奇‘, ‘age‘: 23, ‘address‘: ‘北京朝陽‘, ‘blog‘: ‘沒有博客‘}
{‘_id‘: ObjectId(‘58fae37b95ee78202cc2cc07‘), ‘name‘: ‘刀塔傳奇‘, ‘age‘: 1, ‘address‘: ‘北京朝陽‘, ‘blog‘: ‘沒有博客‘}
{‘_id‘: ObjectId(‘58fae37b95ee78202cc2cc08‘), ‘name‘: ‘刀塔傳奇‘, ‘age‘: 4, ‘address‘: ‘北京朝陽‘, ‘blog‘: ‘沒有博客‘}
{‘_id‘: ObjectId(‘58fae37b95ee78202cc2cc09‘), ‘name‘: ‘刀塔傳奇‘, ‘age‘: 9, ‘address‘: ‘北京朝陽‘, ‘blog‘: ‘沒有博客‘}
{‘_id‘: ObjectId(‘58fae37b95ee78202cc2cc0a‘), ‘name‘: ‘刀塔傳奇‘, ‘age‘: 16, ‘address‘: ‘北京朝陽‘, ‘blog‘: ‘沒有博客‘}
{‘_id‘: ObjectId(‘58fae37b95ee78202cc2cc0b‘), ‘name‘: ‘刀塔傳奇‘, ‘age‘: 25, ‘address‘: ‘北京朝陽‘, ‘blog‘: ‘沒有博客‘}
{‘_id‘: ObjectId(‘58fae37b95ee78202cc2cc0c‘), ‘name‘: ‘刀塔傳奇‘, ‘age‘: 36, ‘address‘: ‘北京朝陽‘, ‘blog‘: ‘沒有博客‘}
{‘_id‘: ObjectId(‘58fae37b95ee78202cc2cc0d‘), ‘name‘: ‘刀塔傳奇‘, ‘age‘: 49, ‘address‘: ‘北京朝陽‘, ‘blog‘: ‘沒有博客‘}

怎麽樣,還是很方便的吧。

技術分享圖片

{‘_id‘: ObjectId(‘58f9e91295ee7820082b562b‘), ‘name‘: ‘郭 璞‘, ‘age‘: 20}
{‘_id‘: ObjectId(‘58f9e96f95ee7820082b562c‘), ‘name‘: ‘張三‘, ‘age‘: 28, ‘Address‘: ‘北京海澱‘, ‘blog‘: ‘http://blog.csdn.net/zhangsan‘}
{‘_id‘: ObjectId(‘58f9e98f95ee7820082b562d‘), ‘name‘: ‘李四‘, ‘age‘: 19, ‘Address‘: ‘上海灘‘, ‘blog‘: ‘http://blog.csdn.net/lisi‘}
{‘_id‘: ObjectId(‘58f9e9a895ee7820082b562e‘), ‘name‘: ‘王五‘, ‘age‘: 23, ‘Address‘: ‘江蘇杭州‘, ‘blog‘: ‘http://blog.csdn.net/wangwu‘}
{‘_id‘: ObjectId(‘58f9e9c595ee7820082b562f‘), ‘name‘: ‘趙六‘, ‘age‘: 32, ‘Address‘: ‘江蘇南京‘, ‘blog‘: ‘http://blog.csdn.net/zhaoliu‘}
{‘_id‘: ObjectId(‘58fac6c495ee78170c9075c7‘), ‘name‘: ‘可愛的噠噠‘, ‘age‘: 19, ‘address‘: ‘北京朝陽‘, ‘blog‘: ‘噠噠的博客一般人是不會知道的‘}
{‘_id‘: ObjectId(‘58fae2dd95ee7810044c07cf‘), ‘name‘: ‘刀塔傳奇‘, ‘age‘: 23, ‘address‘: ‘北京朝陽‘, ‘blog‘: ‘沒有博客‘}
{‘_id‘: ObjectId(‘58fae37b95ee78202cc2cc07‘), ‘name‘: ‘刀塔傳奇‘, ‘age‘: 1, ‘address‘: ‘北京朝陽‘, ‘blog‘: ‘沒有博客‘}
{‘_id‘: ObjectId(‘58fae37b95ee78202cc2cc08‘), ‘name‘: ‘刀塔傳奇‘, ‘age‘: 4, ‘address‘: ‘北京朝陽‘, ‘blog‘: ‘沒有博客‘}
{‘_id‘: ObjectId(‘58fae37b95ee78202cc2cc09‘), ‘name‘: ‘刀塔傳奇‘, ‘age‘: 9, ‘address‘: ‘北京朝陽‘, ‘blog‘: ‘沒有博客‘}
{‘_id‘: ObjectId(‘58fae37b95ee78202cc2cc0a‘), ‘name‘: ‘刀塔傳奇‘, ‘age‘: 16, ‘address‘: ‘北京朝陽‘, ‘blog‘: ‘沒有博客‘}
{‘_id‘: ObjectId(‘58fae37b95ee78202cc2cc0b‘), ‘name‘: ‘刀塔傳奇‘, ‘age‘: 25, ‘address‘: ‘北京朝陽‘, ‘blog‘: ‘沒有博客‘}
{‘_id‘: ObjectId(‘58fae37b95ee78202cc2cc0c‘), ‘name‘: ‘刀塔傳奇‘, ‘age‘: 36, ‘address‘: ‘北京朝陽‘, ‘blog‘: ‘沒有博客‘}
{‘_id‘: ObjectId(‘58fae37b95ee78202cc2cc0d‘), ‘name‘: ‘刀塔傳奇‘, ‘age‘: 49, ‘address‘: ‘北京朝陽‘, ‘blog‘: ‘沒有博客‘}
修改操作之後:
{‘_id‘: ObjectId(‘58f9e91295ee7820082b562b‘), ‘name‘: ‘郭 璞‘, ‘age‘: 20}
{‘_id‘: ObjectId(‘58f9e96f95ee7820082b562c‘), ‘name‘: ‘張三‘, ‘age‘: 23, ‘address‘: ‘上海新城區‘, ‘blog‘: ‘張飛的博客內容被修改後了的內容‘}
{‘_id‘: ObjectId(‘58f9e98f95ee7820082b562d‘), ‘name‘: ‘李四‘, ‘age‘: 19, ‘Address‘: ‘上海灘‘, ‘blog‘: ‘http://blog.csdn.net/lisi‘}
{‘_id‘: ObjectId(‘58f9e9a895ee7820082b562e‘), ‘name‘: ‘王五‘, ‘age‘: 23, ‘Address‘: ‘江蘇杭州‘, ‘blog‘: ‘http://blog.csdn.net/wangwu‘}
{‘_id‘: ObjectId(‘58f9e9c595ee7820082b562f‘), ‘name‘: ‘趙六‘, ‘age‘: 32, ‘Address‘: ‘江蘇南京‘, ‘blog‘: ‘http://blog.csdn.net/zhaoliu‘}
{‘_id‘: ObjectId(‘58fac6c495ee78170c9075c7‘), ‘name‘: ‘可愛的噠噠‘, ‘age‘: 19, ‘address‘: ‘北京朝陽‘, ‘blog‘: ‘噠噠的博客一般人是不會知道的‘}
{‘_id‘: ObjectId(‘58fae2dd95ee7810044c07cf‘), ‘name‘: ‘刀塔傳奇‘, ‘age‘: 23, ‘address‘: ‘北京朝陽‘, ‘blog‘: ‘沒有博客‘}
{‘_id‘: ObjectId(‘58fae37b95ee78202cc2cc07‘), ‘name‘: ‘刀塔傳奇‘, ‘age‘: 1, ‘address‘: ‘北京朝陽‘, ‘blog‘: ‘沒有博客‘}
{‘_id‘: ObjectId(‘58fae37b95ee78202cc2cc08‘), ‘name‘: ‘刀塔傳奇‘, ‘age‘: 4, ‘address‘: ‘北京朝陽‘, ‘blog‘: ‘沒有博客‘}
{‘_id‘: ObjectId(‘58fae37b95ee78202cc2cc09‘), ‘name‘: ‘刀塔傳奇‘, ‘age‘: 9, ‘address‘: ‘北京朝陽‘, ‘blog‘: ‘沒有博客‘}
{‘_id‘: ObjectId(‘58fae37b95ee78202cc2cc0a‘), ‘name‘: ‘刀塔傳奇‘, ‘age‘: 16, ‘address‘: ‘北京朝陽‘, ‘blog‘: ‘沒有博客‘}
{‘_id‘: ObjectId(‘58fae37b95ee78202cc2cc0b‘), ‘name‘: ‘刀塔傳奇‘, ‘age‘: 25, ‘address‘: ‘北京朝陽‘, ‘blog‘: ‘沒有博客‘}
{‘_id‘: ObjectId(‘58fae37b95ee78202cc2cc0c‘), ‘name‘: ‘刀塔傳奇‘, ‘age‘: 36, ‘address‘: ‘北京朝陽‘, ‘blog‘: ‘沒有博客‘}
{‘_id‘: ObjectId(‘58fae37b95ee78202cc2cc0d‘), ‘name‘: ‘刀塔傳奇‘, ‘age‘: 49, ‘address‘: ‘北京朝陽‘, ‘blog‘: ‘沒有博客‘}

方式二

下面介紹部分修改字段內容的方式。

技術分享圖片

默認會刪除所有符合條件的記錄。

如下:

{‘_id‘: ObjectId(‘58f9e91295ee7820082b562b‘), ‘name‘: ‘郭 璞‘, ‘age‘: 20}
{‘_id‘: ObjectId(‘58f9e96f95ee7820082b562c‘), ‘name‘: ‘張三‘, ‘age‘: 23, ‘address‘: ‘上海新城區‘, ‘blog‘: ‘張飛的博客內容被修改後了的內容‘}
{‘_id‘: ObjectId(‘58f9e98f95ee7820082b562d‘), ‘name‘: ‘李四‘, ‘age‘: 19, ‘Address‘: ‘上海灘‘, ‘blog‘: ‘http://blog.csdn.net/lisi‘}
{‘_id‘: ObjectId(‘58f9e9a895ee7820082b562e‘), ‘name‘: ‘王五‘, ‘age‘: 23, ‘Address‘: ‘江蘇杭州‘, ‘blog‘: ‘http://blog.csdn.net/wangwu‘}
{‘_id‘: ObjectId(‘58f9e9c595ee7820082b562f‘), ‘name‘: ‘趙六‘, ‘age‘: 32, ‘Address‘: ‘江蘇南京‘, ‘blog‘: ‘http://blog.csdn.net/zhaoliu‘}
{‘_id‘: ObjectId(‘58fac6c495ee78170c9075c7‘), ‘name‘: ‘可愛的噠噠‘, ‘age‘: 19, ‘address‘: ‘北京朝陽‘, ‘blog‘: ‘噠噠的博客一般人是不會知道的‘}
{‘_id‘: ObjectId(‘58fae2dd95ee7810044c07cf‘), ‘name‘: ‘可愛的Mongodb‘, ‘age‘: 23, ‘address‘: ‘北京朝陽‘, ‘blog‘: ‘沒有博客‘}
{‘_id‘: ObjectId(‘58fae37b95ee78202cc2cc07‘), ‘name‘: ‘可愛的Mongodb‘, ‘age‘: 1, ‘address‘: ‘北京朝陽‘, ‘blog‘: ‘沒有博客‘}
{‘_id‘: ObjectId(‘58fae37b95ee78202cc2cc08‘), ‘name‘: ‘可愛的Mongodb‘, ‘age‘: 4, ‘address‘: ‘北京朝陽‘, ‘blog‘: ‘沒有博客‘}
{‘_id‘: ObjectId(‘58fae37b95ee78202cc2cc09‘), ‘name‘: ‘刀塔傳奇‘, ‘age‘: 9, ‘address‘: ‘北京朝陽‘, ‘blog‘: ‘沒有博客‘}
{‘_id‘: ObjectId(‘58fae37b95ee78202cc2cc0a‘), ‘name‘: ‘刀塔傳奇‘, ‘age‘: 16, ‘address‘: ‘北京朝陽‘, ‘blog‘: ‘沒有博客‘}
{‘_id‘: ObjectId(‘58fae37b95ee78202cc2cc0b‘), ‘name‘: ‘刀塔傳奇‘, ‘age‘: 25, ‘address‘: ‘北京朝陽‘, ‘blog‘: ‘沒有博客‘}
{‘_id‘: ObjectId(‘58fae37b95ee78202cc2cc0c‘), ‘name‘: ‘刀塔傳奇‘, ‘age‘: 36, ‘address‘: ‘北京朝陽‘, ‘blog‘: ‘沒有博客‘}
{‘_id‘: ObjectId(‘58fae37b95ee78202cc2cc0d‘), ‘name‘: ‘刀塔傳奇‘, ‘age‘: 49, ‘address‘: ‘北京朝陽‘, ‘blog‘: ‘沒有博客‘}
修改操作之後:
{‘_id‘: ObjectId(‘58f9e91295ee7820082b562b‘), ‘name‘: ‘郭 璞‘, ‘age‘: 20}
{‘_id‘: ObjectId(‘58f9e96f95ee7820082b562c‘), ‘name‘: ‘張三‘, ‘age‘: 23, ‘address‘: ‘上海新城區‘, ‘blog‘: ‘張飛的博客內容被修改後了的內容‘}
{‘_id‘: ObjectId(‘58f9e98f95ee7820082b562d‘), ‘name‘: ‘李四‘, ‘age‘: 19, ‘Address‘: ‘上海灘‘, ‘blog‘: ‘http://blog.csdn.net/lisi‘}
{‘_id‘: ObjectId(‘58f9e9a895ee7820082b562e‘), ‘name‘: ‘王五‘, ‘age‘: 23, ‘Address‘: ‘江蘇杭州‘, ‘blog‘: ‘http://blog.csdn.net/wangwu‘}
{‘_id‘: ObjectId(‘58f9e9c595ee7820082b562f‘), ‘name‘: ‘趙六‘, ‘age‘: 32, ‘Address‘: ‘江蘇南京‘, ‘blog‘: ‘http://blog.csdn.net/zhaoliu‘}
{‘_id‘: ObjectId(‘58fac6c495ee78170c9075c7‘), ‘name‘: ‘可愛的噠噠‘, ‘age‘: 19, ‘address‘: ‘北京朝陽‘, ‘blog‘: ‘噠噠的博客一般人是不會知道的‘}
{‘_id‘: ObjectId(‘58fae37b95ee78202cc2cc09‘), ‘name‘: ‘刀塔傳奇‘, ‘age‘: 9, ‘address‘: ‘北京朝陽‘, ‘blog‘: ‘沒有博客‘}
{‘_id‘: ObjectId(‘58fae37b95ee78202cc2cc0a‘), ‘name‘: ‘刀塔傳奇‘, ‘age‘: 16, ‘address‘: ‘北京朝陽‘, ‘blog‘: ‘沒有博客‘}
{‘_id‘: ObjectId(‘58fae37b95ee78202cc2cc0b‘), ‘name‘: ‘刀塔傳奇‘, ‘age‘: 25, ‘address‘: ‘北京朝陽‘, ‘blog‘: ‘沒有博客‘}
{‘_id‘: ObjectId(‘58fae37b95ee78202cc2cc0c‘), ‘name‘: ‘刀塔傳奇‘, ‘age‘: 36, ‘address‘: ‘北京朝陽‘, ‘blog‘: ‘沒有博客‘}
{‘_id‘: ObjectId(‘58fae37b95ee78202cc2cc0d‘), ‘name‘: ‘刀塔傳奇‘, ‘age‘: 49, ‘address‘: ‘北京朝陽‘, ‘blog‘: ‘沒有博客‘}

查: find

對於一個數據庫來說,最最常用的估計就是查找操作了。查找操作的速度決定了數據庫性能的評價。下面對於幾個常用的查詢做下介紹。

技術分享圖片

{‘_id‘: ObjectId(‘58f9e96f95ee7820082b562c‘), ‘name‘: ‘張三‘, ‘age‘: 23, ‘address‘: ‘上海新城區‘, ‘blog‘: ‘張飛的博客內容被修改後了的內容‘}
{‘_id‘: ObjectId(‘58f9e9a895ee7820082b562e‘), ‘name‘: ‘王五‘, ‘age‘: 23, ‘Address‘: ‘江蘇杭州‘, ‘blog‘: ‘http://blog.csdn.net/wangwu‘}
{‘_id‘: ObjectId(‘58f9e9c595ee7820082b562f‘), ‘name‘: ‘趙六‘, ‘age‘: 32, ‘Address‘: ‘江蘇南京‘, ‘blog‘: ‘http://blog.csdn.net/zhaoliu‘}
{‘_id‘: ObjectId(‘58fae37b95ee78202cc2cc0b‘), ‘name‘: ‘刀塔傳奇‘, ‘age‘: 25, ‘address‘: ‘北京朝陽‘, ‘blog‘: ‘沒有博客‘}
{‘_id‘: ObjectId(‘58fae37b95ee78202cc2cc0c‘), ‘name‘: ‘刀塔傳奇‘, ‘age‘: 36, ‘address‘: ‘北京朝陽‘, ‘blog‘: ‘沒有博客‘}
{‘_id‘: ObjectId(‘58fae37b95ee78202cc2cc0d‘), ‘name‘: ‘刀塔傳奇‘, ‘age‘: 49, ‘address‘: ‘北京朝陽‘, ‘blog‘: ‘沒有博客‘}

查詢限制條數

collection.find({"age": {‘$gt‘: 20}}).limit(number)

查詢某(幾)個字段的值

技術分享圖片

運行結果:

user 集合中共有11 條數據

對查詢結果排序輸出

items = collection.find().sort([(‘age‘, pymongo.ASCENDING), (‘Address‘, pymongo.DESCENDING)])for item in items:
 print(item)

運行結果:

{‘_id‘: ObjectId(‘58fae37b95ee78202cc2cc09‘), ‘name‘: ‘刀塔傳奇‘, ‘age‘: 9, ‘address‘: ‘北京朝陽‘, ‘blog‘: ‘沒有博客‘}
{‘_id‘: ObjectId(‘58fae37b95ee78202cc2cc0a‘), ‘name‘: ‘刀塔傳奇‘, ‘age‘: 16, ‘address‘: ‘北京朝陽‘, ‘blog‘: ‘沒有博客‘}
{‘_id‘: ObjectId(‘58f9e98f95ee7820082b562d‘), ‘name‘: ‘李四‘, ‘age‘: 19, ‘Address‘: ‘上海灘‘, ‘blog‘: ‘http://blog.csdn.net/lisi‘}
{‘_id‘: ObjectId(‘58fac6c495ee78170c9075c7‘), ‘name‘: ‘可愛的噠噠‘, ‘age‘: 19, ‘address‘: ‘北京朝陽‘, ‘blog‘: ‘噠噠的博客一般人是不會知道的‘}
{‘_id‘: ObjectId(‘58f9e91295ee7820082b562b‘), ‘name‘: ‘郭 璞‘, ‘age‘: 20}
{‘_id‘: ObjectId(‘58f9e9a895ee7820082b562e‘), ‘name‘: ‘王五‘, ‘age‘: 23, ‘Address‘: ‘江蘇杭州‘, ‘blog‘: ‘http://blog.csdn.net/wangwu‘}
{‘_id‘: ObjectId(‘58f9e96f95ee7820082b562c‘), ‘name‘: ‘張三‘, ‘age‘: 23, ‘address‘: ‘上海新城區‘, ‘blog‘: ‘張飛的博客內容被修改後了的內容‘}
{‘_id‘: ObjectId(‘58fae37b95ee78202cc2cc0b‘), ‘name‘: ‘刀塔傳奇‘, ‘age‘: 25, ‘address‘: ‘北京朝陽‘, ‘blog‘: ‘沒有博客‘}
{‘_id‘: ObjectId(‘58f9e9c595ee7820082b562f‘), ‘name‘: ‘趙六‘, ‘age‘: 32, ‘Address‘: ‘江蘇南京‘, ‘blog‘: ‘http://blog.csdn.net/zhaoliu‘}
{‘_id‘: ObjectId(‘58fae37b95ee78202cc2cc0c‘), ‘name‘: ‘刀塔傳奇‘, ‘age‘: 36, ‘address‘: ‘北京朝陽‘, ‘blog‘: ‘沒有博客‘}
{‘_id‘: ObjectId(‘58fae37b95ee78202cc2cc0d‘), ‘name‘: ‘刀塔傳奇‘, ‘age‘: 49, ‘address‘: ‘北京朝陽‘, ‘blog‘: ‘沒有博客‘}

模糊查詢

技術分享圖片

in 查詢

# in 查詢items = collection.find({‘age‘: {‘$in‘: [18, 19, 22, 28]}})for item in items:
 print(item)

運行結果:

{‘_id‘: ObjectId(‘58f9e98f95ee7820082b562d‘), ‘name‘: ‘李四‘, ‘age‘: 19, ‘Address‘: ‘上海灘‘, ‘blog‘: ‘http://blog.csdn.net/lisi‘}
{‘_id‘: ObjectId(‘58fac6c495ee78170c9075c7‘), ‘name‘: ‘可愛的噠噠‘, ‘age‘: 19, ‘address‘: ‘北京朝陽‘, ‘blog‘: ‘噠噠的博客一般人是不會知道的‘}

not in 查詢

# not in 查詢items = collection.find({‘age‘: {‘$nin‘: [18, 19, 28]}})for item in items:
 print(item)

運行結果:

{‘_id‘: ObjectId(‘58f9e91295ee7820082b562b‘), ‘name‘: ‘郭 璞‘, ‘age‘: 20}
{‘_id‘: ObjectId(‘58f9e96f95ee7820082b562c‘), ‘name‘: ‘張三‘, ‘age‘: 23, ‘address‘: ‘上海新城區‘, ‘blog‘: ‘張飛的博客內容被修改後了的內容‘}
{‘_id‘: ObjectId(‘58f9e9a895ee7820082b562e‘), ‘name‘: ‘王五‘, ‘age‘: 23, ‘Address‘: ‘江蘇杭州‘, ‘blog‘: ‘http://blog.csdn.net/wangwu‘}
{‘_id‘: ObjectId(‘58f9e9c595ee7820082b562f‘), ‘name‘: ‘趙六‘, ‘age‘: 32, ‘Address‘: ‘江蘇南京‘, ‘blog‘: ‘http://blog.csdn.net/zhaoliu‘}
{‘_id‘: ObjectId(‘58fae37b95ee78202cc2cc09‘), ‘name‘: ‘刀塔傳奇‘, ‘age‘: 9, ‘address‘: ‘北京朝陽‘, ‘blog‘: ‘沒有博客‘}
{‘_id‘: ObjectId(‘58fae37b95ee78202cc2cc0a‘), ‘name‘: ‘刀塔傳奇‘, ‘age‘: 16, ‘address‘: ‘北京朝陽‘, ‘blog‘: ‘沒有博客‘}
{‘_id‘: ObjectId(‘58fae37b95ee78202cc2cc0b‘), ‘name‘: ‘刀塔傳奇‘, ‘age‘: 25, ‘address‘: ‘北京朝陽‘, ‘blog‘: ‘沒有博客‘}
{‘_id‘: ObjectId(‘58fae37b95ee78202cc2cc0c‘), ‘name‘: ‘刀塔傳奇‘, ‘age‘: 36, ‘address‘: ‘北京朝陽‘, ‘blog‘: ‘沒有博客‘}
{‘_id‘: ObjectId(‘58fae37b95ee78202cc2cc0d‘), ‘name‘: ‘刀塔傳奇‘, ‘age‘: 49, ‘address‘: ‘北京朝陽‘, ‘blog‘: ‘沒有博客‘}

差不多經常用到的簡單的查詢操作就是這樣了。掌握了這些差不多也算可以了。

實戰

下面做個實戰,來加深一下熟練度。比如我要爬取西祠代理網的代理IP的信息。那麽我可以這麽來。

爬取模塊

# coding: utf8# @Author: 郭 璞# @File: spider.py # @Time: 2017/4/22 # @Contact: [email protected]# @blog: http://blog.csdn.net/marksinoberg# @Description: 爬取代理IP模塊import requestsfrom bs4 import BeautifulSoupdef gethtml(url, headers):
 """
 獲取網頁源代碼
 :param url:
 :param headers:
 :return:
 """
 return requests.get(url=url, headers=headers).textdef parse(data):
 """
 解析網頁源碼,將獲取到的代理IP數據封裝到一個大的集合中。
 :param data:
 :return:
 """
 result = []
 soup = BeautifulSoup(data, ‘html.parser‘)
 iptable = soup.find(‘table‘, {‘id‘: ‘ip_list‘}).find_all(‘tr‘)[2:] for item in iptable: try:
 ipaddress = item.find_all(‘td‘)[1].get_text()
 port = item.find_all(‘td‘)[2].get_text()
 address = item.find_all(‘td‘)[3].get_text()
 ip = { ‘ipaddress‘: ipaddress, ‘port‘: port, ‘address‘: address
 }
 result.append(ip)
 ip = None
 except Exception as e:
 print(e) continue
 return resultif __name__ == ‘__main__‘:
 headers = { ‘Referer‘: ‘http://www.xicidaili.com/‘, ‘User-Agent‘: ‘Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/57.0.2987.110 Safari/537.36‘
 }
 url = ‘http://www.xicidaili.com‘
 result = parse(gethtml(url=url, headers=headers))
 print("IP: {}\t Port: {}\t Location: {}".format(result[0].ipaddress, result[0].port, result[0].address))

存儲模塊

# coding: utf8# @Author: 郭 璞# @File: storage.py # @Time: 2017/4/22 # @Contact: [email protected]# @blog: http://blog.csdn.net/marksinoberg# @Description: 代理IP存儲模塊import pymongoclass DbUtils(object):
 def __init__(self, dbname=‘test‘, collectionname=‘‘):
 # 實例化self.db
 exec("self.db = pymongo.MongoClient()."+dbname) # 給要操作的集合賦值
 exec("self.collection = self.db."+collectionname) def storage(self, data=[]):
 return True if self.collection.insert_many(data) else False
 def update(self, old={}, new={}):
 return True if self.collection.update(old, {‘$set‘: new}) else False
 def select(self, where={}, field=[], limits=3, ordering=[]):
 return self.collection.find(where, field).limit(limits).sort(ordering) def delete(self, field={}):
 return True if self.collection.remove(field) else Falseif __name__ == ‘__main__‘:
 dbutils = DbUtils(‘test‘, ‘iptable‘)
 data = [
 {‘ipaddress‘: ‘127.0.0.1‘, ‘port‘: 8080, ‘address‘: ‘遼寧大連‘},
 {‘ipaddress‘: ‘localhost‘, ‘port‘: ‘27017‘, ‘address‘: ‘北京朝陽‘}
 ] # flag = dbutils.storage(data=data)
 # old = {‘address‘: ‘遼寧大連‘}
 # new = {‘address‘: ‘浙江溫州‘}
 # flag = dbutils.update(old=old, new=new)
 # field = {‘port‘: 8080}
 # flag = dbutils.delete(field=field)
 # print("OP Result:", flag)
 result = dbutils.select({‘ipaddress‘: ‘localhost‘}, [‘ipaddress‘, ‘address‘], 3, [(‘port‘, pymongo.ASCENDING)]) for item in result:
 print(item)

總管模塊

# coding: utf8# @Author: 郭 璞# @File: Main.py# @Time: 2017/4/22 # @Contact: [email protected]# @blog: http://blog.csdn.net/marksinoberg# @Description: 代理IP模塊整合from mongo import spiderfrom mongo import storageif __name__ == ‘__main__‘:
 headers = { ‘Referer‘: ‘http://www.xicidaili.com/‘, ‘User-Agent‘: ‘Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/57.0.2987.110 Safari/537.36‘
 }
 url = ‘http://www.xicidaili.com‘
 # 獲取代理IP列表
 iptable = spider.parse(spider.gethtml(url=url, headers=headers))
 print(iptable) # 存儲數據
 dbutils = storage.DbUtils(‘test‘, ‘iptable‘)
 flag = dbutils.storage(data=iptable) if flag:
 print(‘代理IP已經裝填完畢,整裝待發!‘) else:
 print(‘代理IP裝填不成功,可能出現了點問題!‘)

運行效果

[{‘ipaddress‘: ‘119.5.0.3‘, ‘port‘: ‘808‘, ‘address‘: ‘四川南充‘},。。。。。。。。。。。{‘ipaddress‘: ‘202.119.199.147‘, ‘port‘: ‘1080‘, ‘address‘: ‘江蘇徐州‘}]
代理IP已經裝填完畢,整裝待發!

數據庫中的效果如下:

技術分享圖片

總結

最後來回顧一下今天的內容。

  • 使用Python簡單操作了下mongodb。

  • 對常用的操作進行了演示和講解

  • 實戰:封裝了爬蟲,封裝了數據庫工具類。分層分模塊思想的應用。

    大概就是這些了。現在先拿代理IP練練手,後面還可以慢慢添加功能,爬取到合適的某些數據之後,通過發郵件,或者調用即時通信api來及時的通知自己。這都是有待添加的內容。

歡迎關註我的博客或者公眾號:https://home.cnblogs.com/u/Python1234/ Python學習交流

歡迎加入我的千人交流學習答疑群:125240963

最詳細的python 操作 mongodb教程!看完這篇還學不會隨時找我!