问题:如何用pymongo排序mongodb
我在查询mongoDB时尝试使用排序功能,但是失败了。相同的查询在MongoDB控制台中有效,但不适用于此处。代码如下:
import pymongo
from pymongo import Connection
connection = Connection()
db = connection.myDB
print db.posts.count()
for post in db.posts.find({}, {'entities.user_mentions.screen_name':1}).sort({u'entities.user_mentions.screen_name':1}):
print post
我得到的错误如下:
Traceback (most recent call last):
File "find_ow.py", line 7, in <module>
for post in db.posts.find({}, {'entities.user_mentions.screen_name':1}).sort({'entities.user_mentions.screen_name':1},1):
File "/Library/Python/2.6/site-packages/pymongo-2.0.1-py2.6-macosx-10.6-universal.egg/pymongo/cursor.py", line 430, in sort
File "/Library/Python/2.6/site-packages/pymongo-2.0.1-py2.6-macosx-10.6-universal.egg/pymongo/helpers.py", line 67, in _index_document
TypeError: first item in each key pair must be a string
我在其他地方找到了一个链接,该链接说如果使用pymongo,则需要在密钥的前面放置一个“ u”,但这也不起作用。任何其他人都可以使用它,或者这是一个错误。
I’m trying to use the sort feature when querying my mongoDB, but it is failing. The same query works in the MongoDB console but not here. Code is as follows:
import pymongo
from pymongo import Connection
connection = Connection()
db = connection.myDB
print db.posts.count()
for post in db.posts.find({}, {'entities.user_mentions.screen_name':1}).sort({u'entities.user_mentions.screen_name':1}):
print post
The error I get is as follows:
Traceback (most recent call last):
File "find_ow.py", line 7, in <module>
for post in db.posts.find({}, {'entities.user_mentions.screen_name':1}).sort({'entities.user_mentions.screen_name':1},1):
File "/Library/Python/2.6/site-packages/pymongo-2.0.1-py2.6-macosx-10.6-universal.egg/pymongo/cursor.py", line 430, in sort
File "/Library/Python/2.6/site-packages/pymongo-2.0.1-py2.6-macosx-10.6-universal.egg/pymongo/helpers.py", line 67, in _index_document
TypeError: first item in each key pair must be a string
I found a link elsewhere that says I need to place a ‘u’ infront of the key if using pymongo, but that didn’t work either. Anyone else get this to work or is this a bug.
回答 0
.sort()
在pymongo中,以key
和direction
作为参数。
因此,假设您要排序,id
那么您应该.sort("_id", 1)
对于多个字段:
.sort([("field1", pymongo.ASCENDING), ("field2", pymongo.DESCENDING)])
.sort()
, in pymongo, takes key
and direction
as parameters.
So if you want to sort by, let’s say, id
then you should .sort("_id", 1)
For multiple fields:
.sort([("field1", pymongo.ASCENDING), ("field2", pymongo.DESCENDING)])
回答 1
您可以尝试以下方法:
db.Account.find().sort("UserName")
db.Account.find().sort("UserName",pymongo.ASCENDING)
db.Account.find().sort("UserName",pymongo.DESCENDING)
You can try this:
db.Account.find().sort("UserName")
db.Account.find().sort("UserName",pymongo.ASCENDING)
db.Account.find().sort("UserName",pymongo.DESCENDING)
回答 2
这也适用:
db.Account.find().sort('UserName', -1)
db.Account.find().sort('UserName', 1)
我在代码中使用了它,如果我在这里做错了,请发表评论,谢谢。
This also works:
db.Account.find().sort('UserName', -1)
db.Account.find().sort('UserName', 1)
I’m using this in my code, please comment if i’m doing something wrong here, thanks.
回答 3
为什么python使用元组列表代替dict?
在python中,您不能保证字典将按照声明的顺序进行解释。
因此,在mongo shell中,您可以这样做.sort({'field1':1,'field2':1})
,并且解释程序应在第一级对field1进行排序,并在第二级对field 2进行排序。
如果在Python中使用了此sintax,则有可能在第一级对field2进行排序。使用元组没有任何风险。
.sort([("field1",pymongo.ASCENDING), ("field2",pymongo.DESCENDING)])
Why python uses list of tuples instead dict?
In python, you cannot guarantee that the dictionary will be interpreted in the order you declared.
So, in mongo shell you could do .sort({'field1':1,'field2':1})
and the interpreter would sort field1 at first level and field 2 at second level.
If this syntax was used in python, there is a chance of sorting by field2 at first level. With tuple, there is no such risk.
.sort([("field1",pymongo.ASCENDING), ("field2",pymongo.DESCENDING)])
回答 4
.sort([("field1",pymongo.ASCENDING), ("field2",pymongo.DESCENDING)])
Python使用键方向。您可以使用上述方式。
因此,您可以这样做
for post in db.posts.find().sort('entities.user_mentions.screen_name',pymongo.ASCENDING):
print post
.sort([("field1",pymongo.ASCENDING), ("field2",pymongo.DESCENDING)])
Python uses key,direction. You can use the above way.
So in your case you can do this
for post in db.posts.find().sort('entities.user_mentions.screen_name',pymongo.ASCENDING):
print post
回答 5
TLDR:聚合管道比常规管道更快.find().sort()
。
现在转向真正的解释。在MongoDB中有两种执行排序操作的方法:
- 使用
.find()
和.sort()
。
- 或使用聚合管道。
正如许多.find()。sort()所建议的那样,这是执行排序的最简单方法。
.sort([("field1",pymongo.ASCENDING), ("field2",pymongo.DESCENDING)])
但是,与聚合管道相比,这是一个缓慢的过程。
谈到聚合管道方法。实现用于排序的简单聚合管道的步骤为:
- $ match(可选步骤)
- $ sort
注意:根据我的经验,聚合管道的工作速度比该.find().sort()
方法快。
这是聚合管道的示例。
db.collection_name.aggregate([{
"$match": {
# your query - optional step
}
},
{
"$sort": {
"field_1": pymongo.ASCENDING,
"field_2": pymongo.DESCENDING,
....
}
}])
自己尝试此方法,比较速度,然后在评论中让我知道。
编辑:不要忘记allowDiskUse=True
在多个字段上排序时使用,否则会抛出错误。
TLDR: Aggregation pipeline is faster as compared to conventional .find().sort()
.
Now moving to the real explanation. There are two ways to perform sorting operations in MongoDB:
- Using
.find()
and .sort()
.
- Or using the aggregation pipeline.
As suggested by many .find().sort() is the simplest way to perform the sorting.
.sort([("field1",pymongo.ASCENDING), ("field2",pymongo.DESCENDING)])
However, this is a slow process compared to the aggregation pipeline.
Coming to the aggregation pipeline method. The steps to implement simple aggregation pipeline intended for sorting are:
- $match (optional step)
- $sort
NOTE: In my experience, the aggregation pipeline works a bit faster than the .find().sort()
method.
Here’s an example of the aggregation pipeline.
db.collection_name.aggregate([{
"$match": {
# your query - optional step
}
},
{
"$sort": {
"field_1": pymongo.ASCENDING,
"field_2": pymongo.DESCENDING,
....
}
}])
Try this method yourself, compare the speed and let me know about this in the comments.
Edit: Do not forget to use allowDiskUse=True
while sorting on multiple fields otherwise it will throw an error.
回答 6
假设您要按“ created_on”字段进行排序,则可以这样做,
.sort('{}'.format('created_on'), 1 if sort_type == 'asc' else -1)
Say, you want to sort by ‘created_on’ field, then you can do like this,
.sort('{}'.format('created_on'), 1 if sort_type == 'asc' else -1)