如何在Django中以GROUP BY查询?

问题:如何在Django中以GROUP BY查询?

我查询一个模型:

Members.objects.all()

它返回:

Eric, Salesman, X-Shop
Freddie, Manager, X2-Shop
Teddy, Salesman, X2-Shop
Sean, Manager, X2-Shop

我想要的是知道group_by向我的数据库发送查询的最佳Django方法,例如:

Members.objects.all().group_by('designation')

当然,这不起作用。我知道我们可以在上做一些技巧django/db/models/query.py,但我只是很好奇知道如何在不打补丁的情况下进行操作。

I query a model:

Members.objects.all()

And it returns:

Eric, Salesman, X-Shop
Freddie, Manager, X2-Shop
Teddy, Salesman, X2-Shop
Sean, Manager, X2-Shop

What I want is to know the best Django way to fire a group_by query to my database, like:

Members.objects.all().group_by('designation')

Which doesn’t work, of course. I know we can do some tricks on django/db/models/query.py, but I am just curious to know how to do it without patching.


回答 0

如果您打算进行聚合,则可以使用ORM聚合功能

from django.db.models import Count
Members.objects.values('designation').annotate(dcount=Count('designation'))

这导致查询类似于

SELECT designation, COUNT(designation) AS dcount
FROM members GROUP BY designation

并且输出将为以下形式

[{'designation': 'Salesman', 'dcount': 2}, 
 {'designation': 'Manager', 'dcount': 2}]

If you mean to do aggregation you can use the aggregation features of the ORM:

from django.db.models import Count
Members.objects.values('designation').annotate(dcount=Count('designation'))

This results in a query similar to

SELECT designation, COUNT(designation) AS dcount
FROM members GROUP BY designation

and the output would be of the form

[{'designation': 'Salesman', 'dcount': 2}, 
 {'designation': 'Manager', 'dcount': 2}]

回答 1

一个简单的解决方案,但不是正确的方法是使用原始SQL

results = Members.objects.raw('SELECT * FROM myapp_members GROUP BY designation')

另一种解决方案是使用该group_by属性:

query = Members.objects.all().query
query.group_by = ['designation']
results = QuerySet(query=query, model=Members)

现在,您可以遍历结果变量以检索结果。请注意,该group_by文档未记录,在以后的Django版本中可能会更改。

还有…为什么要使用group_by?如果不使用聚合,则可以使用order_by来获得相似的结果。

An easy solution, but not the proper way is to use raw SQL:

results = Members.objects.raw('SELECT * FROM myapp_members GROUP BY designation')

Another solution is to use the group_by property:

query = Members.objects.all().query
query.group_by = ['designation']
results = QuerySet(query=query, model=Members)

You can now iterate over the results variable to retrieve your results. Note that group_by is not documented and may be changed in future version of Django.

And… why do you want to use group_by? If you don’t use aggregation, you can use order_by to achieve an alike result.


回答 2

您也可以使用regroup模板标记按属性分组。从文档:

cities = [
    {'name': 'Mumbai', 'population': '19,000,000', 'country': 'India'},
    {'name': 'Calcutta', 'population': '15,000,000', 'country': 'India'},
    {'name': 'New York', 'population': '20,000,000', 'country': 'USA'},
    {'name': 'Chicago', 'population': '7,000,000', 'country': 'USA'},
    {'name': 'Tokyo', 'population': '33,000,000', 'country': 'Japan'},
]

...

{% regroup cities by country as country_list %}

<ul>
    {% for country in country_list %}
        <li>{{ country.grouper }}
            <ul>
            {% for city in country.list %}
                <li>{{ city.name }}: {{ city.population }}</li>
            {% endfor %}
            </ul>
        </li>
    {% endfor %}
</ul>

看起来像这样:

  • 印度
    • 孟买:19,000,000
    • 加尔各答:15,000,000
  • 美国
    • 纽约:20,000,000
    • 芝加哥:7,000,000
  • 日本
    • 东京:33,000,000

QuerySet我相信它也可以使用。

来源:https : //docs.djangoproject.com/en/2.1/ref/templates/builtins/#regroup

编辑:请注意,如果词典列表未按键排序,则该regroup标签将无法正常运行。它迭代地工作。因此,在将列表(或查询集)传递给regroup标签之前,请先按石斑鱼的键对列表进行排序。

You can also use the regroup template tag to group by attributes. From the docs:

cities = [
    {'name': 'Mumbai', 'population': '19,000,000', 'country': 'India'},
    {'name': 'Calcutta', 'population': '15,000,000', 'country': 'India'},
    {'name': 'New York', 'population': '20,000,000', 'country': 'USA'},
    {'name': 'Chicago', 'population': '7,000,000', 'country': 'USA'},
    {'name': 'Tokyo', 'population': '33,000,000', 'country': 'Japan'},
]

...

{% regroup cities by country as country_list %}

<ul>
    {% for country in country_list %}
        <li>{{ country.grouper }}
            <ul>
            {% for city in country.list %}
                <li>{{ city.name }}: {{ city.population }}</li>
            {% endfor %}
            </ul>
        </li>
    {% endfor %}
</ul>

Looks like this:

  • India
    • Mumbai: 19,000,000
    • Calcutta: 15,000,000
  • USA
    • New York: 20,000,000
    • Chicago: 7,000,000
  • Japan
    • Tokyo: 33,000,000

It also works on QuerySets I believe.

source: https://docs.djangoproject.com/en/2.1/ref/templates/builtins/#regroup

edit: note the regroup tag does not work as you would expect it to if your list of dictionaries is not key-sorted. It works iteratively. So sort your list (or query set) by the key of the grouper before passing it to the regroup tag.


回答 3

您需要按照以下代码片段中的示例进行自定义SQL:

通过子查询自定义SQL

或在在线Django文档中显示的自定义管理器中:

添加额外的Manager方法

You need to do custom SQL as exemplified in this snippet:

Custom SQL via subquery

Or in a custom manager as shown in the online Django docs:

Adding extra Manager methods


回答 4

Django不支持免费的按组分组查询。我以非常糟糕的方式学到了它。如果不使用自定义SQL,则ORM并非旨在支持您想做的事情。您仅限于:

  • 原始sql(即MyModel.objects.raw())
  • cr.execute 句子(以及结果的手工解析)。
  • .annotate() (按句段在.annotate()的子模型中执行句子分组,例如聚合lines_count = Count(’lines’)之类的示例))。

qs您可以调用整个qs.query.group_by = ['field1', 'field2', ...]查询集,但是如果您不知道要编辑的查询,并且不能保证该查询将起作用并且不会破坏QuerySet对象的内部,则可能会有风险。此外,它是一个内部(未记录)的API,您不应直接访问它,而不必担心代码不再与将来的Django版本兼容。

Django does not support free group by queries. I learned it in the very bad way. ORM is not designed to support stuff like what you want to do, without using custom SQL. You are limited to:

  • RAW sql (i.e. MyModel.objects.raw())
  • cr.execute sentences (and a hand-made parsing of the result).
  • .annotate() (the group by sentences are performed in the child model for .annotate(), in examples like aggregating lines_count=Count(‘lines’))).

Over a queryset qs you can call qs.query.group_by = ['field1', 'field2', ...] but it is risky if you don’t know what query are you editing and have no guarantee that it will work and not break internals of the QuerySet object. Besides, it is an internal (undocumented) API you should not access directly without risking the code not being anymore compatible with future Django versions.


回答 5

有一个模块可以让您对Django模型进行分组,并仍然在结果中使用QuerySet:https : //github.com/kako-nawao/django-group-by

例如:

from django_group_by import GroupByMixin

class BookQuerySet(QuerySet, GroupByMixin):
    pass

class Book(Model):
    title = TextField(...)
    author = ForeignKey(User, ...)
    shop = ForeignKey(Shop, ...)
    price = DecimalField(...)

class GroupedBookListView(PaginationMixin, ListView):
    template_name = 'book/books.html'
    model = Book
    paginate_by = 100

    def get_queryset(self):
        return Book.objects.group_by('title', 'author').annotate(
            shop_count=Count('shop'), price_avg=Avg('price')).order_by(
            'name', 'author').distinct()

    def get_context_data(self, **kwargs):
        return super().get_context_data(total_count=self.get_queryset().count(), **kwargs)

‘book / books.html’

<ul>
{% for book in object_list %}
    <li>
        <h2>{{ book.title }}</td>
        <p>{{ book.author.last_name }}, {{ book.author.first_name }}</p>
        <p>{{ book.shop_count }}</p>
        <p>{{ book.price_avg }}</p>
    </li>
{% endfor %}
</ul>

annotate/ aggregate基本Django查询的区别在于使用了相关字段的属性,例如book.author.last_name

如果需要已分组在一起的实例的PK,请添加以下注释:

.annotate(pks=ArrayAgg('id'))

注意:ArrayAgg是Postgres特定的功能,可从Django 1.9开始使用:https : //docs.djangoproject.com/en/1.10/ref/contrib/postgres/aggregates/#arrayagg

There is module that allows you to group Django models and still work with a QuerySet in the result: https://github.com/kako-nawao/django-group-by

For example:

from django_group_by import GroupByMixin

class BookQuerySet(QuerySet, GroupByMixin):
    pass

class Book(Model):
    title = TextField(...)
    author = ForeignKey(User, ...)
    shop = ForeignKey(Shop, ...)
    price = DecimalField(...)

class GroupedBookListView(PaginationMixin, ListView):
    template_name = 'book/books.html'
    model = Book
    paginate_by = 100

    def get_queryset(self):
        return Book.objects.group_by('title', 'author').annotate(
            shop_count=Count('shop'), price_avg=Avg('price')).order_by(
            'name', 'author').distinct()

    def get_context_data(self, **kwargs):
        return super().get_context_data(total_count=self.get_queryset().count(), **kwargs)

‘book/books.html’

<ul>
{% for book in object_list %}
    <li>
        <h2>{{ book.title }}</td>
        <p>{{ book.author.last_name }}, {{ book.author.first_name }}</p>
        <p>{{ book.shop_count }}</p>
        <p>{{ book.price_avg }}</p>
    </li>
{% endfor %}
</ul>

The difference to the annotate/aggregate basic Django queries is the use of the attributes of a related field, e.g. book.author.last_name.

If you need the PKs of the instances that have been grouped together, add the following annotation:

.annotate(pks=ArrayAgg('id'))

NOTE: ArrayAgg is a Postgres specific function, available from Django 1.9 onwards: https://docs.djangoproject.com/en/1.10/ref/contrib/postgres/aggregates/#arrayagg


回答 6

文档说您可以使用值对queryset进行分组。

class Travel(models.Model):
    interest = models.ForeignKey(Interest)
    user = models.ForeignKey(User)
    time = models.DateTimeField(auto_now_add=True)

# Find the travel and group by the interest:

>>> Travel.objects.values('interest').annotate(Count('user'))
<QuerySet [{'interest': 5, 'user__count': 2}, {'interest': 6, 'user__count': 1}]>
# the interest(id=5) had been visited for 2 times, 
# and the interest(id=6) had only been visited for 1 time.

>>> Travel.objects.values('interest').annotate(Count('user', distinct=True)) 
<QuerySet [{'interest': 5, 'user__count': 1}, {'interest': 6, 'user__count': 1}]>
# the interest(id=5) had been visited by only one person (but this person had 
#  visited the interest for 2 times

您可以找到所有书籍,并使用以下代码按名称分组:

Book.objects.values('name').annotate(Count('id')).order_by() # ensure you add the order_by()

你可以在这里看一些指南。

The document says that you can use values to group the queryset .

class Travel(models.Model):
    interest = models.ForeignKey(Interest)
    user = models.ForeignKey(User)
    time = models.DateTimeField(auto_now_add=True)

# Find the travel and group by the interest:

>>> Travel.objects.values('interest').annotate(Count('user'))
<QuerySet [{'interest': 5, 'user__count': 2}, {'interest': 6, 'user__count': 1}]>
# the interest(id=5) had been visited for 2 times, 
# and the interest(id=6) had only been visited for 1 time.

>>> Travel.objects.values('interest').annotate(Count('user', distinct=True)) 
<QuerySet [{'interest': 5, 'user__count': 1}, {'interest': 6, 'user__count': 1}]>
# the interest(id=5) had been visited by only one person (but this person had 
#  visited the interest for 2 times

You can find all the books and group them by name using this code:

Book.objects.values('name').annotate(Count('id')).order_by() # ensure you add the order_by()

You can watch some cheet sheet here.


回答 7

如果我没有记错的话,可以使用what -query-set .group_by = [‘ field ‘]

If I’m not mistaking you can use, whatever-query-set.group_by=[‘field‘]


回答 8

from django.db.models import Sum
Members.objects.annotate(total=Sum(designation))

首先,您需要导入Sum,然后..

from django.db.models import Sum
Members.objects.annotate(total=Sum(designation))

first you need to import Sum then ..