如何打印分组对象

Question 1

I want to print the result of grouping with Pandas.

I have a dataframe:

import pandas as pd
df = pd.DataFrame({'A': ['one', 'one', 'two', 'three', 'three', 'one'], 'B': range(6)})
print(df)

       A  B
0    one  0
1    one  1
2    two  2
3  three  3
4  three  4
5    one  5

When printing after grouping by ‘A’ I have the following:

print(df.groupby('A'))

<pandas.core.groupby.DataFrameGroupBy object at 0x05416E90>

How can I print the dataframe grouped?

If I do:

print(df.groupby('A').head())

I obtain the dataframe as if it was not grouped:

             A  B
A                
one   0    one  0
      1    one  1
two   2    two  2
three 3  three  3
      4  three  4
one   5    one  5

I was expecting something like:

             A  B
A                
one   0    one  0
      1    one  1
      5    one  5
two   2    two  2
three 3  three  3
      4  three  4

Question 2

Simply do:

grouped_df = df.groupby('A')

for key, item in grouped_df:
    print(grouped_df.get_group(key), "\n\n")

This also works,

grouped_df = df.groupby('A')    
gb = grouped_df.groups

for key, values in gb.iteritems():
    print(df.ix[values], "\n\n")

For selective key grouping: Insert the keys you want inside the key_list_from_gb, in following, using gb.keys(): For Example,

gb = grouped_df.groups
gb.keys()

key_list_from_gb = [key1, key2, key3]

for key, values in gb.items():
    if key in key_list_from_gb:
        print(df.ix[values], "\n")

Question 3

If you’re simply looking for a way to display it, you could use describe():

grp = df.groupby['colName']
grp.describe()

This gives you a neat table.

Question 4

I confirmed that the behavior of head() changes between version 0.12 and 0.13. That looks like a bug to me. I created an issue.

But a groupby operation doesn’t actually return a DataFrame sorted by group. The .head() method is a little misleading here — it’s just a convenience feature to let you re-examine the object (in this case, df) that you grouped. The result of groupby is separate kind of object, a GroupBy object. You must apply, transform, or filter to get back to a DataFrame or Series.

If all you wanted to do was sort by the values in columns A, you should use df.sort('A').

Question 5

Another simple alternative:

for name_of_the_group, group in grouped_dataframe:
   print (name_of_the_group)
   print (group)

Question 6

Also, other simple alternative could be:

gb = df.groupby("A")
gb.count() # or,
gb.get_group(your_key)

Question 7

In addition to previous answers:

Taking your example,

df = pd.DataFrame({'A': ['one', 'one', 'two', 'three', 'three', 'one'], 'B': range(6)})

Then simple 1 line code

df.groupby('A').apply(print)

Question 8

Thanks to Surya for good insights. I’d clean up his solution and simply do:

for key, value in df.groupby('A'):
    print(key, value)

Question 9

you cannot see the groupBy data directly by print statement but you can see by iterating over the group using for loop try this code to see the group by data

group = df.groupby('A') #group variable contains groupby data
for A,A_df in group: # A is your column and A_df is group of one kind at a time
  print(A)
  print(A_df)

you will get an output after trying this as a groupby result

I hope it helps

Question 10

Call list() on the GroupBy object

print(list(df.groupby('A')))

gives you:

[('one',      A  B
0  one  0
1  one  1
5  one  5), ('three',        A  B
3  three  3
4  three  4), ('two',      A  B
2  two  2)]

Question 11

In Jupyter Notebook, if you do the following, it prints a nice grouped version of the object. The apply method helps in creation of a multiindex dataframe.

by = 'A'  # groupby 'by' argument
df.groupby(by).apply(lambda a: a[:])

Output:

             A  B
A                
one   0    one  0
      1    one  1
      5    one  5
three 3  three  3
      4  three  4
two   2    two  2

If you want the by column(s) to not appear in the output, just drop the column(s), like so.

df.groupby(by).apply(lambda a: a.drop(by, axis=1)[:])

Output:

Here, I am not sure as to why .iloc[:] does not work instead of [:] at the end. So, if there are some issues in future due to updates (or at present), .iloc[:len(a)] also works.

Question 12

I found a tricky way, just for brainstorm, see the code:

df['a'] = df['A']  # create a shadow column for MultiIndexing
df.sort_values('A', inplace=True)
df.set_index(["A","a"], inplace=True)
print(df)

the output:

             B
A     a
one   one    0
      one    1
      one    5
three three  3
      three  4
two   two    2

The pros is so easy to print, as it returns a dataframe, instead of Groupby Object. And the output looks nice. While the con is that it create a series of redundant data.

Question 13

In python 3

k = None
for name_of_the_group, group in dict(df_group):
    if(k != name_of_the_group):
        print ('\n', name_of_the_group)
        print('..........','\n')
    print (group)
    k = name_of_the_group

In more interactive way

Question 14

to print all (or arbitrarily many) lines of the grouped df:

import pandas as pd
pd.set_option('display.max_rows', 500)

grouped_df = df.group(['var1', 'var2'])
print(grouped_df)

如何打印分组对象

问题：如何打印分组对象

回答 0

回答 1

回答 2

回答 3

回答 4

回答 5

回答 6

回答 7

回答 8

回答 9

回答 10

回答 11

回答 12

排行榜展示

Python 情人节超强技能导出微信聊天记录生成词云

你不得不知道的python超级文献批量搜索下载工具

7行代码 Python热力图可视化分析缺失数据处理

Python 流程图 — 一键转化代码为流程图

Python 优化—算出每条语句执行时间

你的10W块放哪里能赚最多钱？

文章展示

如何在Django中设置时区？

提取每个子列表的第一项

应用程序未拾取.css文件（烧瓶/ python）

熊猫：设置编号。最大行数

在进程运行时不断打印子进程输出

Pydantic-使用Python类型提示进行数据解析和验证

如何打印分组对象

问题：如何打印分组对象

回答 0

回答 1

回答 2

回答 3

回答 4

回答 5

回答 6

回答 7

回答 8

回答 9

回答 10

回答 11

回答 12

相关文章

排行榜展示

文章展示