问题:漂亮地打印整个Pandas系列/ DataFrame

我在终端上经常使用Series和DataFrames。__repr__系列的默认值返回精简的样本,具有一些头和尾值,但其余部分丢失。

有没有一种内置方法可以漂亮地打印整个Series / DataFrame?理想情况下,它将支持适当的对齐方式,可能支持列之间的边界,甚至可能对不同列进行颜色编码。

I work with Series and DataFrames on the terminal a lot. The default __repr__ for a Series returns a reduced sample, with some head and tail values, but the rest missing.

Is there a builtin way to pretty-print the entire Series / DataFrame? Ideally, it would support proper alignment, perhaps borders between columns, and maybe even color-coding for the different columns.


回答 0

您也可以将option_context,与一个或多个选项一起使用:

with pd.option_context('display.max_rows', None, 'display.max_columns', None):  # more options can be specified also
    print(df)

这将使选项自动返回其先前的值。

如果您正在使用jupyter-notebook,则使用display(df)代替print(df)将使用jupyter rich的显示逻辑(像这样)

You can also use the option_context, with one or more options:

with pd.option_context('display.max_rows', None, 'display.max_columns', None):  # more options can be specified also
    print(df)

This will automatically return the options to their previous values.

If you are working on jupyter-notebook, using display(df) instead of print(df) will use jupyter rich display logic (like so).


回答 1

无需修改设置。有一个简单的方法:

print(df.to_string())

No need to hack settings. There is a simple way:

print(df.to_string())

回答 2

当然,如果出现很多情况,请创建一个像这样的功能。您甚至可以将其配置为在每次启动IPython时加载:https : //ipython.org/ipython-doc/1/config/overview.html

def print_full(x):
    pd.set_option('display.max_rows', len(x))
    print(x)
    pd.reset_option('display.max_rows')

至于颜色,过于精致的颜色听起来适得其反,但我同意类似引导程序的方法.table-striped会很好。您总是可以创建一个问题来建议该功能。

Sure, if this comes up a lot, make a function like this one. You can even configure it to load every time you start IPython: https://ipython.org/ipython-doc/1/config/overview.html

def print_full(x):
    pd.set_option('display.max_rows', len(x))
    print(x)
    pd.reset_option('display.max_rows')

As for coloring, getting too elaborate with colors sounds counterproductive to me, but I agree something like bootstrap’s .table-striped would be nice. You could always create an issue to suggest this feature.


回答 3

导入熊猫后,作为使用上下文管理器的替代方法,请设置以下选项以显示整个数据框:

pd.set_option('display.max_columns', None)  # or 1000
pd.set_option('display.max_rows', None)  # or 1000
pd.set_option('display.max_colwidth', -1)  # or 199

有关有用选项的完整列表,请参见:

pd.describe_option('display')

After importing pandas, as an alternative to using the context manager, set such options for displaying entire dataframes:

pd.set_option('display.max_columns', None)  # or 1000
pd.set_option('display.max_rows', None)  # or 1000
pd.set_option('display.max_colwidth', -1)  # or 199

For full list of useful options, see:

pd.describe_option('display')

回答 4

使用列表包:

pip install tabulate

并考虑以下示例用法:

import pandas as pd
from io import StringIO
from tabulate import tabulate

c = """Chromosome Start End
chr1 3 6
chr1 5 7
chr1 8 9"""

df = pd.read_table(StringIO(c), sep="\s+", header=0)

print(tabulate(df, headers='keys', tablefmt='psql'))

+----+--------------+---------+-------+
|    | Chromosome   |   Start |   End |
|----+--------------+---------+-------|
|  0 | chr1         |       3 |     6 |
|  1 | chr1         |       5 |     7 |
|  2 | chr1         |       8 |     9 |
+----+--------------+---------+-------+

Use the tabulate package:

pip install tabulate

And consider the following example usage:

import pandas as pd
from io import StringIO
from tabulate import tabulate

c = """Chromosome Start End
chr1 3 6
chr1 5 7
chr1 8 9"""

df = pd.read_table(StringIO(c), sep="\s+", header=0)

print(tabulate(df, headers='keys', tablefmt='psql'))

+----+--------------+---------+-------+
|    | Chromosome   |   Start |   End |
|----+--------------+---------+-------|
|  0 | chr1         |       3 |     6 |
|  1 | chr1         |       5 |     7 |
|  2 | chr1         |       8 |     9 |
+----+--------------+---------+-------+

回答 5

如果您使用的是Ipython Notebook(Jupyter)。您可以使用HTML

from IPython.core.display import HTML
display(HTML(df.to_html()))

If you are using Ipython Notebook (Jupyter). You can use HTML

from IPython.core.display import HTML
display(HTML(df.to_html()))

回答 6

使用 pd.options.display

此答案是lucidyan先前的答案的变形。通过避免使用,可以使代码更具可读性set_option

导入熊猫后,作为使用上下文管理器的替代方法,请设置以下选项以显示大型数据框:

def set_pandas_display_options() -> None:
    # Ref: https://stackoverflow.com/a/52432757/
    display = pd.options.display

    display.max_columns = 1000
    display.max_rows = 1000
    display.max_colwidth = 199
    display.width = None
    # display.precision = 2  # set as needed

set_pandas_display_options()

此后,您可以使用display(df)或仅df在使用笔记本时使用,否则print(df)

使用 to_string

熊猫0.25.3确实具有DataFrame.to_stringSeries.to_string接受格式化选项的方法。

使用 to_markdown

如果您需要markdown输出,则Pandas 1.0.0具有DataFrame.to_markdownSeries.to_markdown方法。

使用 to_html

如果您需要的是HTML输出,Pandas 0.25.3确实提供了一种方法,但没有Series.to_html。请注意,Series可以将a 转换DataFrame

Using pd.options.display

This answer is a variation of the prior answer by lucidyan. It makes the code more readable by avoiding the use of set_option.

After importing pandas, as an alternative to using the context manager, set such options for displaying large dataframes:

def set_pandas_display_options() -> None:
    # Ref: https://stackoverflow.com/a/52432757/
    display = pd.options.display

    display.max_columns = 1000
    display.max_rows = 1000
    display.max_colwidth = 199
    display.width = None
    # display.precision = 2  # set as needed

set_pandas_display_options()

After this, you can use either display(df) or just df if using a notebook, otherwise print(df).

Using to_string

Pandas 0.25.3 does have DataFrame.to_string and Series.to_string methods which accept formatting options.

Using to_markdown

If what you need is markdown output, Pandas 1.0.0 has DataFrame.to_markdown and Series.to_markdown methods.

Using to_html

If what you need is HTML output, Pandas 0.25.3 does have a method but not a Series.to_html. Note that a Series can be converted to a DataFrame.


回答 7

尝试这个

pd.set_option('display.height',1000)
pd.set_option('display.max_rows',500)
pd.set_option('display.max_columns',500)
pd.set_option('display.width',1000)

Try this

pd.set_option('display.height',1000)
pd.set_option('display.max_rows',500)
pd.set_option('display.max_columns',500)
pd.set_option('display.width',1000)

回答 8

您可以使用以下方法来实现。只要通过总编号。DataFrame中以arg形式存在的列数

‘display.max_columns’

例如:

df= DataFrame(..)
with pd.option_context('display.max_rows', None, 'display.max_columns', df.shape[1]):
    print(df)

You can achieve this using below method. just pass the total no. of columns present in the DataFrame as arg to

‘display.max_columns’

For eg :

df= DataFrame(..)
with pd.option_context('display.max_rows', None, 'display.max_columns', df.shape[1]):
    print(df)

回答 9

尝试使用display()函数。这将自动使用水平和垂直滚动条,并且您可以轻松显示不同的数据集,而无需使用print()。

display(dataframe)

display()也支持正确的对齐方式。

但是,如果要使数据集更漂亮,可以进行检查pd.option_context()。它有很多选项可以清楚地显示数据框。

注意-我正在使用Jupyter笔记本。

Try using display() function. This would automatically use Horizontal and vertical scroll bars and with this you can display different datasets easily instead of using print().

display(dataframe)

display() supports proper alignment also.

However if you want to make the dataset more beautiful you can check pd.option_context(). It has lot of options to clearly show the dataframe.

Note – I am using Jupyter Notebooks.


声明:本站所有文章,如无特殊说明或标注,均为本站原创发布。任何个人或组织,在未征得本站同意时,禁止复制、盗用、采集、发布本站内容到任何网站、书籍等各类媒体平台。如若本站内容侵犯了原著者的合法权益,可联系我们进行处理。