问题:漂亮地打印整个Pandas系列/ DataFrame
我在终端上经常使用Series和DataFrames。__repr__
系列的默认值返回精简的样本,具有一些头和尾值,但其余部分丢失。
有没有一种内置方法可以漂亮地打印整个Series / DataFrame?理想情况下,它将支持适当的对齐方式,可能支持列之间的边界,甚至可能对不同列进行颜色编码。
I work with Series and DataFrames on the terminal a lot. The default __repr__
for a Series returns a reduced sample, with some head and tail values, but the rest missing.
Is there a builtin way to pretty-print the entire Series / DataFrame? Ideally, it would support proper alignment, perhaps borders between columns, and maybe even color-coding for the different columns.
回答 0
您也可以将option_context
,与一个或多个选项一起使用:
with pd.option_context('display.max_rows', None, 'display.max_columns', None): # more options can be specified also
print(df)
这将使选项自动返回其先前的值。
如果您正在使用jupyter-notebook,则使用display(df)
代替print(df)
将使用jupyter rich的显示逻辑(像这样)。
You can also use the option_context
, with one or more options:
with pd.option_context('display.max_rows', None, 'display.max_columns', None): # more options can be specified also
print(df)
This will automatically return the options to their previous values.
If you are working on jupyter-notebook, using display(df)
instead of print(df)
will use jupyter rich display logic (like so).
回答 1
无需修改设置。有一个简单的方法:
print(df.to_string())
No need to hack settings. There is a simple way:
print(df.to_string())
回答 2
Sure, if this comes up a lot, make a function like this one. You can even configure it to load every time you start IPython: https://ipython.org/ipython-doc/1/config/overview.html
def print_full(x):
pd.set_option('display.max_rows', len(x))
print(x)
pd.reset_option('display.max_rows')
As for coloring, getting too elaborate with colors sounds counterproductive to me, but I agree something like bootstrap’s .table-striped
would be nice. You could always create an issue to suggest this feature.
回答 3
导入熊猫后,作为使用上下文管理器的替代方法,请设置以下选项以显示整个数据框:
pd.set_option('display.max_columns', None) # or 1000
pd.set_option('display.max_rows', None) # or 1000
pd.set_option('display.max_colwidth', -1) # or 199
有关有用选项的完整列表,请参见:
pd.describe_option('display')
After importing pandas, as an alternative to using the context manager, set such options for displaying entire dataframes:
pd.set_option('display.max_columns', None) # or 1000
pd.set_option('display.max_rows', None) # or 1000
pd.set_option('display.max_colwidth', -1) # or 199
For full list of useful options, see:
pd.describe_option('display')
回答 4
使用列表包:
pip install tabulate
并考虑以下示例用法:
import pandas as pd
from io import StringIO
from tabulate import tabulate
c = """Chromosome Start End
chr1 3 6
chr1 5 7
chr1 8 9"""
df = pd.read_table(StringIO(c), sep="\s+", header=0)
print(tabulate(df, headers='keys', tablefmt='psql'))
+----+--------------+---------+-------+
| | Chromosome | Start | End |
|----+--------------+---------+-------|
| 0 | chr1 | 3 | 6 |
| 1 | chr1 | 5 | 7 |
| 2 | chr1 | 8 | 9 |
+----+--------------+---------+-------+
Use the tabulate package:
pip install tabulate
And consider the following example usage:
import pandas as pd
from io import StringIO
from tabulate import tabulate
c = """Chromosome Start End
chr1 3 6
chr1 5 7
chr1 8 9"""
df = pd.read_table(StringIO(c), sep="\s+", header=0)
print(tabulate(df, headers='keys', tablefmt='psql'))
+----+--------------+---------+-------+
| | Chromosome | Start | End |
|----+--------------+---------+-------|
| 0 | chr1 | 3 | 6 |
| 1 | chr1 | 5 | 7 |
| 2 | chr1 | 8 | 9 |
+----+--------------+---------+-------+
回答 5
如果您使用的是Ipython Notebook(Jupyter)。您可以使用HTML
from IPython.core.display import HTML
display(HTML(df.to_html()))
If you are using Ipython Notebook (Jupyter). You can use HTML
from IPython.core.display import HTML
display(HTML(df.to_html()))
回答 6
使用 pd.options.display
此答案是lucidyan先前的答案的变形。通过避免使用,可以使代码更具可读性set_option
。
导入熊猫后,作为使用上下文管理器的替代方法,请设置以下选项以显示大型数据框:
def set_pandas_display_options() -> None:
# Ref: https://stackoverflow.com/a/52432757/
display = pd.options.display
display.max_columns = 1000
display.max_rows = 1000
display.max_colwidth = 199
display.width = None
# display.precision = 2 # set as needed
set_pandas_display_options()
此后,您可以使用display(df)
或仅df
在使用笔记本时使用,否则print(df)
。
使用 to_string
熊猫0.25.3确实具有DataFrame.to_string
和Series.to_string
接受格式化选项的方法。
使用 to_markdown
如果您需要markdown输出,则Pandas 1.0.0具有DataFrame.to_markdown
和Series.to_markdown
方法。
使用 to_html
如果您需要的是HTML输出,Pandas 0.25.3确实提供了一种DataFrame.to_html
方法,但没有Series.to_html
。请注意,Series
可以将a 转换为DataFrame
。
Using pd.options.display
This answer is a variation of the prior answer by lucidyan. It makes the code more readable by avoiding the use of set_option
.
After importing pandas, as an alternative to using the context manager, set such options for displaying large dataframes:
def set_pandas_display_options() -> None:
# Ref: https://stackoverflow.com/a/52432757/
display = pd.options.display
display.max_columns = 1000
display.max_rows = 1000
display.max_colwidth = 199
display.width = None
# display.precision = 2 # set as needed
set_pandas_display_options()
After this, you can use either display(df)
or just df
if using a notebook, otherwise print(df)
.
Using to_string
Pandas 0.25.3 does have DataFrame.to_string
and Series.to_string
methods which accept formatting options.
Using to_markdown
If what you need is markdown output, Pandas 1.0.0 has DataFrame.to_markdown
and Series.to_markdown
methods.
Using to_html
If what you need is HTML output, Pandas 0.25.3 does have a DataFrame.to_html
method but not a Series.to_html
. Note that a Series
can be converted to a DataFrame
.
回答 7
尝试这个
pd.set_option('display.height',1000)
pd.set_option('display.max_rows',500)
pd.set_option('display.max_columns',500)
pd.set_option('display.width',1000)
Try this
pd.set_option('display.height',1000)
pd.set_option('display.max_rows',500)
pd.set_option('display.max_columns',500)
pd.set_option('display.width',1000)
回答 8
您可以使用以下方法来实现。只要通过总编号。DataFrame中以arg形式存在的列数
‘display.max_columns’
例如:
df= DataFrame(..)
with pd.option_context('display.max_rows', None, 'display.max_columns', df.shape[1]):
print(df)
You can achieve this using below method. just pass the total no. of columns present in the DataFrame as arg to
‘display.max_columns’
For eg :
df= DataFrame(..)
with pd.option_context('display.max_rows', None, 'display.max_columns', df.shape[1]):
print(df)
回答 9
尝试使用display()函数。这将自动使用水平和垂直滚动条,并且您可以轻松显示不同的数据集,而无需使用print()。
display(dataframe)
display()也支持正确的对齐方式。
但是,如果要使数据集更漂亮,可以进行检查pd.option_context()
。它有很多选项可以清楚地显示数据框。
注意-我正在使用Jupyter笔记本。
Try using display() function. This would automatically use Horizontal and vertical scroll bars and with this you can display different datasets easily instead of using print().
display(dataframe)
display() supports proper alignment also.
However if you want to make the dataset more beautiful you can check pd.option_context()
. It has lot of options to clearly show the dataframe.
Note – I am using Jupyter Notebooks.