从pandas数据框转换为html时,如何在html中显示完整(非截断)的数据框信息?

问题:从pandas数据框转换为html时,如何在html中显示完整(非截断)的数据框信息?

我使用该DataFrame.to_html函数将pandas数据框转换为html输出。当我将其保存到单独的html文件中时,该文件显示截断的输出。

例如,在我的“文字”列中,

df.head(1) 将会呈现

这部电影是很棒的努力。

代替

这部电影是在解构这一时期盛行的复杂社会情绪方面做出的出色努力。

在大屏幕熊猫数据框的屏幕友好格式的情况下,这种表示形式很好,但是我需要一个HTML文件,该文件将显示数据框中包含的完整表格数据,即,将显示后一个文本元素而不是前一段文字。

如何在html版本的信息的TEXT列中显示每个元素的完整,不截断的文本数据?我可以想象html表必须显示长单元格才能显示完整的数据,但是据我所知,只能将列宽参数传递给该DataFrame.to_html函数。

I converted a pandas dataframe to an html output using the DataFrame.to_html function. When I save this to a separate html file, the file shows truncated output.

For example, in my TEXT column,

df.head(1) will show

The film was an excellent effort…

instead of

The film was an excellent effort in deconstructing the complex social sentiments that prevailed during this period.

This rendition is fine in the case of a screen-friendly format of a massive pandas dataframe, but I need an html file that will show complete tabular data contained in the dataframe, that is, something that will show the latter text element rather than the former text snippet.

How would I be able to show the complete, non-truncated text data for each element in my TEXT column in the html version of the information? I would imagine that the html table would have to display long cells to show the complete data, but as far as I understand, only column-width parameters can be passed into the DataFrame.to_html function.


回答 0

display.max_colwidth选项设置为-1

pd.set_option('display.max_colwidth', -1)

set_option docs

例如,在iPython中,我们看到信息被截断为50个字符。多余的部分省略:

如果设置该display.max_colwidth选项,则信息将完整显示:

Set the display.max_colwidth option to -1:

pd.set_option('display.max_colwidth', -1)

set_option docs

For example, in iPython, we see that the information is truncated to 50 characters. Anything in excess is ellipsized:

If you set the display.max_colwidth option, the information will be displayed fully:


回答 1

pd.set_option('display.max_columns', None)  

id (第二个参数)可以完全显示列。

pd.set_option('display.max_columns', None)  

id (second argument) can fully show the columns.


回答 2

虽然pd.set_option('display.max_columns', None)套所示的最大列数,选项pd.set_option('display.max_colwidth', -1)设置每个单个场的最大宽度。

出于我的目的,我编写了一个小的辅助函数,以在不影响其余代码的情况下完全打印大型数据帧,还重新格式化了浮点数并设置了虚拟显示宽度。您可以在用例中采用它。

def print_full(x):
    pd.set_option('display.max_rows', len(x))
    pd.set_option('display.max_columns', None)
    pd.set_option('display.width', 2000)
    pd.set_option('display.float_format', '{:20,.2f}'.format)
    pd.set_option('display.max_colwidth', None)
    print(x)
    pd.reset_option('display.max_rows')
    pd.reset_option('display.max_columns')
    pd.reset_option('display.width')
    pd.reset_option('display.float_format')
    pd.reset_option('display.max_colwidth')

While pd.set_option('display.max_columns', None) sets the number of the maximum columns shown, the option pd.set_option('display.max_colwidth', -1) sets the maximum width of each single field.

For my purposes I wrote a small helper function to fully print huge data frames without affecting the rest of the code, it also reformats float numbers and sets the virtual display width. You may adopt it for your use cases.

def print_full(x):
    pd.set_option('display.max_rows', None)
    pd.set_option('display.max_columns', None)
    pd.set_option('display.width', 2000)
    pd.set_option('display.float_format', '{:20,.2f}'.format)
    pd.set_option('display.max_colwidth', None)
    print(x)
    pd.reset_option('display.max_rows')
    pd.reset_option('display.max_columns')
    pd.reset_option('display.width')
    pd.reset_option('display.float_format')
    pd.reset_option('display.max_colwidth')

回答 3

对于那些希望这样做的人。我在dask中找不到类似的选项,但是如果我只是在同一本笔记本中为熊猫做这件事,那么它也适用于dask。

import pandas as pd
import dask.dataframe as dd
pd.set_option('display.max_colwidth', -1) # This will set the no truncate for pandas as well as for dask. Not sure how it does for dask though. but it works

train_data = dd.read_csv('./data/train.csv')    
train_data.head(5)

For those looking to do this in dask. I could not find a similar option in dask but if I simply do this in same notebook for pandas it works for dask too.

import pandas as pd
import dask.dataframe as dd
pd.set_option('display.max_colwidth', -1) # This will set the no truncate for pandas as well as for dask. Not sure how it does for dask though. but it works

train_data = dd.read_csv('./data/train.csv')    
train_data.head(5)

回答 4

以下代码导致以下错误:

pd.set_option('display.max_colwidth', -1)

FutureWarning:在1.0版中不建议使用传递负整数,在将来的版本中将不支持。而是使用“无”不限制列宽。

而是使用:

pd.set_option('display.max_colwidth', None)

这样就完成了任务并符合1.0 版之后的熊猫版本。

The following code results in the error below:

pd.set_option('display.max_colwidth', -1)

FutureWarning: Passing a negative integer is deprecated in version 1.0 and will not be supported in future version. Instead, use None to not limit the column width.

Instead, use:

pd.set_option('display.max_colwidth', None)

This accomplishes the task and complies with versions of pandas following version 1.0.