将熊猫数据框转换为序列

Question 1

我对熊猫有些陌生。我有一个熊猫数据框，它是1行乘23列。

我想将其转换为系列吗？我想知道最pythonic的方法是什么？

我已经尝试过了，pd.Series(myResults)但是抱怨ValueError: cannot copy sequence with size 23 to array axis with dimension 1。它还不够聪明，无法意识到它仍然是数学上的“向量”。

谢谢！

Question 2

I’m somewhat new to pandas. I have a pandas data frame that is 1 row by 23 columns.

I want to convert this into a series? I’m wondering what the most pythonic way to do this is?

I’ve tried pd.Series(myResults) but it complains ValueError: cannot copy sequence with size 23 to array axis with dimension 1. It’s not smart enough to realize it’s still a “vector” in math terms.

Thanks!

Question 3

它还不够聪明，无法意识到它仍然是数学上的“向量”。

可以说它足够聪明，可以识别尺寸差异。:-)

我认为您可以做的最简单的事情是使用位置选择该行iloc，这将为您提供一个Series，其列作为新索引，值作为值：

>>> df = pd.DataFrame([list(range(5))], columns=["a{}".format(i) for i in range(5)])
>>> df
   a0  a1  a2  a3  a4
0   0   1   2   3   4
>>> df.iloc[0]
a0    0
a1    1
a2    2
a3    3
a4    4
Name: 0, dtype: int64
>>> type(_)
<class 'pandas.core.series.Series'>

Question 4

It’s not smart enough to realize it’s still a “vector” in math terms.

Say rather that it’s smart enough to recognize a difference in dimensionality. :-)

I think the simplest thing you can do is select that row positionally using iloc, which gives you a Series with the columns as the new index and the values as the values:

>>> df = pd.DataFrame([list(range(5))], columns=["a{}".format(i) for i in range(5)])
>>> df
   a0  a1  a2  a3  a4
0   0   1   2   3   4
>>> df.iloc[0]
a0    0
a1    1
a2    2
a3    3
a4    4
Name: 0, dtype: int64
>>> type(_)
<class 'pandas.core.series.Series'>

Question 5

您可以转置单行数据框（仍会生成一个数据框），然后将结果压缩为一系列（与相反to_frame）。

df = pd.DataFrame([list(range(5))], columns=["a{}".format(i) for i in range(5)])

>>> df.T.squeeze()  # Or more simply, df.squeeze() for a single row dataframe.
a0    0
a1    1
a2    2
a3    3
a4    4
Name: 0, dtype: int64

注意：为了适应@IanS提出的观点（即使不是OP的问题），请测试数据框的大小。我假设这df是一个数据框，但是边缘情况是一个空的数据框，一个形状为（1，1）的数据框以及一个具有多行的数据框，在这种情况下，使用应实现其所需的功能。

if df.empty:
    # Empty dataframe, so convert to empty Series.
    result = pd.Series()
elif df.shape == (1, 1)
    # DataFrame with one value, so convert to series with appropriate index.
    result = pd.Series(df.iat[0, 0], index=df.columns)
elif len(df) == 1:
    # Convert to series per OP's question.
    result = df.T.squeeze()
else:
    # Dataframe with multiple rows.  Implement desired behavior.
    pass

也可以按照@themachinist提供的答案进行简化。

if len(df) > 1:
    # Dataframe with multiple rows.  Implement desired behavior.
    pass
else:
    result = pd.Series() if df.empty else df.iloc[0, :]

Question 6

You can transpose the single-row dataframe (which still results in a dataframe) and then squeeze the results into a series (the inverse of to_frame).

df = pd.DataFrame([list(range(5))], columns=["a{}".format(i) for i in range(5)])

>>> df.T.squeeze()  # Or more simply, df.squeeze() for a single row dataframe.
a0    0
a1    1
a2    2
a3    3
a4    4
Name: 0, dtype: int64

Note: To accommodate the point raised by @IanS (even though it is not in the OP’s question), test for the dataframe’s size. I am assuming that df is a dataframe, but the edge cases are an empty dataframe, a dataframe of shape (1, 1), and a dataframe with more than one row in which case the use should implement their desired functionality.

if df.empty:
    # Empty dataframe, so convert to empty Series.
    result = pd.Series()
elif df.shape == (1, 1)
    # DataFrame with one value, so convert to series with appropriate index.
    result = pd.Series(df.iat[0, 0], index=df.columns)
elif len(df) == 1:
    # Convert to series per OP's question.
    result = df.T.squeeze()
else:
    # Dataframe with multiple rows.  Implement desired behavior.
    pass

This can also be simplified along the lines of the answer provided by @themachinist.

if len(df) > 1:
    # Dataframe with multiple rows.  Implement desired behavior.
    pass
else:
    result = pd.Series() if df.empty else df.iloc[0, :]

Question 7

您可以使用以下两种方法之一对数据框进行切片来检索系列：

http://pandas.pydata.org/pandas-docs/stable/generation/pandas.DataFrame.iloc.html http://pandas.pydata.org/pandas-docs/stable/generation/pandas.DataFrame.loc.html

import pandas as pd
import numpy as np
df = pd.DataFrame(data=np.random.randn(1,8))

series1=df.iloc[0,:]
type(series1)
pandas.core.series.Series

Question 8

You can retrieve the series through slicing your dataframe using one of these two methods:

http://pandas.pydata.org/pandas-docs/stable/generated/pandas.DataFrame.iloc.html http://pandas.pydata.org/pandas-docs/stable/generated/pandas.DataFrame.loc.html

import pandas as pd
import numpy as np
df = pd.DataFrame(data=np.random.randn(1,8))

series1=df.iloc[0,:]
type(series1)
pandas.core.series.Series

Question 9

其他方式 –

假设myResult是包含1 col和23行形式的数据的dataFrame

// label your columns by passing a list of names
myResult.columns = ['firstCol']

// fetch the column in this way, which will return you a series
myResult = myResult['firstCol']

print(type(myResult))

以类似的方式，您可以从具有多个列的Dataframe中获得序列。

Question 10

Another way –

Suppose myResult is the dataFrame that contains your data in the form of 1 col and 23 rows

// label your columns by passing a list of names
myResult.columns = ['firstCol']

// fetch the column in this way, which will return you a series
myResult = myResult['firstCol']

print(type(myResult))

In similar fashion, you can get series from Dataframe with multiple columns.

Question 11

您也可以使用stack（）

df= DataFrame([list(range(5))], columns = [“a{}”.format(I) for I in range(5)])

在您运行df之后，请运行：

df.stack()

您获得系列数据

Question 12

You can also use stack()

df= DataFrame([list(range(5))], columns = [“a{}”.format(I) for I in range(5)])

After u run df, then run:

df.stack()

You obtain your dataframe in series

Question 13

data = pd.DataFrame({"a":[1,2,3,34],"b":[5,6,7,8]})
new_data = pd.melt(data)
new_data.set_index("variable", inplace=True)

这给出了一个带有索引的数据框，作为数据的列名，并且所有数据都在“值”列中

Question 14

data = pd.DataFrame({"a":[1,2,3,34],"b":[5,6,7,8]})
new_data = pd.melt(data)
new_data.set_index("variable", inplace=True)

This gives a dataframe with index as column name of data and all data are present in “values” column

将熊猫数据框转换为序列

问题：将熊猫数据框转换为序列

回答 0

回答 1

回答 2

回答 3

回答 4

回答 5

排行榜展示

Python 情人节超强技能导出微信聊天记录生成词云

你不得不知道的python超级文献批量搜索下载工具

Python 流程图 — 一键转化代码为流程图

7行代码 Python热力图可视化分析缺失数据处理

Python 优化—算出每条语句执行时间

你的10W块放哪里能赚最多钱？

文章展示

使用典型的测试目录结构运行unittest

使用请求包时发生SSL InsecurePlatform错误

如何在Python中获取Linux控制台窗口宽度

如何并行化一个简单的Python循环？

Pandas 性能优化全方位实战教程

快速计数正整数中的非零位的方法

将熊猫数据框转换为序列

问题：将熊猫数据框转换为序列

回答 0

回答 1

回答 2

回答 3

回答 4

回答 5

相关文章

排行榜展示

文章展示