问题:从变量中的值构造pandas DataFrame会得到“ ValueError:如果使用所有标量值,则必须传递索引”

这可能是一个简单的问题,但是我不知道该怎么做。可以说我有两个变量,如下所示。

a = 2
b = 3

我想从中构造一个DataFrame:

df2 = pd.DataFrame({'A':a,'B':b})

这会产生一个错误:

ValueError:如果使用所有标量值,则必须传递索引

我也尝试过这个:

df2 = (pd.DataFrame({'a':a,'b':b})).reset_index()

这给出了相同的错误消息。

This may be a simple question, but I can not figure out how to do this. Lets say that I have two variables as follows.

a = 2
b = 3

I want to construct a DataFrame from this:

df2 = pd.DataFrame({'A':a,'B':b})

This generates an error:

ValueError: If using all scalar values, you must pass an index

I tried this also:

df2 = (pd.DataFrame({'a':a,'b':b})).reset_index()

This gives the same error message.


回答 0

错误消息指出,如果要传递标量值,则必须传递索引。因此,您不能对列使用标量值-例如,使用列表:

>>> df = pd.DataFrame({'A': [a], 'B': [b]})
>>> df
   A  B
0  2  3

或使用标量值并传递索引:

>>> df = pd.DataFrame({'A': a, 'B': b}, index=[0])
>>> df
   A  B
0  2  3

The error message says that if you’re passing scalar values, you have to pass an index. So you can either not use scalar values for the columns — e.g. use a list:

>>> df = pd.DataFrame({'A': [a], 'B': [b]})
>>> df
   A  B
0  2  3

or use scalar values and pass an index:

>>> df = pd.DataFrame({'A': a, 'B': b}, index=[0])
>>> df
   A  B
0  2  3

回答 1

pd.DataFrame.from_records当您已经有了字典时,也可以使用以下方法更方便:

df = pd.DataFrame.from_records([{ 'A':a,'B':b }])

您还可以根据需要通过以下方式设置索引:

df = pd.DataFrame.from_records([{ 'A':a,'B':b }], index='A')

You can also use pd.DataFrame.from_records which is more convenient when you already have the dictionary in hand:

df = pd.DataFrame.from_records([{ 'A':a,'B':b }])

You can also set index, if you want, by:

df = pd.DataFrame.from_records([{ 'A':a,'B':b }], index='A')

回答 2

您需要首先创建一个熊猫系列。第二步是将熊猫系列转换为熊猫数据框。

import pandas as pd
data = {'a': 1, 'b': 2}
pd.Series(data).to_frame()

您甚至可以提供列名。

pd.Series(data).to_frame('ColumnName')

You need to create a pandas series first. The second step is to convert the pandas series to pandas dataframe.

import pandas as pd
data = {'a': 1, 'b': 2}
pd.Series(data).to_frame()

You can even provide a column name.

pd.Series(data).to_frame('ColumnName')

回答 3

您可以尝试将字典包装到列表中

my_dict = {'A':1,'B':2}

pd.DataFrame([my_dict])

   A  B
0  1  2

You may try wrapping your dictionary in to list

my_dict = {'A':1,'B':2}

pd.DataFrame([my_dict])

   A  B
0  1  2

回答 4

也许Series将提供您需要的所有功能:

pd.Series({'A':a,'B':b})

可以将DataFrame视为Series的集合,因此您可以:

  • 连接多个系列到一个数据帧(如所描述的在这里

  • 将Series变量添加到现有数据框中(此处示例

Maybe Series would provide all the functions you need:

pd.Series({'A':a,'B':b})

DataFrame can be thought of as a collection of Series hence you can :

  • Concatenate multiple Series into one data frame (as described here )

  • Add a Series variable into existing data frame ( example here )


回答 5

您需要提供可迭代项作为Pandas DataFrame列的值:

df2 = pd.DataFrame({'A':[a],'B':[b]})

You need to provide iterables as the values for the Pandas DataFrame columns:

df2 = pd.DataFrame({'A':[a],'B':[b]})

回答 6

我对numpy数组有同样的问题,解决方案是将它们展平:

data = {
    'b': array1.flatten(),
    'a': array2.flatten(),
}

df = pd.DataFrame(data)

I had the same problem with numpy arrays and the solution is to flatten them:

data = {
    'b': array1.flatten(),
    'a': array2.flatten(),
}

df = pd.DataFrame(data)

回答 7

如果要转换标量字典,则必须包含一个索引:

import pandas as pd

alphabets = {'A': 'a', 'B': 'b'}
index = [0]
alphabets_df = pd.DataFrame(alphabets, index=index)
print(alphabets_df)

尽管列表字典不需要索引,但是可以将相同的概念扩展为列表字典:

planets = {'planet': ['earth', 'mars', 'jupiter'], 'length_of_day': ['1', '1.03', '0.414']}
index = [0, 1, 2]
planets_df = pd.DataFrame(planets, index=index)
print(planets_df)

当然,对于列表字典,您可以构建不带索引的数据框:

planets_df = pd.DataFrame(planets)
print(planets_df)

If you intend to convert a dictionary of scalars, you have to include an index:

import pandas as pd

alphabets = {'A': 'a', 'B': 'b'}
index = [0]
alphabets_df = pd.DataFrame(alphabets, index=index)
print(alphabets_df)

Although index is not required for a dictionary of lists, the same idea can be expanded to a dictionary of lists:

planets = {'planet': ['earth', 'mars', 'jupiter'], 'length_of_day': ['1', '1.03', '0.414']}
index = [0, 1, 2]
planets_df = pd.DataFrame(planets, index=index)
print(planets_df)

Of course, for the dictionary of lists, you can build the dataframe without an index:

planets_df = pd.DataFrame(planets)
print(planets_df)

回答 8

您可以尝试:

df2 = pd.DataFrame.from_dict({'a':a,'b':b}, orient = 'index')

从’orient’参数的文档中:如果传递的dict的键应该是结果DataFrame的列,请传递’columns’(默认值)。否则,如果键应该是行,则传递“ index”。

You could try:

df2 = pd.DataFrame.from_dict({'a':a,'b':b}, orient = 'index')

From the documentation on the ‘orient’ argument: If the keys of the passed dict should be the columns of the resulting DataFrame, pass ‘columns’ (default). Otherwise if the keys should be rows, pass ‘index’.


回答 9

熊猫魔术在工作。一切逻辑都搞定了。

错误消息"ValueError: If using all scalar values, you must pass an index"说您必须传递索引。

这并不一定意味着传递索引会使熊猫按照自己的意愿去做

传递索引时,pandas会将字典键视为列名,并将值视为列中索引中每个值应包含的值。

a = 2
b = 3
df2 = pd.DataFrame({'A':a,'B':b}, index=[1])

    A   B
1   2   3

传递更大的索引:

df2 = pd.DataFrame({'A':a,'B':b}, index=[1, 2, 3, 4])

    A   B
1   2   3
2   2   3
3   2   3
4   2   3

如果没有给出索引,则通常由数据框自动生成索引。然而,大熊猫不知道多少行23你想要的。但是,您可以对此更加明确

df2 = pd.DataFrame({'A':[a]*4,'B':[b]*4})
df2

    A   B
0   2   3
1   2   3
2   2   3
3   2   3

但是默认索引是基于0的。

我建议在创建数据框时始终将列表字典传递给数据框构造函数。对于其他开发人员来说更容易阅读。Pandas有很多警告,不要让其他开发人员必须要拥有所有这些方面的专家才能阅读您的代码。

Pandas magic at work. All logic is out.

The error message "ValueError: If using all scalar values, you must pass an index" Says you must pass an index.

This does not necessarily mean passing an index makes pandas do what you want it to do

When you pass an index, pandas will treat your dictionary keys as column names and the values as what the column should contain for each of the values in the index.

a = 2
b = 3
df2 = pd.DataFrame({'A':a,'B':b}, index=[1])

    A   B
1   2   3

Passing a larger index:

df2 = pd.DataFrame({'A':a,'B':b}, index=[1, 2, 3, 4])

    A   B
1   2   3
2   2   3
3   2   3
4   2   3

An index is usually automatically generated by a dataframe when none is given. However, pandas does not know how many rows of 2 and 3 you want. You can however be more explicit about it

df2 = pd.DataFrame({'A':[a]*4,'B':[b]*4})
df2

    A   B
0   2   3
1   2   3
2   2   3
3   2   3

The default index is 0 based though.

I would recommend always passing a dictionary of lists to the dataframe constructor when creating dataframes. It’s easier to read for other developers. Pandas has a lot of caveats, don’t make other developers have to experts in all of them in order to read your code.


回答 10

输入不必是记录列表,也可以是单个字典:

pd.DataFrame.from_records({'a':1,'b':2}, index=[0])
   a  b
0  1  2

这似乎等效于:

pd.DataFrame({'a':1,'b':2}, index=[0])
   a  b
0  1  2

the input does not have to be a list of records – it can be a single dictionary as well:

pd.DataFrame.from_records({'a':1,'b':2}, index=[0])
   a  b
0  1  2

Which seems to be equivalent to:

pd.DataFrame({'a':1,'b':2}, index=[0])
   a  b
0  1  2

回答 11

这是因为DataFrame具有两个直观的维度-列行。

您仅使用字典键指定列。

如果只想指定一维数据,请使用系列!

This is because a DataFrame has two intuitive dimensions – the columns and the rows.

You are only specifying the columns using the dictionary keys.

If you only want to specify one dimensional data, use a Series!


回答 12

将字典转换为数据框

col_dict_df = pd.Series(col_dict).to_frame('new_col').reset_index()

为列命名

col_dict_df.columns = ['col1', 'col2']

Convert Dictionary to Data Frame

col_dict_df = pd.Series(col_dict).to_frame('new_col').reset_index()

Give new name to Column

col_dict_df.columns = ['col1', 'col2']

回答 13

如果您有字典,则可以使用以下代码将其转换为熊猫数据框:

pd.DataFrame({"key": d.keys(), "value": d.values()})

If you have a dictionary you can turn it into a pandas data frame with the following line of code:

pd.DataFrame({"key": d.keys(), "value": d.values()})

回答 14

只需将字典传递给列表即可:

a = 2
b = 3
df2 = pd.DataFrame([{'A':a,'B':b}])

Just pass the dict on a list:

a = 2
b = 3
df2 = pd.DataFrame([{'A':a,'B':b}])

声明:本站所有文章,如无特殊说明或标注,均为本站原创发布。任何个人或组织,在未征得本站同意时,禁止复制、盗用、采集、发布本站内容到任何网站、书籍等各类媒体平台。如若本站内容侵犯了原著者的合法权益,可联系我们进行处理。