标签归档:scalar

python:如何识别变量是数组还是标量

问题:python:如何识别变量是数组还是标量

我有一个接受参数的函数NBins。我想用标量50或数组对此函数进行调用[0, 10, 20, 30]。我如何识别函数的长度NBins是多少?或换句话说,如果它是标量或向量?

我尝试了这个:

>>> N=[2,3,5]
>>> P = 5
>>> len(N)
3
>>> len(P)
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
TypeError: object of type 'int' has no len()
>>> 

正如你看到的,我不能申请lenP,因为它不是一个数组….有什么样isarrayisscalar在Python?

谢谢

I have a function that takes the argument NBins. I want to make a call to this function with a scalar 50 or an array [0, 10, 20, 30]. How can I identify within the function, what the length of NBins is? or said differently, if it is a scalar or a vector?

I tried this:

>>> N=[2,3,5]
>>> P = 5
>>> len(N)
3
>>> len(P)
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
TypeError: object of type 'int' has no len()
>>> 

As you see, I can’t apply len to P, since it’s not an array…. Is there something like isarray or isscalar in python?

thanks


回答 0

>>> isinstance([0, 10, 20, 30], list)
True
>>> isinstance(50, list)
False

要支持任何类型的序列,请选中collections.Sequence而不是list

注意isinstance还支持一个元组类,type(x) in (..., ...)应避免检查,这是不必要的。

您可能还想检查 not isinstance(x, (str, unicode))

>>> isinstance([0, 10, 20, 30], list)
True
>>> isinstance(50, list)
False

To support any type of sequence, check collections.Sequence instead of list.

note: isinstance also supports a tuple of classes, check type(x) in (..., ...) should be avoided and is unnecessary.

You may also wanna check not isinstance(x, (str, unicode))


回答 1

先前的答案假定该数组是python标准列表。作为经常使用numpy的人,我建议使用以下Python测试:

if hasattr(N, "__len__")

Previous answers assume that the array is a python standard list. As someone who uses numpy often, I’d recommend a very pythonic test of:

if hasattr(N, "__len__")

回答 2

将@jamylak和@ jpaddison3的答案结合在一起,如果您需要对作为输入的numpy数组保持鲁棒性,并以与列表相同的方式处理它们,则应使用

import numpy as np
isinstance(P, (list, tuple, np.ndarray))

对于list,tuple和numpy数组的子类,这是可靠的。

而且,如果您还想对序列的所有其他子类(不仅是列表和元组)具有鲁棒性,请使用

import collections
import numpy as np
isinstance(P, (collections.Sequence, np.ndarray))

为什么要用这种方法isinstance而不是type(P)与目标值进行比较?这是一个示例,我们制作并研究NewListlist的一个琐碎子类的行为。

>>> class NewList(list):
...     isThisAList = '???'
... 
>>> x = NewList([0,1])
>>> y = list([0,1])
>>> print x
[0, 1]
>>> print y
[0, 1]
>>> x==y
True
>>> type(x)
<class '__main__.NewList'>
>>> type(x) is list
False
>>> type(y) is list
True
>>> type(x).__name__
'NewList'
>>> isinstance(x, list)
True

尽管xy比较平等,通过处理它们type会导致不同的行为。然而,由于x是的子类的实例list,使用isinstance(x,list)得到所需的行为和治疗xy以相同的方式。

Combining @jamylak and @jpaddison3’s answers together, if you need to be robust against numpy arrays as the input and handle them in the same way as lists, you should use

import numpy as np
isinstance(P, (list, tuple, np.ndarray))

This is robust against subclasses of list, tuple and numpy arrays.

And if you want to be robust against all other subclasses of sequence as well (not just list and tuple), use

import collections
import numpy as np
isinstance(P, (collections.Sequence, np.ndarray))

Why should you do things this way with isinstance and not compare type(P) with a target value? Here is an example, where we make and study the behaviour of NewList, a trivial subclass of list.

>>> class NewList(list):
...     isThisAList = '???'
... 
>>> x = NewList([0,1])
>>> y = list([0,1])
>>> print x
[0, 1]
>>> print y
[0, 1]
>>> x==y
True
>>> type(x)
<class '__main__.NewList'>
>>> type(x) is list
False
>>> type(y) is list
True
>>> type(x).__name__
'NewList'
>>> isinstance(x, list)
True

Despite x and y comparing as equal, handling them by type would result in different behaviour. However, since x is an instance of a subclass of list, using isinstance(x,list) gives the desired behaviour and treats x and y in the same manner.


回答 3

numpy中有与isscalar()等效的东西吗?是。

>>> np.isscalar(3.1)
True
>>> np.isscalar([3.1])
False
>>> np.isscalar(False)
True

Is there an equivalent to isscalar() in numpy? Yes.

>>> np.isscalar(3.1)
True
>>> np.isscalar([3.1])
False
>>> np.isscalar(False)
True

回答 4

虽然@jamylak的方法更好,但这是另一种方法

>>> N=[2,3,5]
>>> P = 5
>>> type(P) in (tuple, list)
False
>>> type(N) in (tuple, list)
True

While, @jamylak’s approach is the better one, here is an alternative approach

>>> N=[2,3,5]
>>> P = 5
>>> type(P) in (tuple, list)
False
>>> type(N) in (tuple, list)
True

回答 5

另一种替代方法(使用类属性):

N = [2,3,5]
P = 5

type(N).__name__ == 'list'
True

type(P).__name__ == 'int'
True

type(N).__name__ in ('list', 'tuple')
True

无需导入任何东西。

Another alternative approach (use of class name property):

N = [2,3,5]
P = 5

type(N).__name__ == 'list'
True

type(P).__name__ == 'int'
True

type(N).__name__ in ('list', 'tuple')
True

No need to import anything.


回答 6

这是我找到的最佳方法:检查__len__和的存在__getitem__

您可能会问为什么?原因包括:

  1. 该流行方法isinstance(obj, abc.Sequence)在某些对象(包括PyTorch的Tensor)上失败,因为它们未实现__contains__
  2. 不幸的是,Python的collections.abc中没有任何东西可以检查__len__并且__getitem__我认为这是处理类似数组对象的最小方法。
  3. 它适用于列表,元组,ndarray,Tensor等。

因此,事不宜迟:

def is_array_like(obj, string_is_array=False, tuple_is_array=True):
    result = hasattr(obj, "__len__") and hasattr(obj, '__getitem__') 
    if result and not string_is_array and isinstance(obj, (str, abc.ByteString)):
        result = False
    if result and not tuple_is_array and isinstance(obj, tuple):
        result = False
    return result

请注意,我添加了默认参数,因为大多数时候您可能希望将字符串视为值,而不是数组。元组也是如此。

Here is the best approach I have found: Check existence of __len__ and __getitem__.

You may ask why? The reasons includes:

  1. The popular method isinstance(obj, abc.Sequence) fails on some objects including PyTorch’s Tensor because they do not implement __contains__.
  2. Unfortunately, there is nothing in Python’s collections.abc that checks for only __len__ and __getitem__ which I feel are minimal methods for array-like objects.
  3. It works on list, tuple, ndarray, Tensor etc.

So without further ado:

def is_array_like(obj, string_is_array=False, tuple_is_array=True):
    result = hasattr(obj, "__len__") and hasattr(obj, '__getitem__') 
    if result and not string_is_array and isinstance(obj, (str, abc.ByteString)):
        result = False
    if result and not tuple_is_array and isinstance(obj, tuple):
        result = False
    return result

Note that I’ve added default parameters because most of the time you might want to consider strings as values, not arrays. Similarly for tuples.


回答 7

>>> N=[2,3,5]
>>> P = 5
>>> type(P)==type(0)
True
>>> type([1,2])==type(N)
True
>>> type(P)==type([1,2])
False
>>> N=[2,3,5]
>>> P = 5
>>> type(P)==type(0)
True
>>> type([1,2])==type(N)
True
>>> type(P)==type([1,2])
False

回答 8

您可以检查变量的数据类型。

N = [2,3,5]
P = 5
type(P)

它将以P的数据类型输出。

<type 'int'>

这样就可以区分它是整数还是数组。

You can check data type of variable.

N = [2,3,5]
P = 5
type(P)

It will give you out put as data type of P.

<type 'int'>

So that you can differentiate that it is an integer or an array.


回答 9

令我惊讶的是,这样的基本问题似乎在python中没有即时的答案。在我看来,几乎所有建议的答案都使用某种类型检查,通常在python中不建议这样做,并且它们似乎仅限于特定情况(它们因使用不同的数字类型或非元组或列表的通用可迭代对象而失败)。

对我来说,更好的方法是导入numpy并使用array.size,例如:

>>> a=1
>>> np.array(a)
Out[1]: array(1)

>>> np.array(a).size
Out[2]: 1

>>> np.array([1,2]).size
Out[3]: 2

>>> np.array('125')
Out[4]: 1

另请注意:

>>> len(np.array([1,2]))

Out[5]: 2

但:

>>> len(np.array(a))
---------------------------------------------------------------------------
TypeError                                 Traceback (most recent call last)
<ipython-input-40-f5055b93f729> in <module>()
----> 1 len(np.array(a))

TypeError: len() of unsized object

I am surprised that such a basic question doesn’t seem to have an immediate answer in python. It seems to me that nearly all proposed answers use some kind of type checking, that is usually not advised in python and they seem restricted to a specific case (they fail with different numerical types or generic iteratable objects that are not tuples or lists).

For me, what works better is importing numpy and using array.size, for example:

>>> a=1
>>> np.array(a)
Out[1]: array(1)

>>> np.array(a).size
Out[2]: 1

>>> np.array([1,2]).size
Out[3]: 2

>>> np.array('125')
Out[4]: 1

Note also:

>>> len(np.array([1,2]))

Out[5]: 2

but:

>>> len(np.array(a))
---------------------------------------------------------------------------
TypeError                                 Traceback (most recent call last)
<ipython-input-40-f5055b93f729> in <module>()
----> 1 len(np.array(a))

TypeError: len() of unsized object

回答 10

只需使用size代替len

>>> from numpy import size
>>> N = [2, 3, 5]
>>> size(N)
3
>>> N = array([2, 3, 5])
>>> size(N)
3
>>> P = 5
>>> size(P)
1

Simply use size instead of len!

>>> from numpy import size
>>> N = [2, 3, 5]
>>> size(N)
3
>>> N = array([2, 3, 5])
>>> size(N)
3
>>> P = 5
>>> size(P)
1

回答 11

preds_test [0]的形状为(128,128,1),让我们使用isinstance()函数检查其数据类型isinstance接受2个参数。第一个参数是数据第二个参数是数据类型isinstance(preds_test [0],np.ndarray)给出Output为True。这意味着preds_test [0]是一个数组。

preds_test[0] is of shape (128,128,1) Lets check its data type using isinstance() function isinstance takes 2 arguments. 1st argument is data 2nd argument is data type isinstance(preds_test[0], np.ndarray) gives Output as True. It means preds_test[0] is an array.


回答 12

为了回答标题中的问题,判断变量是否为标量的直接方法是尝试将其转换为浮点数。如果得到TypeError,则不是。

N = [1, 2, 3]
try:
    float(N)
except TypeError:
    print('it is not a scalar')
else:
    print('it is a scalar')

To answer the question in the title, a direct way to tell if a variable is a scalar is to try to convert it to a float. If you get TypeError, it’s not.

N = [1, 2, 3]
try:
    float(N)
except TypeError:
    print('it is not a scalar')
else:
    print('it is a scalar')

从变量中的值构造pandas DataFrame会得到“ ValueError:如果使用所有标量值,则必须传递索引”

问题:从变量中的值构造pandas DataFrame会得到“ ValueError:如果使用所有标量值,则必须传递索引”

这可能是一个简单的问题,但是我不知道该怎么做。可以说我有两个变量,如下所示。

a = 2
b = 3

我想从中构造一个DataFrame:

df2 = pd.DataFrame({'A':a,'B':b})

这会产生一个错误:

ValueError:如果使用所有标量值,则必须传递索引

我也尝试过这个:

df2 = (pd.DataFrame({'a':a,'b':b})).reset_index()

这给出了相同的错误消息。

This may be a simple question, but I can not figure out how to do this. Lets say that I have two variables as follows.

a = 2
b = 3

I want to construct a DataFrame from this:

df2 = pd.DataFrame({'A':a,'B':b})

This generates an error:

ValueError: If using all scalar values, you must pass an index

I tried this also:

df2 = (pd.DataFrame({'a':a,'b':b})).reset_index()

This gives the same error message.


回答 0

错误消息指出,如果要传递标量值,则必须传递索引。因此,您不能对列使用标量值-例如,使用列表:

>>> df = pd.DataFrame({'A': [a], 'B': [b]})
>>> df
   A  B
0  2  3

或使用标量值并传递索引:

>>> df = pd.DataFrame({'A': a, 'B': b}, index=[0])
>>> df
   A  B
0  2  3

The error message says that if you’re passing scalar values, you have to pass an index. So you can either not use scalar values for the columns — e.g. use a list:

>>> df = pd.DataFrame({'A': [a], 'B': [b]})
>>> df
   A  B
0  2  3

or use scalar values and pass an index:

>>> df = pd.DataFrame({'A': a, 'B': b}, index=[0])
>>> df
   A  B
0  2  3

回答 1

pd.DataFrame.from_records当您已经有了字典时,也可以使用以下方法更方便:

df = pd.DataFrame.from_records([{ 'A':a,'B':b }])

您还可以根据需要通过以下方式设置索引:

df = pd.DataFrame.from_records([{ 'A':a,'B':b }], index='A')

You can also use pd.DataFrame.from_records which is more convenient when you already have the dictionary in hand:

df = pd.DataFrame.from_records([{ 'A':a,'B':b }])

You can also set index, if you want, by:

df = pd.DataFrame.from_records([{ 'A':a,'B':b }], index='A')

回答 2

您需要首先创建一个熊猫系列。第二步是将熊猫系列转换为熊猫数据框。

import pandas as pd
data = {'a': 1, 'b': 2}
pd.Series(data).to_frame()

您甚至可以提供列名。

pd.Series(data).to_frame('ColumnName')

You need to create a pandas series first. The second step is to convert the pandas series to pandas dataframe.

import pandas as pd
data = {'a': 1, 'b': 2}
pd.Series(data).to_frame()

You can even provide a column name.

pd.Series(data).to_frame('ColumnName')

回答 3

您可以尝试将字典包装到列表中

my_dict = {'A':1,'B':2}

pd.DataFrame([my_dict])

   A  B
0  1  2

You may try wrapping your dictionary in to list

my_dict = {'A':1,'B':2}

pd.DataFrame([my_dict])

   A  B
0  1  2

回答 4

也许Series将提供您需要的所有功能:

pd.Series({'A':a,'B':b})

可以将DataFrame视为Series的集合,因此您可以:

  • 连接多个系列到一个数据帧(如所描述的在这里

  • 将Series变量添加到现有数据框中(此处示例

Maybe Series would provide all the functions you need:

pd.Series({'A':a,'B':b})

DataFrame can be thought of as a collection of Series hence you can :

  • Concatenate multiple Series into one data frame (as described here )

  • Add a Series variable into existing data frame ( example here )


回答 5

您需要提供可迭代项作为Pandas DataFrame列的值:

df2 = pd.DataFrame({'A':[a],'B':[b]})

You need to provide iterables as the values for the Pandas DataFrame columns:

df2 = pd.DataFrame({'A':[a],'B':[b]})

回答 6

我对numpy数组有同样的问题,解决方案是将它们展平:

data = {
    'b': array1.flatten(),
    'a': array2.flatten(),
}

df = pd.DataFrame(data)

I had the same problem with numpy arrays and the solution is to flatten them:

data = {
    'b': array1.flatten(),
    'a': array2.flatten(),
}

df = pd.DataFrame(data)

回答 7

如果要转换标量字典,则必须包含一个索引:

import pandas as pd

alphabets = {'A': 'a', 'B': 'b'}
index = [0]
alphabets_df = pd.DataFrame(alphabets, index=index)
print(alphabets_df)

尽管列表字典不需要索引,但是可以将相同的概念扩展为列表字典:

planets = {'planet': ['earth', 'mars', 'jupiter'], 'length_of_day': ['1', '1.03', '0.414']}
index = [0, 1, 2]
planets_df = pd.DataFrame(planets, index=index)
print(planets_df)

当然,对于列表字典,您可以构建不带索引的数据框:

planets_df = pd.DataFrame(planets)
print(planets_df)

If you intend to convert a dictionary of scalars, you have to include an index:

import pandas as pd

alphabets = {'A': 'a', 'B': 'b'}
index = [0]
alphabets_df = pd.DataFrame(alphabets, index=index)
print(alphabets_df)

Although index is not required for a dictionary of lists, the same idea can be expanded to a dictionary of lists:

planets = {'planet': ['earth', 'mars', 'jupiter'], 'length_of_day': ['1', '1.03', '0.414']}
index = [0, 1, 2]
planets_df = pd.DataFrame(planets, index=index)
print(planets_df)

Of course, for the dictionary of lists, you can build the dataframe without an index:

planets_df = pd.DataFrame(planets)
print(planets_df)

回答 8

您可以尝试:

df2 = pd.DataFrame.from_dict({'a':a,'b':b}, orient = 'index')

从’orient’参数的文档中:如果传递的dict的键应该是结果DataFrame的列,请传递’columns’(默认值)。否则,如果键应该是行,则传递“ index”。

You could try:

df2 = pd.DataFrame.from_dict({'a':a,'b':b}, orient = 'index')

From the documentation on the ‘orient’ argument: If the keys of the passed dict should be the columns of the resulting DataFrame, pass ‘columns’ (default). Otherwise if the keys should be rows, pass ‘index’.


回答 9

熊猫魔术在工作。一切逻辑都搞定了。

错误消息"ValueError: If using all scalar values, you must pass an index"说您必须传递索引。

这并不一定意味着传递索引会使熊猫按照自己的意愿去做

传递索引时,pandas会将字典键视为列名,并将值视为列中索引中每个值应包含的值。

a = 2
b = 3
df2 = pd.DataFrame({'A':a,'B':b}, index=[1])

    A   B
1   2   3

传递更大的索引:

df2 = pd.DataFrame({'A':a,'B':b}, index=[1, 2, 3, 4])

    A   B
1   2   3
2   2   3
3   2   3
4   2   3

如果没有给出索引,则通常由数据框自动生成索引。然而,大熊猫不知道多少行23你想要的。但是,您可以对此更加明确

df2 = pd.DataFrame({'A':[a]*4,'B':[b]*4})
df2

    A   B
0   2   3
1   2   3
2   2   3
3   2   3

但是默认索引是基于0的。

我建议在创建数据框时始终将列表字典传递给数据框构造函数。对于其他开发人员来说更容易阅读。Pandas有很多警告,不要让其他开发人员必须要拥有所有这些方面的专家才能阅读您的代码。

Pandas magic at work. All logic is out.

The error message "ValueError: If using all scalar values, you must pass an index" Says you must pass an index.

This does not necessarily mean passing an index makes pandas do what you want it to do

When you pass an index, pandas will treat your dictionary keys as column names and the values as what the column should contain for each of the values in the index.

a = 2
b = 3
df2 = pd.DataFrame({'A':a,'B':b}, index=[1])

    A   B
1   2   3

Passing a larger index:

df2 = pd.DataFrame({'A':a,'B':b}, index=[1, 2, 3, 4])

    A   B
1   2   3
2   2   3
3   2   3
4   2   3

An index is usually automatically generated by a dataframe when none is given. However, pandas does not know how many rows of 2 and 3 you want. You can however be more explicit about it

df2 = pd.DataFrame({'A':[a]*4,'B':[b]*4})
df2

    A   B
0   2   3
1   2   3
2   2   3
3   2   3

The default index is 0 based though.

I would recommend always passing a dictionary of lists to the dataframe constructor when creating dataframes. It’s easier to read for other developers. Pandas has a lot of caveats, don’t make other developers have to experts in all of them in order to read your code.


回答 10

输入不必是记录列表,也可以是单个字典:

pd.DataFrame.from_records({'a':1,'b':2}, index=[0])
   a  b
0  1  2

这似乎等效于:

pd.DataFrame({'a':1,'b':2}, index=[0])
   a  b
0  1  2

the input does not have to be a list of records – it can be a single dictionary as well:

pd.DataFrame.from_records({'a':1,'b':2}, index=[0])
   a  b
0  1  2

Which seems to be equivalent to:

pd.DataFrame({'a':1,'b':2}, index=[0])
   a  b
0  1  2

回答 11

这是因为DataFrame具有两个直观的维度-列行。

您仅使用字典键指定列。

如果只想指定一维数据,请使用系列!

This is because a DataFrame has two intuitive dimensions – the columns and the rows.

You are only specifying the columns using the dictionary keys.

If you only want to specify one dimensional data, use a Series!


回答 12

将字典转换为数据框

col_dict_df = pd.Series(col_dict).to_frame('new_col').reset_index()

为列命名

col_dict_df.columns = ['col1', 'col2']

Convert Dictionary to Data Frame

col_dict_df = pd.Series(col_dict).to_frame('new_col').reset_index()

Give new name to Column

col_dict_df.columns = ['col1', 'col2']

回答 13

如果您有字典,则可以使用以下代码将其转换为熊猫数据框:

pd.DataFrame({"key": d.keys(), "value": d.values()})

If you have a dictionary you can turn it into a pandas data frame with the following line of code:

pd.DataFrame({"key": d.keys(), "value": d.values()})

回答 14

只需将字典传递给列表即可:

a = 2
b = 3
df2 = pd.DataFrame([{'A':a,'B':b}])

Just pass the dict on a list:

a = 2
b = 3
df2 = pd.DataFrame([{'A':a,'B':b}])