标签归档:arrays

如何在NumPy中创建一个空数组/矩阵?

问题:如何在NumPy中创建一个空数组/矩阵?

我无法弄清楚如何以通常使用列表的方式使用数组或矩阵。我想创建一个空数组(或矩阵),然后一次向其中添加一列(或行)。

目前,我能找到的唯一方法是:

mat = None
for col in columns:
    if mat is None:
        mat = col
    else:
        mat = hstack((mat, col))

而如果这是一个列表,我会做这样的事情:

list = []
for item in data:
    list.append(item)

有没有办法对NumPy数组或矩阵使用这种表示法?

I can’t figure out how to use an array or matrix in the way that I would normally use a list. I want to create an empty array (or matrix) and then add one column (or row) to it at a time.

At the moment the only way I can find to do this is like:

mat = None
for col in columns:
    if mat is None:
        mat = col
    else:
        mat = hstack((mat, col))

Whereas if it were a list, I’d do something like this:

list = []
for item in data:
    list.append(item)

Is there a way to use that kind of notation for NumPy arrays or matrices?


回答 0

您对有效使用NumPy的思维模式有误。NumPy数组存储在连续的内存块中。如果要向现有阵列添加行或列,则需要将整个阵列复制到新的内存块中,从而为要存储的新元素创建间隙。如果反复进行以构建阵列,则效率非常低下。

在添加行的情况下,最好的选择是创建一个与数据集最终大小一样大的数组,然后逐行向其中添加数据:

>>> import numpy
>>> a = numpy.zeros(shape=(5,2))
>>> a
array([[ 0.,  0.],
   [ 0.,  0.],
   [ 0.,  0.],
   [ 0.,  0.],
   [ 0.,  0.]])
>>> a[0] = [1,2]
>>> a[1] = [2,3]
>>> a
array([[ 1.,  2.],
   [ 2.,  3.],
   [ 0.,  0.],
   [ 0.,  0.],
   [ 0.,  0.]])

You have the wrong mental model for using NumPy efficiently. NumPy arrays are stored in contiguous blocks of memory. If you want to add rows or columns to an existing array, the entire array needs to be copied to a new block of memory, creating gaps for the new elements to be stored. This is very inefficient if done repeatedly to build an array.

In the case of adding rows, your best bet is to create an array that is as big as your data set will eventually be, and then assign data to it row-by-row:

>>> import numpy
>>> a = numpy.zeros(shape=(5,2))
>>> a
array([[ 0.,  0.],
   [ 0.,  0.],
   [ 0.,  0.],
   [ 0.,  0.],
   [ 0.,  0.]])
>>> a[0] = [1,2]
>>> a[1] = [2,3]
>>> a
array([[ 1.,  2.],
   [ 2.,  3.],
   [ 0.,  0.],
   [ 0.,  0.],
   [ 0.,  0.]])

回答 1

NumPy数组是与列表非常不同的数据结构,旨在以不同的方式使用。您的使用hstack可能效率很低…每次调用它时,现有数组中的所有数据都将被复制到一个新数组中。(该append函数会有相同的问题。)如果您想一次建立一列矩阵,最好将其保留在列表中直到完成,然后再将其转换为数组。

例如


mylist = []
for item in data:
    mylist.append(item)
mat = numpy.array(mylist)

item可以是列表,数组或任何可迭代的,只要每个item元素具有相同数量的元素即可。
在这种特殊情况下(data保存矩阵列有些可迭代),您可以简单地使用


mat = numpy.array(data)

(还请注意,将其list用作变量名可能不是一个好习惯,因为它会用该名称掩盖内置类型,这可能会导致错误。)

编辑:

如果出于某种原因您确实想创建一个空数组,则可以使用 numpy.array([]),但这很少有用!

A NumPy array is a very different data structure from a list and is designed to be used in different ways. Your use of hstack is potentially very inefficient… every time you call it, all the data in the existing array is copied into a new one. (The append function will have the same issue.) If you want to build up your matrix one column at a time, you might be best off to keep it in a list until it is finished, and only then convert it into an array.

e.g.


mylist = []
for item in data:
    mylist.append(item)
mat = numpy.array(mylist)

item can be a list, an array or any iterable, as long as each item has the same number of elements.
In this particular case (data is some iterable holding the matrix columns) you can simply use


mat = numpy.array(data)

(Also note that using list as a variable name is probably not good practice since it masks the built-in type by that name, which can lead to bugs.)

EDIT:

If for some reason you really do want to create an empty array, you can just use numpy.array([]), but this is rarely useful!


回答 2

要在NumPy中创建一个空的多维数组(例如,m*n用于存储矩阵的2D数组),以防万一您不知道m要追加多少行并且不在乎Stephen Simmons提到的计算成本(即重新构建数组),您可以将要附加到的尺寸压缩为0 X = np.empty(shape=[0, n])

例如,您可以使用这种方式(在这里m = 5我们假设在创建空矩阵时我们并不知道,以及n = 2):

import numpy as np

n = 2
X = np.empty(shape=[0, n])

for i in range(5):
    for j  in range(2):
        X = np.append(X, [[i, j]], axis=0)

print X

这将为您提供:

[[ 0.  0.]
 [ 0.  1.]
 [ 1.  0.]
 [ 1.  1.]
 [ 2.  0.]
 [ 2.  1.]
 [ 3.  0.]
 [ 3.  1.]
 [ 4.  0.]
 [ 4.  1.]]

To create an empty multidimensional array in NumPy (e.g. a 2D array m*n to store your matrix), in case you don’t know m how many rows you will append and don’t care about the computational cost Stephen Simmons mentioned (namely re-buildinging the array at each append), you can squeeze to 0 the dimension to which you want to append to: X = np.empty(shape=[0, n]).

This way you can use for example (here m = 5 which we assume we didn’t know when creating the empty matrix, and n = 2):

import numpy as np

n = 2
X = np.empty(shape=[0, n])

for i in range(5):
    for j  in range(2):
        X = np.append(X, [[i, j]], axis=0)

print X

which will give you:

[[ 0.  0.]
 [ 0.  1.]
 [ 1.  0.]
 [ 1.  1.]
 [ 2.  0.]
 [ 2.  1.]
 [ 3.  0.]
 [ 3.  1.]
 [ 4.  0.]
 [ 4.  1.]]

回答 3

我进行了很多研究,因为我需要在我的一个学校项目中使用numpy.array作为集合,并且需要将其初始化为空。在这里,我在Stack Overflow上没有找到任何相关的答案,所以我开始涂鸦的东西。

# Initialize your variable as an empty list first
In [32]: x=[]
# and now cast it as a numpy ndarray
In [33]: x=np.array(x)

结果将是:

In [34]: x
Out[34]: array([], dtype=float64)

因此,您可以按如下所示直接初始化np数组:

In [36]: x= np.array([], dtype=np.float64)

我希望这有帮助。

I looked into this a lot because I needed to use a numpy.array as a set in one of my school projects and I needed to be initialized empty… I didn’t found any relevant answer here on Stack Overflow, so I started doodling something.

# Initialize your variable as an empty list first
In [32]: x=[]
# and now cast it as a numpy ndarray
In [33]: x=np.array(x)

The result will be:

In [34]: x
Out[34]: array([], dtype=float64)

Therefore you can directly initialize an np array as follows:

In [36]: x= np.array([], dtype=np.float64)

I hope this helps.


回答 4

您可以使用附加功能。对于行:

>>> from numpy import *
>>> a = array([10,20,30])
>>> append(a, [[1,2,3]], axis=0)
array([[10, 20, 30],      
       [1, 2, 3]])

对于列:

>>> append(a, [[15],[15]], axis=1)
array([[10, 20, 30, 15],      
       [1, 2, 3, 15]])

编辑
当然,正如其他答案中所述,除非每次对矩阵/数组进行一些处理(例如反转),否则每次将其添加到列表中时,我都会创建一个列表,将其添加到列表中,然后将其转换为数组。

You can use the append function. For rows:

>>> from numpy import *
>>> a = array([10,20,30])
>>> append(a, [[1,2,3]], axis=0)
array([[10, 20, 30],      
       [1, 2, 3]])

For columns:

>>> append(a, [[15],[15]], axis=1)
array([[10, 20, 30, 15],      
       [1, 2, 3, 15]])

EDIT
Of course, as mentioned in other answers, unless you’re doing some processing (ex. inversion) on the matrix/array EVERY time you append something to it, I would just create a list, append to it then convert it to an array.


回答 5

如果您完全不知道数组的最终大小,则可以像这样增加数组的大小:

my_arr = numpy.zeros((0,5))
for i in range(3):
    my_arr=numpy.concatenate( ( my_arr, numpy.ones((1,5)) ) )
print(my_arr)

[[ 1.  1.  1.  1.  1.]  [ 1.  1.  1.  1.  1.]  [ 1.  1.  1.  1.  1.]]
  • 注意0第一行中的。
  • numpy.append是另一种选择。它调用numpy.concatenate

If you absolutely don’t know the final size of the array, you can increment the size of the array like this:

my_arr = numpy.zeros((0,5))
for i in range(3):
    my_arr=numpy.concatenate( ( my_arr, numpy.ones((1,5)) ) )
print(my_arr)

[[ 1.  1.  1.  1.  1.]  [ 1.  1.  1.  1.  1.]  [ 1.  1.  1.  1.  1.]]
  • Notice the 0 in the first line.
  • numpy.append is another option. It calls numpy.concatenate.

回答 6

您可以将其应用于构建任何类型的数组,例如零:

a = range(5)
a = [i*0 for i in a]
print a 
[0, 0, 0, 0, 0]

You can apply it to build any kind of array, like zeros:

a = range(5)
a = [i*0 for i in a]
print a 
[0, 0, 0, 0, 0]

回答 7

这是一些使numpys看起来更像列表的解决方法

np_arr = np.array([])
np_arr = np.append(np_arr , 2)
np_arr = np.append(np_arr , 24)
print(np_arr)

输出:array([2.,24.])

Here is some workaround to make numpys look more like Lists

np_arr = np.array([])
np_arr = np.append(np_arr , 2)
np_arr = np.append(np_arr , 24)
print(np_arr)

OUTPUT: array([ 2., 24.])


回答 8

根据您使用它的目的,您可能需要指定数据类型(请参见‘dtype’)。

例如,要创建一个8位值的2D数组(适合用作单色图像):

myarray = numpy.empty(shape=(H,W),dtype='u1')

对于RGB图像,在形状中包括颜色通道的数量: shape=(H,W,3)

您可能还想考虑使用零初始化,numpy.zeros而不是使用numpy.empty。请参阅此处的注释。

Depending on what you are using this for, you may need to specify the data type (see ‘dtype’).

For example, to create a 2D array of 8-bit values (suitable for use as a monochrome image):

myarray = numpy.empty(shape=(H,W),dtype='u1')

For an RGB image, include the number of color channels in the shape: shape=(H,W,3)

You may also want to consider zero-initializing with numpy.zeros instead of using numpy.empty. See the note here.


回答 9

我认为您想处理列表的大部分工作,然后将结果用作矩阵。也许这是一种方式;

ur_list = []
for col in columns:
    ur_list.append(list(col))

mat = np.matrix(ur_list)

I think you want to handle most of the work with lists then use the result as a matrix. Maybe this is a way ;

ur_list = []
for col in columns:
    ur_list.append(list(col))

mat = np.matrix(ur_list)

回答 10

我认为您可以创建空的numpy数组,例如:

>>> import numpy as np
>>> empty_array= np.zeros(0)
>>> empty_array
array([], dtype=float64)
>>> empty_array.shape
(0,)

当您要在循环中附加numpy数组时,此格式很有用。

I think you can create empty numpy array like:

>>> import numpy as np
>>> empty_array= np.zeros(0)
>>> empty_array
array([], dtype=float64)
>>> empty_array.shape
(0,)

This format is useful when you want to append numpy array in the loop.


回答 11

为了创建一个空的NumPy数组而不定义其形状,有一种方法:

1。

arr = np.array([]) 

首选。因为您知道您将以numpy的形式使用它。

2。

arr = []
# and use it as numpy. append to it or etc..

NumPy之后将其转换为np.ndarray类型,无需额外的操作[] dimionsion

For creating an empty NumPy array without defining its shape there is to way:

1.

arr = np.array([]) 

preferred. cause you know you will be using this as numpy.

2.

arr = []
# and use it as numpy. append to it or etc..

NumPy converts this to np.ndarray type afterward, without extra [] dimionsion.


Numpy的array()和asarray()函数有什么区别?

问题:Numpy的array()和asarray()函数有什么区别?

Numpy array()asarray()函数之间有什么区别?什么时候应该使用一个而不是另一个?他们似乎为我能想到的所有输入生成相同的输出。

What is the difference between Numpy’s array() and asarray() functions? When should you use one rather than the other? They seem to generate identical output for all the inputs I can think of.


回答 0

由于将其他问题重定向到这个询问问题asanyarray其他数组创建例程的问题,因此可能有必要简要概述每个问题的作法。

区别主要在于何时返回不变的输入,而不是将新数组作为副本。

array提供了多种选择(大多数其他功能都是围绕它的薄包装器),包括用于确定何时复制的标志。完整的解释将和文档一样长(请参阅Array Creation,但是简要地,这里有一些示例:

假设andarray,并且mmatrix,并且它们都具有dtypefloat32

  • np.array(a)并且np.array(m)将复制两个,因为这是默认行为。
  • np.array(a, copy=False)并且np.array(m, copy=False)将复制m但不复制a,因为m不是ndarray
  • np.array(a, copy=False, subok=True),并且np.array(m, copy=False, subok=True)不会复制任何内容,因为mmatrix,这是的子类ndarray
  • np.array(a, dtype=int, copy=False, subok=True)将同时复制两者,因为与dtype不兼容。

其他大多数功能都是array在复制发生时围绕该控件的薄包装器:

  • asarray:如果兼容ndarraycopy=False),则输入将返回未复制的状态。
  • asanyarray:如果输入是兼容的ndarray或子类matrix(如copy=Falsesubok=True),则输入将不被复制。
  • ascontiguousarray:如果输入是兼容ndarray的连续C顺序(copy=False,,则将返回未复制的输入order='C')
  • asfortranarray:如果输入与ndarray连续的Fortran顺序(copy=Falseorder='F')兼容,则将返回未复制的输入。
  • require:如果输入与指定的需求字符串兼容,则输入将不复制而返回。
  • copy:总是复制输入。
  • fromiter:输入被视为可迭代的(例如,您可以从迭代器的元素构造数组,而不是object使用迭代器的数组);始终复制。

还有一些便利功能,例如asarray_chkfinite(与复制规则相同asarray,但复制规则与相同,但是ValueError如果有naninf值,则会提高),以及子类的构造函数(例如matrix或特殊情况下的记录数组),当然还有实际的ndarray构造函数(可让您直接创建数组)超出缓冲区)。

Since other questions are being redirected to this one which ask about asanyarray or other array creation routines, it’s probably worth having a brief summary of what each of them does.

The differences are mainly about when to return the input unchanged, as opposed to making a new array as a copy.

array offers a wide variety of options (most of the other functions are thin wrappers around it), including flags to determine when to copy. A full explanation would take just as long as the docs (see Array Creation, but briefly, here are some examples:

Assume a is an ndarray, and m is a matrix, and they both have a dtype of float32:

  • np.array(a) and np.array(m) will copy both, because that’s the default behavior.
  • np.array(a, copy=False) and np.array(m, copy=False) will copy m but not a, because m is not an ndarray.
  • np.array(a, copy=False, subok=True) and np.array(m, copy=False, subok=True) will copy neither, because m is a matrix, which is a subclass of ndarray.
  • np.array(a, dtype=int, copy=False, subok=True) will copy both, because the dtype is not compatible.

Most of the other functions are thin wrappers around array that control when copying happens:

  • asarray: The input will be returned uncopied iff it’s a compatible ndarray (copy=False).
  • asanyarray: The input will be returned uncopied iff it’s a compatible ndarray or subclass like matrix (copy=False, subok=True).
  • ascontiguousarray: The input will be returned uncopied iff it’s a compatible ndarray in contiguous C order (copy=False, order='C').
  • asfortranarray: The input will be returned uncopied iff it’s a compatible ndarray in contiguous Fortran order (copy=False, order='F').
  • require: The input will be returned uncopied iff it’s compatible with the specified requirements string.
  • copy: The input is always copied.
  • fromiter: The input is treated as an iterable (so, e.g., you can construct an array from an iterator’s elements, instead of an object array with the iterator); always copied.

There are also convenience functions, like asarray_chkfinite (same copying rules as asarray, but raises ValueError if there are any nan or inf values), and constructors for subclasses like matrix or for special cases like record arrays, and of course the actual ndarray constructor (which lets you create an array directly out of strides over a buffer).


回答 1

定义asarray是:

def asarray(a, dtype=None, order=None):
    return array(a, dtype, copy=False, order=order)

就像array,除了它的选项更少,和copy=Falsearraycopy=True默认。

主要区别在于array(默认情况下)将复制对象,而asarray除非有必要,否则不会复制。

The definition of asarray is:

def asarray(a, dtype=None, order=None):
    return array(a, dtype, copy=False, order=order)

So it is like array, except it has fewer options, and copy=False. array has copy=True by default.

The main difference is that array (by default) will make a copy of the object, while asarray will not unless necessary.


回答 2

可以通过以下示例证明差异:

  1. 产生矩阵

    >>> A = numpy.matrix(numpy.ones((3,3)))
    >>> A
    matrix([[ 1.,  1.,  1.],
            [ 1.,  1.,  1.],
            [ 1.,  1.,  1.]])
  2. 用于numpy.array修改A。不起作用,因为您正在修改副本

    >>> numpy.array(A)[2]=2
    >>> A
    matrix([[ 1.,  1.,  1.],
            [ 1.,  1.,  1.],
            [ 1.,  1.,  1.]])
  3. 用于numpy.asarray修改A。之所以有效,是因为您正在修改A自己

    >>> numpy.asarray(A)[2]=2
    >>> A
    matrix([[ 1.,  1.,  1.],
            [ 1.,  1.,  1.],
            [ 2.,  2.,  2.]])

希望这可以帮助!

The difference can be demonstrated by this example:

  1. generate a matrix

    >>> A = numpy.matrix(numpy.ones((3,3)))
    >>> A
    matrix([[ 1.,  1.,  1.],
            [ 1.,  1.,  1.],
            [ 1.,  1.,  1.]])
    
  2. use numpy.array to modify A. Doesn’t work because you are modifying a copy

    >>> numpy.array(A)[2]=2
    >>> A
    matrix([[ 1.,  1.,  1.],
            [ 1.,  1.,  1.],
            [ 1.,  1.,  1.]])
    
  3. use numpy.asarray to modify A. It worked because you are modifying A itself

    >>> numpy.asarray(A)[2]=2
    >>> A
    matrix([[ 1.,  1.,  1.],
            [ 1.,  1.,  1.],
            [ 2.,  2.,  2.]])
    

Hope this helps!


回答 3

array和的文档中非常清楚地提到了差异asarray。不同之处在于参数列表,因此取决于这些参数的功能作用。

函数定义为:

numpy.array(object, dtype=None, copy=True, order=None, subok=False, ndmin=0)

numpy.asarray(a, dtype=None, order=None)

以下参数是可能传递给文档的参数,array不是 asarray文档中提到的参数:

copy:bool,可选如果为true(默认),则复制对象。否则,仅当__array__返回一个副本,obj是一个嵌套序列或需要一个副本以满足其他任何要求(dtype,order等)时,才创建副本。

subok:bool,可选如果为True,则子类将被传递,否则返回的数组将被强制为基类数组(默认)。

ndmin:int,可选指定结果数组应具有的最小维数。可以根据需要预先添加形状。

The differences are mentioned quite clearly in the documentation of array and asarray. The differences lie in the argument list and hence the action of the function depending on those parameters.

The function definitions are :

numpy.array(object, dtype=None, copy=True, order=None, subok=False, ndmin=0)

and

numpy.asarray(a, dtype=None, order=None)

The following arguments are those that may be passed to array and not asarray as mentioned in the documentation :

copy : bool, optional If true (default), then the object is copied. Otherwise, a copy will only be made if __array__ returns a copy, if obj is a nested sequence, or if a copy is needed to satisfy any of the other requirements (dtype, order, etc.).

subok : bool, optional If True, then sub-classes will be passed-through, otherwise the returned array will be forced to be a base-class array (default).

ndmin : int, optional Specifies the minimum number of dimensions that the resulting array should have. Ones will be pre-pended to the shape as needed to meet this requirement.


回答 4

这是一个可以证明差异的简单示例。

主要区别在于数组将复制原始数据,并且使用不同的对象,我们可以修改原始数组中的数据。

import numpy as np
a = np.arange(0.0, 10.2, 0.12)
int_cvr = np.asarray(a, dtype = np.int64)

数组(a)中的内容保持不变,但仍然可以使用另一个对象对数据执行任何操作,而无需修改原始数组中的内容。

Here’s a simple example that can demonstrate the difference.

The main difference is that array will make a copy of the original data and using different object we can modify the data in the original array.

import numpy as np
a = np.arange(0.0, 10.2, 0.12)
int_cvr = np.asarray(a, dtype = np.int64)

The contents in array (a), remain untouched, and still, we can perform any operation on the data using another object without modifying the content in original array.


回答 5

asarray(x) 就好像 array(x, copy=False)

asarray(x)当您要确保x在执行任何其他操作之前将其设为数组时使用。如果x已经是数组,则不会进行任何复制。这不会造成冗余的性能损失。

这是确保x首先转换为数组的函数示例。

def mysum(x):
    return np.asarray(x).sum()

asarray(x) is like array(x, copy=False)

Use asarray(x) when you want to ensure that x will be an array before any other operations are done. If x is already an array then no copy would be done. It would not cause a redundant performance hit.

Here is an example of a function that ensure x is converted into an array first.

def mysum(x):
    return np.asarray(x).sum()

如何在Python中声明和添加项目到数组?

问题:如何在Python中声明和添加项目到数组?

我试图将项目添加到python中的数组。

我跑

array = {}

然后,我尝试通过以下操作向此数组添加一些内容:

array.append(valueToBeInserted)

似乎没有.append办法。如何将项目添加到数组?

I’m trying to add items to an array in python.

I run

array = {}

Then, I try to add something to this array by doing:

array.append(valueToBeInserted)

There doesn’t seem to be a .append method for this. How do I add items to an array?


回答 0

{}表示一个空字典,而不是数组/列表。对于列表或数组,您需要[]

要初始化一个空列表,请执行以下操作:

my_list = []

要么

my_list = list()

要将元素添加到列表,请使用 append

my_list.append(12)

extend在列表中包含另一个列表中的元素,请使用extend

my_list.extend([1,2,3,4])
my_list
--> [12,1,2,3,4]

要从列表中删除元素,请使用 remove

my_list.remove(2)

字典表示键/值对的集合,也称为关联数组或映射。

要初始化一个空字典,请使用{}dict()

字典具有键和值

my_dict = {'key':'value', 'another_key' : 0}

要使用其他字典的内容扩展字典,可以使用以下update方法

my_dict.update({'third_key' : 1})

从字典中删除值

del my_dict['key']

{} represents an empty dictionary, not an array/list. For lists or arrays, you need [].

To initialize an empty list do this:

my_list = []

or

my_list = list()

To add elements to the list, use append

my_list.append(12)

To extend the list to include the elements from another list use extend

my_list.extend([1,2,3,4])
my_list
--> [12,1,2,3,4]

To remove an element from a list use remove

my_list.remove(2)

Dictionaries represent a collection of key/value pairs also known as an associative array or a map.

To initialize an empty dictionary use {} or dict()

Dictionaries have keys and values

my_dict = {'key':'value', 'another_key' : 0}

To extend a dictionary with the contents of another dictionary you may use the update method

my_dict.update({'third_key' : 1})

To remove a value from a dictionary

del my_dict['key']

回答 1

不,如果您这样做:

array = {}

在您的示例中,您使用的array是字典,而不是数组。如果需要数组,则在Python中使用列表:

array = []

然后,要添加项目,请执行以下操作:

array.append('a')

No, if you do:

array = {}

IN your example you are using array as a dictionary, not an array. If you need an array, in Python you use lists:

array = []

Then, to add items you do:

array.append('a')

回答 2

数组(list在python中称为)使用该[]符号。{}是用于dict(在其他语言中也称为哈希表,关联的数组等),因此您无需为字典添加“追加”。

如果您实际上想要一个数组(列表),请使用:

array = []
array.append(valueToBeInserted)

Arrays (called list in python) use the [] notation. {} is for dict (also called hash tables, associated arrays, etc in other languages) so you won’t have ‘append’ for a dict.

If you actually want an array (list), use:

array = []
array.append(valueToBeInserted)

回答 3

仅出于完成目的,您还可以执行以下操作:

array = []
array += [valueToBeInserted]

如果它是一个字符串列表,这也将起作用:

array += 'string'

Just for sake of completion, you can also do this:

array = []
array += [valueToBeInserted]

If it’s a list of strings, this will also work:

array += 'string'

回答 4

在某些语言(例如JAVA)中,您可以使用花括号定义数组,如下所示,但在python中,其含义不同:

Java:

int[] myIntArray = {1,2,3};
String[] myStringArray = {"a","b","c"};

但是,在Python中,花括号用于定义字典,需要将key:value赋值设置为{'a':1, 'b':2}

要实际定义一个数组(在python中实际上称为list),您可以执行以下操作:

Python:

mylist = [1,2,3]

或其他示例,例如:

mylist = list()
mylist.append(1)
mylist.append(2)
mylist.append(3)
print(mylist)
>>> [1,2,3]

In some languages like JAVA you define an array using curly braces as following but in python it has a different meaning:

Java:

int[] myIntArray = {1,2,3};
String[] myStringArray = {"a","b","c"};

However, in Python, curly braces are used to define dictionaries, which needs a key:value assignment as {'a':1, 'b':2}

To actually define an array (which is actually called list in python) you can do:

Python:

mylist = [1,2,3]

or other examples like:

mylist = list()
mylist.append(1)
mylist.append(2)
mylist.append(3)
print(mylist)
>>> [1,2,3]

回答 5

您也可以:

array = numpy.append(array, value)

请注意,该numpy.append()方法返回一个新对象,因此,如果要修改初始数组,则必须编写:array = ...

You can also do:

array = numpy.append(array, value)

Note that the numpy.append() method returns a new object, so if you want to modify your initial array, you have to write: array = ...


回答 6

我相信你们都错了。您需要执行以下操作:

array = array[] 为了定义它,然后:

array.append ["hello"] 添加到它。

I believe you are all wrong. you need to do:

array = array[] in order to define it, and then:

array.append ["hello"] to add to it.


检查另一个字符串中是否存在多个字符串

问题:检查另一个字符串中是否存在多个字符串

如何检查数组中的任何字符串是否在另一个字符串中?

喜欢:

a = ['a', 'b', 'c']
str = "a123"
if a in str:
  print "some of the strings found in str"
else:
  print "no strings found in str"

该代码行不通,只是为了展示我想要实现的目标。

How can I check if any of the strings in an array exists in another string?

Like:

a = ['a', 'b', 'c']
str = "a123"
if a in str:
  print "some of the strings found in str"
else:
  print "no strings found in str"

That code doesn’t work, it’s just to show what I want to achieve.


回答 0

您可以使用any

a_string = "A string is more than its parts!"
matches = ["more", "wholesome", "milk"]

if any(x in a_string for x in matches):

同样检查 找到了列表中的所有字符串,请使用all代替any

You can use any:

a_string = "A string is more than its parts!"
matches = ["more", "wholesome", "milk"]

if any(x in a_string for x in matches):

Similarly to check if all the strings from the list are found, use all instead of any.


回答 1

any()到目前为止,如果您想要的只是TrueFalse,那么这是最好的方法,但是如果您想具体了解哪些字符串匹配,则可以使用一些方法。

如果要进行第一个匹配(False默认为):

match = next((x for x in a if x in str), False)

如果要获得所有匹配项(包括重复项):

matches = [x for x in a if x in str]

如果要获取所有非重复的匹配项(不考虑顺序):

matches = {x for x in a if x in str}

如果要以正确的顺序获取所有非重复的匹配项:

matches = []
for x in a:
    if x in str and x not in matches:
        matches.append(x)

any() is by far the best approach if all you want is True or False, but if you want to know specifically which string/strings match, you can use a couple things.

If you want the first match (with False as a default):

match = next((x for x in a if x in str), False)

If you want to get all matches (including duplicates):

matches = [x for x in a if x in str]

If you want to get all non-duplicate matches (disregarding order):

matches = {x for x in a if x in str}

If you want to get all non-duplicate matches in the right order:

matches = []
for x in a:
    if x in str and x not in matches:
        matches.append(x)

回答 2

如果输入的字符串变长astr变长,则应小心。简单的解采用O(S *(A ^ 2)),其中S是的长度,str而A是中的所有字符串的长度之和a。为获得更快的解决方案,请查看用于字符串匹配的Aho-Corasick算法,该算法以线性时间O(S + A)运行。

You should be careful if the strings in a or str gets longer. The straightforward solutions take O(S*(A^2)), where S is the length of str and A is the sum of the lenghts of all strings in a. For a faster solution, look at Aho-Corasick algorithm for string matching, which runs in linear time O(S+A).


回答 3

只是为了增加一些多样性regex

import re

if any(re.findall(r'a|b|c', str, re.IGNORECASE)):
    print 'possible matches thanks to regex'
else:
    print 'no matches'

或者如果您的清单太长- any(re.findall(r'|'.join(a), str, re.IGNORECASE))

Just to add some diversity with regex:

import re

if any(re.findall(r'a|b|c', str, re.IGNORECASE)):
    print 'possible matches thanks to regex'
else:
    print 'no matches'

or if your list is too long – any(re.findall(r'|'.join(a), str, re.IGNORECASE))


回答 4

您需要迭代a的元素。

a = ['a', 'b', 'c']
str = "a123"
found_a_string = False
for item in a:    
    if item in str:
        found_a_string = True

if found_a_string:
    print "found a match"
else:
    print "no match found"

You need to iterate on the elements of a.

a = ['a', 'b', 'c']
str = "a123"
found_a_string = False
for item in a:    
    if item in str:
        found_a_string = True

if found_a_string:
    print "found a match"
else:
    print "no match found"

回答 5

jbernadas已经提到Aho-Corasick-Algorithm,以降低复杂性。

这是在Python中使用它的一种方法:

  1. 这里下载aho_corasick.py

  2. 将其与主Python文件放在同一目录中并命名 aho_corasick.py

  3. 使用以下代码尝试算法:

    from aho_corasick import aho_corasick #(string, keywords)
    
    print(aho_corasick(string, ["keyword1", "keyword2"]))

请注意,搜索区分大小写

jbernadas already mentioned the Aho-Corasick-Algorithm in order to reduce complexity.

Here is one way to use it in Python:

  1. Download aho_corasick.py from here

  2. Put it in the same directory as your main Python file and name it aho_corasick.py

  3. Try the alrorithm with the following code:

    from aho_corasick import aho_corasick #(string, keywords)
    
    print(aho_corasick(string, ["keyword1", "keyword2"]))
    

Note that the search is case-sensitive


回答 6

a = ['a', 'b', 'c']
str =  "a123"

a_match = [True for match in a if match in str]

if True in a_match:
  print "some of the strings found in str"
else:
  print "no strings found in str"
a = ['a', 'b', 'c']
str =  "a123"

a_match = [True for match in a if match in str]

if True in a_match:
  print "some of the strings found in str"
else:
  print "no strings found in str"

回答 7

这取决于上下文猜,如果你要检查,如单文字(任何一个字,E,W,..等)足够

original_word ="hackerearcth"
for 'h' in original_word:
      print("YES")

如果要检查original_word中的任何字符:请使用

if any(your_required in yourinput for your_required in original_word ):

如果要在那个original_word中输入所有想要的输入,请使用所有简单的输入

original_word = ['h', 'a', 'c', 'k', 'e', 'r', 'e', 'a', 'r', 't', 'h']
yourinput = str(input()).lower()
if all(requested_word in yourinput for requested_word in original_word):
    print("yes")

It depends on the context suppose if you want to check single literal like(any single word a,e,w,..etc) in is enough

original_word ="hackerearcth"
for 'h' in original_word:
      print("YES")

if you want to check any of the character among the original_word: make use of

if any(your_required in yourinput for your_required in original_word ):

if you want all the input you want in that original_word,make use of all simple

original_word = ['h', 'a', 'c', 'k', 'e', 'r', 'e', 'a', 'r', 't', 'h']
yourinput = str(input()).lower()
if all(requested_word in yourinput for requested_word in original_word):
    print("yes")

回答 8

关于如何获取String中所有列表元素的更多信息

a = ['a', 'b', 'c']
str = "a123" 
list(filter(lambda x:  x in str, a))

Just some more info on how to get all list elements availlable in String

a = ['a', 'b', 'c']
str = "a123" 
list(filter(lambda x:  x in str, a))

回答 9

一种出奇的快速方法是使用set

a = ['a', 'b', 'c']
str = "a123"
if set(a) & set(str):
    print("some of the strings found in str")
else:
    print("no strings found in str")

如果a不包含任何多个字符的值(在这种情况下,请使用上面any列出的值),则此方法有效。如果是这样,这是简单的指定为字符串:。aa = 'abc'

A surprisingly fast approach is to use set:

a = ['a', 'b', 'c']
str = "a123"
if set(a) & set(str):
    print("some of the strings found in str")
else:
    print("no strings found in str")

This works if a does not contain any multiple-character values (in which case use any as listed above). If so, it’s simpler to specify a as a string: a = 'abc'.


回答 10

flog = open('test.txt', 'r')
flogLines = flog.readlines()
strlist = ['SUCCESS', 'Done','SUCCESSFUL']
res = False
for line in flogLines:
     for fstr in strlist:
         if line.find(fstr) != -1:
            print('found') 
            res = True


if res:
    print('res true')
else: 
    print('res false')

flog = open('test.txt', 'r')
flogLines = flog.readlines()
strlist = ['SUCCESS', 'Done','SUCCESSFUL']
res = False
for line in flogLines:
     for fstr in strlist:
         if line.find(fstr) != -1:
            print('found') 
            res = True


if res:
    print('res true')
else: 
    print('res false')


回答 11

我会使用这种功能来提高速度:

def check_string(string, substring_list):
    for substring in substring_list:
        if substring in string:
            return True
    return False

I would use this kind of function for speed:

def check_string(string, substring_list):
    for substring in substring_list:
        if substring in string:
            return True
    return False

回答 12

data = "firstName and favoriteFood"
mandatory_fields = ['firstName', 'lastName', 'age']


# for each
for field in mandatory_fields:
    if field not in data:
        print("Error, missing req field {0}".format(field));

# still fine, multiple if statements
if ('firstName' not in data or 
    'lastName' not in data or
    'age' not in data):
    print("Error, missing a req field");

# not very readable, list comprehension
missing_fields = [x for x in mandatory_fields if x not in data]
if (len(missing_fields)>0):
    print("Error, missing fields {0}".format(", ".join(missing_fields)));
data = "firstName and favoriteFood"
mandatory_fields = ['firstName', 'lastName', 'age']


# for each
for field in mandatory_fields:
    if field not in data:
        print("Error, missing req field {0}".format(field));

# still fine, multiple if statements
if ('firstName' not in data or 
    'lastName' not in data or
    'age' not in data):
    print("Error, missing a req field");

# not very readable, list comprehension
missing_fields = [x for x in mandatory_fields if x not in data]
if (len(missing_fields)>0):
    print("Error, missing fields {0}".format(", ".join(missing_fields)));

Python列表与数组-何时使用?

问题:Python列表与数组-何时使用?

如果要创建一维数组,则可以将其实现为列表,也可以使用标准库中的“数组”模块。我一直将列表用于一维数组。

我想改用数组模块的原因或情况是什么?

是为了性能和内存优化,还是我缺少明显的东西?

If you are creating a 1d array, you can implement it as a List, or else use the ‘array’ module in the standard library. I have always used Lists for 1d arrays.

What is the reason or circumstance where I would want to use the array module instead?

Is it for performance and memory optimization, or am I missing something obvious?


回答 0

基本上,Python列表非常灵活,可以保存完全不同的任意数据,并且可以在摊销后的固定时间内非常高效地附加到它们。如果您需要高效而又省时地缩小和扩展列表,则可以采用这些方法。但是它们比C数组占用更多的空间

array.array类型,在另一方面,是只在C数组的薄包装。它只能保存所有相同类型的同类数据,因此仅使用sizeof(one object) * length内存字节。通常,在需要将C数组公开给扩展名或系统调用(例如ioctlfctnl)时,应使用它。

array.array也是在Python 2.x()中表示可变字符串的一种合理方法array('B', bytes)。但是,Python 2.6+和3.x提供了一个可变字节字符串bytearray

但是,如果要对数字数据的均质数组进行数学运算,则最好使用NumPy,它可以自动对复杂的多维数组进行矢量化操作。

简而言之array.array当您需要除数学之外的其他原因而需要同构C数据数组时,此选项很有用。

Basically, Python lists are very flexible and can hold completely heterogeneous, arbitrary data, and they can be appended to very efficiently, in amortized constant time. If you need to shrink and grow your list time-efficiently and without hassle, they are the way to go. But they use a lot more space than C arrays.

The array.array type, on the other hand, is just a thin wrapper on C arrays. It can hold only homogeneous data, all of the same type, and so it uses only sizeof(one object) * length bytes of memory. Mostly, you should use it when you need to expose a C array to an extension or a system call (for example, ioctl or fctnl).

array.array is also a reasonable way to represent a mutable string in Python 2.x (array('B', bytes)). However, Python 2.6+ and 3.x offers a mutable byte string as bytearray.

However, if you want to do math on a homogeneous array of numeric data, then you’re much better off using NumPy, which can automatically vectorize operations on complex multi-dimensional arrays.

To make a long story short: array.array is useful when you need a homogeneous C array of data for reasons other than doing math.


回答 1

在几乎所有情况下,正常列表都是正确的选择。数组模块更像是C数组的一个薄包装器,它为您提供了一种强类型的容器(请参阅docs),可以访问更多类似C的类型,例如有符号/无符号short或double,这不是构建的一部分-in类型。我说只有在确实需要时才使用arrays模块,在所有其他情况下,都坚持使用列表。

For almost all cases the normal list is the right choice. The arrays module is more like a thin wrapper over C arrays, which give you kind of strongly typed containers (see docs), with access to more C-like types such as signed/unsigned short or double, which are not part of the built-in types. I’d say use the arrays module only if you really need it, in all other cases stick with lists.


回答 2

如果您不知道为什么要使用它,那么数组模块就是其中一种您可能不需要的东西(请注意,我并不是要以居高临下的方式来说明这一点!) 。大多数时候,数组模块用于与C代码进行接口。为您提供有关性能问题的更直接答案:

对于某些用途,数组比列表更有效。如果需要分配一个您不会更改的数组,那么数组可以更快并且使用更少的内存。GvR有一个优化轶事,其中阵列模块是赢家(长期阅读,但值得)。

另一方面,列表消耗的内存比数组多的部分原因是因为当所有分配的元素都被使用时,python将分配一些额外的元素。这意味着将项目追加到列表的速度更快。因此,如果您计划添加项目,则要使用列表。

TL; DR如果您有特殊的优化需求或需要与C代码进行接口(并且不能使用pyrex),则仅使用数组。

The array module is kind of one of those things that you probably don’t have a need for if you don’t know why you would use it (and take note that I’m not trying to say that in a condescending manner!). Most of the time, the array module is used to interface with C code. To give you a more direct answer to your question about performance:

Arrays are more efficient than lists for some uses. If you need to allocate an array that you KNOW will not change, then arrays can be faster and use less memory. GvR has an optimization anecdote in which the array module comes out to be the winner (long read, but worth it).

On the other hand, part of the reason why lists eat up more memory than arrays is because python will allocate a few extra elements when all allocated elements get used. This means that appending items to lists is faster. So if you plan on adding items, a list is the way to go.

TL;DR I’d only use an array if you had an exceptional optimization need or you need to interface with C code (and can’t use pyrex).


回答 3

这是一个权衡!

每个人的优点:

清单

  • 灵活
  • 可以是异构的

数组(例如:numpy数组)

  • 统一值数组
  • 同质
  • 紧凑(尺寸)
  • 高效(功能和速度)
  • 方便

It’s a trade off !

pros of each one :

list

  • flexible
  • can be heterogeneous

array (ex: numpy array)

  • array of uniform values
  • homogeneous
  • compact (in size)
  • efficient (functionality and speed)
  • convenient

回答 4

我的理解是,数组的存储效率更高(例如,内存的连续块与指向Python对象的指针),但是我不知道任何性能上的好处。另外,对于数组,您必须存储相同类型的原语,而列表可以存储任何内容。

My understanding is that arrays are stored more efficiently (i.e. as contiguous blocks of memory vs. pointers to Python objects), but I am not aware of any performance benefit. Additionally, with arrays you must store primitives of the same type, whereas lists can store anything.


回答 5

标准库数组对于二进制I / O很有用,例如将整数列表转换为要写入例如wave文件的字符串。也就是说,正如许多人已经指出的那样,如果您要进行任何实际工作,则应考虑使用NumPy。

The standard library arrays are useful for binary I/O, such as translating a list of ints to a string to write to, say, a wave file. That said, as many have already noted, if you’re going to do any real work then you should consider using NumPy.


回答 6

如果要使用数组,请考虑使用numpy或scipy包,它们为数组提供了更大的灵活性。

If you’re going to be using arrays, consider the numpy or scipy packages, which give you arrays with a lot more flexibility.


回答 7

数组只能用于特定类型,而列表可以用于任何对象。

数组也只能是一种类型的数据,而列表可以具有各种对象类型的条目。

数组对于某些数值计算也更加有效。

Array can only be used for specific types, whereas lists can be used for any object.

Arrays can also only data of one type, whereas a list can have entries of various object types.

Arrays are also more efficient for some numerical computation.


回答 8

numpy数组和list之间的重要区别是,数组切片是原始数组上的视图。这意味着不会复制数据,并且对视图的任何修改将反映在源数组中。

An important difference between numpy array and list is that array slices are views on the original array. This means that the data is not copied, and any modifications to the view will be reflected in the source array.


回答 9

这个答案将总结几乎所有有关何时使用List和Array的查询:

  1. 这两种数据类型之间的主要区别是可以对它们执行的操作。例如,您可以将数组除以3,然后将数组的每个元素除以3。使用列表无法完成相同的操作。

  2. 该列表是python语法的一部分,因此不需要声明它,而您必须在使用它之前声明该数组。

  3. 您可以将不同数据类型的值存储在列表中(异构),而在Array中,您只能存储相同数据类型的值(异构)。

  4. 数组具有丰富的功能和快速的功能,与列表相比,它广泛用于算术运算和存储大量数据。

  5. 与列表相比,数组占用的内存更少。

This answer will sum up almost all the queries about when to use List and Array:

  1. The main difference between these two data types is the operations you can perform on them. For example, you can divide an array by 3 and it will divide each element of array by 3. Same can not be done with the list.

  2. The list is the part of python’s syntax so it doesn’t need to be declared whereas you have to declare the array before using it.

  3. You can store values of different data-types in a list (heterogeneous), whereas in Array you can only store values of only the same data-type (homogeneous).

  4. Arrays being rich in functionalities and fast, it is widely used for arithmetic operations and for storing a large amount of data – compared to list.

  5. Arrays take less memory compared to lists.


块数组尺寸

问题:块数组尺寸

我目前正在尝试学习Numpy和Python。给定以下数组:

import numpy as np
a = np.array([[1,2],[1,2]])

有没有返回尺寸的函数a(ega是2 x 2数组)?

size() 返回4并没有太大帮助。

I’m currently trying to learn Numpy and Python. Given the following array:

import numpy as np
a = np.array([[1,2],[1,2]])

Is there a function that returns the dimensions of a (e.g.a is a 2 by 2 array)?

size() returns 4 and that doesn’t help very much.


回答 0

.shape

ndarray。 数组尺寸的形状
元组。

从而:

>>> a.shape
(2, 2)

It is .shape:

ndarray.shape
Tuple of array dimensions.

Thus:

>>> a.shape
(2, 2)

回答 1

第一:

按照惯例,在Python世界中,的快捷方式numpynp,因此:

In [1]: import numpy as np

In [2]: a = np.array([[1,2],[3,4]])

第二:

在Numpy中,维度轴/轴形状是相关的,有时是相似的概念:

尺寸

在“ 数学/物理学”中,维或维数被非正式地定义为指定空间中任何点所需的最小坐标数。但在numpy的,根据numpy的文档,这是相同的轴线/轴:

在Numpy中,尺寸称为轴。轴数为等级。

In [3]: a.ndim  # num of dimensions/axes, *Mathematics definition of dimension*
Out[3]: 2

轴/轴

在Numpy中索引an 的第n个坐标array。多维数组每个轴可以有一个索引。

In [4]: a[1,0]  # to index `a`, we specific 1 at the first axis and 0 at the second axis.
Out[4]: 3  # which results in 3 (locate at the row 1 and column 0, 0-based index)

形状

描述沿每个可用轴有多少数据(或范围)。

In [5]: a.shape
Out[5]: (2, 2)  # both the first and second axis have 2 (columns/rows/pages/blocks/...) data

First:

By convention, in Python world, the shortcut for numpy is np, so:

In [1]: import numpy as np

In [2]: a = np.array([[1,2],[3,4]])

Second:

In Numpy, dimension, axis/axes, shape are related and sometimes similar concepts:

dimension

In Mathematics/Physics, dimension or dimensionality is informally defined as the minimum number of coordinates needed to specify any point within a space. But in Numpy, according to the numpy doc, it’s the same as axis/axes:

In Numpy dimensions are called axes. The number of axes is rank.

In [3]: a.ndim  # num of dimensions/axes, *Mathematics definition of dimension*
Out[3]: 2

axis/axes

the nth coordinate to index an array in Numpy. And multidimensional arrays can have one index per axis.

In [4]: a[1,0]  # to index `a`, we specific 1 at the first axis and 0 at the second axis.
Out[4]: 3  # which results in 3 (locate at the row 1 and column 0, 0-based index)

shape

describes how many data (or the range) along each available axis.

In [5]: a.shape
Out[5]: (2, 2)  # both the first and second axis have 2 (columns/rows/pages/blocks/...) data

回答 2

import numpy as np   
>>> np.shape(a)
(2,2)

如果输入不是numpy数组而是列表列表,则也可以使用

>>> a = [[1,2],[1,2]]
>>> np.shape(a)
(2,2)

或元组的元组

>>> a = ((1,2),(1,2))
>>> np.shape(a)
(2,2)
import numpy as np   
>>> np.shape(a)
(2,2)

Also works if the input is not a numpy array but a list of lists

>>> a = [[1,2],[1,2]]
>>> np.shape(a)
(2,2)

Or a tuple of tuples

>>> a = ((1,2),(1,2))
>>> np.shape(a)
(2,2)

回答 3

您可以使用.shape

In: a = np.array([[1,2,3],[4,5,6]])
In: a.shape
Out: (2, 3)
In: a.shape[0] # x axis
Out: 2
In: a.shape[1] # y axis
Out: 3

You can use .shape

In: a = np.array([[1,2,3],[4,5,6]])
In: a.shape
Out: (2, 3)
In: a.shape[0] # x axis
Out: 2
In: a.shape[1] # y axis
Out: 3

回答 4

您可以使用.ndim尺寸并.shape知道确切尺寸

var = np.array([[1,2,3,4,5,6], [1,2,3,4,5,6]])

var.ndim
# displays 2

var.shape
# display 6, 2

您可以使用.reshape功能更改尺寸

var = np.array([[1,2,3,4,5,6], [1,2,3,4,5,6]]).reshape(3,4)

var.ndim
#display 2

var.shape
#display 3, 4

You can use .ndim for dimension and .shape to know the exact dimension

var = np.array([[1,2,3,4,5,6], [1,2,3,4,5,6]])

var.ndim
# displays 2

var.shape
# display 6, 2

You can change the dimension using .reshape function

var = np.array([[1,2,3,4,5,6], [1,2,3,4,5,6]]).reshape(3,4)

var.ndim
#display 2

var.shape
#display 3, 4

回答 5

shape方法要求它a是一个Numpy ndarray。但是Numpy还可以计算纯python对象的可迭代对象的形状:

np.shape([[1,2],[1,2]])

The shape method requires that a be a Numpy ndarray. But Numpy can also calculate the shape of iterables of pure python objects:

np.shape([[1,2],[1,2]])

回答 6

a.shape只是的受限版本np.info()。看一下这个:

import numpy as np
a = np.array([[1,2],[1,2]])
np.info(a)

class:  ndarray
shape:  (2, 2)
strides:  (8, 4)
itemsize:  4
aligned:  True
contiguous:  True
fortran:  False
data pointer: 0x27509cf0560
byteorder:  little
byteswap:  False
type: int32

a.shape is just a limited version of np.info(). Check this out:

import numpy as np
a = np.array([[1,2],[1,2]])
np.info(a)

Out

class:  ndarray
shape:  (2, 2)
strides:  (8, 4)
itemsize:  4
aligned:  True
contiguous:  True
fortran:  False
data pointer: 0x27509cf0560
byteorder:  little
byteswap:  False
type: int32

numpy数组和矩阵有什么区别?我应该使用哪一个?

问题:numpy数组和矩阵有什么区别?我应该使用哪一个?

每种都有哪些优点和缺点?

从我所看到的情况来看,如果需要,任何一个都可以替代另一个,所以我应该同时使用这两个还是应该仅使用其中之一?

程序的样式会影响我的选择吗?我正在使用numpy进行一些机器学习,因此确实有很多矩阵,但也有很多向量(数组)。

What are the advantages and disadvantages of each?

From what I’ve seen, either one can work as a replacement for the other if need be, so should I bother using both or should I stick to just one of them?

Will the style of the program influence my choice? I am doing some machine learning using numpy, so there are indeed lots of matrices, but also lots of vectors (arrays).


回答 0

根据官方文件,不再建议使用矩阵类,因为将来会删除它。

https://numpy.org/doc/stable/reference/generation/numpy.matrix.html

正如其他答案所指出的那样,您可以使用NumPy数组实现所有操作。

As per the official documents, it’s not anymore advisable to use matrix class since it will be removed in the future.

https://numpy.org/doc/stable/reference/generated/numpy.matrix.html

As other answers already state that you can achieve all the operations with NumPy arrays.


回答 1

numpy的矩阵是严格2维的,而numpy的阵列(ndarrays)是N维的。矩阵对象是ndarray的子​​类,因此它们继承了ndarray的所有属性和方法。

numpy矩阵的主要优点是它们为矩阵乘法提供了一种方便的表示法:如果a和b是矩阵,则a*b它们是矩阵乘积。

import numpy as np

a = np.mat('4 3; 2 1')
b = np.mat('1 2; 3 4')
print(a)
# [[4 3]
#  [2 1]]
print(b)
# [[1 2]
#  [3 4]]
print(a*b)
# [[13 20]
#  [ 5  8]]

另一方面,从Python 3.5开始,NumPy使用@运算符支持中缀矩阵乘法,因此您可以在Python> = 3.5中使用ndarrays实现相同的矩阵乘法便捷性。

import numpy as np

a = np.array([[4, 3], [2, 1]])
b = np.array([[1, 2], [3, 4]])
print(a@b)
# [[13 20]
#  [ 5  8]]

矩阵对象和ndarray都.T必须返回转置,但是矩阵对象也必须具有.H共轭转置和.I逆。

相反,numpy数组始终遵守以元素为单位应用操作的规则(除了new @运算符)。因此,如果ab是numpy数组,则a*b该数组是通过按元素逐个乘以组成的:

c = np.array([[4, 3], [2, 1]])
d = np.array([[1, 2], [3, 4]])
print(c*d)
# [[4 6]
#  [6 4]]

要获得矩阵乘法的结果,请使用np.dot(或@在Python> = 3.5中,如上所示):

print(np.dot(c,d))
# [[13 20]
#  [ 5  8]]

**运营商还表现不同:

print(a**2)
# [[22 15]
#  [10  7]]
print(c**2)
# [[16  9]
#  [ 4  1]]

由于a是矩阵,所以a**2返回矩阵乘积a*a。由于c是ndarray,因此c**2返回一个ndarray,每个组件的元素均平方。

矩阵对象和ndarray之间还有其他技术差异(与np.ravel,项目选择和序列行为有关)。

numpy数组的主要优点是它们比二维矩阵更通用。当您需要3维数组时会发生什么?然后,您必须使用ndarray,而不是矩阵对象。因此,学习使用矩阵对象的工作量更大-您必须学习矩阵对象操作和ndarray操作。

编写一个将矩阵和数组混合在一起的程序会使您的生活变得困难,因为您必须跟踪变量是什么类型的对象,以免乘法返回您不期望的东西。

相反,如果仅使用ndarray,则可以执行矩阵对象可以执行的所有操作,以及更多操作,但功能/符号略有不同。

如果您愿意放弃NumPy矩阵产品表示法的视觉吸引力(使用python> = 3.5的ndarrays几乎可以优雅地实现),那么我认为NumPy数组绝对是可行的方法。

PS。当然,您实际上不必选择以牺牲另一个为代价,因为np.asmatrixnp.asarray允许您将一个转换为另一个(只要数组是二维的)。


还有就是与NumPy之间的差异大纲arraysVS NumPy的matrixES 这里

Numpy matrices are strictly 2-dimensional, while numpy arrays (ndarrays) are N-dimensional. Matrix objects are a subclass of ndarray, so they inherit all the attributes and methods of ndarrays.

The main advantage of numpy matrices is that they provide a convenient notation for matrix multiplication: if a and b are matrices, then a*b is their matrix product.

import numpy as np

a = np.mat('4 3; 2 1')
b = np.mat('1 2; 3 4')
print(a)
# [[4 3]
#  [2 1]]
print(b)
# [[1 2]
#  [3 4]]
print(a*b)
# [[13 20]
#  [ 5  8]]

On the other hand, as of Python 3.5, NumPy supports infix matrix multiplication using the @ operator, so you can achieve the same convenience of matrix multiplication with ndarrays in Python >= 3.5.

import numpy as np

a = np.array([[4, 3], [2, 1]])
b = np.array([[1, 2], [3, 4]])
print(a@b)
# [[13 20]
#  [ 5  8]]

Both matrix objects and ndarrays have .T to return the transpose, but matrix objects also have .H for the conjugate transpose, and .I for the inverse.

In contrast, numpy arrays consistently abide by the rule that operations are applied element-wise (except for the new @ operator). Thus, if a and b are numpy arrays, then a*b is the array formed by multiplying the components element-wise:

c = np.array([[4, 3], [2, 1]])
d = np.array([[1, 2], [3, 4]])
print(c*d)
# [[4 6]
#  [6 4]]

To obtain the result of matrix multiplication, you use np.dot (or @ in Python >= 3.5, as shown above):

print(np.dot(c,d))
# [[13 20]
#  [ 5  8]]

The ** operator also behaves differently:

print(a**2)
# [[22 15]
#  [10  7]]
print(c**2)
# [[16  9]
#  [ 4  1]]

Since a is a matrix, a**2 returns the matrix product a*a. Since c is an ndarray, c**2 returns an ndarray with each component squared element-wise.

There are other technical differences between matrix objects and ndarrays (having to do with np.ravel, item selection and sequence behavior).

The main advantage of numpy arrays is that they are more general than 2-dimensional matrices. What happens when you want a 3-dimensional array? Then you have to use an ndarray, not a matrix object. Thus, learning to use matrix objects is more work — you have to learn matrix object operations, and ndarray operations.

Writing a program that mixes both matrices and arrays makes your life difficult because you have to keep track of what type of object your variables are, lest multiplication return something you don’t expect.

In contrast, if you stick solely with ndarrays, then you can do everything matrix objects can do, and more, except with slightly different functions/notation.

If you are willing to give up the visual appeal of NumPy matrix product notation (which can be achieved almost as elegantly with ndarrays in Python >= 3.5), then I think NumPy arrays are definitely the way to go.

PS. Of course, you really don’t have to choose one at the expense of the other, since np.asmatrix and np.asarray allow you to convert one to the other (as long as the array is 2-dimensional).


There is a synopsis of the differences between NumPy arrays vs NumPy matrixes here.


回答 2

Scipy.org建议您使用数组:

*’array’或’matrix’?我应该使用哪个?-简短答案

使用数组。

  • 它们是numpy的标准向量/矩阵/张量类型。许多numpy函数返回数组,而不是矩阵。

  • 在逐元素运算和线性代数运算之间有明显的区别。

  • 如果愿意,可以有标准向量或行/列向量。

使用数组类型的唯一缺点是,您将不得不使用dot而不是*乘(减少)两个张量(标量积,矩阵向量乘法等)。

Scipy.org recommends that you use arrays:

*’array’ or ‘matrix’? Which should I use? – Short answer

Use arrays.

  • They are the standard vector/matrix/tensor type of numpy. Many numpy function return arrays, not matrices.

  • There is a clear distinction between element-wise operations and linear algebra operations.

  • You can have standard vectors or row/column vectors if you like.

The only disadvantage of using the array type is that you will have to use dot instead of * to multiply (reduce) two tensors (scalar product, matrix vector multiplication etc.).


回答 3

只是将一个案例添加到unutbu的列表中。

与numpy矩阵或矩阵语言(如matlab)相比,numpy ndarray对我而言最大的实际差异之一是,在归约运算中未保留维。矩阵始终为2d,而数组的均值则少一维。

例如,矩阵或数组的行为不佳的行:

带矩阵

>>> m = np.mat([[1,2],[2,3]])
>>> m
matrix([[1, 2],
        [2, 3]])
>>> mm = m.mean(1)
>>> mm
matrix([[ 1.5],
        [ 2.5]])
>>> mm.shape
(2, 1)
>>> m - mm
matrix([[-0.5,  0.5],
        [-0.5,  0.5]])

带阵列

>>> a = np.array([[1,2],[2,3]])
>>> a
array([[1, 2],
       [2, 3]])
>>> am = a.mean(1)
>>> am.shape
(2,)
>>> am
array([ 1.5,  2.5])
>>> a - am #wrong
array([[-0.5, -0.5],
       [ 0.5,  0.5]])
>>> a - am[:, np.newaxis]  #right
array([[-0.5,  0.5],
       [-0.5,  0.5]])

我还认为混合数组和矩阵会带来很多“快乐的”调试时间。但是,就乘法而言,scipy.sparse矩阵始终是矩阵。

Just to add one case to unutbu’s list.

One of the biggest practical differences for me of numpy ndarrays compared to numpy matrices or matrix languages like matlab, is that the dimension is not preserved in reduce operations. Matrices are always 2d, while the mean of an array, for example, has one dimension less.

For example demean rows of a matrix or array:

with matrix

>>> m = np.mat([[1,2],[2,3]])
>>> m
matrix([[1, 2],
        [2, 3]])
>>> mm = m.mean(1)
>>> mm
matrix([[ 1.5],
        [ 2.5]])
>>> mm.shape
(2, 1)
>>> m - mm
matrix([[-0.5,  0.5],
        [-0.5,  0.5]])

with array

>>> a = np.array([[1,2],[2,3]])
>>> a
array([[1, 2],
       [2, 3]])
>>> am = a.mean(1)
>>> am.shape
(2,)
>>> am
array([ 1.5,  2.5])
>>> a - am #wrong
array([[-0.5, -0.5],
       [ 0.5,  0.5]])
>>> a - am[:, np.newaxis]  #right
array([[-0.5,  0.5],
       [-0.5,  0.5]])

I also think that mixing arrays and matrices gives rise to many “happy” debugging hours. However, scipy.sparse matrices are always matrices in terms of operators like multiplication.


回答 4

正如其他人提到的那样,也许它的主要优点matrix是它为矩阵乘法提供了一种方便的符号。

但是,在Python 3.5中,终于有了一个专用的infix运算符用于矩阵乘法@

在最新的NumPy版本中,它可以与ndarrays 一起使用:

A = numpy.ones((1, 3))
B = numpy.ones((3, 3))
A @ B

因此,如今,如果有更多疑问,您应该坚持ndarray

As others have mentioned, perhaps the main advantage of matrix was that it provided a convenient notation for matrix multiplication.

However, in Python 3.5 there is finally a dedicated infix operator for matrix multiplication: @.

With recent NumPy versions, it can be used with ndarrays:

A = numpy.ones((1, 3))
B = numpy.ones((3, 3))
A @ B

So nowadays, even more, when in doubt, you should stick to ndarray.


按列对NumPy中的数组排序

问题:按列对NumPy中的数组排序

如何按第n列对NumPy中的数组排序?

例如,

a = array([[9, 2, 3],
           [4, 5, 6],
           [7, 0, 5]])

我想按第二列对行进行排序,以便返回:

array([[7, 0, 5],
       [9, 2, 3],
       [4, 5, 6]])

How can I sort an array in NumPy by the nth column?

For example,

a = array([[9, 2, 3],
           [4, 5, 6],
           [7, 0, 5]])

I’d like to sort rows by the second column, such that I get back:

array([[7, 0, 5],
       [9, 2, 3],
       [4, 5, 6]])

回答 0

@steve答案实际上是最优雅的方法。

对于“正确”的方式,请参见numpy.ndarray.sort的order关键字参数。

但是,您需要将数组视为具有字段的数组(结构化数组)。

如果您最初没有使用字段定义数组,那么“正确”的方法就很难看了。

作为一个简单的示例,对其进行排序并返回副本:

In [1]: import numpy as np

In [2]: a = np.array([[1,2,3],[4,5,6],[0,0,1]])

In [3]: np.sort(a.view('i8,i8,i8'), order=['f1'], axis=0).view(np.int)
Out[3]: 
array([[0, 0, 1],
       [1, 2, 3],
       [4, 5, 6]])

对其进行原位排序:

In [6]: a.view('i8,i8,i8').sort(order=['f1'], axis=0) #<-- returns None

In [7]: a
Out[7]: 
array([[0, 0, 1],
       [1, 2, 3],
       [4, 5, 6]])

据我所知,@ Steve确实是最优雅的方式…

此方法的唯一优点是,“ order”参数是用来对搜索进行排序的字段列表。例如,您可以通过提供order = [‘f1’,’f2’,’f0’]来对第二列,第三列,第一列进行排序。

@steve‘s is actually the most elegant way of doing it.

For the “correct” way see the order keyword argument of numpy.ndarray.sort

However, you’ll need to view your array as an array with fields (a structured array).

The “correct” way is quite ugly if you didn’t initially define your array with fields…

As a quick example, to sort it and return a copy:

In [1]: import numpy as np

In [2]: a = np.array([[1,2,3],[4,5,6],[0,0,1]])

In [3]: np.sort(a.view('i8,i8,i8'), order=['f1'], axis=0).view(np.int)
Out[3]: 
array([[0, 0, 1],
       [1, 2, 3],
       [4, 5, 6]])

To sort it in-place:

In [6]: a.view('i8,i8,i8').sort(order=['f1'], axis=0) #<-- returns None

In [7]: a
Out[7]: 
array([[0, 0, 1],
       [1, 2, 3],
       [4, 5, 6]])

@Steve’s really is the most elegant way to do it, as far as I know…

The only advantage to this method is that the “order” argument is a list of the fields to order the search by. For example, you can sort by the second column, then the third column, then the first column by supplying order=[‘f1′,’f2′,’f0’].


回答 1

我想这可行: a[a[:,1].argsort()]

这表示的第二列,a并据此对其进行排序。

I suppose this works: a[a[:,1].argsort()]

This indicates the second column of a and sort it based on it accordingly.


回答 2

您可以按照Steve Tjoa的方法对多个列进行排序,方法是使用诸如mergesort之类的稳定排序并对索引从最低有效列到最高有效列进行排序:

a = a[a[:,2].argsort()] # First sort doesn't need to be stable.
a = a[a[:,1].argsort(kind='mergesort')]
a = a[a[:,0].argsort(kind='mergesort')]

排序方式为:第0列,然后是1,然后是2。

You can sort on multiple columns as per Steve Tjoa’s method by using a stable sort like mergesort and sorting the indices from the least significant to the most significant columns:

a = a[a[:,2].argsort()] # First sort doesn't need to be stable.
a = a[a[:,1].argsort(kind='mergesort')]
a = a[a[:,0].argsort(kind='mergesort')]

This sorts by column 0, then 1, then 2.


回答 3

我认为您可以从Python文档Wiki中进行以下操作:

a = ([[1, 2, 3], [4, 5, 6], [0, 0, 1]]); 
a = sorted(a, key=lambda a_entry: a_entry[1]) 
print a

输出为:

[[[0, 0, 1], [1, 2, 3], [4, 5, 6]]]

From the Python documentation wiki, I think you can do:

a = ([[1, 2, 3], [4, 5, 6], [0, 0, 1]]); 
a = sorted(a, key=lambda a_entry: a_entry[1]) 
print a

The output is:

[[[0, 0, 1], [1, 2, 3], [4, 5, 6]]]

回答 4

如果有人想在他们程序的关键部分使用排序,下面是对不同提案的性能比较:

import numpy as np
table = np.random.rand(5000, 10)

%timeit table.view('f8,f8,f8,f8,f8,f8,f8,f8,f8,f8').sort(order=['f9'], axis=0)
1000 loops, best of 3: 1.88 ms per loop

%timeit table[table[:,9].argsort()]
10000 loops, best of 3: 180 µs per loop

import pandas as pd
df = pd.DataFrame(table)
%timeit df.sort_values(9, ascending=True)
1000 loops, best of 3: 400 µs per loop

因此,似乎使用argsort进行索引是迄今为止最快的方法…

In case someone wants to make use of sorting at a critical part of their programs here’s a performance comparison for the different proposals:

import numpy as np
table = np.random.rand(5000, 10)

%timeit table.view('f8,f8,f8,f8,f8,f8,f8,f8,f8,f8').sort(order=['f9'], axis=0)
1000 loops, best of 3: 1.88 ms per loop

%timeit table[table[:,9].argsort()]
10000 loops, best of 3: 180 µs per loop

import pandas as pd
df = pd.DataFrame(table)
%timeit df.sort_values(9, ascending=True)
1000 loops, best of 3: 400 µs per loop

So, it looks like indexing with argsort is the quickest method so far…


回答 5

该NumPy的邮件列表,这里是另一种解决方案:

>>> a
array([[1, 2],
       [0, 0],
       [1, 0],
       [0, 2],
       [2, 1],
       [1, 0],
       [1, 0],
       [0, 0],
       [1, 0],
      [2, 2]])
>>> a[np.lexsort(np.fliplr(a).T)]
array([[0, 0],
       [0, 0],
       [0, 2],
       [1, 0],
       [1, 0],
       [1, 0],
       [1, 0],
       [1, 2],
       [2, 1],
       [2, 2]])

From the NumPy mailing list, here’s another solution:

>>> a
array([[1, 2],
       [0, 0],
       [1, 0],
       [0, 2],
       [2, 1],
       [1, 0],
       [1, 0],
       [0, 0],
       [1, 0],
      [2, 2]])
>>> a[np.lexsort(np.fliplr(a).T)]
array([[0, 0],
       [0, 0],
       [0, 2],
       [1, 0],
       [1, 0],
       [1, 0],
       [1, 0],
       [1, 2],
       [2, 1],
       [2, 2]])

回答 6

我有一个类似的问题。

我的问题:

我想计算SVD,需要按降序对我的特征值进行排序。但是我想保留特征值和特征向量之间的映射。我的特征值在第一行中,而对应的特征向量在同一列中。

因此,我想按降序按第一行在列中对二维数组进行排序。

我的解决方案

a = a[::, a[0,].argsort()[::-1]]

那么这是如何工作的呢?

a[0,] 只是我要排序的第一行。

现在,我使用argsort来获取索引的顺序。

我用 [::-1]是因为我需要降序排列。

最后,我使用a[::, ...]正确的顺序查看各列。

I had a similar problem.

My Problem:

I want to calculate an SVD and need to sort my eigenvalues in descending order. But I want to keep the mapping between eigenvalues and eigenvectors. My eigenvalues were in the first row and the corresponding eigenvector below it in the same column.

So I want to sort a two-dimensional array column-wise by the first row in descending order.

My Solution

a = a[::, a[0,].argsort()[::-1]]

So how does this work?

a[0,] is just the first row I want to sort by.

Now I use argsort to get the order of indices.

I use [::-1] because I need descending order.

Lastly I use a[::, ...] to get a view with the columns in the right order.


回答 7

稍微复杂一点的lexsort例子-在第一列下降,在第二列上升。的窍门lexsort是,它对行进行排序(因此.T),并优先考虑最后一行。

In [120]: b=np.array([[1,2,1],[3,1,2],[1,1,3],[2,3,4],[3,2,5],[2,1,6]])
In [121]: b
Out[121]: 
array([[1, 2, 1],
       [3, 1, 2],
       [1, 1, 3],
       [2, 3, 4],
       [3, 2, 5],
       [2, 1, 6]])
In [122]: b[np.lexsort(([1,-1]*b[:,[1,0]]).T)]
Out[122]: 
array([[3, 1, 2],
       [3, 2, 5],
       [2, 1, 6],
       [2, 3, 4],
       [1, 1, 3],
       [1, 2, 1]])

A little more complicated lexsort example – descending on the 1st column, secondarily ascending on the 2nd. The tricks with lexsort are that it sorts on rows (hence the .T), and gives priority to the last.

In [120]: b=np.array([[1,2,1],[3,1,2],[1,1,3],[2,3,4],[3,2,5],[2,1,6]])
In [121]: b
Out[121]: 
array([[1, 2, 1],
       [3, 1, 2],
       [1, 1, 3],
       [2, 3, 4],
       [3, 2, 5],
       [2, 1, 6]])
In [122]: b[np.lexsort(([1,-1]*b[:,[1,0]]).T)]
Out[122]: 
array([[3, 1, 2],
       [3, 2, 5],
       [2, 1, 6],
       [2, 3, 4],
       [1, 1, 3],
       [1, 2, 1]])

回答 8

这是考虑所有列的另一种解决方案(JJ的答案的更紧凑方式);

ar=np.array([[0, 0, 0, 1],
             [1, 0, 1, 0],
             [0, 1, 0, 0],
             [1, 0, 0, 1],
             [0, 0, 1, 0],
             [1, 1, 0, 0]])

用lexsort排序,

ar[np.lexsort(([ar[:, i] for i in range(ar.shape[1]-1, -1, -1)]))]

输出:

array([[0, 0, 0, 1],
       [0, 0, 1, 0],
       [0, 1, 0, 0],
       [1, 0, 0, 1],
       [1, 0, 1, 0],
       [1, 1, 0, 0]])

Here is another solution considering all columns (more compact way of J.J‘s answer);

ar=np.array([[0, 0, 0, 1],
             [1, 0, 1, 0],
             [0, 1, 0, 0],
             [1, 0, 0, 1],
             [0, 0, 1, 0],
             [1, 1, 0, 0]])

Sort with lexsort,

ar[np.lexsort(([ar[:, i] for i in range(ar.shape[1]-1, -1, -1)]))]

Output:

array([[0, 0, 0, 1],
       [0, 0, 1, 0],
       [0, 1, 0, 0],
       [1, 0, 0, 1],
       [1, 0, 1, 0],
       [1, 1, 0, 0]])

回答 9

只需使用排序,即可使用要排序的列号。

a = np.array([1,1], [1,-1], [-1,1], [-1,-1]])
print (a)
a=a.tolist() 
a = np.array(sorted(a, key=lambda a_entry: a_entry[0]))
print (a)

Simply using sort, use coloumn number based on which you want to sort.

a = np.array([1,1], [1,-1], [-1,1], [-1,-1]])
print (a)
a=a.tolist() 
a = np.array(sorted(a, key=lambda a_entry: a_entry[0]))
print (a)

回答 10

这是一个古老的问题,但是如果您需要将其推广到2维以上的数组,则可以采用以下解决方案:

np.einsum('ij->ij', a[a[:,1].argsort(),:])

这对于两个维度来说是一个过大的杀伤力,并且a[a[:,1].argsort()]每个@steve的答案就足够了,但是不能将该答案推广到更高的维度。您可以在此问题中找到3D阵列的示例。

输出:

[[7 0 5]
 [9 2 3]
 [4 5 6]]

It is an old question but if you need to generalize this to a higher than 2 dimension arrays, here is the solution than can be easily generalized:

np.einsum('ij->ij', a[a[:,1].argsort(),:])

This is an overkill for two dimensions and a[a[:,1].argsort()] would be enough per @steve’s answer, however that answer cannot be generalized to higher dimensions. You can find an example of 3D array in this question.

Output:

[[7 0 5]
 [9 2 3]
 [4 5 6]]

如何在不截断的情况下打印完整的NumPy数组?

问题:如何在不截断的情况下打印完整的NumPy数组?

当我打印一个numpy数组时,我得到一个截断的表示形式,但是我想要完整的数组。

有什么办法吗?

例子:

>>> numpy.arange(10000)
array([   0,    1,    2, ..., 9997, 9998, 9999])

>>> numpy.arange(10000).reshape(250,40)
array([[   0,    1,    2, ...,   37,   38,   39],
       [  40,   41,   42, ...,   77,   78,   79],
       [  80,   81,   82, ...,  117,  118,  119],
       ..., 
       [9880, 9881, 9882, ..., 9917, 9918, 9919],
       [9920, 9921, 9922, ..., 9957, 9958, 9959],
       [9960, 9961, 9962, ..., 9997, 9998, 9999]])

When I print a numpy array, I get a truncated representation, but I want the full array.

Is there any way to do this?

Examples:

>>> numpy.arange(10000)
array([   0,    1,    2, ..., 9997, 9998, 9999])

>>> numpy.arange(10000).reshape(250,40)
array([[   0,    1,    2, ...,   37,   38,   39],
       [  40,   41,   42, ...,   77,   78,   79],
       [  80,   81,   82, ...,  117,  118,  119],
       ..., 
       [9880, 9881, 9882, ..., 9917, 9918, 9919],
       [9920, 9921, 9922, ..., 9957, 9958, 9959],
       [9960, 9961, 9962, ..., 9997, 9998, 9999]])

回答 0

用途numpy.set_printoptions

import sys
import numpy
numpy.set_printoptions(threshold=sys.maxsize)

Use numpy.set_printoptions:

import sys
import numpy
numpy.set_printoptions(threshold=sys.maxsize)

回答 1

import numpy as np
np.set_printoptions(threshold=np.inf)

我建议使用,np.inf而不是np.nan别人建议的。它们都为您的目的而工作,但是通过将阈值设置为“无穷大”,对于每个阅读您的代码的人来说都是显而易见的。对我来说,达到“没有数字”的门槛似乎有点模糊。

import numpy as np
np.set_printoptions(threshold=np.inf)

I suggest using np.inf instead of np.nan which is suggested by others. They both work for your purpose, but by setting the threshold to “infinity” it is obvious to everybody reading your code what you mean. Having a threshold of “not a number” seems a little vague to me.


回答 2

先前的答案是正确的,但是作为较弱的选择,您可以转换为列表:

>>> numpy.arange(100).reshape(25,4).tolist()

[[0, 1, 2, 3], [4, 5, 6, 7], [8, 9, 10, 11], [12, 13, 14, 15], [16, 17, 18, 19], [20, 21,
22, 23], [24, 25, 26, 27], [28, 29, 30, 31], [32, 33, 34, 35], [36, 37, 38, 39], [40, 41,
42, 43], [44, 45, 46, 47], [48, 49, 50, 51], [52, 53, 54, 55], [56, 57, 58, 59], [60, 61,
62, 63], [64, 65, 66, 67], [68, 69, 70, 71], [72, 73, 74, 75], [76, 77, 78, 79], [80, 81,
82, 83], [84, 85, 86, 87], [88, 89, 90, 91], [92, 93, 94, 95], [96, 97, 98, 99]]

The previous answers are the correct ones, but as a weaker alternative you can transform into a list:

>>> numpy.arange(100).reshape(25,4).tolist()

[[0, 1, 2, 3], [4, 5, 6, 7], [8, 9, 10, 11], [12, 13, 14, 15], [16, 17, 18, 19], [20, 21,
22, 23], [24, 25, 26, 27], [28, 29, 30, 31], [32, 33, 34, 35], [36, 37, 38, 39], [40, 41,
42, 43], [44, 45, 46, 47], [48, 49, 50, 51], [52, 53, 54, 55], [56, 57, 58, 59], [60, 61,
62, 63], [64, 65, 66, 67], [68, 69, 70, 71], [72, 73, 74, 75], [76, 77, 78, 79], [80, 81,
82, 83], [84, 85, 86, 87], [88, 89, 90, 91], [92, 93, 94, 95], [96, 97, 98, 99]]

回答 3

NumPy 1.15或更高版本

如果您使用NumPy 1.15(2018年7月23日发行)或更高版本,则可以使用printoptions上下文管理器:

with numpy.printoptions(threshold=numpy.inf):
    print(arr)

(当然,如果您导入的方式是,请替换numpy为)npnumpy

使用上下文管理器(with-block)可确保在上下文管理器完成后,打印选项将恢复为块启动之前的状态。它确保设置是临时的,并且仅应用于块内的代码。

有关上下文管理器及其支持的其他参数的详细信息,请参见numpy.printoptions文档

NumPy 1.15 or newer

If you use NumPy 1.15 (released 2018-07-23) or newer, you can use the printoptions context manager:

with numpy.printoptions(threshold=numpy.inf):
    print(arr)

(of course, replace numpy by np if that’s how you imported numpy)

The use of a context manager (the with-block) ensures that after the context manager is finished, the print options will revert to whatever they were before the block started. It ensures the setting is temporary, and only applied to code within the block.

See numpy.printoptions documentation for details on the context manager and what other arguments it supports.


回答 4

听起来您正在使用numpy。

如果是这样,您可以添加:

import numpy as np
np.set_printoptions(threshold=np.nan)

这将禁用边角打印。有关更多信息,请参见此NumPy教程

This sounds like you’re using numpy.

If that’s the case, you can add:

import numpy as np
np.set_printoptions(threshold=np.nan)

That will disable the corner printing. For more information, see this NumPy Tutorial.


回答 5

这是一种一次性的方法,如果您不想更改默认设置,这将非常有用:

def fullprint(*args, **kwargs):
  from pprint import pprint
  import numpy
  opt = numpy.get_printoptions()
  numpy.set_printoptions(threshold=numpy.inf)
  pprint(*args, **kwargs)
  numpy.set_printoptions(**opt)

Here is a one-off way to do this, which is useful if you don’t want to change your default settings:

def fullprint(*args, **kwargs):
  from pprint import pprint
  import numpy
  opt = numpy.get_printoptions()
  numpy.set_printoptions(threshold=numpy.inf)
  pprint(*args, **kwargs)
  numpy.set_printoptions(**opt)

回答 6

使用上下文管理作为保价 sugggested

import numpy as np


class fullprint:
    'context manager for printing full numpy arrays'

    def __init__(self, **kwargs):
        kwargs.setdefault('threshold', np.inf)
        self.opt = kwargs

    def __enter__(self):
        self._opt = np.get_printoptions()
        np.set_printoptions(**self.opt)

    def __exit__(self, type, value, traceback):
        np.set_printoptions(**self._opt)


if __name__ == '__main__': 
    a = np.arange(1001)

    with fullprint():
        print(a)

    print(a)

    with fullprint(threshold=None, edgeitems=10):
        print(a)

Using a context manager as Paul Price sugggested

import numpy as np


class fullprint:
    'context manager for printing full numpy arrays'

    def __init__(self, **kwargs):
        kwargs.setdefault('threshold', np.inf)
        self.opt = kwargs

    def __enter__(self):
        self._opt = np.get_printoptions()
        np.set_printoptions(**self.opt)

    def __exit__(self, type, value, traceback):
        np.set_printoptions(**self._opt)


if __name__ == '__main__': 
    a = np.arange(1001)

    with fullprint():
        print(a)

    print(a)

    with fullprint(threshold=None, edgeitems=10):
        print(a)

回答 7

numpy.savetxt

numpy.savetxt(sys.stdout, numpy.arange(10000))

或者如果您需要一个字符串:

import StringIO
sio = StringIO.StringIO()
numpy.savetxt(sio, numpy.arange(10000))
s = sio.getvalue()
print s

默认输出格式为:

0.000000000000000000e+00
1.000000000000000000e+00
2.000000000000000000e+00
3.000000000000000000e+00
...

并可以使用其他参数进行配置。

特别要注意的是,它也不会显示方括号,并允许进行大量自定义,如以下内容所述:如何打印不带括号的Numpy数组?

在python 2.7.12,numpy 1.11.1上测试。

numpy.savetxt

numpy.savetxt(sys.stdout, numpy.arange(10000))

or if you need a string:

import StringIO
sio = StringIO.StringIO()
numpy.savetxt(sio, numpy.arange(10000))
s = sio.getvalue()
print s

The default output format is:

0.000000000000000000e+00
1.000000000000000000e+00
2.000000000000000000e+00
3.000000000000000000e+00
...

and it can be configured with further arguments.

Note in particular how this also not shows the square brackets, and allows for a lot of customization, as mentioned at: How to print a Numpy array without brackets?

Tested on Python 2.7.12, numpy 1.11.1.


回答 8

这是一个微小的修饰(除去传递额外的参数选项set_printoptions)neok的回答。

它显示了如何使用contextlib.contextmanager更少的代码行轻松地创建这样的contextmanager:

import numpy as np
from contextlib import contextmanager

@contextmanager
def show_complete_array():
    oldoptions = np.get_printoptions()
    np.set_printoptions(threshold=np.inf)
    try:
        yield
    finally:
        np.set_printoptions(**oldoptions)

在您的代码中,可以这样使用它:

a = np.arange(1001)

print(a)      # shows the truncated array

with show_complete_array():
    print(a)  # shows the complete array

print(a)      # shows the truncated array (again)

This is a slight modification (removed the option to pass additional arguments to set_printoptions)of neoks answer.

It shows how you can use contextlib.contextmanager to easily create such a contextmanager with fewer lines of code:

import numpy as np
from contextlib import contextmanager

@contextmanager
def show_complete_array():
    oldoptions = np.get_printoptions()
    np.set_printoptions(threshold=np.inf)
    try:
        yield
    finally:
        np.set_printoptions(**oldoptions)

In your code it can be used like this:

a = np.arange(1001)

print(a)      # shows the truncated array

with show_complete_array():
    print(a)  # shows the complete array

print(a)      # shows the truncated array (again)

回答 9

除了最大列数(以固定)之外,此答案numpy.set_printoptions(threshold=numpy.nan)还可以显示一定数量的字符。在某些环境中,例如从bash调用python(而不是交互式会话)时,可以通过如下设置参数来解决此问题linewidth

import numpy as np
np.set_printoptions(linewidth=2000)    # default = 75
Mat = np.arange(20000,20150).reshape(2,75)    # 150 elements (75 columns)
print(Mat)

在这种情况下,您的窗口应限制换行符的字符数。

对于那些使用sublime文本并希望在输出窗口中查看结果的用户,应将build选项添加"word_wrap": false到sublime-build文件[ source ]中。

Complementary to this answer from the maximum number of columns (fixed with numpy.set_printoptions(threshold=numpy.nan)), there is also a limit of characters to be displayed. In some environments like when calling python from bash (rather than the interactive session), this can be fixed by setting the parameter linewidth as following.

import numpy as np
np.set_printoptions(linewidth=2000)    # default = 75
Mat = np.arange(20000,20150).reshape(2,75)    # 150 elements (75 columns)
print(Mat)

In this case, your window should limit the number of characters to wrap the line.

For those out there using sublime text and wanting to see results within the output window, you should add the build option "word_wrap": false to the sublime-build file [source] .


回答 10

从NumPy 1.16版本开始,有关更多详细信息,请参见GitHub票证12251

from sys import maxsize
from numpy import set_printoptions

set_printoptions(threshold=maxsize)

Since NumPy version 1.16, for more details see GitHub ticket 12251.

from sys import maxsize
from numpy import set_printoptions

set_printoptions(threshold=maxsize)

回答 11

要关闭它并返回正常模式

np.set_printoptions(threshold=False)

To turn it off and return to the normal mode

np.set_printoptions(threshold=False)

回答 12

假设您有一个numpy数组

 arr = numpy.arange(10000).reshape(250,40)

如果要一次性打印整个数组(不切换np.set_printoptions),但是想要比上下文管理器更简单(更少的代码)的方法,那就做

for row in arr:
     print row 

Suppose you have a numpy array

 arr = numpy.arange(10000).reshape(250,40)

If you want to print the full array in a one-off way (without toggling np.set_printoptions), but want something simpler (less code) than the context manager, just do

for row in arr:
     print row 

回答 13

稍作修改:(因为您要打印大量列表)

import numpy as np
np.set_printoptions(threshold=np.inf, linewidth=200)

x = np.arange(1000)
print(x)

这将增加每行的字符数(默认线宽为75)。使用任何您喜欢的值作为适合您的编码环境的线宽。通过每行添加更多字符,这将使您不必遍历大量输出行。

A slight modification: (since you are going to print a huge list)

import numpy as np
np.set_printoptions(threshold=np.inf, linewidth=200)

x = np.arange(1000)
print(x)

This will increase the number of characters per line (default linewidth of 75). Use any value you like for the linewidth which suits your coding environment. This will save you from having to go through huge number of output lines by adding more characters per line.


回答 14

您可以使用array2string功能-docs

a = numpy.arange(10000).reshape(250,40)
print(numpy.array2string(a, threshold=numpy.nan, max_line_width=numpy.nan))
# [Big output]

You can use the array2string function – docs.

a = numpy.arange(10000).reshape(250,40)
print(numpy.array2string(a, threshold=numpy.nan, max_line_width=numpy.nan))
# [Big output]

回答 15

您不会总是希望打印所有项目,尤其是对于大型阵列。

一种显示更多项目的简单方法:

In [349]: ar
Out[349]: array([1, 1, 1, ..., 0, 0, 0])

In [350]: ar[:100]
Out[350]:
array([1, 1, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 1, 1, 0, 0, 1,
       1, 1, 0, 0, 0, 0, 0, 1, 1, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
       0, 0, 1, 1, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 1, 1, 0, 0, 0, 0,
       0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 1, 1, 0, 0, 0, 0, 0, 0, 0, 1, 1, 1,
       0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 1])

默认情况下,当切片的数组<1000时,它可以正常工作。

You won’t always want all items printed, especially for large arrays.

A simple way to show more items:

In [349]: ar
Out[349]: array([1, 1, 1, ..., 0, 0, 0])

In [350]: ar[:100]
Out[350]:
array([1, 1, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 1, 1, 0, 0, 1,
       1, 1, 0, 0, 0, 0, 0, 1, 1, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
       0, 0, 1, 1, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 1, 1, 0, 0, 0, 0,
       0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 1, 1, 0, 0, 0, 0, 0, 0, 0, 1, 1, 1,
       0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 1])

It works fine when sliced array < 1000 by default.


回答 16

如果有熊猫

    numpy.arange(10000).reshape(250,40)
    print(pandas.DataFrame(a).to_string(header=False, index=False))

避免了需要重新设置的副作用,numpy.set_printoptions(threshold=sys.maxsize)并且您没有得到numpy.array和方括号。我发现这很方便将大量数组转储到日志文件中

If you have pandas available,

    numpy.arange(10000).reshape(250,40)
    print(pandas.DataFrame(a).to_string(header=False, index=False))

avoids the side effect of requiring a reset of numpy.set_printoptions(threshold=sys.maxsize) and you don’t get the numpy.array and brackets. I find this convenient for dumping a wide array into a log file


回答 17

如果一个数组太大而无法打印,NumPy会自动跳过该数组的中央部分而仅打印角点:要禁用此行为并强制NumPy打印整个数组,可以使用更改打印选项set_printoptions

>>> np.set_printoptions(threshold='nan')

要么

>>> np.set_printoptions(edgeitems=3,infstr='inf',
... linewidth=75, nanstr='nan', precision=8,
... suppress=False, threshold=1000, formatter=None)

您也可以参考numpy文档 numpy文档中的“或部分”以获取更多帮助。

If an array is too large to be printed, NumPy automatically skips the central part of the array and only prints the corners: To disable this behaviour and force NumPy to print the entire array, you can change the printing options using set_printoptions.

>>> np.set_printoptions(threshold='nan')

or

>>> np.set_printoptions(edgeitems=3,infstr='inf',
... linewidth=75, nanstr='nan', precision=8,
... suppress=False, threshold=1000, formatter=None)

You can also refer to the numpy documentation numpy documentation for “or part” for more help.


将NumPy数组转储到csv文件中

问题:将NumPy数组转储到csv文件中

有没有办法将NumPy数组转储到CSV文件中?我有一个2D NumPy数组,需要以人类可读的格式转储它。

Is there a way to dump a NumPy array into a CSV file? I have a 2D NumPy array and need to dump it in human-readable format.


回答 0

numpy.savetxt 将数组保存到文本文件。

import numpy
a = numpy.asarray([ [1,2,3], [4,5,6], [7,8,9] ])
numpy.savetxt("foo.csv", a, delimiter=",")

numpy.savetxt saves an array to a text file.

import numpy
a = numpy.asarray([ [1,2,3], [4,5,6], [7,8,9] ])
numpy.savetxt("foo.csv", a, delimiter=",")

回答 1

您可以使用pandas。它确实需要一些额外的内存,因此并不总是可能的,但是它非常快速且易于使用。

import pandas as pd 
pd.DataFrame(np_array).to_csv("path/to/file.csv")

如果您不想要标题或索引,请使用 to_csv("/path/to/file.csv", header=None, index=None)

You can use pandas. It does take some extra memory so it’s not always possible, but it’s very fast and easy to use.

import pandas as pd 
pd.DataFrame(np_array).to_csv("path/to/file.csv")

if you don’t want a header or index, use to_csv("/path/to/file.csv", header=None, index=None)


回答 2

tofile 是执行此操作的便捷功能:

import numpy as np
a = np.asarray([ [1,2,3], [4,5,6], [7,8,9] ])
a.tofile('foo.csv',sep=',',format='%10.5f')

手册页有一些有用的注释:

这是用于快速存储阵列数据的便利功能。有关字节序和精度的信息会丢失,因此对于打算在不同字节序的计算机之间存档数据或传输数据的文件,此方法不是一个好的选择。这些问题中的一些可以通过将数据输出为文本文件来克服,而这是以速度和文件大小为代价的。

注意。此功能不会生成多行的CSV文件,而是将所有内容保存到一行。

tofile is a convenient function to do this:

import numpy as np
a = np.asarray([ [1,2,3], [4,5,6], [7,8,9] ])
a.tofile('foo.csv',sep=',',format='%10.5f')

The man page has some useful notes:

This is a convenience function for quick storage of array data. Information on endianness and precision is lost, so this method is not a good choice for files intended to archive data or transport data between machines with different endianness. Some of these problems can be overcome by outputting the data as text files, at the expense of speed and file size.

Note. This function does not produce multi-line csv files, it saves everything to one line.


回答 3

将记录数组编写为带有标题的CSV文件需要更多的工作。

本示例读取标题为第一行的CSV文件,然后写入相同的文件。

import numpy as np

# Write an example CSV file with headers on first line
with open('example.csv', 'w') as fp:
    fp.write('''\
col1,col2,col3
1,100.1,string1
2,222.2,second string
''')

# Read it as a Numpy record array
ar = np.recfromcsv('example.csv')
print(repr(ar))
# rec.array([(1, 100.1, 'string1'), (2, 222.2, 'second string')], 
#           dtype=[('col1', '<i4'), ('col2', '<f8'), ('col3', 'S13')])

# Write as a CSV file with headers on first line
with open('out.csv', 'w') as fp:
    fp.write(','.join(ar.dtype.names) + '\n')
    np.savetxt(fp, ar, '%s', ',')

请注意,此示例不考虑带逗号的字符串。要考虑非数字数据的引号,请使用以下csv软件包:

import csv

with open('out2.csv', 'wb') as fp:
    writer = csv.writer(fp, quoting=csv.QUOTE_NONNUMERIC)
    writer.writerow(ar.dtype.names)
    writer.writerows(ar.tolist())

Writing record arrays as CSV files with headers requires a bit more work.

This example reads a CSV file with the header on the first line, then writes the same file.

import numpy as np

# Write an example CSV file with headers on first line
with open('example.csv', 'w') as fp:
    fp.write('''\
col1,col2,col3
1,100.1,string1
2,222.2,second string
''')

# Read it as a Numpy record array
ar = np.recfromcsv('example.csv')
print(repr(ar))
# rec.array([(1, 100.1, 'string1'), (2, 222.2, 'second string')], 
#           dtype=[('col1', '<i4'), ('col2', '<f8'), ('col3', 'S13')])

# Write as a CSV file with headers on first line
with open('out.csv', 'w') as fp:
    fp.write(','.join(ar.dtype.names) + '\n')
    np.savetxt(fp, ar, '%s', ',')

Note that this example does not consider strings with commas. To consider quotes for non-numeric data, use the csv package:

import csv

with open('out2.csv', 'wb') as fp:
    writer = csv.writer(fp, quoting=csv.QUOTE_NONNUMERIC)
    writer.writerow(ar.dtype.names)
    writer.writerows(ar.tolist())

回答 4

如前所述,将数组转储为CSV文件的最佳方法是使用.savetxt(...)方法。但是,有些事情我们应该知道如何正确完成。

例如,如果您有一个带dtype = np.int32as 的numpy数组

   narr = np.array([[1,2],
                 [3,4],
                 [5,6]], dtype=np.int32)

并想另存savetxt

np.savetxt('values.csv', narr, delimiter=",")

它将数据以浮点指数格式存储为

1.000000000000000000e+00,2.000000000000000000e+00
3.000000000000000000e+00,4.000000000000000000e+00
5.000000000000000000e+00,6.000000000000000000e+00

你必须使用一个名为参数更改格式fmt

np.savetxt('values.csv', narr, fmt="%d", delimiter=",")

以原始格式存储数据

以压缩的gz格式保存数据

此外,savetxt还可用于以.gz压缩格式存储数据,这在通过网络传输数据时可能很有用。

我们只需要更改文件的扩展名,因为.gznumpy会自动处理所有内容

np.savetxt('values.gz', narr, fmt="%d", delimiter=",")

希望能帮助到你

As already discussed, the best way to dump the array into a CSV file is by using .savetxt(...)method. However, there are certain things we should know to do it properly.

For example, if you have a numpy array with dtype = np.int32 as

   narr = np.array([[1,2],
                 [3,4],
                 [5,6]], dtype=np.int32)

and want to save using savetxt as

np.savetxt('values.csv', narr, delimiter=",")

It will store the data in floating point exponential format as

1.000000000000000000e+00,2.000000000000000000e+00
3.000000000000000000e+00,4.000000000000000000e+00
5.000000000000000000e+00,6.000000000000000000e+00

You will have to change the formatting by using a parameter called fmt as

np.savetxt('values.csv', narr, fmt="%d", delimiter=",")

to store data in its original format

Saving Data in Compressed gz format

Also, savetxt can be used for storing data in .gz compressed format which might be useful while transferring data over network.

We just need to change the extension of the file as .gz and numpy will take care of everything automatically

np.savetxt('values.gz', narr, fmt="%d", delimiter=",")

Hope it helps


回答 5

我相信您也可以很简单地完成此操作,如下所示:

  1. 将Numpy数组转换为Pandas数据框
  2. 另存为CSV

例如#1:

    # Libraries to import
    import pandas as pd
    import nump as np

    #N x N numpy array (dimensions dont matter)
    corr_mat    #your numpy array
    my_df = pd.DataFrame(corr_mat)  #converting it to a pandas dataframe

例如#2:

    #save as csv 
    my_df.to_csv('foo.csv', index=False)   # "foo" is the name you want to give
                                           # to csv file. Make sure to add ".csv"
                                           # after whatever name like in the code

I believe you can also accomplish this quite simply as follows:

  1. Convert Numpy array into a Pandas dataframe
  2. Save as CSV

e.g. #1:

    # Libraries to import
    import pandas as pd
    import nump as np

    #N x N numpy array (dimensions dont matter)
    corr_mat    #your numpy array
    my_df = pd.DataFrame(corr_mat)  #converting it to a pandas dataframe

e.g. #2:

    #save as csv 
    my_df.to_csv('foo.csv', index=False)   # "foo" is the name you want to give
                                           # to csv file. Make sure to add ".csv"
                                           # after whatever name like in the code

回答 6

如果要在列中写:

    for x in np.nditer(a.T, order='C'): 
            file.write(str(x))
            file.write("\n")

这里的“ a”是numpy数组的名称,“文件”是要写入文件的变量。

如果要写在行中:

    writer= csv.writer(file, delimiter=',')
    for x in np.nditer(a.T, order='C'): 
            row.append(str(x))
    writer.writerow(row)

if you want to write in column:

    for x in np.nditer(a.T, order='C'): 
            file.write(str(x))
            file.write("\n")

Here ‘a’ is the name of numpy array and ‘file’ is the variable to write in a file.

If you want to write in row:

    writer= csv.writer(file, delimiter=',')
    for x in np.nditer(a.T, order='C'): 
            row.append(str(x))
    writer.writerow(row)

回答 7

如果要将numpy数组(例如your_array = np.array([[1,2],[3,4]]))保存到一个单元格,可以先使用进行转换your_array.tolist()

然后将其以正常方式保存到一个单元格中,并且delimiter=';' 和,csv文件中的单元格将如下所示[[1, 2], [2, 4]]

然后,您可以像这样恢复阵列: your_array = np.array(ast.literal_eval(cell_string))

If you want to save your numpy array (e.g. your_array = np.array([[1,2],[3,4]])) to one cell, you could convert it first with your_array.tolist().

Then save it the normal way to one cell, with delimiter=';' and the cell in the csv-file will look like this [[1, 2], [2, 4]]

Then you could restore your array like this: your_array = np.array(ast.literal_eval(cell_string))


回答 8

您也可以使用纯python而不使用任何模块来完成此操作。

# format as a block of csv text to do whatever you want
csv_rows = ["{},{}".format(i, j) for i, j in array]
csv_text = "\n".join(csv_rows)

# write it to a file
with open('file.csv', 'w') as f:
    f.write(csv_text)

You can also do it with pure python without using any modules.

# format as a block of csv text to do whatever you want
csv_rows = ["{},{}".format(i, j) for i, j in array]
csv_text = "\n".join(csv_rows)

# write it to a file
with open('file.csv', 'w') as f:
    f.write(csv_text)

回答 9

在Python中,我们使用csv.writer()模块将数据写入csv文件。该模块类似于csv.reader()模块。

import csv

person = [['SN', 'Person', 'DOB'],
['1', 'John', '18/1/1997'],
['2', 'Marie','19/2/1998'],
['3', 'Simon','20/3/1999'],
['4', 'Erik', '21/4/2000'],
['5', 'Ana', '22/5/2001']]

csv.register_dialect('myDialect',
delimiter = '|',
quoting=csv.QUOTE_NONE,
skipinitialspace=True)

with open('dob.csv', 'w') as f:
    writer = csv.writer(f, dialect='myDialect')
    for row in person:
       writer.writerow(row)

f.close()

定界符是用于分隔字段的字符串。默认值为comma(,)。

In Python we use csv.writer() module to write data into csv files. This module is similar to the csv.reader() module.

import csv

person = [['SN', 'Person', 'DOB'],
['1', 'John', '18/1/1997'],
['2', 'Marie','19/2/1998'],
['3', 'Simon','20/3/1999'],
['4', 'Erik', '21/4/2000'],
['5', 'Ana', '22/5/2001']]

csv.register_dialect('myDialect',
delimiter = '|',
quoting=csv.QUOTE_NONE,
skipinitialspace=True)

with open('dob.csv', 'w') as f:
    writer = csv.writer(f, dialect='myDialect')
    for row in person:
       writer.writerow(row)

f.close()

A delimiter is a string used to separate fields. The default value is comma(,).