将NumPy数组转储到csv文件中

问题:将NumPy数组转储到csv文件中

有没有办法将NumPy数组转储到CSV文件中?我有一个2D NumPy数组,需要以人类可读的格式转储它。

Is there a way to dump a NumPy array into a CSV file? I have a 2D NumPy array and need to dump it in human-readable format.


回答 0

numpy.savetxt 将数组保存到文本文件。

import numpy
a = numpy.asarray([ [1,2,3], [4,5,6], [7,8,9] ])
numpy.savetxt("foo.csv", a, delimiter=",")

numpy.savetxt saves an array to a text file.

import numpy
a = numpy.asarray([ [1,2,3], [4,5,6], [7,8,9] ])
numpy.savetxt("foo.csv", a, delimiter=",")

回答 1

您可以使用pandas。它确实需要一些额外的内存,因此并不总是可能的,但是它非常快速且易于使用。

import pandas as pd 
pd.DataFrame(np_array).to_csv("path/to/file.csv")

如果您不想要标题或索引,请使用 to_csv("/path/to/file.csv", header=None, index=None)

You can use pandas. It does take some extra memory so it’s not always possible, but it’s very fast and easy to use.

import pandas as pd 
pd.DataFrame(np_array).to_csv("path/to/file.csv")

if you don’t want a header or index, use to_csv("/path/to/file.csv", header=None, index=None)


回答 2

tofile 是执行此操作的便捷功能:

import numpy as np
a = np.asarray([ [1,2,3], [4,5,6], [7,8,9] ])
a.tofile('foo.csv',sep=',',format='%10.5f')

手册页有一些有用的注释:

这是用于快速存储阵列数据的便利功能。有关字节序和精度的信息会丢失,因此对于打算在不同字节序的计算机之间存档数据或传输数据的文件,此方法不是一个好的选择。这些问题中的一些可以通过将数据输出为文本文件来克服,而这是以速度和文件大小为代价的。

注意。此功能不会生成多行的CSV文件,而是将所有内容保存到一行。

tofile is a convenient function to do this:

import numpy as np
a = np.asarray([ [1,2,3], [4,5,6], [7,8,9] ])
a.tofile('foo.csv',sep=',',format='%10.5f')

The man page has some useful notes:

This is a convenience function for quick storage of array data. Information on endianness and precision is lost, so this method is not a good choice for files intended to archive data or transport data between machines with different endianness. Some of these problems can be overcome by outputting the data as text files, at the expense of speed and file size.

Note. This function does not produce multi-line csv files, it saves everything to one line.


回答 3

将记录数组编写为带有标题的CSV文件需要更多的工作。

本示例读取标题为第一行的CSV文件,然后写入相同的文件。

import numpy as np

# Write an example CSV file with headers on first line
with open('example.csv', 'w') as fp:
    fp.write('''\
col1,col2,col3
1,100.1,string1
2,222.2,second string
''')

# Read it as a Numpy record array
ar = np.recfromcsv('example.csv')
print(repr(ar))
# rec.array([(1, 100.1, 'string1'), (2, 222.2, 'second string')], 
#           dtype=[('col1', '<i4'), ('col2', '<f8'), ('col3', 'S13')])

# Write as a CSV file with headers on first line
with open('out.csv', 'w') as fp:
    fp.write(','.join(ar.dtype.names) + '\n')
    np.savetxt(fp, ar, '%s', ',')

请注意,此示例不考虑带逗号的字符串。要考虑非数字数据的引号,请使用以下csv软件包:

import csv

with open('out2.csv', 'wb') as fp:
    writer = csv.writer(fp, quoting=csv.QUOTE_NONNUMERIC)
    writer.writerow(ar.dtype.names)
    writer.writerows(ar.tolist())

Writing record arrays as CSV files with headers requires a bit more work.

This example reads a CSV file with the header on the first line, then writes the same file.

import numpy as np

# Write an example CSV file with headers on first line
with open('example.csv', 'w') as fp:
    fp.write('''\
col1,col2,col3
1,100.1,string1
2,222.2,second string
''')

# Read it as a Numpy record array
ar = np.recfromcsv('example.csv')
print(repr(ar))
# rec.array([(1, 100.1, 'string1'), (2, 222.2, 'second string')], 
#           dtype=[('col1', '<i4'), ('col2', '<f8'), ('col3', 'S13')])

# Write as a CSV file with headers on first line
with open('out.csv', 'w') as fp:
    fp.write(','.join(ar.dtype.names) + '\n')
    np.savetxt(fp, ar, '%s', ',')

Note that this example does not consider strings with commas. To consider quotes for non-numeric data, use the csv package:

import csv

with open('out2.csv', 'wb') as fp:
    writer = csv.writer(fp, quoting=csv.QUOTE_NONNUMERIC)
    writer.writerow(ar.dtype.names)
    writer.writerows(ar.tolist())

回答 4

如前所述,将数组转储为CSV文件的最佳方法是使用.savetxt(...)方法。但是,有些事情我们应该知道如何正确完成。

例如,如果您有一个带dtype = np.int32as 的numpy数组

   narr = np.array([[1,2],
                 [3,4],
                 [5,6]], dtype=np.int32)

并想另存savetxt

np.savetxt('values.csv', narr, delimiter=",")

它将数据以浮点指数格式存储为

1.000000000000000000e+00,2.000000000000000000e+00
3.000000000000000000e+00,4.000000000000000000e+00
5.000000000000000000e+00,6.000000000000000000e+00

你必须使用一个名为参数更改格式fmt

np.savetxt('values.csv', narr, fmt="%d", delimiter=",")

以原始格式存储数据

以压缩的gz格式保存数据

此外,savetxt还可用于以.gz压缩格式存储数据,这在通过网络传输数据时可能很有用。

我们只需要更改文件的扩展名,因为.gznumpy会自动处理所有内容

np.savetxt('values.gz', narr, fmt="%d", delimiter=",")

希望能帮助到你

As already discussed, the best way to dump the array into a CSV file is by using .savetxt(...)method. However, there are certain things we should know to do it properly.

For example, if you have a numpy array with dtype = np.int32 as

   narr = np.array([[1,2],
                 [3,4],
                 [5,6]], dtype=np.int32)

and want to save using savetxt as

np.savetxt('values.csv', narr, delimiter=",")

It will store the data in floating point exponential format as

1.000000000000000000e+00,2.000000000000000000e+00
3.000000000000000000e+00,4.000000000000000000e+00
5.000000000000000000e+00,6.000000000000000000e+00

You will have to change the formatting by using a parameter called fmt as

np.savetxt('values.csv', narr, fmt="%d", delimiter=",")

to store data in its original format

Saving Data in Compressed gz format

Also, savetxt can be used for storing data in .gz compressed format which might be useful while transferring data over network.

We just need to change the extension of the file as .gz and numpy will take care of everything automatically

np.savetxt('values.gz', narr, fmt="%d", delimiter=",")

Hope it helps


回答 5

我相信您也可以很简单地完成此操作,如下所示:

  1. 将Numpy数组转换为Pandas数据框
  2. 另存为CSV

例如#1:

    # Libraries to import
    import pandas as pd
    import nump as np

    #N x N numpy array (dimensions dont matter)
    corr_mat    #your numpy array
    my_df = pd.DataFrame(corr_mat)  #converting it to a pandas dataframe

例如#2:

    #save as csv 
    my_df.to_csv('foo.csv', index=False)   # "foo" is the name you want to give
                                           # to csv file. Make sure to add ".csv"
                                           # after whatever name like in the code

I believe you can also accomplish this quite simply as follows:

  1. Convert Numpy array into a Pandas dataframe
  2. Save as CSV

e.g. #1:

    # Libraries to import
    import pandas as pd
    import nump as np

    #N x N numpy array (dimensions dont matter)
    corr_mat    #your numpy array
    my_df = pd.DataFrame(corr_mat)  #converting it to a pandas dataframe

e.g. #2:

    #save as csv 
    my_df.to_csv('foo.csv', index=False)   # "foo" is the name you want to give
                                           # to csv file. Make sure to add ".csv"
                                           # after whatever name like in the code

回答 6

如果要在列中写:

    for x in np.nditer(a.T, order='C'): 
            file.write(str(x))
            file.write("\n")

这里的“ a”是numpy数组的名称,“文件”是要写入文件的变量。

如果要写在行中:

    writer= csv.writer(file, delimiter=',')
    for x in np.nditer(a.T, order='C'): 
            row.append(str(x))
    writer.writerow(row)

if you want to write in column:

    for x in np.nditer(a.T, order='C'): 
            file.write(str(x))
            file.write("\n")

Here ‘a’ is the name of numpy array and ‘file’ is the variable to write in a file.

If you want to write in row:

    writer= csv.writer(file, delimiter=',')
    for x in np.nditer(a.T, order='C'): 
            row.append(str(x))
    writer.writerow(row)

回答 7

如果要将numpy数组(例如your_array = np.array([[1,2],[3,4]]))保存到一个单元格,可以先使用进行转换your_array.tolist()

然后将其以正常方式保存到一个单元格中,并且delimiter=';' 和,csv文件中的单元格将如下所示[[1, 2], [2, 4]]

然后,您可以像这样恢复阵列: your_array = np.array(ast.literal_eval(cell_string))

If you want to save your numpy array (e.g. your_array = np.array([[1,2],[3,4]])) to one cell, you could convert it first with your_array.tolist().

Then save it the normal way to one cell, with delimiter=';' and the cell in the csv-file will look like this [[1, 2], [2, 4]]

Then you could restore your array like this: your_array = np.array(ast.literal_eval(cell_string))


回答 8

您也可以使用纯python而不使用任何模块来完成此操作。

# format as a block of csv text to do whatever you want
csv_rows = ["{},{}".format(i, j) for i, j in array]
csv_text = "\n".join(csv_rows)

# write it to a file
with open('file.csv', 'w') as f:
    f.write(csv_text)

You can also do it with pure python without using any modules.

# format as a block of csv text to do whatever you want
csv_rows = ["{},{}".format(i, j) for i, j in array]
csv_text = "\n".join(csv_rows)

# write it to a file
with open('file.csv', 'w') as f:
    f.write(csv_text)

回答 9

在Python中,我们使用csv.writer()模块将数据写入csv文件。该模块类似于csv.reader()模块。

import csv

person = [['SN', 'Person', 'DOB'],
['1', 'John', '18/1/1997'],
['2', 'Marie','19/2/1998'],
['3', 'Simon','20/3/1999'],
['4', 'Erik', '21/4/2000'],
['5', 'Ana', '22/5/2001']]

csv.register_dialect('myDialect',
delimiter = '|',
quoting=csv.QUOTE_NONE,
skipinitialspace=True)

with open('dob.csv', 'w') as f:
    writer = csv.writer(f, dialect='myDialect')
    for row in person:
       writer.writerow(row)

f.close()

定界符是用于分隔字段的字符串。默认值为comma(,)。

In Python we use csv.writer() module to write data into csv files. This module is similar to the csv.reader() module.

import csv

person = [['SN', 'Person', 'DOB'],
['1', 'John', '18/1/1997'],
['2', 'Marie','19/2/1998'],
['3', 'Simon','20/3/1999'],
['4', 'Erik', '21/4/2000'],
['5', 'Ana', '22/5/2001']]

csv.register_dialect('myDialect',
delimiter = '|',
quoting=csv.QUOTE_NONE,
skipinitialspace=True)

with open('dob.csv', 'w') as f:
    writer = csv.writer(f, dialect='myDialect')
    for row in person:
       writer.writerow(row)

f.close()

A delimiter is a string used to separate fields. The default value is comma(,).