在Python中读取.mat文件

问题:在Python中读取.mat文件

是否可以在Python中读取二进制MATLAB .mat文件?

我已经看到SciPy声称支持读取.mat文件,但是我没有成功。我安装了SciPy 0.7.0版,但找不到该loadmat()方法。

Is it possible to read binary MATLAB .mat files in Python?

I’ve seen that SciPy has alleged support for reading .mat files, but I’m unsuccessful with it. I installed SciPy version 0.7.0, and I can’t find the loadmat() method.


回答 0

需要导入,import scipy.io

import scipy.io
mat = scipy.io.loadmat('file.mat')

An import is required, import scipy.io

import scipy.io
mat = scipy.io.loadmat('file.mat')

回答 1

无论是scipy.io.savemat,还是scipy.io.loadmat对于MATLAB阵列的7.3版本的工作。但是好消息是MATLAB版本7.3文件是hdf5数据集。因此,可以使用许多工具(包括NumPy)读取它们。

对于Python,您将需要h5py扩展,该扩展在系统上需要HDF5。

import numpy as np
import h5py
f = h5py.File('somefile.mat','r')
data = f.get('data/variable1')
data = np.array(data) # For converting to a NumPy array

Neither scipy.io.savemat, nor scipy.io.loadmat work for MATLAB arrays version 7.3. But the good part is that MATLAB version 7.3 files are hdf5 datasets. So they can be read using a number of tools, including NumPy.

For Python, you will need the h5py extension, which requires HDF5 on your system.

import numpy as np
import h5py
f = h5py.File('somefile.mat','r')
data = f.get('data/variable1')
data = np.array(data) # For converting to a NumPy array

回答 2

首先将.mat文件另存为:

save('test.mat', '-v7')

之后,在Python中,使用通常的loadmat函数:

import scipy.io as sio
test = sio.loadmat('test.mat')

First save the .mat file as:

save('test.mat', '-v7')

After that, in Python, use the usual loadmat function:

import scipy.io as sio
test = sio.loadmat('test.mat')

回答 3

有一个很好的软件包mat4py,可以很容易地使用安装

pip install mat4py

使用起来很简单(从网站上):

从MAT文件加载数据

该函数loadmat仅使用Python dictlist对象将MAT文件中存储的所有变量加载到简单的Python数据结构中。数字和单元格数组将转换为按行排序的嵌套列表。压缩数组以消除仅包含一个元素的数组。结果数据结构由与JSON格式兼容的简单类型组成。

示例:将MAT文件加载到Python数据结构中:

from mat4py import loadmat

data = loadmat('datafile.mat')

变量datadict带有MAT文件中包含的变量和值的a 。

将Python数据结构保存到MAT文件

可以使用函数将Python数据保存到MAT文件中savemat。数据已经以同样的方式为被结构化的loadmat,也就是说,它应该由简单数据类型,像dictliststrint,和float

示例:将Python数据结构保存到MAT文件中:

from mat4py import savemat

savemat('datafile.mat', data)

参数data应为dict带有变量的a。

There is a nice package called mat4py which can easily be installed using

pip install mat4py

It is straightforward to use (from the website):

Load data from a MAT-file

The function loadmat loads all variables stored in the MAT-file into a simple Python data structure, using only Python’s dict and list objects. Numeric and cell arrays are converted to row-ordered nested lists. Arrays are squeezed to eliminate arrays with only one element. The resulting data structure is composed of simple types that are compatible with the JSON format.

Example: Load a MAT-file into a Python data structure:

from mat4py import loadmat

data = loadmat('datafile.mat')

The variable data is a dict with the variables and values contained in the MAT-file.

Save a Python data structure to a MAT-file

Python data can be saved to a MAT-file, with the function savemat. Data has to be structured in the same way as for loadmat, i.e. it should be composed of simple data types, like dict, list, str, int, and float.

Example: Save a Python data structure to a MAT-file:

from mat4py import savemat

savemat('datafile.mat', data)

The parameter data shall be a dict with the variables.


回答 4

安装了MATLAB 2014b或更高版本后,可以使用适用于PythonMATLAB引擎

import matlab.engine
eng = matlab.engine.start_matlab()
content = eng.load("example.mat", nargout=1)

Having MATLAB 2014b or newer installed, the MATLAB engine for Python could be used:

import matlab.engine
eng = matlab.engine.start_matlab()
content = eng.load("example.mat", nargout=1)

回答 5

读取文件

import scipy.io
mat = scipy.io.loadmat(file_name)

检查MAT变量的类型

print(type(mat))
#OUTPUT - <class 'dict'>

字典中的MATLAB变量分配给这些变量对象

Reading the file

import scipy.io
mat = scipy.io.loadmat(file_name)

Inspecting the type of MAT variable

print(type(mat))
#OUTPUT - <class 'dict'>

The keys inside the dictionary are MATLAB variables, and the values are the objects assigned to those variables.


回答 6

MathWorks本身也提供用于PythonMATLAB引擎。如果您有MATLAB,则可能值得考虑(我自己还没有尝试过,但是它具有比仅读取MATLAB文件更多的功能)。但是,我不知道是否允许将其分发给其他用户(如果这些人拥有MATLAB,这可能不是问题。否则,也许NumPy是正确的选择?)。

另外,如果您想自己掌握所有基础知识,MathWorks将提供有关文件格式结构的详细文档(如果链接发生更改,请尝试使用google matfile_format.pdf或其标题MAT-FILE Format)。它并不像我个人想象的那样复杂,但是显然,这不是最简单的方法。它还取决于.mat您要支持-files的多少功能。

我编写了一个“小”(约700行)Python脚本,该脚本可以读取一些基本的.mat-files。我既不是Python专家,也不是初学者,我花了大约两天时间来编写它(使用上面链接的MathWorks文档)。我学到很多新东西,这很有趣(大部分时间)。当我在工作时编写Python脚本时,恐怕我无法发布它了……但是我可以在这里给出一些建议:

  • 首先阅读文档。
  • 使用十六进制编辑器(例如HxD)并查看.mat要解析的参考文件。
  • 尝试通过将字节保存到.txt文件并注释每行来弄清楚每个字节的含义。
  • 使用类来保存每个数据元素(如miCOMPRESSEDmiMATRIXmxDOUBLE,或miINT32
  • .mat-files’结构是最佳的用于保存在一个树形数据结构中的数据元素; 每个节点都有一个类和子节点

There is also the MATLAB Engine for Python by MathWorks itself. If you have MATLAB, this might be worth considering (I haven’t tried it myself but it has a lot more functionality than just reading MATLAB files). However, I don’t know if it is allowed to distribute it to other users (it is probably not a problem if those persons have MATLAB. Otherwise, maybe NumPy is the right way to go?).

Also, if you want to do all the basics yourself, MathWorks provides (if the link changes, try to google for matfile_format.pdf or its title MAT-FILE Format) a detailed documentation on the structure of the file format. It’s not as complicated as I personally thought, but obviously, this is not the easiest way to go. It also depends on how many features of the .mat-files you want to support.

I’ve written a “small” (about 700 lines) Python script which can read some basic .mat-files. I’m neither a Python expert nor a beginner and it took me about two days to write it (using the MathWorks documentation linked above). I’ve learned a lot of new stuff and it was quite fun (most of the time). As I’ve written the Python script at work, I’m afraid I cannot publish it… But I can give some advice here:

  • First read the documentation.
  • Use a hex editor (such as HxD) and look into a reference .mat-file you want to parse.
  • Try to figure out the meaning of each byte by saving the bytes to a .txt file and annotate each line.
  • Use classes to save each data element (such as miCOMPRESSED, miMATRIX, mxDOUBLE, or miINT32)
  • The .mat-files’ structure is optimal for saving the data elements in a tree data structure; each node has one class and subnodes

回答 7

from os.path import dirname, join as pjoin
import scipy.io as sio
data_dir = pjoin(dirname(sio.__file__), 'matlab', 'tests', 'data')
mat_fname = pjoin(data_dir, 'testdouble_7.4_GLNX86.mat')
mat_contents = sio.loadmat(mat_fname)

您可以使用上面的代码在Python中读取默认保存的.mat文件。

from os.path import dirname, join as pjoin
import scipy.io as sio
data_dir = pjoin(dirname(sio.__file__), 'matlab', 'tests', 'data')
mat_fname = pjoin(data_dir, 'testdouble_7.4_GLNX86.mat')
mat_contents = sio.loadmat(mat_fname)

You can use above code to read the default saved .mat file in Python.