问题:如何修复imdb.load_data()函数的“ allow_pickle = False时无法加载对象数组”?

我正在尝试使用Google Colab中的IMDb数据集实现二进制分类示例。我以前已经实现了此模型。但是,几天后我再次尝试执行此操作时,它返回一个值错误:对于load_data()函数,当allow_pickle = False时无法加载对象数组。

我已经尝试解决此问题,请参考一个类似问题的现有答案:如何修复sketch_rnn算法中的“当allow_pickle = False时无法加载对象数组”, 但事实证明,仅添加allow_pickle参数是不够的。

我的代码:

from keras.datasets import imdb
(train_data, train_labels), (test_data, test_labels) = imdb.load_data(num_words=10000)

错误:

ValueError                                Traceback (most recent call last)
<ipython-input-1-2ab3902db485> in <module>()
      1 from keras.datasets import imdb
----> 2 (train_data, train_labels), (test_data, test_labels) = imdb.load_data(num_words=10000)

2 frames
/usr/local/lib/python3.6/dist-packages/keras/datasets/imdb.py in load_data(path, num_words, skip_top, maxlen, seed, start_char, oov_char, index_from, **kwargs)
     57                     file_hash='599dadb1135973df5b59232a0e9a887c')
     58     with np.load(path) as f:
---> 59         x_train, labels_train = f['x_train'], f['y_train']
     60         x_test, labels_test = f['x_test'], f['y_test']
     61 

/usr/local/lib/python3.6/dist-packages/numpy/lib/npyio.py in __getitem__(self, key)
    260                 return format.read_array(bytes,
    261                                          allow_pickle=self.allow_pickle,
--> 262                                          pickle_kwargs=self.pickle_kwargs)
    263             else:
    264                 return self.zip.read(key)

/usr/local/lib/python3.6/dist-packages/numpy/lib/format.py in read_array(fp, allow_pickle, pickle_kwargs)
    690         # The array contained Python objects. We need to unpickle the data.
    691         if not allow_pickle:
--> 692             raise ValueError("Object arrays cannot be loaded when "
    693                              "allow_pickle=False")
    694         if pickle_kwargs is None:

ValueError: Object arrays cannot be loaded when allow_pickle=False

I’m trying to implement the binary classification example using the IMDb dataset in Google Colab. I have implemented this model before. But when I tried to do it again after a few days, it returned a value error: 'Object arrays cannot be loaded when allow_pickle=False' for the load_data() function.

I have already tried solving this, referring to an existing answer for a similar problem: How to fix ‘Object arrays cannot be loaded when allow_pickle=False’ in the sketch_rnn algorithm. But it turns out that just adding an allow_pickle argument isn’t sufficient.

My code:

from keras.datasets import imdb
(train_data, train_labels), (test_data, test_labels) = imdb.load_data(num_words=10000)

The error:

ValueError                                Traceback (most recent call last)
<ipython-input-1-2ab3902db485> in <module>()
      1 from keras.datasets import imdb
----> 2 (train_data, train_labels), (test_data, test_labels) = imdb.load_data(num_words=10000)

2 frames
/usr/local/lib/python3.6/dist-packages/keras/datasets/imdb.py in load_data(path, num_words, skip_top, maxlen, seed, start_char, oov_char, index_from, **kwargs)
     57                     file_hash='599dadb1135973df5b59232a0e9a887c')
     58     with np.load(path) as f:
---> 59         x_train, labels_train = f['x_train'], f['y_train']
     60         x_test, labels_test = f['x_test'], f['y_test']
     61 

/usr/local/lib/python3.6/dist-packages/numpy/lib/npyio.py in __getitem__(self, key)
    260                 return format.read_array(bytes,
    261                                          allow_pickle=self.allow_pickle,
--> 262                                          pickle_kwargs=self.pickle_kwargs)
    263             else:
    264                 return self.zip.read(key)

/usr/local/lib/python3.6/dist-packages/numpy/lib/format.py in read_array(fp, allow_pickle, pickle_kwargs)
    690         # The array contained Python objects. We need to unpickle the data.
    691         if not allow_pickle:
--> 692             raise ValueError("Object arrays cannot be loaded when "
    693                              "allow_pickle=False")
    694         if pickle_kwargs is None:

ValueError: Object arrays cannot be loaded when allow_pickle=False

回答 0

这是一个强制imdb.load_data在笔记本中允许泡菜代替此行的技巧:

(train_data, train_labels), (test_data, test_labels) = imdb.load_data(num_words=10000)

这样:

import numpy as np
# save np.load
np_load_old = np.load

# modify the default parameters of np.load
np.load = lambda *a,**k: np_load_old(*a, allow_pickle=True, **k)

# call load_data with allow_pickle implicitly set to true
(train_data, train_labels), (test_data, test_labels) = imdb.load_data(num_words=10000)

# restore np.load for future normal usage
np.load = np_load_old

Here’s a trick to force imdb.load_data to allow pickle by, in your notebook, replacing this line:

(train_data, train_labels), (test_data, test_labels) = imdb.load_data(num_words=10000)

by this:

import numpy as np
# save np.load
np_load_old = np.load

# modify the default parameters of np.load
np.load = lambda *a,**k: np_load_old(*a, allow_pickle=True, **k)

# call load_data with allow_pickle implicitly set to true
(train_data, train_labels), (test_data, test_labels) = imdb.load_data(num_words=10000)

# restore np.load for future normal usage
np.load = np_load_old

回答 1

这个问题仍然存在于keras git上。我希望它能尽快解决。在此之前,请尝试将numpy版本降级为1.16.2。看来解决了问题。

!pip install numpy==1.16.1
import numpy as np

此版本的numpy的默认值为allow_pickleas True

This issue is still up on keras git. I hope it gets solved as soon as possible. Until then, try downgrading your numpy version to 1.16.2. It seems to solve the problem.

!pip install numpy==1.16.1
import numpy as np

This version of numpy has the default value of allow_pickle as True.


回答 2

在GitHub上发布此问题之后,官方解决方案是编辑imdb.py文件。此修复程序对我来说效果很好,无需降级numpy。在以下位置找到imdb.py文件tensorflow/python/keras/datasets/imdb.py(对我来说完整路径是:C:\Anaconda\Lib\site-packages\tensorflow\python\keras\datasets\imdb.py-其他安装将有所不同),并根据差异更改第85行:

-  with np.load(path) as f:
+  with np.load(path, allow_pickle=True) as f:

进行更改的原因是为了防止在腌制文件中使用与SQL注入等效的Python。上面的更改仅会影响imdb数据,因此您将安全性保留在其他位置(不降低numpy的级别)。

Following this issue on GitHub, the official solution is to edit the imdb.py file. This fix worked well for me without the need to downgrade numpy. Find the imdb.py file at tensorflow/python/keras/datasets/imdb.py (full path for me was: C:\Anaconda\Lib\site-packages\tensorflow\python\keras\datasets\imdb.py – other installs will be different) and change line 85 as per the diff:

-  with np.load(path) as f:
+  with np.load(path, allow_pickle=True) as f:

The reason for the change is security to prevent the Python equivalent of an SQL injection in a pickled file. The change above will ONLY effect the imdb data and you therefore retain the security elsewhere (by not downgrading numpy).


回答 3

我只是使用allow_pickle = True作为np.load()的参数,它对我有用。

I just used allow_pickle = True as an argument to np.load() and it worked for me.


回答 4

就我而言:

np.load(path, allow_pickle=True)

In my case worked with:

np.load(path, allow_pickle=True)

回答 5

我认为cheez的答案(https://stackoverflow.com/users/122933/cheez)是最简单,最有效。我将对其进行详细说明,以便在整个会话期间都不会修改numpy函数。

我的建议如下。我正在使用它从keras下载路透社数据集,该数据集显示了相同类型的错误:

old = np.load
np.load = lambda *a,**k: old(*a,**k,allow_pickle=True)

from keras.datasets import reuters
(train_data, train_labels), (test_data, test_labels) = reuters.load_data(num_words=10000)

np.load = old
del(old)

I think the answer from cheez (https://stackoverflow.com/users/122933/cheez) is the easiest and most effective one. I’d elaborate a little bit over it so it would not modify a numpy function for the whole session period.

My suggestion is below. I´m using it to download the reuters dataset from keras which is showing the same kind of error:

old = np.load
np.load = lambda *a,**k: old(*a,**k,allow_pickle=True)

from keras.datasets import reuters
(train_data, train_labels), (test_data, test_labels) = reuters.load_data(num_words=10000)

np.load = old
del(old)

回答 6

您可以尝试更改标志的值

np.load(training_image_names_array,allow_pickle=True)

You can try changing the flag’s value

np.load(training_image_names_array,allow_pickle=True)

回答 7

上面列出的解决方案都不对我有用:我使用python 3.7.3运行anaconda。对我有用的是

  • 从Anaconda powershell运行“ conda install numpy == 1.16.1”

  • 关闭并重新打开笔记本

none of the above listed solutions worked for me: i run anaconda with python 3.7.3. What worked for me was

  • run “conda install numpy==1.16.1” from Anaconda powershell

  • close and reopen the notebook


回答 8

在jupyter笔记本上使用

np_load_old = np.load

# modify the default parameters of np.load
np.load = lambda *a,**k: np_load_old(*a, allow_pickle=True, **k)

工作正常,但问题是在spyder中使用此方法时出现的(您每次必须重新启动内核,否则会收到类似以下的错误:

TypeError :()为关键字参数“ allow_pickle”获得了多个值

我在这里使用解决方案解决了这个问题:

on jupyter notebook using

np_load_old = np.load

# modify the default parameters of np.load
np.load = lambda *a,**k: np_load_old(*a, allow_pickle=True, **k)

worked fine, but the problem appears when you use this method in spyder(you have to restart the kernel every time or you will get an error like:

TypeError : () got multiple values for keyword argument ‘allow_pickle’

I solved this issue using the solution here:


回答 9

我降落在这里,尝试了您的方式,无法解决。

我实际上是在编写给定的代码

pickle.load(path)

用过,所以我用

np.load(path, allow_pickle=True)

I landed up here, tried your ways and could not figure out.

I was actually working on a pregiven code where

pickle.load(path)

was used so i replaced it with

np.load(path, allow_pickle=True)

回答 10

是的,安装以前的numpy版本可以解决此问题。

对于使用PyCharm IDE的用户:

在我的IDE(Pycharm)中,依次单击File-> Settings-> Project Interpreter:我发现numpy为1.16.3,所以我恢复为1.16.1。单击+并在搜索中键入numpy,在“指定版本”上打勾:1.16.1并选择->安装软件包。

Yes, installing previous a version of numpy solved the problem.

For those who uses PyCharm IDE:

in my IDE (Pycharm), File->Settings->Project Interpreter: I found my numpy to be 1.16.3, so I revert back to 1.16.1. Click + and type numpy in the search, tick “specify version” : 1.16.1 and choose–> install package.


回答 11

找到imdb.py的路径,然后将标志添加到np.load(path,… flag …)

    def load_data(.......):
    .......................................
    .......................................
    - with np.load(path) as f:
    + with np.load(path,allow_pickle=True) as f:

find the path to imdb.py then just add the flag to np.load(path,…flag…)

    def load_data(.......):
    .......................................
    .......................................
    - with np.load(path) as f:
    + with np.load(path,allow_pickle=True) as f:

回答 12

它对我的工作

        np_load_old = np.load
        np.load = lambda *a: np_load_old(*a, allow_pickle=True)
        (x_train, y_train), (x_test, y_test) = reuters.load_data(num_words=None, test_split=0.2)
        np.load = np_load_old

Its work for me

        np_load_old = np.load
        np.load = lambda *a: np_load_old(*a, allow_pickle=True)
        (x_train, y_train), (x_test, y_test) = reuters.load_data(num_words=None, test_split=0.2)
        np.load = np_load_old

回答 13

我发现TensorFlow 2.0(我正在使用2.0.0-alpha0)与最新版本的Numpy不兼容,即v1.17.0(可能还有v1.16.5 +)。导入TF2后,它会抛出巨大的FutureWarning列表,如下所示:

FutureWarning: Passing (type, 1) or '1type' as a synonym of type is deprecated; in a future version of numpy, it will be understood as (type, (1,)) / '(1,)type'.
  _np_qint8 = np.dtype([("qint8", np.int8, 1)])
/anaconda3/lib/python3.6/site-packages/tensorboard/compat/tensorflow_stub/dtypes.py:541: FutureWarning: Passing (type, 1) or '1type' as a synonym of type is deprecated; in a future version of numpy, it will be understood as (type, (1,)) / '(1,)type'.
  _np_qint8 = np.dtype([("qint8", np.int8, 1)])
/anaconda3/lib/python3.6/site-packages/tensorboard/compat/tensorflow_stub/dtypes.py:542: FutureWarning: Passing (type, 1) or '1type' as a synonym of type is deprecated; in a future version of numpy, it will be understood as (type, (1,)) / '(1,)type'.
  _np_quint8 = np.dtype([("quint8", np.uint8, 1)])
/anaconda3/lib/python3.6/site-packages/tensorboard/compat/tensorflow_stub/dtypes.py:543: FutureWarning: Passing (type, 1) or '1type' as a synonym of type is deprecated; in a future version of numpy, it will be understood as (type, (1,)) / '(1,)type'.

尝试从keras加载imdb数据集时,这还会导致allow_pickle错误

我尝试使用以下效果很好的解决方案,但是我必须在导入TF2或tf.keras的每个项目中都这样做。

np.load = lambda *a,**k: np_load_old(*a, allow_pickle=True, **k)

我找到的最简单的解决方案是全局安装numpy 1.16.1,或者在虚拟环境中使用tensorflow和numpy的兼容版本。

我的答案是指出,这不仅是imdb.load_data的问题,而且是TF2和Numpy版本不兼容所引起的更大的问题,并且可能导致许多其他隐藏的错误或问题。

What I have found is that TensorFlow 2.0 (I am using 2.0.0-alpha0) is not compatible with the latest version of Numpy i.e. v1.17.0 (and possibly v1.16.5+). As soon as TF2 is imported, it throws a huge list of FutureWarning, that looks something like this:

FutureWarning: Passing (type, 1) or '1type' as a synonym of type is deprecated; in a future version of numpy, it will be understood as (type, (1,)) / '(1,)type'.
  _np_qint8 = np.dtype([("qint8", np.int8, 1)])
/anaconda3/lib/python3.6/site-packages/tensorboard/compat/tensorflow_stub/dtypes.py:541: FutureWarning: Passing (type, 1) or '1type' as a synonym of type is deprecated; in a future version of numpy, it will be understood as (type, (1,)) / '(1,)type'.
  _np_qint8 = np.dtype([("qint8", np.int8, 1)])
/anaconda3/lib/python3.6/site-packages/tensorboard/compat/tensorflow_stub/dtypes.py:542: FutureWarning: Passing (type, 1) or '1type' as a synonym of type is deprecated; in a future version of numpy, it will be understood as (type, (1,)) / '(1,)type'.
  _np_quint8 = np.dtype([("quint8", np.uint8, 1)])
/anaconda3/lib/python3.6/site-packages/tensorboard/compat/tensorflow_stub/dtypes.py:543: FutureWarning: Passing (type, 1) or '1type' as a synonym of type is deprecated; in a future version of numpy, it will be understood as (type, (1,)) / '(1,)type'.

This also resulted in the allow_pickle error when tried to load imdb dataset from keras

I tried to use the following solution which worked just fine, but I had to do it every single project where I was importing TF2 or tf.keras.

np.load = lambda *a,**k: np_load_old(*a, allow_pickle=True, **k)

The easiest solution I found was to either install numpy 1.16.1 globally, or use compatible versions of tensorflow and numpy in a virtual environment.

My goal with this answer is to point out that its not just a problem with imdb.load_data, but a larger problem vaused by incompatibility of TF2 and Numpy versions and may result in many other hidden bugs or issues.


回答 14

Tensorflow在tf-nightly版本中有一个修复。

!pip install tf-nightly

当前版本是’2.0.0-dev20190511’。

Tensorflow has a fix in tf-nightly version.

!pip install tf-nightly

The current version is ‘2.0.0-dev20190511’.


回答 15

@cheez答案有时不起作用,并一次又一次地递归调用该函数。要解决此问题,您应该深入复制该功能。您可以使用函数来完成此操作partial,因此最终代码为:

import numpy as np
from functools import partial

# save np.load
np_load_old = partial(np.load)

# modify the default parameters of np.load
np.load = lambda *a,**k: np_load_old(*a, allow_pickle=True, **k)

# call load_data with allow_pickle implicitly set to true
(train_data, train_labels), (test_data, test_labels) = 
imdb.load_data(num_words=10000)

# restore np.load for future normal usage
np.load = np_load_old

The answer of @cheez sometime doesn’t work and recursively call the function again and again. To solve this problem you should copy the function deeply. You can do this by using the function partial, so the final code is:

import numpy as np
from functools import partial

# save np.load
np_load_old = partial(np.load)

# modify the default parameters of np.load
np.load = lambda *a,**k: np_load_old(*a, allow_pickle=True, **k)

# call load_data with allow_pickle implicitly set to true
(train_data, train_labels), (test_data, test_labels) = 
imdb.load_data(num_words=10000)

# restore np.load for future normal usage
np.load = np_load_old

回答 16

我通常不发布这些东西,但这太烦人了。混淆来自某些Keras imdb.py文件已经更新的事实:

with np.load(path) as f:

到的版本allow_pickle=True。确保检查imdb.py文件以查看是否已实施此更改。如果已进行调整,则可以很好地进行以下操作:

from keras.datasets import imdb
(train_text, train_labels), (test_text, test_labels) = imdb.load_data(num_words=10000)

I don’t usually post to these things but this was super annoying. The confusion comes from the fact that some of the Keras imdb.py files have already updated:

with np.load(path) as f:

to the version with allow_pickle=True. Make sure check the imdb.py file to see if this change was already implemented. If it has been adjusted, the following works fine:

from keras.datasets import imdb
(train_text, train_labels), (test_text, test_labels) = imdb.load_data(num_words=10000)

回答 17

最简单的方法是将imdb.py设置更改allow_pickle=True为引发错误np.load的行imdb.py

The easiest way is to change imdb.py setting allow_pickle=True to np.load at the line where imdb.py throws error.


声明:本站所有文章,如无特殊说明或标注,均为本站原创发布。任何个人或组织,在未征得本站同意时,禁止复制、盗用、采集、发布本站内容到任何网站、书籍等各类媒体平台。如若本站内容侵犯了原著者的合法权益,可联系我们进行处理。