问题:Tensorflow后端的Keras能否被迫随意使用CPU或GPU?
我安装了Tensorflow后端和CUDA的Keras。我有时想按需强迫Keras使用CPU。不用说在虚拟环境中安装单独的仅CPU的Tensorflow就能做到吗?如果可以,怎么办?如果后端是Theano,则可以设置标志,但是我还没有听说过可以通过Keras访问Tensorflow标志。
I have Keras installed with the Tensorflow backend and CUDA. I’d like to sometimes on demand force Keras to use CPU. Can this be done without say installing a separate CPU-only Tensorflow in a virtual environment? If so how? If the backend were Theano, the flags could be set, but I have not heard of Tensorflow flags accessible via Keras.
回答 0
回答 1
一个相当分离的方法是使用
import tensorflow as tf
from keras import backend as K
num_cores = 4
if GPU:
num_GPU = 1
num_CPU = 1
if CPU:
num_CPU = 1
num_GPU = 0
config = tf.ConfigProto(intra_op_parallelism_threads=num_cores,
inter_op_parallelism_threads=num_cores,
allow_soft_placement=True,
device_count = {'CPU' : num_CPU,
'GPU' : num_GPU}
)
session = tf.Session(config=config)
K.set_session(session)
在此处,通过booleans
GPU
和CPU
,我们通过严格定义允许Tensorflow会话访问的GPU和CPU的数量来指示我们是否要使用GPU或CPU运行代码。变量num_GPU
并num_CPU
定义该值。num_cores
然后通过intra_op_parallelism_threads
和设置可供使用的CPU内核数inter_op_parallelism_threads
。
该intra_op_parallelism_threads
变量指示在计算图中单个节点中并行操作被允许使用(内部)的线程数。虽然inter_ops_parallelism_threads
变量定义了跨计算图节点(并行)进行并行操作可访问的线程数。
allow_soft_placement
如果满足以下任一条件,则允许在CPU上运行操作:
该操作没有GPU实现
没有已知或注册的GPU设备
需要与CPU的其他输入一起放置
所有这些都在任何其他操作之前在我的类的构造函数中执行,并且可以与我使用的任何模型或其他代码完全分开。
注意:这要求tensorflow-gpu
和cuda
/ cudnn
要安装,因为提供了使用GPU的选项。
参考:
A rather separable way of doing this is to use
import tensorflow as tf
from keras import backend as K
num_cores = 4
if GPU:
num_GPU = 1
num_CPU = 1
if CPU:
num_CPU = 1
num_GPU = 0
config = tf.ConfigProto(intra_op_parallelism_threads=num_cores,
inter_op_parallelism_threads=num_cores,
allow_soft_placement=True,
device_count = {'CPU' : num_CPU,
'GPU' : num_GPU}
)
session = tf.Session(config=config)
K.set_session(session)
Here, with booleans
GPU
and CPU
, we indicate whether we would like to run our code with the GPU or CPU by rigidly defining the number of GPUs and CPUs the Tensorflow session is allowed to access. The variables num_GPU
and num_CPU
define this value. num_cores
then sets the number of CPU cores available for usage via intra_op_parallelism_threads
and inter_op_parallelism_threads
.
The intra_op_parallelism_threads
variable dictates the number of threads a parallel operation in a single node in the computation graph is allowed to use (intra). While the inter_ops_parallelism_threads
variable defines the number of threads accessible for parallel operations across the nodes of the computation graph (inter).
allow_soft_placement
allows for operations to be run on the CPU if any of the following criterion are met:
there is no GPU implementation for the operation
there are no GPU devices known or registered
there is a need to co-locate with other inputs from the CPU
All of this is executed in the constructor of my class before any other operations, and is completely separable from any model or other code I use.
Note: This requires tensorflow-gpu
and cuda
/cudnn
to be installed because the option is given to use a GPU.
Refs:
回答 2
这对我有用(win10),在导入keras之前放置:
import os
os.environ['CUDA_VISIBLE_DEVICES'] = '-1'
This worked for me (win10), place before you import keras:
import os
os.environ['CUDA_VISIBLE_DEVICES'] = '-1'
回答 3
只需导入tensortflow并使用keras,就这么简单。
import tensorflow as tf
# your code here
with tf.device('/gpu:0'):
model.fit(X, y, epochs=20, batch_size=128, callbacks=callbacks_list)
Just import tensortflow and use keras, it’s that easy.
import tensorflow as tf
# your code here
with tf.device('/gpu:0'):
model.fit(X, y, epochs=20, batch_size=128, callbacks=callbacks_list)
回答 4
按照keras 教程,您可以简单地使用与tf.device
常规tensorflow中相同的作用域:
with tf.device('/gpu:0'):
x = tf.placeholder(tf.float32, shape=(None, 20, 64))
y = LSTM(32)(x) # all ops in the LSTM layer will live on GPU:0
with tf.device('/cpu:0'):
x = tf.placeholder(tf.float32, shape=(None, 20, 64))
y = LSTM(32)(x) # all ops in the LSTM layer will live on CPU:0
As per keras tutorial, you can simply use the same tf.device
scope as in regular tensorflow:
with tf.device('/gpu:0'):
x = tf.placeholder(tf.float32, shape=(None, 20, 64))
y = LSTM(32)(x) # all ops in the LSTM layer will live on GPU:0
with tf.device('/cpu:0'):
x = tf.placeholder(tf.float32, shape=(None, 20, 64))
y = LSTM(32)(x) # all ops in the LSTM layer will live on CPU:0
回答 5
我只是花了一些时间弄清楚。Thoma的答案不完整。假设您的程序是test.py
,您想使用gpu0来运行该程序,并使其他gpus保持空闲。
你应该写 CUDA_VISIBLE_DEVICES=0 python test.py
注意DEVICES
不是DEVICE
I just spent some time figure it out.
Thoma’s answer is not complete.
Say your program is test.py
, you want to use gpu0 to run this program, and keep other gpus free.
You should write CUDA_VISIBLE_DEVICES=0 python test.py
Notice it’s DEVICES
not DEVICE
回答 6
对于从事PyCharm并强制使用CPU的人员,您可以在“运行/调试”配置的“环境变量”下添加以下行:
<OTHER_ENVIRONMENT_VARIABLES>;CUDA_VISIBLE_DEVICES=-1
For people working on PyCharm, and for forcing CPU, you can add the following line in the Run/Debug configuration, under Environment variables:
<OTHER_ENVIRONMENT_VARIABLES>;CUDA_VISIBLE_DEVICES=-1