标签归档:gpu

如何在张量流中获取当前可用的GPU?

问题:如何在张量流中获取当前可用的GPU?

我有一个使用分布式TensorFlow的计划,并且看到TensorFlow可以使用GPU进行培训和测试。在集群环境中,每台机器可能具有0个或1个或更多个GPU,我想将TensorFlow图运行到尽可能多的机器上的GPU中。

我发现运行tf.Session()TensorFlow时会在如下所示的日志消息中提供有关GPU的信息:

I tensorflow/core/common_runtime/gpu/gpu_init.cc:126] DMA: 0 
I tensorflow/core/common_runtime/gpu/gpu_init.cc:136] 0:   Y 
I tensorflow/core/common_runtime/gpu/gpu_device.cc:838] Creating TensorFlow device (/gpu:0) -> (device: 0, name: GeForce GTX 1080, pci bus id: 0000:01:00.0)

我的问题是如何从TensorFlow获取有关当前可用GPU的信息?我可以从日志中获取已加载的GPU信息,但我想以更复杂的编程方式进行操作。我还可以使用CUDA_VISIBLE_DEVICES环境变量有意地限制GPU,所以我不想知道一种从OS内核获取GPU信息的方法。

简而言之,如果机器中有两个GPU ,我希望这样的函数tf.get_available_gpus()将返回['/gpu:0', '/gpu:1']。我该如何实施?

I have a plan to use distributed TensorFlow, and I saw TensorFlow can use GPUs for training and testing. In a cluster environment, each machine could have 0 or 1 or more GPUs, and I want to run my TensorFlow graph into GPUs on as many machines as possible.

I found that when running tf.Session() TensorFlow gives information about GPU in the log messages like below:

I tensorflow/core/common_runtime/gpu/gpu_init.cc:126] DMA: 0 
I tensorflow/core/common_runtime/gpu/gpu_init.cc:136] 0:   Y 
I tensorflow/core/common_runtime/gpu/gpu_device.cc:838] Creating TensorFlow device (/gpu:0) -> (device: 0, name: GeForce GTX 1080, pci bus id: 0000:01:00.0)

My question is how do I get information about current available GPU from TensorFlow? I can get loaded GPU information from the log, but I want to do it in a more sophisticated, programmatic way. I also could restrict GPUs intentionally using the CUDA_VISIBLE_DEVICES environment variable, so I don’t want to know a way of getting GPU information from OS kernel.

In short, I want a function like tf.get_available_gpus() that will return ['/gpu:0', '/gpu:1'] if there are two GPUs available in the machine. How can I implement this?


回答 0

有一个未记录的方法device_lib.list_local_devices(),该方法使您可以列出本地进程中可用的设备。(注意,作为一种未公开的方法,此方法可能会向后不兼容更改。)该函数返回DeviceAttributes协议缓冲区对象的列表。您可以按以下方式提取GPU设备的字符串设备名称列表:

from tensorflow.python.client import device_lib

def get_available_gpus():
    local_device_protos = device_lib.list_local_devices()
    return [x.name for x in local_device_protos if x.device_type == 'GPU']

请注意(至少在TensorFlow 1.4之前),调用device_lib.list_local_devices()将运行一些初始化代码,默认情况下,这些初始化代码将在所有设备上分配所有GPU内存(GitHub issue)。为避免这种情况,请首先使用一个显着小的per_process_gpu_fraction或创建一个会话allow_growth=True,以防止分配所有内存。有关更多详细信息,请参见此问题

There is an undocumented method called device_lib.list_local_devices() that enables you to list the devices available in the local process. (N.B. As an undocumented method, this is subject to backwards incompatible changes.) The function returns a list of DeviceAttributes protocol buffer objects. You can extract a list of string device names for the GPU devices as follows:

from tensorflow.python.client import device_lib

def get_available_gpus():
    local_device_protos = device_lib.list_local_devices()
    return [x.name for x in local_device_protos if x.device_type == 'GPU']

Note that (at least up to TensorFlow 1.4), calling device_lib.list_local_devices() will run some initialization code that, by default, will allocate all of the GPU memory on all of the devices (GitHub issue). To avoid this, first create a session with an explicitly small per_process_gpu_fraction, or allow_growth=True, to prevent all of the memory being allocated. See this question for more details.


回答 1

您可以使用以下代码检查所有设备列表:

from tensorflow.python.client import device_lib

device_lib.list_local_devices()

You can check all device list using following code:

from tensorflow.python.client import device_lib

device_lib.list_local_devices()

回答 2

测试工具中还有一种方法。因此,所有要做的就是:

tf.test.is_gpu_available()

和/或

tf.test.gpu_device_name()

在Tensorflow文档中查找参数。

There is also a method in the test util. So all that has to be done is:

tf.test.is_gpu_available()

and/or

tf.test.gpu_device_name()

Look up the Tensorflow docs for arguments.


回答 3

在TensorFlow 2.0中,您可以使用 tf.config.experimental.list_physical_devices('GPU')

import tensorflow as tf
gpus = tf.config.experimental.list_physical_devices('GPU')
for gpu in gpus:
    print("Name:", gpu.name, "  Type:", gpu.device_type)

如果您安装了两个GPU,它将输出以下内容:

Name: /physical_device:GPU:0   Type: GPU
Name: /physical_device:GPU:1   Type: GPU

从2.1开始,您可以 experimental

gpus = tf.config.list_physical_devices('GPU')

看到:

In TensorFlow 2.0, you can use tf.config.experimental.list_physical_devices('GPU'):

import tensorflow as tf
gpus = tf.config.experimental.list_physical_devices('GPU')
for gpu in gpus:
    print("Name:", gpu.name, "  Type:", gpu.device_type)

If you have two GPUs installed, it outputs this:

Name: /physical_device:GPU:0   Type: GPU
Name: /physical_device:GPU:1   Type: GPU

From 2.1, you can drop experimental:

gpus = tf.config.list_physical_devices('GPU')

See:


回答 4

接受的答案给你GPU的数量,但它也分配所有这些GPU的内存。您可以通过在调用device_lib.list_local_devices()之前创建具有固定较低内存的会话来避免这种情况,这对于某些应用程序可能是不需要的。

我最终使用nvidia-smi来获取GPU的数量,而没有在其上分配任何内存。

import subprocess

n = str(subprocess.check_output(["nvidia-smi", "-L"])).count('UUID')

The accepted answer gives you the number of GPUs but it also allocates all the memory on those GPUs. You can avoid this by creating a session with fixed lower memory before calling device_lib.list_local_devices() which may be unwanted for some applications.

I ended up using nvidia-smi to get the number of GPUs without allocating any memory on them.

import subprocess

n = str(subprocess.check_output(["nvidia-smi", "-L"])).count('UUID')

回答 5

除了Mrry的出色解释之外,他建议在哪里使用,device_lib.list_local_devices()我可以向您展示如何从命令行检查GPU相关信息。

因为目前只有Nvidia的GPU适用于NN框架,所以答案只涵盖了它们。Nvidia上有一个页面,其中记录了如何使用/ proc文件系统接口来获取有关驱动程序,任何已安装的NVIDIA图形卡以及AGP状态的运行时信息。

/proc/driver/nvidia/gpus/0..N/information

提供有关每个已安装的NVIDIA图形适配器的信息(型号名称,IRQ,BIOS版本,总线类型)。请注意,BIOS版本仅在X运行时可用。

因此,您可以从命令行运行此命令,cat /proc/driver/nvidia/gpus/0/information并查看有关第一个GPU的信息。从python运行它很容易并且您可以检查第二,第三,第四GPU直到失败。

肯定Mrry的答案更可靠,而且我不确定我的答案是否可以在非Linux机器上使用,但是Nvidia的页面提供了其他有趣的信息,但鲜为人知。

Apart from the excellent explanation by Mrry, where he suggested to use device_lib.list_local_devices() I can show you how you can check for GPU related information from the command line.

Because currently only Nvidia’s gpus work for NN frameworks, the answer covers only them. Nvidia has a page where they document how you can use the /proc filesystem interface to obtain run-time information about the driver, any installed NVIDIA graphics cards, and the AGP status.

/proc/driver/nvidia/gpus/0..N/information

Provide information about each of the installed NVIDIA graphics adapters (model name, IRQ, BIOS version, Bus Type). Note that the BIOS version is only available while X is running.

So you can run this from command line cat /proc/driver/nvidia/gpus/0/information and see information about your first GPU. It is easy to run this from python and also you can check second, third, fourth GPU till it will fail.

Definitely Mrry’s answer is more robust and I am not sure whether my answer will work on non-linux machine, but that Nvidia’s page provide other interesting information, which not many people know about.


回答 6

以下工作在tensorflow 2中:

import tensorflow as tf
gpus = tf.config.experimental.list_physical_devices('GPU')
for gpu in gpus:
    print("Name:", gpu.name, "  Type:", gpu.device_type)

从2.1开始,您可以删除experimental

    gpus = tf.config.list_physical_devices('GPU')

https://www.tensorflow.org/api_docs/python/tf/config/list_physical_devices

The following works in tensorflow 2:

import tensorflow as tf
gpus = tf.config.experimental.list_physical_devices('GPU')
for gpu in gpus:
    print("Name:", gpu.name, "  Type:", gpu.device_type)

From 2.1, you can drop experimental:

    gpus = tf.config.list_physical_devices('GPU')

https://www.tensorflow.org/api_docs/python/tf/config/list_physical_devices


回答 7

NVIDIA GTX GeForce 1650 Ti我的机器中调用了一个GPUtensorflow-gpu==2.2.0

运行以下两行代码:

import tensorflow as tf
print("Num GPUs Available: ", len(tf.config.experimental.list_physical_devices('GPU')))

输出:

Num GPUs Available:  1

I got a GPU called NVIDIA GTX GeForce 1650 Ti in my machine with tensorflow-gpu==2.2.0

Run the following two lines of code:

import tensorflow as tf
print("Num GPUs Available: ", len(tf.config.experimental.list_physical_devices('GPU')))

Output:

Num GPUs Available:  1

回答 8

使用这种方式并检查所有零件:

from __future__ import absolute_import, division, print_function, unicode_literals

import numpy as np
import tensorflow as tf
import tensorflow_hub as hub
import tensorflow_datasets as tfds


version = tf.__version__
executing_eagerly = tf.executing_eagerly()
hub_version = hub.__version__
available = tf.config.experimental.list_physical_devices("GPU")

print("Version: ", version)
print("Eager mode: ", executing_eagerly)
print("Hub Version: ", h_version)
print("GPU is", "available" if avai else "NOT AVAILABLE")

Use this way and check all parts :

from __future__ import absolute_import, division, print_function, unicode_literals

import numpy as np
import tensorflow as tf
import tensorflow_hub as hub
import tensorflow_datasets as tfds


version = tf.__version__
executing_eagerly = tf.executing_eagerly()
hub_version = hub.__version__
available = tf.config.experimental.list_physical_devices("GPU")

print("Version: ", version)
print("Eager mode: ", executing_eagerly)
print("Hub Version: ", h_version)
print("GPU is", "available" if avai else "NOT AVAILABLE")

回答 9

确保在您的GPU支持计算机中安装了最新的TensorFlow 2.x GPU,在python中执行以下代码,

from __future__ import absolute_import, division, print_function, unicode_literals

import tensorflow as tf 

print("Num GPUs Available: ", len(tf.config.experimental.list_physical_devices('GPU')))

会得到一个输出看起来像,

2020-02-07 10:45:37.587838:我tensorflow / stream_executor / cuda / cuda_gpu_executor.cc:1006]从SysFS读取成功的NUMA节点具有负值(-1),但必须至少有一个NUMA节点,因此返回NUMA节点为零2020-02-07 10:45:37.588896:I tensorflow / core / common_runtime / gpu / gpu_device.cc:1746]添加可见的gpu设备:0、1、2、3、4、5、6、7 Num可用GPU:8

Ensure you have the latest TensorFlow 2.x GPU installed in your GPU supporting machine, Execute the following code in python,

from __future__ import absolute_import, division, print_function, unicode_literals

import tensorflow as tf 

print("Num GPUs Available: ", len(tf.config.experimental.list_physical_devices('GPU')))

Will get an output looks like,

2020-02-07 10:45:37.587838: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:1006] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero 2020-02-07 10:45:37.588896: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1746] Adding visible gpu devices: 0, 1, 2, 3, 4, 5, 6, 7 Num GPUs Available: 8


如何检查pytorch是否正在使用GPU?

问题:如何检查pytorch是否正在使用GPU?

我想知道是否pytorch正在使用我的GPU。nvidia-smi在此过程中,可以检测是否有来自GPU的任何活动,但是我想要在python脚本中编写一些东西。

有办法吗?

I would like to know if pytorch is using my GPU. It’s possible to detect with nvidia-smi if there is any activity from the GPU during the process, but I want something written in a python script.

Is there a way to do so?


回答 0

这将起作用:

In [1]: import torch

In [2]: torch.cuda.current_device()
Out[2]: 0

In [3]: torch.cuda.device(0)
Out[3]: <torch.cuda.device at 0x7efce0b03be0>

In [4]: torch.cuda.device_count()
Out[4]: 1

In [5]: torch.cuda.get_device_name(0)
Out[5]: 'GeForce GTX 950M'

In [6]: torch.cuda.is_available()
Out[6]: True

这告诉我GPU GeForce GTX 950M正在被使用PyTorch

This is going to work :

In [1]: import torch

In [2]: torch.cuda.current_device()
Out[2]: 0

In [3]: torch.cuda.device(0)
Out[3]: <torch.cuda.device at 0x7efce0b03be0>

In [4]: torch.cuda.device_count()
Out[4]: 1

In [5]: torch.cuda.get_device_name(0)
Out[5]: 'GeForce GTX 950M'

In [6]: torch.cuda.is_available()
Out[6]: True

This tells me the GPU GeForce GTX 950M is being used by PyTorch.


回答 1

因为这里没有提出,所以我添加了一个使用的方法torch.device,因为这很方便,也可以在正确的上初始化张量device

# setting device on GPU if available, else CPU
device = torch.device('cuda' if torch.cuda.is_available() else 'cpu')
print('Using device:', device)
print()

#Additional Info when using cuda
if device.type == 'cuda':
    print(torch.cuda.get_device_name(0))
    print('Memory Usage:')
    print('Allocated:', round(torch.cuda.memory_allocated(0)/1024**3,1), 'GB')
    print('Cached:   ', round(torch.cuda.memory_cached(0)/1024**3,1), 'GB')

输出:

Using device: cuda

Tesla K80
Memory Usage:
Allocated: 0.3 GB
Cached:    0.6 GB

如上所述,使用device它是可能的

  • 移至张量到各自的device

    torch.rand(10).to(device)
  • 创建直接在张量device

    torch.rand(10, device=device)

这使得在CPUGPU之间切换变得舒适,而无需更改实际代码。


编辑:

由于对缓存分配的内存存在一些疑问和困惑,因此我添加了一些有关它的其他信息:


您可以直接device在帖子中上面指定的位置移交一个,也可以将其保留为None,它将使用current_device()

As it hasn’t been proposed here, I’m adding a method using torch.device, as this is quite handy, also when initializing tensors on the correct device.

# setting device on GPU if available, else CPU
device = torch.device('cuda' if torch.cuda.is_available() else 'cpu')
print('Using device:', device)
print()

#Additional Info when using cuda
if device.type == 'cuda':
    print(torch.cuda.get_device_name(0))
    print('Memory Usage:')
    print('Allocated:', round(torch.cuda.memory_allocated(0)/1024**3,1), 'GB')
    print('Cached:   ', round(torch.cuda.memory_reserved(0)/1024**3,1), 'GB')

Edit: torch.cuda.memory_cached has been renamed to torch.cuda.memory_reserved. So use memory_cached for older versions.

Output:

Using device: cuda

Tesla K80
Memory Usage:
Allocated: 0.3 GB
Cached:    0.6 GB

As mentioned above, using device it is possible to:

  • To move tensors to the respective device:

      torch.rand(10).to(device)
    
  • To create a tensor directly on the device:

      torch.rand(10, device=device)
    

Which makes switching between CPU and GPU comfortable without changing the actual code.


Edit:

As there has been some questions and confusion about the cached and allocated memory I’m adding some additional information about it:


You can either directly hand over a device as specified further above in the post or you can leave it None and it will use the current_device().


Additional note: Old graphic cards with Cuda compute capability 3.0 or lower may be visible but cannot be used by Pytorch!
Thanks to hekimgil for pointing this out! – “Found GPU0 GeForce GT 750M which is of cuda capability 3.0. PyTorch no longer supports this GPU because it is too old. The minimum cuda capability that we support is 3.5.”


回答 2

在开始运行训练循环之后,如果要从终端手动查看它,则您的程序是否正在使用GPU资源以及使用程度如何,则可以像下面这样简单地使用watch

$ watch -n 2 nvidia-smi

这将持续每2秒更新一次使用情况统计信息,直到您按ctrl+c


如果您需要对可能需要的更多GPU统计信息进行更多控制,则可以使用with的更复杂的版本nvidia-smi--query-gpu=...。以下是对此的简单说明:

$ watch -n 3 nvidia-smi --query-gpu=index,gpu_name,memory.total,memory.used,memory.free,temperature.gpu,pstate,utilization.gpu,utilization.memory --format=csv

这将输出统计信息,例如:

注意:中的逗号分隔查询名称之间不应有任何空格--query-gpu=...。否则,这些值将被忽略,并且不返回任何统计信息。


另外,您可以通过执行以下操作来检查您的PyTorch安装是否正确检测到CUDA安装:

In [13]: import  torch

In [14]: torch.cuda.is_available()
Out[14]: True

True状态表示PyTorch已正确配置并正在使用GPU,尽管您必须在代码中使用必需的语句移动/放置张量。


如果要在Python代码中执行此操作,请查看以下模块:

https://github.com/jonsafari/nvidia-ml-py或在pypi中:https ://pypi.python.org/pypi/nvidia-ml-py/

After you start running the training loop, if you want to manually watch it from the terminal whether your program is utilizing the GPU resources and to what extent, then you can simply use watch as in:

$ watch -n 2 nvidia-smi

This will continuously update the usage stats for every 2 seconds until you press ctrl+c


If you need more control on more GPU stats you might need, you can use more sophisticated version of nvidia-smi with --query-gpu=.... Below is a simple illustration of this:

$ watch -n 3 nvidia-smi --query-gpu=index,gpu_name,memory.total,memory.used,memory.free,temperature.gpu,pstate,utilization.gpu,utilization.memory --format=csv

which would output the stats something like:

Note: There should not be any space between the comma separated query names in --query-gpu=.... Else those values will be ignored and no stats are returned.


Also, you can check whether your installation of PyTorch detects your CUDA installation correctly by doing:

In [13]: import  torch

In [14]: torch.cuda.is_available()
Out[14]: True

True status means that PyTorch is configured correctly and is using the GPU although you have to move/place the tensors with necessary statements in your code.


If you want to do this inside Python code, then look into this module:

https://github.com/jonsafari/nvidia-ml-py or in pypi here: https://pypi.python.org/pypi/nvidia-ml-py/


回答 3

在办公室站点和“入门”页面上,如下检查PyTorch的GPU:

import torch
torch.cuda.is_available()

参考:PyTorch |开始

On the office site and the get start page, check GPU for PyTorch as below:

import torch
torch.cuda.is_available()

Reference: PyTorch|Get Start


回答 4

从实际的角度来看,只有一个小题外话:

import torch
dev = torch.device("cuda") if torch.cuda.is_available() else torch.device("cpu")

dev现在知道,如果CUDA或CPU。

当移至cuda时,如何处理模型和张量是有区别的。起初有点奇怪。

import torch
dev = torch.device("cuda") if torch.cuda.is_available() else torch.device("cpu")
t1 = torch.randn(1,2)
t2 = torch.randn(1,2).to(dev)
print(t1)  # tensor([[-0.2678,  1.9252]])
print(t2)  # tensor([[ 0.5117, -3.6247]], device='cuda:0')
t1.to(dev) 
print(t1)  # tensor([[-0.2678,  1.9252]]) 
print(t1.is_cuda) # False
t1=t1.to(dev)
print(t1)  # tensor([[-0.2678,  1.9252]], device='cuda:0') 
print(t1.is_cuda) # True

class M(nn.Module):
def __init__(self):        
    super().__init__()        
    self.l1 = nn.Linear(1,2)

def forward(self, x):                      
    x = self.l1(x)
    return x
model = M()   # not on cuda
model.to(dev) # is on cuda (all parameters)
print(next(model.parameters()).is_cuda) #True

这一切都是棘手的,一旦理解就可以帮助您以更少的调试快速处理。

From practical standpoint just one minor digression:

import torch
dev = torch.device("cuda") if torch.cuda.is_available() else torch.device("cpu")

This dev now knows if cuda or cpu.

And there is a difference how you deal with model and with tensors when moving to cuda. It is a bit strange at first.

import torch
import torch.nn as nn
dev = torch.device("cuda") if torch.cuda.is_available() else torch.device("cpu")
t1 = torch.randn(1,2)
t2 = torch.randn(1,2).to(dev)
print(t1)  # tensor([[-0.2678,  1.9252]])
print(t2)  # tensor([[ 0.5117, -3.6247]], device='cuda:0')
t1.to(dev) 
print(t1)  # tensor([[-0.2678,  1.9252]]) 
print(t1.is_cuda) # False
t1 = t1.to(dev)
print(t1)  # tensor([[-0.2678,  1.9252]], device='cuda:0') 
print(t1.is_cuda) # True

class M(nn.Module):
    def __init__(self):        
        super().__init__()        
        self.l1 = nn.Linear(1,2)

    def forward(self, x):                      
        x = self.l1(x)
        return x
model = M()   # not on cuda
model.to(dev) # is on cuda (all parameters)
print(next(model.parameters()).is_cuda) # True

This all is tricky and understanding it once, helps you to deal fast with less debugging.


回答 5

要检查是否有可用的GPU:

torch.cuda.is_available()

如果以上函数返回False

  1. 你要么没有GPU,
  2. 或尚未安装Nvidia驱动程序,因此操作系统看不到GPU,
  3. 或者GPU被环境变量隐藏CUDA_VISIBLE_DEVICES。当值CUDA_VISIBLE_DEVICES是-1时,所有设备都被隐藏。您可以使用以下代码在代码中检查该值:os.environ['CUDA_VISIBLE_DEVICES']

如果以上函数返回True,则不一定表示您正在使用GPU。在Pytorch中,您可以在创建张量时为设备分配张量。默认情况下,张量分配给cpu。要检查张量的分配位置,请执行以下操作:

# assuming that 'a' is a tensor created somewhere else
a.device  # returns the device where the tensor is allocated

请注意,您无法对在不同设备中分配的张量进行操作。要查看如何为GPU分配张量,请参见此处:https : //pytorch.org/docs/stable/notes/cuda.html

To check if there is a GPU available:

torch.cuda.is_available()

If the above function returns False,

  1. you either have no GPU,
  2. or the Nvidia drivers have not been installed so the OS does not see the GPU,
  3. or the GPU is being hidden by the environmental variable CUDA_VISIBLE_DEVICES. When the value of CUDA_VISIBLE_DEVICES is -1, then all your devices are being hidden. You can check that value in code with this line: os.environ['CUDA_VISIBLE_DEVICES']

If the above function returns True that does not necessarily mean that you are using the GPU. In Pytorch you can allocate tensors to devices when you create them. By default, tensors get allocated to the cpu. To check where your tensor is allocated do:

# assuming that 'a' is a tensor created somewhere else
a.device  # returns the device where the tensor is allocated

Note that you cannot operate on tensors allocated in different devices. To see how to allocate a tensor to the GPU, see here: https://pytorch.org/docs/stable/notes/cuda.html


回答 6

几乎所有答案都在这里参考torch.cuda.is_available()。但是,那只是硬币的一部分。它告诉您GPU(实际上是CUDA)是否可用,而不是实际上是否在使用它。在典型的设置中,您可以通过以下方式设置设备:

device = torch.device("cuda") if torch.cuda.is_available() else torch.device("cpu")

但是在较大的环境(例如研究)中,通常也为用户提供更多选项,因此,基于输入,他们可以禁用CUDA,指定CUDA ID等。在这种情况下,是否使用GPU不仅取决于是否可用。将设备设置为割炬设备后,您可以获取其type属性以验证它是否为CUDA。

if device.type == 'cuda':
    # do something

Almost all answers here reference torch.cuda.is_available(). However, that’s only one part of the coin. It tells you whether the GPU (actually CUDA) is available, not whether it’s actually being used. In a typical setup, you would set your device with something like this:

device = torch.device("cuda") if torch.cuda.is_available() else torch.device("cpu")

but in larger environments (e.g. research) it is also common to give the user more options, so based on input they can disable CUDA, specify CUDA IDs, and so on. In such case, whether or not the GPU is used is not only based on whether it is available or not. After the device has been set to a torch device, you can get its type property to verify whether it’s CUDA or not.

if device.type == 'cuda':
    # do something

回答 7

只需从命令提示符或Linux环境运行以下命令。

python -c 'import torch; print(torch.cuda.is_available())'

上面应该打印 True

python -c 'import torch; print(torch.rand(2,3).cuda())'

这应该打印以下内容:

tensor([[0.7997, 0.6170, 0.7042], [0.4174, 0.1494, 0.0516]], device='cuda:0')

Simply from command prompt or Linux environment run the following command.

python -c 'import torch; print(torch.cuda.is_available())'

The above should print True

python -c 'import torch; print(torch.rand(2,3).cuda())'

This one should print the following:

tensor([[0.7997, 0.6170, 0.7042], [0.4174, 0.1494, 0.0516]], device='cuda:0')

回答 8

如果您在这里是因为您的pytorch总是False为此付出代价torch.cuda.is_available(),则可能是因为您安装的pytorch版本没有GPU支持。(例如:您先在笔记本电脑中编码,然后在服务器上进行测试)。

解决方案是使用pytorch 下载页面中的正确命令再次卸载并安装pytorch 。另请参阅 pytorch问题。

If you are here because your pytorch always gives False for torch.cuda.is_available() that’s probably because you installed your pytorch version without GPU support. (Eg: you coded up in laptop then testing on server).

The solution is to uninstall and install pytorch again with the right command from pytorch downloads page. Also refer this pytorch issue.


回答 9

在GPU上创建张量,如下所示:

$ python
>>> import torch
>>> print(torch.rand(3,3).cuda()) 

不要退出,打开另一个终端,并使用以下命令检查python进程是否正在使用GPU:

$ nvidia-smi

Create a tensor on the GPU as follows:

$ python
>>> import torch
>>> print(torch.rand(3,3).cuda()) 

Do not quit, open another terminal and check if the python process is using the GPU using:

$ nvidia-smi

如何判断tensorflow是否在python shell中使用gpu加速?

问题:如何判断tensorflow是否在python shell中使用gpu加速?

我已经在ubuntu 16.04中使用第二个答案 ubuntu的内置apt cuda安装中安装了tensorflow 。

现在我的问题是如何测试tensorflow是否真的在使用gpu?我有一个gtx 960m gpu。当我import tensorflow这是输出

I tensorflow/stream_executor/dso_loader.cc:105] successfully opened CUDA library libcublas.so locally
I tensorflow/stream_executor/dso_loader.cc:105] successfully opened CUDA library libcudnn.so locally
I tensorflow/stream_executor/dso_loader.cc:105] successfully opened CUDA library libcufft.so locally
I tensorflow/stream_executor/dso_loader.cc:105] successfully opened CUDA library libcuda.so.1 locally
I tensorflow/stream_executor/dso_loader.cc:105] successfully opened CUDA library libcurand.so locally

这个输出足以检查tensorflow是否正在使用gpu吗?

I have installed tensorflow in my ubuntu 16.04 using the second answer here with ubuntu’s builtin apt cuda installation.

Now my question is how can I test if tensorflow is really using gpu? I have a gtx 960m gpu. When I import tensorflow this is the output

I tensorflow/stream_executor/dso_loader.cc:105] successfully opened CUDA library libcublas.so locally
I tensorflow/stream_executor/dso_loader.cc:105] successfully opened CUDA library libcudnn.so locally
I tensorflow/stream_executor/dso_loader.cc:105] successfully opened CUDA library libcufft.so locally
I tensorflow/stream_executor/dso_loader.cc:105] successfully opened CUDA library libcuda.so.1 locally
I tensorflow/stream_executor/dso_loader.cc:105] successfully opened CUDA library libcurand.so locally

Is this output enough to check if tensorflow is using gpu ?


回答 0

不,我认为“开放式CUDA库”不足以说明问题,因为图形的不同节点可能位于不同的设备上。

要找出使用哪个设备,您可以启用日志设备放置,如下所示:

sess = tf.Session(config=tf.ConfigProto(log_device_placement=True))

检查您的控制台是否有这种类型的输出。

No, I don’t think “open CUDA library” is enough to tell, because different nodes of the graph may be on different devices.

To find out which device is used, you can enable log device placement like this:

sess = tf.Session(config=tf.ConfigProto(log_device_placement=True))

Check your console for this type of output.


回答 1

除了使用sess = tf.Session(config=tf.ConfigProto(log_device_placement=True))其他答案以及TensorFlow官方文档中概述的方法外,您还可以尝试将计算分配给gpu,看看是否有错误。

import tensorflow as tf
with tf.device('/gpu:0'):
    a = tf.constant([1.0, 2.0, 3.0, 4.0, 5.0, 6.0], shape=[2, 3], name='a')
    b = tf.constant([1.0, 2.0, 3.0, 4.0, 5.0, 6.0], shape=[3, 2], name='b')
    c = tf.matmul(a, b)

with tf.Session() as sess:
    print (sess.run(c))

这里

  • “ / cpu:0”:您计算机的CPU。
  • “ / gpu:0”:计算机的GPU(如果有)。

如果您有一个gpu并可以使用它,您将看到结果。否则,您将看到堆栈跟踪很长的错误。最后,您将获得以下内容:

无法将设备分配给节点“ MatMul”:无法满足显式设备规范“ / device:GPU:0”,因为在此过程中未注册与该规范匹配的设备


最近,一些有用的功能出现在TF中:

您还可以在会话中检查可用设备:

with tf.Session() as sess:
  devices = sess.list_devices()

devices 会给你类似的东西

[_DeviceAttributes(/job:tpu_worker/replica:0/task:0/device:CPU:0, CPU, -1, 4670268618893924978),
 _DeviceAttributes(/job:tpu_worker/replica:0/task:0/device:XLA_CPU:0, XLA_CPU, 17179869184, 6127825144471676437),
 _DeviceAttributes(/job:tpu_worker/replica:0/task:0/device:XLA_GPU:0, XLA_GPU, 17179869184, 16148453971365832732),
 _DeviceAttributes(/job:tpu_worker/replica:0/task:0/device:TPU:0, TPU, 17179869184, 10003582050679337480),
 _DeviceAttributes(/job:tpu_worker/replica:0/task:0/device:TPU:1, TPU, 17179869184, 5678397037036584928)

Apart from using sess = tf.Session(config=tf.ConfigProto(log_device_placement=True)) which is outlined in other answers as well as in the official TensorFlow documentation, you can try to assign a computation to the gpu and see whether you have an error.

import tensorflow as tf
with tf.device('/gpu:0'):
    a = tf.constant([1.0, 2.0, 3.0, 4.0, 5.0, 6.0], shape=[2, 3], name='a')
    b = tf.constant([1.0, 2.0, 3.0, 4.0, 5.0, 6.0], shape=[3, 2], name='b')
    c = tf.matmul(a, b)

with tf.Session() as sess:
    print (sess.run(c))

Here

  • “/cpu:0”: The CPU of your machine.
  • “/gpu:0”: The GPU of your machine, if you have one.

If you have a gpu and can use it, you will see the result. Otherwise you will see an error with a long stacktrace. In the end you will have something like this:

Cannot assign a device to node ‘MatMul’: Could not satisfy explicit device specification ‘/device:GPU:0’ because no devices matching that specification are registered in this process


Recently a few helpful functions appeared in TF:

You can also check for available devices in the session:

with tf.Session() as sess:
  devices = sess.list_devices()

devices will return you something like

[_DeviceAttributes(/job:tpu_worker/replica:0/task:0/device:CPU:0, CPU, -1, 4670268618893924978),
 _DeviceAttributes(/job:tpu_worker/replica:0/task:0/device:XLA_CPU:0, XLA_CPU, 17179869184, 6127825144471676437),
 _DeviceAttributes(/job:tpu_worker/replica:0/task:0/device:XLA_GPU:0, XLA_GPU, 17179869184, 16148453971365832732),
 _DeviceAttributes(/job:tpu_worker/replica:0/task:0/device:TPU:0, TPU, 17179869184, 10003582050679337480),
 _DeviceAttributes(/job:tpu_worker/replica:0/task:0/device:TPU:1, TPU, 17179869184, 5678397037036584928)

回答 2

以下代码段应为您提供所有可用于tensorflow的设备。

from tensorflow.python.client import device_lib
print(device_lib.list_local_devices())

样本输出

[name:“ / cpu:0” device_type:“ CPU” memory_limit:268435456本地化{}化身:4402277519343584096,

名称:“ / gpu:0” device_type:“ GPU” memory_limit:6772842168本地化{bus_id:1}化身:7471795903849088328 physical_device_desc:“设备:0,名称:GeForce GTX 1070,pci总线ID:0000:05:00.0”]

Following piece of code should give you all devices available to tensorflow.

from tensorflow.python.client import device_lib
print(device_lib.list_local_devices())

Sample Output

[name: “/cpu:0” device_type: “CPU” memory_limit: 268435456 locality { } incarnation: 4402277519343584096,

name: “/gpu:0” device_type: “GPU” memory_limit: 6772842168 locality { bus_id: 1 } incarnation: 7471795903849088328 physical_device_desc: “device: 0, name: GeForce GTX 1070, pci bus id: 0000:05:00.0” ]


回答 3

我认为有一个更简单的方法可以实现这一目标。

import tensorflow as tf
if tf.test.gpu_device_name():
    print('Default GPU Device: {}'.format(tf.test.gpu_device_name()))
else:
    print("Please install GPU version of TF")

它通常打印像

Default GPU Device: /device:GPU:0

在我看来,这比那些冗长的日志更容易。

I think there is an easier way to achieve this.

import tensorflow as tf
if tf.test.gpu_device_name():
    print('Default GPU Device: {}'.format(tf.test.gpu_device_name()))
else:
    print("Please install GPU version of TF")

It usually prints like

Default GPU Device: /device:GPU:0

This seems easier to me rather than those verbose logs.


回答 4

Tensorflow 2.0

会话在2.0中不再使用。相反,可以使用tf.test.is_gpu_available

import tensorflow as tf

assert tf.test.is_gpu_available()
assert tf.test.is_built_with_cuda()

如果出现错误,则需要检查安装。

Tensorflow 2.0

Sessions are no longer used in 2.0. Instead, one can use tf.test.is_gpu_available:

import tensorflow as tf

assert tf.test.is_gpu_available()
assert tf.test.is_built_with_cuda()

If you get an error, you need to check your installation.


回答 5

这是否可以确认使用GPU同时训练tensorflow吗?

sess = tf.Session(config=tf.ConfigProto(log_device_placement=True))

输出量

I tensorflow/core/common_runtime/gpu/gpu_device.cc:885] Found device 0 with properties: 
name: GeForce GT 730
major: 3 minor: 5 memoryClockRate (GHz) 0.9015
pciBusID 0000:01:00.0
Total memory: 1.98GiB
Free memory: 1.72GiB
I tensorflow/core/common_runtime/gpu/gpu_device.cc:906] DMA: 0 
I tensorflow/core/common_runtime/gpu/gpu_device.cc:916] 0:   Y 
I tensorflow/core/common_runtime/gpu/gpu_device.cc:975] Creating TensorFlow device (/gpu:0) -> (device: 0, name: GeForce GT 730, pci bus id: 0000:01:00.0)
Device mapping:
/job:localhost/replica:0/task:0/gpu:0 -> device: 0, name: GeForce GT 730, pci bus id: 0000:01:00.0
I tensorflow/core/common_runtime/direct_session.cc:255] Device mapping:
/job:localhost/replica:0/task:0/gpu:0 -> device: 0, name: GeForce GT 730, pci bus id: 0000:01:00.0

This will confirm that tensorflow using GPU while training also ?

Code

sess = tf.Session(config=tf.ConfigProto(log_device_placement=True))

Output

I tensorflow/core/common_runtime/gpu/gpu_device.cc:885] Found device 0 with properties: 
name: GeForce GT 730
major: 3 minor: 5 memoryClockRate (GHz) 0.9015
pciBusID 0000:01:00.0
Total memory: 1.98GiB
Free memory: 1.72GiB
I tensorflow/core/common_runtime/gpu/gpu_device.cc:906] DMA: 0 
I tensorflow/core/common_runtime/gpu/gpu_device.cc:916] 0:   Y 
I tensorflow/core/common_runtime/gpu/gpu_device.cc:975] Creating TensorFlow device (/gpu:0) -> (device: 0, name: GeForce GT 730, pci bus id: 0000:01:00.0)
Device mapping:
/job:localhost/replica:0/task:0/gpu:0 -> device: 0, name: GeForce GT 730, pci bus id: 0000:01:00.0
I tensorflow/core/common_runtime/direct_session.cc:255] Device mapping:
/job:localhost/replica:0/task:0/gpu:0 -> device: 0, name: GeForce GT 730, pci bus id: 0000:01:00.0

回答 6

除了其他答案之外,以下内容还应帮助您确保您的tensorflow版本包括GPU支持。

import tensorflow as tf
print(tf.test.is_built_with_cuda())

In addition to other answers, the following should help you to make sure that your version of tensorflow includes GPU support.

import tensorflow as tf
print(tf.test.is_built_with_cuda())

回答 7

好的,首先ipython shell从终端和importTensorFlow 启动一个:

$ ipython --pylab
Python 3.6.5 |Anaconda custom (64-bit)| (default, Apr 29 2018, 16:14:56) 
Type 'copyright', 'credits' or 'license' for more information
IPython 6.4.0 -- An enhanced Interactive Python. Type '?' for help.
Using matplotlib backend: Qt5Agg

In [1]: import tensorflow as tf

现在,我们可以使用以下命令在控制台中查看 GPU内存使用情况:

# realtime update for every 2s
$ watch -n 2 nvidia-smi

由于我们只import使用过TensorFlow,但尚未使用任何GPU,因此使用情况统计信息为:

请注意,GPU内存使用情况非常少(〜700MB);有时,GPU内存使用量甚至可能低至0 MB。


现在,让我们在代码中加载GPU。如中所示tf documentation,执行以下操作:

In [2]: sess = tf.Session(config=tf.ConfigProto(log_device_placement=True))

现在,手表统计信息应显示更新的GPU使用情况内存,如下所示:

现在观察从ipython shell进行的Python进程如何使用约7 GB的GPU内存。


PS:您可以在代码运行时继续观看这些统计信息,以了解随着时间的推移GPU使用的强度。

Ok, first launch an ipython shell from the terminal and import TensorFlow:

$ ipython --pylab
Python 3.6.5 |Anaconda custom (64-bit)| (default, Apr 29 2018, 16:14:56) 
Type 'copyright', 'credits' or 'license' for more information
IPython 6.4.0 -- An enhanced Interactive Python. Type '?' for help.
Using matplotlib backend: Qt5Agg

In [1]: import tensorflow as tf

Now, we can watch the GPU memory usage in a console using the following command:

# realtime update for every 2s
$ watch -n 2 nvidia-smi

Since we’ve only imported TensorFlow but have not used any GPU yet, the usage stats will be:

Notice how the GPU memory usage is very less (~ 700MB); Sometimes the GPU memory usage might even be as low as 0 MB.


Now, let’s load the GPU in our code. As indicated in tf documentation, do:

In [2]: sess = tf.Session(config=tf.ConfigProto(log_device_placement=True))

Now, the watch stats should show an updated GPU usage memory as below:

Observe now how our Python process from the ipython shell is using ~ 7 GB of the GPU memory.


P.S. You can continue watching these stats as the code is running, to see how intense the GPU usage is over time.


回答 8

这应该给出可用于Tensorflow的设备列表(在Py-3.6下):

tf = tf.Session(config=tf.ConfigProto(log_device_placement=True))
tf.list_devices()
# _DeviceAttributes(/job:localhost/replica:0/task:0/device:CPU:0, CPU, 268435456)

This should give the list of devices available for Tensorflow (under Py-3.6):

tf = tf.Session(config=tf.ConfigProto(log_device_placement=True))
tf.list_devices()
# _DeviceAttributes(/job:localhost/replica:0/task:0/device:CPU:0, CPU, 268435456)

回答 9

我更喜欢使用nvidia-smi来监视GPU使用情况。如果在您开始编程时它显着上升,则表明您的张量流正在使用GPU。

I prefer to use nvidia-smi to monitor GPU usage. if it goes up significantly when you start you program, it’s a strong sign that your tensorflow is using GPU.


回答 10

使用Tensorflow的最新更新,您可以按以下步骤进行检查:

tf.test.is_gpu_available( cuda_only=False, min_cuda_compute_capability=None)

这将返回True如果正在使用的GPU Tensorflow,并返回False否则。

如果需要设备device_name,可以键入:tf.test.gpu_device_name()。从这里获取更多详细信息

With the recent updates of Tensorflow, you can check it as follow :

tf.test.is_gpu_available( cuda_only=False, min_cuda_compute_capability=None)

This will return True if GPU is being used by Tensorflow, and return False otherwise.

If you want device device_name you can type : tf.test.gpu_device_name(). Get more details from here


回答 11

在Jupyter中运行以下命令,

import tensorflow as tf
sess = tf.Session(config=tf.ConfigProto(log_device_placement=True))

如果您已经正确设置了环境,则在运行“ jupyter notebook”的终端中将获得以下输出

2017-10-05 14:51:46.335323: I c:\tf_jenkins\home\workspace\release-win\m\windows-gpu\py\35\tensorflow\core\common_runtime\gpu\gpu_device.cc:1030] Creating TensorFlow device (/gpu:0) -> (device: 0, name: Quadro K620, pci bus id: 0000:02:00.0)
Device mapping:
/job:localhost/replica:0/task:0/gpu:0 -> device: 0, name: Quadro K620, pci bus id: 0000:02:00.0
2017-10-05 14:51:46.337418: I c:\tf_jenkins\home\workspace\release-win\m\windows-gpu\py\35\tensorflow\core\common_runtime\direct_session.cc:265] Device mapping:
/job:localhost/replica:0/task:0/gpu:0 -> device: 0, name: Quadro K620, pci bus id: 0000:02:00.0

您可以在这里看到我正在使用TensorFlow和Nvidia Quodro K620。

Run the following in Jupyter,

import tensorflow as tf
sess = tf.Session(config=tf.ConfigProto(log_device_placement=True))

If you’ve set up your environment properly, you’ll get the following output in the terminal where you ran “jupyter notebook”,

2017-10-05 14:51:46.335323: I c:\tf_jenkins\home\workspace\release-win\m\windows-gpu\py\35\tensorflow\core\common_runtime\gpu\gpu_device.cc:1030] Creating TensorFlow device (/gpu:0) -> (device: 0, name: Quadro K620, pci bus id: 0000:02:00.0)
Device mapping:
/job:localhost/replica:0/task:0/gpu:0 -> device: 0, name: Quadro K620, pci bus id: 0000:02:00.0
2017-10-05 14:51:46.337418: I c:\tf_jenkins\home\workspace\release-win\m\windows-gpu\py\35\tensorflow\core\common_runtime\direct_session.cc:265] Device mapping:
/job:localhost/replica:0/task:0/gpu:0 -> device: 0, name: Quadro K620, pci bus id: 0000:02:00.0

You can see here I’m using TensorFlow with an Nvidia Quodro K620.


回答 12

我发现仅从命令行查询gpu是最简单的:

nvidia-smi

+-----------------------------------------------------------------------------+
| NVIDIA-SMI 384.98                 Driver Version: 384.98                    |
|-------------------------------+----------------------+----------------------+
| GPU  Name        Persistence-M| Bus-Id        Disp.A | Volatile Uncorr. ECC |
| Fan  Temp  Perf  Pwr:Usage/Cap|         Memory-Usage | GPU-Util  Compute M. |
|===============================+======================+======================|
|   0  GeForce GTX 980 Ti  Off  | 00000000:02:00.0  On |                  N/A |
| 22%   33C    P8    13W / 250W |   5817MiB /  6075MiB |      0%      Default |
+-------------------------------+----------------------+----------------------+

+-----------------------------------------------------------------------------+
| Processes:                                                       GPU Memory |
|  GPU       PID   Type   Process name                             Usage      |
|=============================================================================|
|    0      1060      G   /usr/lib/xorg/Xorg                            53MiB |
|    0     25177      C   python                                      5751MiB |
+-----------------------------------------------------------------------------+

如果您的学习是后台进程,则pid的来源 jobs -p应与pid的来源相匹配nvidia-smi

I find just querying the gpu from the command line is easiest:

nvidia-smi

+-----------------------------------------------------------------------------+
| NVIDIA-SMI 384.98                 Driver Version: 384.98                    |
|-------------------------------+----------------------+----------------------+
| GPU  Name        Persistence-M| Bus-Id        Disp.A | Volatile Uncorr. ECC |
| Fan  Temp  Perf  Pwr:Usage/Cap|         Memory-Usage | GPU-Util  Compute M. |
|===============================+======================+======================|
|   0  GeForce GTX 980 Ti  Off  | 00000000:02:00.0  On |                  N/A |
| 22%   33C    P8    13W / 250W |   5817MiB /  6075MiB |      0%      Default |
+-------------------------------+----------------------+----------------------+

+-----------------------------------------------------------------------------+
| Processes:                                                       GPU Memory |
|  GPU       PID   Type   Process name                             Usage      |
|=============================================================================|
|    0      1060      G   /usr/lib/xorg/Xorg                            53MiB |
|    0     25177      C   python                                      5751MiB |
+-----------------------------------------------------------------------------+

if your learning is a background process the pid from jobs -p should match the pid from nvidia-smi


回答 13

您可以通过运行以下代码来检查当前是否正在使用GPU:

import tensorflow as tf
tf.test.gpu_device_name()

如果输出为'',则表示您CPU仅在使用;
如果输出是类似的内容/device:GPU:0,则表示GPU有效。


并使用以下代码检查GPU您正在使用的代码:

from tensorflow.python.client import device_lib 
device_lib.list_local_devices()

You can check if you are currently using the GPU by running the following code:

import tensorflow as tf
tf.test.gpu_device_name()

If the output is '', it means you are using CPU only;
If the output is something like that /device:GPU:0, it means GPU works.


And use the following code to check which GPU you are using:

from tensorflow.python.client import device_lib 
device_lib.list_local_devices()

回答 14

将其放在jupyter笔记本顶部附近。注释掉您不需要的内容。

# confirm TensorFlow sees the GPU
from tensorflow.python.client import device_lib
assert 'GPU' in str(device_lib.list_local_devices())

# confirm Keras sees the GPU (for TensorFlow 1.X + Keras)
from keras import backend
assert len(backend.tensorflow_backend._get_available_gpus()) > 0

# confirm PyTorch sees the GPU
from torch import cuda
assert cuda.is_available()
assert cuda.device_count() > 0
print(cuda.get_device_name(cuda.current_device()))

注意:随着TensorFlow 2.0的发布,Keras现在已包含在TF API中。

最初在这里回答。

Put this near the top of your jupyter notebook. Comment out what you don’t need.

# confirm TensorFlow sees the GPU
from tensorflow.python.client import device_lib
assert 'GPU' in str(device_lib.list_local_devices())

# confirm Keras sees the GPU (for TensorFlow 1.X + Keras)
from keras import backend
assert len(backend.tensorflow_backend._get_available_gpus()) > 0

# confirm PyTorch sees the GPU
from torch import cuda
assert cuda.is_available()
assert cuda.device_count() > 0
print(cuda.get_device_name(cuda.current_device()))

NOTE: With the release of TensorFlow 2.0, Keras is now included as part of the TF API.

Originally answerwed here.


回答 15

对于Tensorflow 2.0

import tensorflow as tf

tf.test.is_gpu_available(
    cuda_only=False,
    min_cuda_compute_capability=None
)

来源在这里

其他选择是:

tf.config.experimental.list_physical_devices('GPU')

For Tensorflow 2.0

import tensorflow as tf

tf.test.is_gpu_available(
    cuda_only=False,
    min_cuda_compute_capability=None
)

source here

other option is:

tf.config.experimental.list_physical_devices('GPU')

回答 16

TENSORFLOW的更新> = 2.1。

检查TensorFlow是否使用GPU的推荐方法如下:

tf.config.list_physical_devices('GPU') 

从TensorFlow 2.1开始,tf.test.gpu_device_name()已不赞成使用上述内容。

UPDATE FOR TENSORFLOW >= 2.1.

The recommended way in which to check if TensorFlow is using GPU is the following:

tf.config.list_physical_devices('GPU') 

As of TensorFlow 2.1, tf.test.gpu_device_name() has been deprecated in favour of the aforementioned.

Then, in the terminal you can use nvidia-smi to check how much GPU memory has been alloted; at the same time, using watch -n K nvidia-smi would tell you for example every K seconds how much memory you are using (you may want to use K = 1 for real-time)


回答 17

这是我用来列出可tf.session直接从bash 访问的设备的行:

python -c "import os; os.environ['TF_CPP_MIN_LOG_LEVEL'] = '3'; import tensorflow as tf; sess = tf.Session(); [print(x) for x in sess.list_devices()]; print(tf.__version__);"

它将打印可用设备和tensorflow版本,例如:

_DeviceAttributes(/job:localhost/replica:0/task:0/device:CPU:0, CPU, 268435456, 10588614393916958794)
_DeviceAttributes(/job:localhost/replica:0/task:0/device:XLA_GPU:0, XLA_GPU, 17179869184, 12320120782636586575)
_DeviceAttributes(/job:localhost/replica:0/task:0/device:XLA_CPU:0, XLA_CPU, 17179869184, 13378821206986992411)
_DeviceAttributes(/job:localhost/replica:0/task:0/device:GPU:0, GPU, 32039954023, 12481654498215526877)
1.14.0

This is the line I am using to list devices available to tf.session directly from bash:

python -c "import os; os.environ['TF_CPP_MIN_LOG_LEVEL'] = '3'; import tensorflow as tf; sess = tf.Session(); [print(x) for x in sess.list_devices()]; print(tf.__version__);"

It will print available devices and tensorflow version, for example:

_DeviceAttributes(/job:localhost/replica:0/task:0/device:CPU:0, CPU, 268435456, 10588614393916958794)
_DeviceAttributes(/job:localhost/replica:0/task:0/device:XLA_GPU:0, XLA_GPU, 17179869184, 12320120782636586575)
_DeviceAttributes(/job:localhost/replica:0/task:0/device:XLA_CPU:0, XLA_CPU, 17179869184, 13378821206986992411)
_DeviceAttributes(/job:localhost/replica:0/task:0/device:GPU:0, GPU, 32039954023, 12481654498215526877)
1.14.0

回答 18

我发现下面的代码片段非常方便测试gpu ..

Tensorflow 2.0测试

import tensorflow as tf
import tensorflow.compat.v1 as tf
tf.disable_v2_behavior()
with tf.device('/gpu:0'):
    a = tf.constant([1.0, 2.0, 3.0, 4.0, 5.0, 6.0], shape=[2, 3], name='a')
    b = tf.constant([1.0, 2.0, 3.0, 4.0, 5.0, 6.0], shape=[3, 2], name='b')
    c = tf.matmul(a, b)

with tf.Session() as sess:
    print (sess.run(c))

Tensorflow 1测试

import tensorflow as tf
with tf.device('/gpu:0'):
    a = tf.constant([1.0, 2.0, 3.0, 4.0, 5.0, 6.0], shape=[2, 3], name='a')
    b = tf.constant([1.0, 2.0, 3.0, 4.0, 5.0, 6.0], shape=[3, 2], name='b')
    c = tf.matmul(a, b)

with tf.Session() as sess:
    print (sess.run(c))

I found below snippet is very handy to test the gpu ..

Tensorflow 2.0 Test

import tensorflow as tf
import tensorflow.compat.v1 as tf
tf.disable_v2_behavior()
with tf.device('/gpu:0'):
    a = tf.constant([1.0, 2.0, 3.0, 4.0, 5.0, 6.0], shape=[2, 3], name='a')
    b = tf.constant([1.0, 2.0, 3.0, 4.0, 5.0, 6.0], shape=[3, 2], name='b')
    c = tf.matmul(a, b)

with tf.Session() as sess:
    print (sess.run(c))

Tensorflow 1 Test

import tensorflow as tf
with tf.device('/gpu:0'):
    a = tf.constant([1.0, 2.0, 3.0, 4.0, 5.0, 6.0], shape=[2, 3], name='a')
    b = tf.constant([1.0, 2.0, 3.0, 4.0, 5.0, 6.0], shape=[3, 2], name='b')
    c = tf.matmul(a, b)

with tf.Session() as sess:
    print (sess.run(c))

回答 19

以下内容还将返回您的GPU设备的名称。

import tensorflow as tf
tf.test.gpu_device_name()

The following will also return the name of your GPU devices.

import tensorflow as tf
tf.test.gpu_device_name()

回答 20

使用tensotflow 2.0> =

import tensorflow as tf
sess = tf.compat.v1.Session(config=tf.compat.v1.ConfigProto(log_device_placement=True))

With tensorflow 2.0 >=

import tensorflow as tf
sess = tf.compat.v1.Session(config=tf.compat.v1.ConfigProto(log_device_placement=True))


回答 21

您可以使用一些选项来测试TensorFlow安装是否正在使用GPU加速。

您可以在三个不同的平台上键入以下命令。

import tensorflow as tf
sess = tf.Session(config=tf.ConfigProto(log_device_placement=True))
  1. Jupyter Notebook-检查正在运行Jupyter Notebook的控制台。您将能够看到正在使用的GPU。
  2. Python Shell-您将能够直接看到输出。(注意-不要将第二个命令的输出分配给变量’sess’;如果有帮助的话)。
  3. Spyder-在控制台中键入以下命令。

    import tensorflow as tf tf.test.is_gpu_available()

You have some options to test whether GPU acceleration is being used by your TensorFlow installation.

You can type in the following commands in three different platforms.

import tensorflow as tf
sess = tf.Session(config=tf.ConfigProto(log_device_placement=True))
  1. Jupyter Notebook – Check the console which is running the Jupyter Notebook. You will be able to see the GPU being used.
  2. Python Shell – You will be able to directly see the output. (Note- do not assign the output of the second command to the variable ‘sess’; if that helps).
  3. Spyder – Type in the following command in the console.

    import tensorflow as tf tf.test.is_gpu_available()


回答 22

Tensorflow 2.1

可以使用nvidia-smi进行验证的简单计算,以了解GPU上的内存使用情况。

import tensorflow as tf 

c1 = []
n = 10

def matpow(M, n):
    if n < 1: #Abstract cases where n < 1
        return M
    else:
        return tf.matmul(M, matpow(M, n-1))

with tf.device('/gpu:0'):
    a = tf.Variable(tf.random.uniform(shape=(10000, 10000)), name="a")
    b = tf.Variable(tf.random.uniform(shape=(10000, 10000)), name="b")
    c1.append(matpow(a, n))
    c1.append(matpow(b, n))

Tensorflow 2.1

A simple calculation that can be verified with nvidia-smi for memory usage on the GPU.

import tensorflow as tf 

c1 = []
n = 10

def matpow(M, n):
    if n < 1: #Abstract cases where n < 1
        return M
    else:
        return tf.matmul(M, matpow(M, n-1))

with tf.device('/gpu:0'):
    a = tf.Variable(tf.random.uniform(shape=(10000, 10000)), name="a")
    b = tf.Variable(tf.random.uniform(shape=(10000, 10000)), name="b")
    c1.append(matpow(a, n))
    c1.append(matpow(b, n))

回答 23

>>> import tensorflow as tf 
>>> tf.config.list_physical_devices('GPU')

2020-05-10 14:58:16.243814: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcuda.so.1
2020-05-10 14:58:16.262675: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:981] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero
2020-05-10 14:58:16.263119: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1555] Found device 0 with properties:
pciBusID: 0000:01:00.0 name: GeForce GTX 1060 6GB computeCapability: 6.1
coreClock: 1.7715GHz coreCount: 10 deviceMemorySize: 5.93GiB deviceMemoryBandwidth: 178.99GiB/s
2020-05-10 14:58:16.263143: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcudart.so.10.1
2020-05-10 14:58:16.263188: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcublas.so.10
2020-05-10 14:58:16.264289: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcufft.so.10
2020-05-10 14:58:16.264495: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcurand.so.10
2020-05-10 14:58:16.265644: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcusolver.so.10
2020-05-10 14:58:16.266329: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcusparse.so.10
2020-05-10 14:58:16.266357: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcudnn.so.7
2020-05-10 14:58:16.266478: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:981] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero
2020-05-10 14:58:16.266823: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:981] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero
2020-05-10 14:58:16.267107: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1697] Adding visible gpu devices: 0
[PhysicalDevice(name='/physical_device:GPU:0', device_type='GPU')]

如@AmitaiIrron所建议:

本部分表明已找到一个GPU

2020-05-10 14:58:16.263119: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1555] Found device 0 with properties:

pciBusID: 0000:01:00.0 name: GeForce GTX 1060 6GB computeCapability: 6.1
coreClock: 1.7715GHz coreCount: 10 deviceMemorySize: 5.93GiB deviceMemoryBandwidth: 178.99GiB/s

在这里,它被添加为可用的物理设备

2020-05-10 14:58:16.267107: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1697] Adding visible gpu devices: 0

[PhysicalDevice(name='/physical_device:GPU:0', device_type='GPU')]
>>> import tensorflow as tf 
>>> tf.config.list_physical_devices('GPU')

2020-05-10 14:58:16.243814: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcuda.so.1
2020-05-10 14:58:16.262675: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:981] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero
2020-05-10 14:58:16.263119: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1555] Found device 0 with properties:
pciBusID: 0000:01:00.0 name: GeForce GTX 1060 6GB computeCapability: 6.1
coreClock: 1.7715GHz coreCount: 10 deviceMemorySize: 5.93GiB deviceMemoryBandwidth: 178.99GiB/s
2020-05-10 14:58:16.263143: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcudart.so.10.1
2020-05-10 14:58:16.263188: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcublas.so.10
2020-05-10 14:58:16.264289: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcufft.so.10
2020-05-10 14:58:16.264495: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcurand.so.10
2020-05-10 14:58:16.265644: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcusolver.so.10
2020-05-10 14:58:16.266329: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcusparse.so.10
2020-05-10 14:58:16.266357: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcudnn.so.7
2020-05-10 14:58:16.266478: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:981] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero
2020-05-10 14:58:16.266823: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:981] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero
2020-05-10 14:58:16.267107: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1697] Adding visible gpu devices: 0
[PhysicalDevice(name='/physical_device:GPU:0', device_type='GPU')]

As suggested by @AmitaiIrron:

This section indicates that a gpu was found

2020-05-10 14:58:16.263119: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1555] Found device 0 with properties:

pciBusID: 0000:01:00.0 name: GeForce GTX 1060 6GB computeCapability: 6.1
coreClock: 1.7715GHz coreCount: 10 deviceMemorySize: 5.93GiB deviceMemoryBandwidth: 178.99GiB/s

And here that it got added as an available physical device

2020-05-10 14:58:16.267107: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1697] Adding visible gpu devices: 0

[PhysicalDevice(name='/physical_device:GPU:0', device_type='GPU')]

回答 24

如果您正在使用TensorFlow 2.0,则可以使用此for循环显示设备:

with tf.compat.v1.Session() as sess:
  devices = sess.list_devices()
devices

If you are using TensorFlow 2.0, you can use this for loop to show the devices:

with tf.compat.v1.Session() as sess:
  devices = sess.list_devices()
devices

回答 25

如果您使用的是tensorflow 2.x,请使用:

sess = tf.compat.v1.Session(config=tf.compat.v1.ConfigProto(log_device_placement=True))

if you are using tensorflow 2.x use:

sess = tf.compat.v1.Session(config=tf.compat.v1.ConfigProto(log_device_placement=True))

回答 26

在Jupyter或IDE中运行以下命令以检查Tensorflow是否使用GPU: tf.config.list_physical_devices('GPU')

Run this command in Jupyter or your IDE to check if Tensorflow is using a GPU or not: tf.config.list_physical_devices('GPU')


Fastai – 快速深度学习模块

欢迎光临Fastai

FAST AI使用现代最佳实践简化了快速准确的神经网络训练

正在安装

您可以在不安装的情况下使用Fastai,方法是使用Google Colab事实上,本文档的每一页还可以作为交互式笔记本使用-单击任何页面顶部的“Open in CoLab”将其打开(请确保将Colab运行时更改为“GPU”以使其快速运行!)请参阅上的fast.ai文档Using Colab了解更多信息

只要你运行的是Linux或Windows(不支持Mac),你就可以用conda在你自己的机器上安装Fastai(强烈推荐)。对于Windows,请参阅“在Windows上运行”了解重要说明

如果您使用的是miniconda(推荐)然后运行(请注意,如果您替换conda使用mamba安装过程将更快、更可靠):

conda install -c fastchan fastai

或者如果您正在使用Anaconda然后运行:

conda install -c fastchan fastai anaconda

要使用pip安装,请使用:pip install fastai如果使用pip安装,则应首先按照PyTorch安装PyTorchinstallation instructions

如果您计划自己开发Fastai,或者想要走在前列,您可以使用可编辑安装(如果这样做,您还应该使用的可编辑安装fastcore来配合它。)首先安装PyTorch,然后:

git clone https://github.com/fastai/fastai
pip install -e "fastai[dev]"

学习速度快

开始快速学习(和深度学习)的最好方法是阅读the book,并完成the free course

要了解使用FAST AI可能实现的功能,请查看Quick Start,它展示了如何使用大约5行代码来构建图像分类器、图像分割模型、文本情感模型、推荐系统和表格模型。对于每个应用程序,代码基本相同

通读一下Tutorials学习如何在您自己的数据集上训练您自己的模型。使用导航侧栏浏览FAST AI文档。这里记录了每个类、函数和方法

要了解图书馆的设计和动机,请阅读peer reviewed paper

关于FAST

Fastai是一个深度学习库,它为从业者提供高级组件,这些组件可以快速、轻松地在标准深度学习领域提供最先进的结果,并为研究人员提供可以混合和匹配的低级组件,以建立新的方法。它的目标是在易用性、灵活性或性能方面不做实质性妥协的情况下完成这两件事。这要归功于精心分层的体系结构,该体系结构以分离的抽象方式表达了许多深度学习和数据处理技术的公共底层模式。通过利用底层Python语言的动态性和PyTorch库的灵活性,可以简洁而清晰地表达这些抽象。FAST AI包括:

  • 一种新的Python类型调度系统和张量的语义类型层次结构
  • 一个GPU优化的计算机视觉库,可以在纯Python中扩展
  • 一种优化器,它将现代优化器的常见功能重构为两个基本部分,允许在4-5行代码中实现优化算法
  • 一种新颖的双向回调系统,可以访问数据、模型或优化器的任何部分,并在培训期间随时更改它们
  • 一种全新的数据挡路接口
  • 还有更多

FAST AI围绕两个主要设计目标进行组织:平易近人和快速高效,同时也是深度可破解和可配置的。它构建在提供可组合构建块的低级API层次结构之上。这样,想要重写高级API的一部分或添加特定行为以满足其需求的用户不必学习如何使用最低级别

从其他库迁移

从普通的PyTorch、Ignite或任何其他基于PyTorch的库迁移非常容易,甚至可以将Fastai与其他库结合使用。通常,您将能够使用所有现有的数据处理代码,但将能够减少培训所需的代码量,并更容易地利用现代最佳实践。以下是一些受欢迎的库中的迁移指南,可帮助您上路:

Windows支持

使用安装时mambaconda替换-c fastchan在使用的安装中-c pytorch -c nvidia -c fastai,因为Windows当前不支持Fastchan

由于Jupyter和Windows上的Python多处理问题,num_workersDataloader自动重置为0,以避免Jupyter挂起。这使得Windows上的Jupyter上的计算机视觉等任务比Linux上的要慢很多倍。如果从脚本使用FastAI,则不存在此限制

看见this example要充分利用Windows上的Fastai API,请执行以下操作

测试

要并行运行测试,请启动:

nbdev_test_nbsmake test

要通过所有测试,您需要安装以下可选依赖项:

pip install "sentencepiece<0.1.90" wandb tensorboard albumentations pydicom opencv-python scikit-image pyarrow kornia \
    catalyst captum neptune-cli

测试是使用nbdev,例如,请参阅的文档test_eq

贡献

克隆此存储库后,请运行nbdev_install_git_hooks在你的航站楼里。这将设置git挂钩,它会清理笔记本以删除笔记本中存储的无关内容(例如,您运行了哪些单元格),这会导致不必要的合并冲突。

在提交PR之前,请检查当地图书馆和笔记本是否匹配。剧本nbdev_diff_nbs我可以让你知道当地图书馆和笔记本有什么不同

  • 如果您对其中一个导出的单元格中的笔记本进行了更改,则可以使用以下命令将其导出到库nbdev_build_libmake fastai
  • 如果您对库进行了更改,则可以使用以下命令将其导出回笔记本nbdev_update_lib

码头集装箱

对于那些对这个项目的官方码头集装箱感兴趣的人,可以在here

Pytorch-强GPU加速的Python中的张量和动态神经网络

PyTorch是一个Python软件包,提供了两个高级功能。

张量计算(如NumPy),具有强大的GPU加速能力,建立在基于磁带的autograd系统上的深度神经网络,在需要的时候,你可以重复使用你最喜欢的Python包,如NumPy、SciPy和Cython,来扩展PyTorch。

系统 3.6 3.7 3.8
Linux CPU
Linux GPU
Windows CPU/GPU
Linux(Ppc64le)CPU
Linux(Ppc64le)GPU
Linux(Aarch64)CPU

另请参阅ci.pytorch.org HUD

更多关于PyTorch的信息

在粒度级别上,PyTorch是一个由以下组件组成的库:

组件 描述
火炬 像NumPy这样的张量器库,具有强大的GPU支持
torch.autograd 基于磁带的自动区分库,支持TORCH中的所有可微分张量操作
torch.jit 从PyTorch代码创建可序列化和可优化模型的编译堆栈(TorchScript)
torch.nn 与Autograd深度集成的神经网络库,旨在实现最大的灵活性
torch.multiprocessing Python多处理,但具有跨进程共享火炬张量的神奇内存。适用于数据加载和HogWild培训
torch.utils 为方便起见,DataLoader和其他实用程序功能

通常,PyTorch用作以下任一用途:

  • 替代NumPy使用GPU的功能
  • 提供最大灵活性和速度的深度学习研究平台

进一步阐述:

一种支持GPU的张量库

如果您使用NumPy,则您使用的是张量(也称为(Ndarray)

PyTorch提供的张量既可以在CPU上运行,也可以在GPU上运行,从而将计算速度大幅提高

我们提供各种各样的张量例程来加速和满足您的科学计算需求,例如切片、索引、数学运算、线性代数、约简。而且他们跑得很快!

动态神经网络:基于磁带的自动评分

PyTorch有一种构建神经网络的独特方式:使用和重放磁带录音机

大多数框架,如TensorFlow、Theano、Caffe和CNTK都有静电的世界观。人们必须建立一个神经网络,并一次又一次地重复使用相同的结构。改变网络的行为方式意味着必须从头开始

对于PyTorch,我们使用一种称为反向模式自动区分的技术,该技术允许您在没有延迟或开销的情况下任意更改网络的行为方式。我们的灵感来自于几篇关于这一主题的研究论文,以及目前和过去的工作,如手电筒-自动分级自动评分链条

虽然这种技术不是PyTorch独有的,但它是迄今为止最快的实现之一。你在疯狂的研究中获得了最快的速度和最好的灵活性

Python优先

PyTorch不是到单一C++框架的Python绑定。它是为深度集成到Python而构建的。你可以像以前一样自然地使用它NumPy/科学Py/科学工具包-学习等。您可以用Python本身编写新的神经网络层,使用您喜欢的库并使用包,如CythonNumba我们的目标是不在适当的地方重新发明轮子。

势在必行的经验

PyTorch被设计为直观、线性的思想,并且易于使用。当您执行一行代码时,它就会被执行。没有异步的世界观。当您进入调试器或接收错误消息和堆栈跟踪时,理解它们很简单。堆栈跟踪准确地指向定义代码的位置。我们希望您永远不要因为错误的堆栈跟踪或异步且不透明的执行引擎而花费数小时调试您的代码

快速精益

PyTorch的框架开销最小。我们集成了加速库,如英特尔MKL和NVIDIA(CuDNNNCCL)以最大限度地提高速度。在核心上,其CPU和GPU张量以及神经网络后端(TH、THC、THNN、THCUNN)已经成熟,并且经过多年的测试

因此,PyTorch是相当快的-无论您运行的是小型神经网络还是大型神经网络

与Torch或某些替代方案相比,PyTorch中的内存使用效率非常高。我们已经为GPU编写了自定义内存分配器,以确保您的深度学习模型具有最高的内存效率。这使您能够训练比以前更大的深度学习模型

无痛延长

编写新的神经网络模块,或者与PyTorch的张量API接口,设计得简单明了,抽象最少

您可以使用Torch API在Python中编写新的神经网络图层或者您最喜欢的基于NumPy的库,如SciPy

如果您想用C/C++编写您的层,我们提供了一个方便的扩展API,它是高效的,并且具有最少的样板。不需要编写包装器代码。你可以看到此处提供教程这里有一个例子

安装

二进制文件

通过conda或pip轮从二进制文件安装的命令在我们的网站上提供:https://pytorch.org

NVIDIA Jetson平台

NVIDIA的Jetson Nano、Jetson TX2和Jetson AGX Xavier的Python轮可通过以下URL获得:

它们需要JetPack 4.2和更高版本,以及@达斯蒂-内华达州维护它们

来自源

如果从源代码安装,则需要Python 3.6.2或更高版本和C++14编译器。此外,我们强烈建议您安装蟒蛇环境。您将获得一个高质量的BLAS库(MKL),并且无论您的Linux发行版是什么,您都可以获得受控的依赖项版本

一旦你有了蟒蛇已安装,以下是说明

如果要使用CUDA支持进行编译,请安装

如果要禁用CUDA支持,请导出环境变量USE_CUDA=0其他可能有用的环境变量可以在setup.py

如果您正在为NVIDIA的Jetson平台(Jetson Nano、TX1、TX2、AGX Xavier)构建,请参阅以下安装PyTorch for Jetson Nano的说明此处提供

如果要使用ROCM支持进行编译,请安装

  • AMD ROCM4.0及更高版本的安装
  • ROCM目前仅支持Linux系统

如果要禁用ROCM支持,请导出环境变量USE_ROCM=0其他可能有用的环境变量可以在setup.py

安装依赖项

常见

conda install astunparse numpy ninja pyyaml mkl mkl-include setuptools cmake cffi typing_extensions future six requests dataclasses

在Linux上

# CUDA only: Add LAPACK support for the GPU if needed
conda install -c pytorch magma-cuda110  # or the magma-cuda* that matches your CUDA version from https://anaconda.org/pytorch/repo

在MacOS上

# Add these packages if torch.distributed is needed
conda install pkg-config libuv

在Windows上

# Add these packages if torch.distributed is needed.
# Distributed package support on Windows is a prototype feature and is subject to changes.
conda install -c conda-forge libuv=1.39

获取PyTorch源代码

git clone --recursive https://github.com/pytorch/pytorch
cd pytorch
# if you are updating an existing checkout
git submodule sync
git submodule update --init --recursive --jobs 0

安装PyTorch

在Linux上

export CMAKE_PREFIX_PATH=${CONDA_PREFIX:-"$(dirname $(which conda))/../"}
python setup.py install

请注意,如果您正在为ROCM编译,则必须首先运行此命令:

python tools/amd_build/build_amd.py

请注意,如果您使用的是Python,您可能会遇到链接器导致的错误:

build/temp.linux-x86_64-3.7/torch/csrc/stub.o: file not recognized: file format not recognized
collect2: error: ld returned 1 exit status
error: command 'g++' failed with exit status 1

这是由以下原因引起的ld从Conda环境跟踪系统ld您应该使用较新版本的Python来修复此问题。推荐的Python版本为3.6.10+、3.7.6+和3.8.1+

在MacOS上

export CMAKE_PREFIX_PATH=${CONDA_PREFIX:-"$(dirname $(which conda))/../"}
MACOSX_DEPLOYMENT_TARGET=10.9 CC=clang CXX=clang++ python setup.py install

每个CUDA版本仅支持一个特定的XCode版本。据报道,以下组合可与PyTorch配合使用

CUDA版 Xcode版本
10.0 Xcode 9.4
10.1 Xcode 10.1

在Windows上

选择正确的Visual Studio版本

在Visual Studio的新版本中有时会出现回归,因此最好使用相同的Visual Studio版本16.8.5作为Pytorch CI。虽然PyTorch CI使用Visual Studio BuildTools,但您可以使用Visual Studio Enterprise、Professional或Community

如果您想构建遗留的python代码,请参阅在遗留代码和CUDA的基础上构建

使用CPU构建

使用CPU构建相当容易

有关OpenMP的说明:所需的OpenMP实施是英特尔OpenMP(IOMP)。为了链接到iomp,您需要手动下载库并通过调整设置构建环境CMAKE_INCLUDE_PATHLIB该说明这里是设置MKL和英特尔OpenMP的示例。如果没有这些CMake配置,将使用Microsoft Visual C OpenMP运行时(vcomp

使用CUDA构建

NVTX是使用CUDA构建Pytorch所必需的。NVTX是CUDA分布式的一部分,被称为“NSight Compute”。要将其安装到已安装的CUDA上,请再次运行CUDA安装并选中相应的复选框。确保在Visual Studio之后安装带Night Compute的CUDA

目前支持VS 2017/2019,支持忍者作为CMake的生成器。如果ninja.exe在以下位置检测到PATH,则使用忍者作为默认生成器,否则将使用VS 2017/2019
如果选择忍者作为生成器,则会选择最新的MSVC作为底层工具链

其他库,如岩浆oneDNN,也称为MKLDNN或DNNL,以及Sccache是经常需要的。请参阅安装帮助器要安装它们,请执行以下操作

您可以参考build_pytorch.bat其他一些环境变量配置的脚本

cmd

:: [Optional] If you want to build with the VS 2017 generator for old CUDA and PyTorch, please change the value in the next line to `Visual Studio 15 2017`.
:: Note: This value is useless if Ninja is detected. However, you can force that by using `set USE_NINJA=OFF`.
set CMAKE_GENERATOR=Visual Studio 16 2019

:: Read the content in the previous section carefully before you proceed.
:: [Optional] If you want to override the underlying toolset used by Ninja and Visual Studio with CUDA, please run the following script block.
:: "Visual Studio 2019 Developer Command Prompt" will be run automatically.
:: Make sure you have CMake >= 3.12 before you do this when you use the Visual Studio generator.
set CMAKE_GENERATOR_TOOLSET_VERSION=14.27
set DISTUTILS_USE_SDK=1
for /f "usebackq tokens=*" %i in (`"%ProgramFiles(x86)%\Microsoft Visual Studio\Installer\vswhere.exe" -version [15^,16^) -products * -latest -property installationPath`) do call "%i\VC\Auxiliary\Build\vcvarsall.bat" x64 -vcvars_ver=%CMAKE_GENERATOR_TOOLSET_VERSION%

:: [Optional] If you want to override the CUDA host compiler
set CUDAHOSTCXX=C:\Program Files (x86)\Microsoft Visual Studio\2019\Community\VC\Tools\MSVC\14.27.29110\bin\HostX64\x64\cl.exe

python setup.py install
调整生成选项(可选)

您可以通过执行以下操作来调整cmake变量的配置(无需首先构建)。例如,可以通过这样的步骤来调整CuDNN或BLAS的预先检测的目录

在Linux上

export CMAKE_PREFIX_PATH=${CONDA_PREFIX:-"$(dirname $(which conda))/../"}
python setup.py build --cmake-only
ccmake build  # or cmake-gui build

在MacOS上

export CMAKE_PREFIX_PATH=${CONDA_PREFIX:-"$(dirname $(which conda))/../"}
MACOSX_DEPLOYMENT_TARGET=10.9 CC=clang CXX=clang++ python setup.py build --cmake-only
ccmake build  # or cmake-gui build

Docker镜像

使用预构建映像

您还可以从Docker Hub获取预构建的坞站映像,并使用Docker v19.03+运行

docker run --gpus all --rm -ti --ipc=host pytorch/pytorch:latest

请注意,PyTorch使用共享内存在进程之间共享数据,因此如果使用Torch多处理(例如,对于多线程数据加载器),运行容器的默认共享内存段大小是不够的,您应该使用以下两种方法之一来增加共享内存大小--ipc=host--shm-size命令行选项用于nvidia-docker run

自己打造形象

注:必须使用18.06以上的坞站版本构建

这个Dockerfile用于构建支持CUDA 11.1和cuDNN V8的映像。你可以通过PYTHON_VERSION=x.yMake Variable指定Miniconda要使用的Python版本,或将其保留为未设置为使用默认版本

make -f docker.Makefile
# images are tagged as docker.io/${your_docker_username}/pytorch

构建文档

要构建各种格式的文档,您需要狮身人面像以及阅读文档的主题

cd docs/
pip install -r requirements.txt

然后,您可以通过运行以下命令来生成文档make <format>docs/文件夹。跑make要获取所有可用输出格式的列表,请执行以下操作

如果运行Katex错误npm install katex如果它持续存在,请尝试npm install -g katex

以前的版本

早期PyTorch版本的安装说明和二进制文件可在以下位置找到我们的网站

快速入门

帮助您入门的三点建议:

资源

沟通

发布和贡献

PyTorch有90天的发布周期(主要版本)。如果您通过以下方式遇到错误,请通知我们提交问题

我们感谢所有的贡献。如果您计划回馈错误修复,请在不做任何进一步讨论的情况下这样做

如果您计划为核心贡献新的功能、实用程序功能或扩展,请先打开一个问题并与我们讨论该功能。未经讨论发送PR可能最终导致拒绝PR,因为我们可能会将核心带到与您可能意识到的方向不同的方向

要了解更多关于为Pytorch做出贡献的信息,请参阅我们的投稿页面

团队

PyTorch是一个社区驱动的项目,有几个熟练的工程师和研究人员参与其中

PyTorch目前由亚当·帕兹克萨姆·格罗斯苏史密斯·钦塔拉格雷戈里·查南(Gregory Chanan)主要贡献来自于数以百计的各种形式和手段的人才。值得一提的是:特雷弗·基林(Trevor Killeen)、萨桑克·奇拉姆库尔蒂(Sasank Chilamkurthy)、谢尔盖·扎戈鲁伊科(Sergey Zagoruyko)、亚当·莱勒(Adam Lerer)、弗朗西斯科·马萨(Francisco Massa)、阿利汗·特贾尼(Alykhan Tejani)、卢卡·安提加(Luca Antiga)、阿尔班·德斯迈森(Alban Desmaison)、安德烈亚斯·科普夫(Andreas Koepf)、詹姆斯·布拉德伯里(James Bradbury)、林泽明、田远东、纪尧姆·兰普尔(Guillaume Lample

注:此项目与哈伯金/火炬同名同姓。休是Torch社区的一位有价值的贡献者,并在Torch和PyTorch的许多事情上提供了帮助

许可证

PyTorch具有BSD样式的许可证,可以在许可证