标签归档:system-information

如何使用python找出CPU数量

问题:如何使用python找出CPU数量

我想知道使用Python的本地计算机上的CPU数量。当使用最佳缩放的仅用户空间程序调用时,结果应user/real为输出time(1)

I want to know the number of CPUs on the local machine using Python. The result should be user/real as output by time(1) when called with an optimally scaling userspace-only program.


回答 0

如果您的Python版本> = 2.6,则可以简单地使用

import multiprocessing

multiprocessing.cpu_count()

http://docs.python.org/library/multiprocessing.html#multiprocessing.cpu_count

If you have python with a version >= 2.6 you can simply use

import multiprocessing

multiprocessing.cpu_count()

http://docs.python.org/library/multiprocessing.html#multiprocessing.cpu_count


回答 1

如果您对当前进程可用的处理器数量感兴趣,则必须首先检查cpuset。否则(或者如果未使用cpuset)multiprocessing.cpu_count()是在Python 2.6及更高版本中使用的方法。以下方法可回溯到旧版Python中的几个替代方法:

import os
import re
import subprocess


def available_cpu_count():
    """ Number of available virtual or physical CPUs on this system, i.e.
    user/real as output by time(1) when called with an optimally scaling
    userspace-only program"""

    # cpuset
    # cpuset may restrict the number of *available* processors
    try:
        m = re.search(r'(?m)^Cpus_allowed:\s*(.*)$',
                      open('/proc/self/status').read())
        if m:
            res = bin(int(m.group(1).replace(',', ''), 16)).count('1')
            if res > 0:
                return res
    except IOError:
        pass

    # Python 2.6+
    try:
        import multiprocessing
        return multiprocessing.cpu_count()
    except (ImportError, NotImplementedError):
        pass

    # https://github.com/giampaolo/psutil
    try:
        import psutil
        return psutil.cpu_count()   # psutil.NUM_CPUS on old versions
    except (ImportError, AttributeError):
        pass

    # POSIX
    try:
        res = int(os.sysconf('SC_NPROCESSORS_ONLN'))

        if res > 0:
            return res
    except (AttributeError, ValueError):
        pass

    # Windows
    try:
        res = int(os.environ['NUMBER_OF_PROCESSORS'])

        if res > 0:
            return res
    except (KeyError, ValueError):
        pass

    # jython
    try:
        from java.lang import Runtime
        runtime = Runtime.getRuntime()
        res = runtime.availableProcessors()
        if res > 0:
            return res
    except ImportError:
        pass

    # BSD
    try:
        sysctl = subprocess.Popen(['sysctl', '-n', 'hw.ncpu'],
                                  stdout=subprocess.PIPE)
        scStdout = sysctl.communicate()[0]
        res = int(scStdout)

        if res > 0:
            return res
    except (OSError, ValueError):
        pass

    # Linux
    try:
        res = open('/proc/cpuinfo').read().count('processor\t:')

        if res > 0:
            return res
    except IOError:
        pass

    # Solaris
    try:
        pseudoDevices = os.listdir('/devices/pseudo/')
        res = 0
        for pd in pseudoDevices:
            if re.match(r'^cpuid@[0-9]+$', pd):
                res += 1

        if res > 0:
            return res
    except OSError:
        pass

    # Other UNIXes (heuristic)
    try:
        try:
            dmesg = open('/var/run/dmesg.boot').read()
        except IOError:
            dmesgProcess = subprocess.Popen(['dmesg'], stdout=subprocess.PIPE)
            dmesg = dmesgProcess.communicate()[0]

        res = 0
        while '\ncpu' + str(res) + ':' in dmesg:
            res += 1

        if res > 0:
            return res
    except OSError:
        pass

    raise Exception('Can not determine number of CPUs on this system')

If you’re interested into the number of processors available to your current process, you have to check cpuset first. Otherwise (or if cpuset is not in use), multiprocessing.cpu_count() is the way to go in Python 2.6 and newer. The following method falls back to a couple of alternative methods in older versions of Python:

import os
import re
import subprocess


def available_cpu_count():
    """ Number of available virtual or physical CPUs on this system, i.e.
    user/real as output by time(1) when called with an optimally scaling
    userspace-only program"""

    # cpuset
    # cpuset may restrict the number of *available* processors
    try:
        m = re.search(r'(?m)^Cpus_allowed:\s*(.*)$',
                      open('/proc/self/status').read())
        if m:
            res = bin(int(m.group(1).replace(',', ''), 16)).count('1')
            if res > 0:
                return res
    except IOError:
        pass

    # Python 2.6+
    try:
        import multiprocessing
        return multiprocessing.cpu_count()
    except (ImportError, NotImplementedError):
        pass

    # https://github.com/giampaolo/psutil
    try:
        import psutil
        return psutil.cpu_count()   # psutil.NUM_CPUS on old versions
    except (ImportError, AttributeError):
        pass

    # POSIX
    try:
        res = int(os.sysconf('SC_NPROCESSORS_ONLN'))

        if res > 0:
            return res
    except (AttributeError, ValueError):
        pass

    # Windows
    try:
        res = int(os.environ['NUMBER_OF_PROCESSORS'])

        if res > 0:
            return res
    except (KeyError, ValueError):
        pass

    # jython
    try:
        from java.lang import Runtime
        runtime = Runtime.getRuntime()
        res = runtime.availableProcessors()
        if res > 0:
            return res
    except ImportError:
        pass

    # BSD
    try:
        sysctl = subprocess.Popen(['sysctl', '-n', 'hw.ncpu'],
                                  stdout=subprocess.PIPE)
        scStdout = sysctl.communicate()[0]
        res = int(scStdout)

        if res > 0:
            return res
    except (OSError, ValueError):
        pass

    # Linux
    try:
        res = open('/proc/cpuinfo').read().count('processor\t:')

        if res > 0:
            return res
    except IOError:
        pass

    # Solaris
    try:
        pseudoDevices = os.listdir('/devices/pseudo/')
        res = 0
        for pd in pseudoDevices:
            if re.match(r'^cpuid@[0-9]+$', pd):
                res += 1

        if res > 0:
            return res
    except OSError:
        pass

    # Other UNIXes (heuristic)
    try:
        try:
            dmesg = open('/var/run/dmesg.boot').read()
        except IOError:
            dmesgProcess = subprocess.Popen(['dmesg'], stdout=subprocess.PIPE)
            dmesg = dmesgProcess.communicate()[0]

        res = 0
        while '\ncpu' + str(res) + ':' in dmesg:
            res += 1

        if res > 0:
            return res
    except OSError:
        pass

    raise Exception('Can not determine number of CPUs on this system')

回答 2

另一种选择是使用该psutil库,它在以下情况下总是有用的:

>>> import psutil
>>> psutil.cpu_count()
2

这应该可以在psutil(Unix和Windows)支持的任何平台上使用。

请注意,在某些情况下multiprocessing.cpu_count可能会产生一种NotImplementedError同时psutil就能获得CPU的数量。这仅仅是因为psutil首先尝试使用与以前使用的相同的技术multiprocessing,如果失败,它还会使用其他技术。

Another option is to use the psutil library, which always turn out useful in these situations:

>>> import psutil
>>> psutil.cpu_count()
2

This should work on any platform supported by psutil(Unix and Windows).

Note that in some occasions multiprocessing.cpu_count may raise a NotImplementedError while psutil will be able to obtain the number of CPUs. This is simply because psutil first tries to use the same techniques used by multiprocessing and, if those fail, it also uses other techniques.


回答 3

在Python 3.4及更高版本中:os.cpu_count()

multiprocessing.cpu_count()就此功能实现了,但是NotImplementedError如果os.cpu_count()返回则提高None(“无法确定CPU的数量”)。

In Python 3.4+: os.cpu_count().

multiprocessing.cpu_count() is implemented in terms of this function but raises NotImplementedError if os.cpu_count() returns None (“can’t determine number of CPUs”).


回答 4

len(os.sched_getaffinity(0)) 通常就是你想要的

https://docs.python.org/3/library/os.html#os.sched_getaffinity

os.sched_getaffinity(0)(在Python 3中添加)(考虑sched_setaffinityLinux系统调用)返回可用的CPU集合,这限制了进程及其子进程可以在哪些CPU上运行。

0表示获取当前过程的值。该函数返回set()允许的CPU,因此需要len()

multiprocessing.cpu_count() 另一方面,仅返回物理CPU的总数。

这种差异尤为重要,因为某些集群管理系统(例如Platform LSF)将作业CPU的使用限制为sched_getaffinity

因此,如果使用multiprocessing.cpu_count(),则脚本可能会尝试使用比可用内核更多的内核,这可能导致过载和超时。

通过限制与taskset实用程序的关联性,我们可以具体看到差异。

例如,如果我在16核系统中将Python限制为仅1核(0核):

taskset -c 0 ./main.py

使用测试脚本:

main.py

#!/usr/bin/env python3

import multiprocessing
import os

print(multiprocessing.cpu_count())
print(len(os.sched_getaffinity(0)))

那么输出是:

16
1

nproc 但是默认情况下确实遵守关联性,并且:

taskset -c 0 nproc

输出:

1

man nproc使其非常明确:

打印可用的处理单元数

nproc具有--all要获取物理CPU计数的较不常见情况的标志:

taskset -c 0 nproc --all

该方法的唯一缺点是,它似乎仅适用于UNIX。我以为Windows必须具有类似的相似性API SetProcessAffinityMask,所以我想知道为什么还没有移植它。但是我对Windows一无所知。

已在Ubuntu 16.04,Python 3.5.2中进行了测试。

len(os.sched_getaffinity(0)) is what you usually want

https://docs.python.org/3/library/os.html#os.sched_getaffinity

os.sched_getaffinity(0) (added in Python 3) returns the set of CPUs available considering the sched_setaffinity Linux system call, which limits which CPUs a process and its children can run on.

0 means to get the value for the current process. The function returns a set() of allowed CPUs, thus the need for len().

multiprocessing.cpu_count() on the other hand just returns the total number of physical CPUs.

The difference is especially important because certain cluster management systems such as Platform LSF limit job CPU usage with sched_getaffinity.

Therefore, if you use multiprocessing.cpu_count(), your script might try to use way more cores than it has available, which may lead to overload and timeouts.

We can see the difference concretely by restricting the affinity with the taskset utility.

For example, if I restrict Python to just 1 core (core 0) in my 16 core system:

taskset -c 0 ./main.py

with the test script:

main.py

#!/usr/bin/env python3

import multiprocessing
import os

print(multiprocessing.cpu_count())
print(len(os.sched_getaffinity(0)))

then the output is:

16
1

nproc however does respect the affinity by default and:

taskset -c 0 nproc

outputs:

1

and man nproc makes that quite explicit:

print the number of processing units available

nproc has the --all flag for the less common case that you want to get the physical CPU count:

taskset -c 0 nproc --all

The only downside of this method is that this appears to be UNIX only. I supposed Windows must have a similar affinity API, possibly SetProcessAffinityMask, so I wonder why it hasn’t been ported. But I know nothing about Windows.

Tested in Ubuntu 16.04, Python 3.5.2.


回答 5

平台无关:

psutil.cpu_count(逻辑=假)

https://github.com/giampaolo/psutil/blob/master/INSTALL.rst

platform independent:

psutil.cpu_count(logical=False)

https://github.com/giampaolo/psutil/blob/master/INSTALL.rst


回答 6

这些给你超线程的CPU数量

  1. multiprocessing.cpu_count()
  2. os.cpu_count()

这些为您提供虚拟机的CPU数量

  1. psutil.cpu_count()
  2. numexpr.detect_number_of_cores()

仅在您在VM上工作时才重要。

These give you the hyperthreaded CPU count

  1. multiprocessing.cpu_count()
  2. os.cpu_count()

These give you the virtual machine CPU count

  1. psutil.cpu_count()
  2. numexpr.detect_number_of_cores()

Only matters if you works on VMs.


回答 7

multiprocessing.cpu_count()将返回逻辑CPU的数量,因此,如果您具有带超线程功能的四核CPU,它将返回8。如果您想要物理CPU的数量,请使用python绑定来hwloc:

#!/usr/bin/env python
import hwloc
topology = hwloc.Topology()
topology.load()
print topology.get_nbobjs_by_type(hwloc.OBJ_CORE)

hwloc设计为可跨操作系统和体系结构移植。

multiprocessing.cpu_count() will return the number of logical CPUs, so if you have a quad-core CPU with hyperthreading, it will return 8. If you want the number of physical CPUs, use the python bindings to hwloc:

#!/usr/bin/env python
import hwloc
topology = hwloc.Topology()
topology.load()
print topology.get_nbobjs_by_type(hwloc.OBJ_CORE)

hwloc is designed to be portable across OSes and architectures.


回答 8

无法弄清楚如何添加到代码或回复消息,但是这里提供了对jython的支持,您可以在放弃之前先加以支持:

# jython
try:
    from java.lang import Runtime
    runtime = Runtime.getRuntime()
    res = runtime.availableProcessors()
    if res > 0:
        return res
except ImportError:
    pass

Can’t figure out how to add to the code or reply to the message but here’s support for jython that you can tack in before you give up:

# jython
try:
    from java.lang import Runtime
    runtime = Runtime.getRuntime()
    res = runtime.availableProcessors()
    if res > 0:
        return res
except ImportError:
    pass

回答 9

这对于使用不同操作系统/系统但想要获得世界最佳状态的我们来说可能是有用的:

import os
workers = os.cpu_count()
if 'sched_getaffinity' in dir(os):
    workers = len(os.sched_getaffinity(0))

This may work for those of us who use different os/systems, but want to get the best of all worlds:

import os
workers = os.cpu_count()
if 'sched_getaffinity' in dir(os):
    workers = len(os.sched_getaffinity(0))

回答 10

您也可以为此目的使用“ joblib”。

import joblib
print joblib.cpu_count()

此方法将为您提供系统中的cpus数。但是需要安装joblib。关于joblib的更多信息可以在这里找到 https://pythonhosted.org/joblib/parallel.html

或者,您可以使用python的numexpr软件包。它具有许多简单的功能,有助于获取有关系统cpu的信息。

import numexpr as ne
print ne.detect_number_of_cores()

You can also use “joblib” for this purpose.

import joblib
print joblib.cpu_count()

This method will give you the number of cpus in the system. joblib needs to be installed though. More information on joblib can be found here https://pythonhosted.org/joblib/parallel.html

Alternatively you can use numexpr package of python. It has lot of simple functions helpful for getting information about the system cpu.

import numexpr as ne
print ne.detect_number_of_cores()

回答 11

如果您没有Python 2.6,请选择另一个选项:

import commands
n = commands.getoutput("grep -c processor /proc/cpuinfo")

Another option if you don’t have Python 2.6:

import commands
n = commands.getoutput("grep -c processor /proc/cpuinfo")