如何构造包含Cython代码的Python包

问题:如何构造包含Cython代码的Python包

我想制作一个包含一些Cython代码的Python包。我的Cython代码运行良好。但是,现在我想知道如何最好地打包它。

对于大多数只想安装软件包的人,我想包括.cCython创建的文件,并安排对其setup.py进行编译以生成模块。然后,用户不需要安装Cython即可安装软件包。

但是对于那些可能想要修改程序包的人,我也想提供Cython .pyx文件,并以某种方式还允许setup.py使用Cython构建它们(因此这些用户需要安装Cython)。

我应该如何构造软件包中的文件以适应这两种情况?

用Cython文档提供了一些指导。但这并没有说明如何制作一个setup.py处理Cython情况的单例。

I’d like to make a Python package containing some Cython code. I’ve got the the Cython code working nicely. However, now I want to know how best to package it.

For most people who just want to install the package, I’d like to include the .c file that Cython creates, and arrange for setup.py to compile that to produce the module. Then the user doesn’t need Cython installed in order to install the package.

But for people who may want to modify the package, I’d also like to provide the Cython .pyx files, and somehow also allow for setup.py to build them using Cython (so those users would need Cython installed).

How should I structure the files in the package to cater for both these scenarios?

The Cython documentation gives a little guidance. But it doesn’t say how to make a single setup.py that handles both the with/without Cython cases.


回答 0

我现在已经在Python程序包simplerandomBitBucket repo-编辑:now github)中亲自完成了这个任务(我不希望这是一个受欢迎的程序包,但这是学习Cython的好机会)。

此方法依赖于以下事实:.pyx使用Cython.Distutils.build_ext(至少使用Cython版本0.14)构建文件似乎总是.c在与源.pyx文件相同的目录中创建文件。

这是一个精简版setup.py,我希望其中显示要点:

from distutils.core import setup
from distutils.extension import Extension

try:
    from Cython.Distutils import build_ext
except ImportError:
    use_cython = False
else:
    use_cython = True

cmdclass = {}
ext_modules = []

if use_cython:
    ext_modules += [
        Extension("mypackage.mycythonmodule", ["cython/mycythonmodule.pyx"]),
    ]
    cmdclass.update({'build_ext': build_ext})
else:
    ext_modules += [
        Extension("mypackage.mycythonmodule", ["cython/mycythonmodule.c"]),
    ]

setup(
    name='mypackage',
    ...
    cmdclass=cmdclass,
    ext_modules=ext_modules,
    ...
)

我还进行了编辑,MANIFEST.in以确保将mycythonmodule.c其包含在源分发中(使用创建的源分发python setup.py sdist):

...
recursive-include cython *
...

我不承诺mycythonmodule.c版本控制“ trunk”(或Mercurial的“ default”)。发布时,我需要记住先进行操作python setup.py build_ext,以确保mycythonmodule.c该源代码是最新的并且是最新的。我还创建了一个release分支,并将C文件提交到该分支中。这样,我就拥有与该发行版一起分发的C文件的历史记录。

I’ve done this myself now, in a Python package simplerandom (BitBucket repo – EDIT: now github) (I don’t expect this to be a popular package, but it was a good chance to learn Cython).

This method relies on the fact that building a .pyx file with Cython.Distutils.build_ext (at least with Cython version 0.14) always seems to create a .c file in the same directory as the source .pyx file.

Here is a cut-down version of setup.py which I hope shows the essentials:

from distutils.core import setup
from distutils.extension import Extension

try:
    from Cython.Distutils import build_ext
except ImportError:
    use_cython = False
else:
    use_cython = True

cmdclass = {}
ext_modules = []

if use_cython:
    ext_modules += [
        Extension("mypackage.mycythonmodule", ["cython/mycythonmodule.pyx"]),
    ]
    cmdclass.update({'build_ext': build_ext})
else:
    ext_modules += [
        Extension("mypackage.mycythonmodule", ["cython/mycythonmodule.c"]),
    ]

setup(
    name='mypackage',
    ...
    cmdclass=cmdclass,
    ext_modules=ext_modules,
    ...
)

I also edited MANIFEST.in to ensure that mycythonmodule.c is included in a source distribution (a source distribution that is created with python setup.py sdist):

...
recursive-include cython *
...

I don’t commit mycythonmodule.c to version control ‘trunk’ (or ‘default’ for Mercurial). When I make a release, I need to remember to do a python setup.py build_ext first, to ensure that mycythonmodule.c is present and up-to-date for the source code distribution. I also make a release branch, and commit the C file into the branch. That way I have a historical record of the C file that was distributed with that release.


回答 1

克雷格·麦昆(Craig McQueen)的答案有所添加:请参见下文,了解如何覆盖sdist命令以使Cython在创建源代码分发之前自动编译源文件。

这样一来,您就可以避免意外分发过期C资源的风险。在您对分发过程的控制有限的情况下(例如,通过持续集成自动创建分发时),这也很有帮助。

from distutils.command.sdist import sdist as _sdist

...

class sdist(_sdist):
    def run(self):
        # Make sure the compiled Cython files in the distribution are up-to-date
        from Cython.Build import cythonize
        cythonize(['cython/mycythonmodule.pyx'])
        _sdist.run(self)
cmdclass['sdist'] = sdist

Adding to Craig McQueen’s answer: see below for how to override the sdist command to have Cython automatically compile your source files before creating a source distribution.

That way your run no risk of accidentally distributing outdated C sources. It also helps in the case where you have limited control over the distribution process e.g. when automatically creating distributions from continuous integration etc.

from distutils.command.sdist import sdist as _sdist

...

class sdist(_sdist):
    def run(self):
        # Make sure the compiled Cython files in the distribution are up-to-date
        from Cython.Build import cythonize
        cythonize(['cython/mycythonmodule.pyx'])
        _sdist.run(self)
cmdclass['sdist'] = sdist

回答 2

http://docs.cython.org/en/latest/src/userguide/source_files_and_compilation.html#distributing-cython-modules

强烈建议您分发生成的.c文件以及Cython源,以便用户无需使用Cython即可安装模块。

还建议您分发的版本中默认不启用Cython编译。即使用户安装了Cython,他也可能不想仅使用它来安装模块。另外,他使用的版本可能与您使用的版本不同,并且可能无法正确编译您的源代码。

这只是意味着您附带的setup.py文件将只是生成的.c文件上的常规distutils文件,对于基本示例,我们将使用:

from distutils.core import setup
from distutils.extension import Extension
 
setup(
    ext_modules = [Extension("example", ["example.c"])]
)

http://docs.cython.org/en/latest/src/userguide/source_files_and_compilation.html#distributing-cython-modules

It is strongly recommended that you distribute the generated .c files as well as your Cython sources, so that users can install your module without needing to have Cython available.

It is also recommended that Cython compilation not be enabled by default in the version you distribute. Even if the user has Cython installed, he probably doesn’t want to use it just to install your module. Also, the version he has may not be the same one you used, and may not compile your sources correctly.

This simply means that the setup.py file that you ship with will just be a normal distutils file on the generated .c files, for the basic example we would have instead:

from distutils.core import setup
from distutils.extension import Extension
 
setup(
    ext_modules = [Extension("example", ["example.c"])]
)

回答 3

最简单的方法是同时包含两者,而仅使用c文件?包括.pyx文件是不错的选择,但是无论如何只要有了.c文件就不需要了。想要重新编译.pyx的人可以安装Pyrex并手动进行。

否则,您需要有一个用于distutils的自定义build_ext命令,该命令首先生成C文件。Cython已经包含一个。http://docs.cython.org/src/userguide/source_files_and_compilation.html

该文档没有做的是说如何使其成为条件,但是

try:
     from Cython.distutils import build_ext
except ImportError:
     from distutils.command import build_ext

应该处理。

The easiest is to include both but just use the c-file? Including the .pyx file is nice, but it’s not needed once you have the .c file anyway. People who want to recompile the .pyx can install Pyrex and do it manually.

Otherwise you need to have a custom build_ext command for distutils that builds the C file first. Cython already includes one. http://docs.cython.org/src/userguide/source_files_and_compilation.html

What that documentation doesn’t do is say how to make this conditional, but

try:
     from Cython.distutils import build_ext
except ImportError:
     from distutils.command import build_ext

Should handle it.


回答 4

包含(Cython)生成的.c文件非常奇怪。尤其是当我们在git中包含它时。我更喜欢使用setuptools_cython。当Cython不可用时,它将构建一个具有内置Cython环境的鸡蛋,然后使用该鸡蛋构建代码。

一个可能的示例:https : //github.com/douban/greenify/blob/master/setup.py


更新(2017-01-05):

因为setuptools 18.0,没有必要使用setuptools_cython是一个从头开始构建Cython项目而无需的示例setuptools_cython

Including (Cython) generated .c files are pretty weird. Especially when we include that in git. I’d prefer to use setuptools_cython. When Cython is not available, it will build an egg which has built-in Cython environment, and then build your code using the egg.

A possible example: https://github.com/douban/greenify/blob/master/setup.py


Update(2017-01-05):

Since setuptools 18.0, there’s no need to use setuptools_cython. Here is an example to build Cython project from scratch without setuptools_cython.


回答 5

这是我编写的安装脚本,它使在构建中包括嵌套目录更加容易。需要从一个程序包中的文件夹运行它。

Givig结构如下:

__init__.py
setup.py
test.py
subdir/
      __init__.py
      anothertest.py

setup.py

from setuptools import setup, Extension
from Cython.Distutils import build_ext
# from os import path
ext_names = (
    'test',
    'subdir.anothertest',       
) 

cmdclass = {'build_ext': build_ext}
# for modules in main dir      
ext_modules = [
    Extension(
        ext,
        [ext + ".py"],            
    ) 
    for ext in ext_names if ext.find('.') < 0] 
# for modules in subdir ONLY ONE LEVEL DOWN!! 
# modify it if you need more !!!
ext_modules += [
    Extension(
        ext,
        ["/".join(ext.split('.')) + ".py"],     
    )
    for ext in ext_names if ext.find('.') > 0]

setup(
    name='name',
    ext_modules=ext_modules,
    cmdclass=cmdclass,
    packages=["base", "base.subdir"],
)
#  Build --------------------------
#  python setup.py build_ext --inplace

编译愉快;)

This is a setup script I wrote which makes it easier to include nested directories inside the build. One needs to run it from folder within a package.

Givig structure like this:

__init__.py
setup.py
test.py
subdir/
      __init__.py
      anothertest.py

setup.py

from setuptools import setup, Extension
from Cython.Distutils import build_ext
# from os import path
ext_names = (
    'test',
    'subdir.anothertest',       
) 

cmdclass = {'build_ext': build_ext}
# for modules in main dir      
ext_modules = [
    Extension(
        ext,
        [ext + ".py"],            
    ) 
    for ext in ext_names if ext.find('.') < 0] 
# for modules in subdir ONLY ONE LEVEL DOWN!! 
# modify it if you need more !!!
ext_modules += [
    Extension(
        ext,
        ["/".join(ext.split('.')) + ".py"],     
    )
    for ext in ext_names if ext.find('.') > 0]

setup(
    name='name',
    ext_modules=ext_modules,
    cmdclass=cmdclass,
    packages=["base", "base.subdir"],
)
#  Build --------------------------
#  python setup.py build_ext --inplace

Happy compiling ;)


回答 6

我想到的简单技巧:

from distutils.core import setup

try:
    from Cython.Build import cythonize
except ImportError:
    from pip import pip

    pip.main(['install', 'cython'])

    from Cython.Build import cythonize


setup(…)

如果无法导入,只需安装Cython。一个人可能不应该共享此代码,但是对于我自己的依赖关系来说已经足够了。

The simple hack I came up with:

from distutils.core import setup

try:
    from Cython.Build import cythonize
except ImportError:
    from pip import pip

    pip.main(['install', 'cython'])

    from Cython.Build import cythonize


setup(…)

Just install Cython if it could not be imported. One should probably not share this code, but for my own dependencies it’s good enough.


回答 7

所有其他答案都依赖

  • 发行版
  • 从导入Cython.Build,在通过cython setup_requires导入和导入cython之间会产生鸡与蛋的问题。

一种现代的解决方案是改用setuptools,请参见以下答案(自动处理Cython扩展需要setuptools 18.0,即,它已经可用了很多年)。setup.py具有需求处理,入口点和cython模块的现代标准可能如下所示:

from setuptools import setup, Extension

with open('requirements.txt') as f:
    requirements = f.read().splitlines()

setup(
    name='MyPackage',
    install_requires=requirements,
    setup_requires=[
        'setuptools>=18.0',  # automatically handles Cython extensions
        'cython>=0.28.4',
    ],
    entry_points={
        'console_scripts': [
            'mymain = mypackage.main:main',
        ],
    },
    ext_modules=[
        Extension(
            'mypackage.my_cython_module',
            sources=['mypackage/my_cython_module.pyx'],
        ),
    ],
)

All other answers either rely on

  • distutils
  • importing from Cython.Build, which creates a chicken-and-egg problem between requiring cython via setup_requires and importing it.

A modern solution is to use setuptools instead, see this answer (automatic handling of Cython extensions requires setuptools 18.0, i.e., it’s available for many years already). A modern standard setup.py with requirements handling, an entry point, and a cython module could look like this:

from setuptools import setup, Extension

with open('requirements.txt') as f:
    requirements = f.read().splitlines()

setup(
    name='MyPackage',
    install_requires=requirements,
    setup_requires=[
        'setuptools>=18.0',  # automatically handles Cython extensions
        'cython>=0.28.4',
    ],
    entry_points={
        'console_scripts': [
            'mymain = mypackage.main:main',
        ],
    },
    ext_modules=[
        Extension(
            'mypackage.my_cython_module',
            sources=['mypackage/my_cython_module.pyx'],
        ),
    ],
)

回答 8

我发现仅使用setuptools而非功能受限的distutils的最简单方法是

from setuptools import setup
from setuptools.extension import Extension
try:
    from Cython.Build import cythonize
except ImportError:
    use_cython = False
else:
    use_cython = True

ext_modules = []
if use_cython:
    ext_modules += cythonize('package/cython_module.pyx')
else:
    ext_modules += [Extension('package.cython_module',
                              ['package/cython_modules.c'])]

setup(name='package_name', ext_modules=ext_modules)

The easiest way I found using only setuptools instead of the feature limited distutils is

from setuptools import setup
from setuptools.extension import Extension
try:
    from Cython.Build import cythonize
except ImportError:
    use_cython = False
else:
    use_cython = True

ext_modules = []
if use_cython:
    ext_modules += cythonize('package/cython_module.pyx')
else:
    ext_modules += [Extension('package.cython_module',
                              ['package/cython_modules.c'])]

setup(name='package_name', ext_modules=ext_modules)

回答 9

我想我通过提供自定义build_ext命令找到了一种很好的方法。这个想法如下:

  1. 我通过重写finalize_options()import numpy在函数的主体中添加numpy标头,很好地避免了numpy在setup()安装之前不可用的问题。

  2. 如果cython在系统上可用,它将挂接到命令的check_extensions_list()方法中,并通过cython化所有过时的cython模块,将其替换为C扩展,稍后可通过该build_extension() 方法处理。我们也只是在模块中提供功能的后一部分:这意味着,如果cython不可用,但是我们有C扩展名,它仍然可以工作,从而可以进行源代码分发。

这是代码:

import re, sys, os.path
from distutils import dep_util, log
from setuptools.command.build_ext import build_ext

try:
    import Cython.Build
    HAVE_CYTHON = True
except ImportError:
    HAVE_CYTHON = False

class BuildExtWithNumpy(build_ext):
    def check_cython(self, ext):
        c_sources = []
        for fname in ext.sources:
            cname, matches = re.subn(r"(?i)\.pyx$", ".c", fname, 1)
            c_sources.append(cname)
            if matches and dep_util.newer(fname, cname):
                if HAVE_CYTHON:
                    return ext
                raise RuntimeError("Cython and C module unavailable")
        ext.sources = c_sources
        return ext

    def check_extensions_list(self, extensions):
        extensions = [self.check_cython(ext) for ext in extensions]
        return build_ext.check_extensions_list(self, extensions)

    def finalize_options(self):
        import numpy as np
        build_ext.finalize_options(self)
        self.include_dirs.append(np.get_include())

这样一来,人们就可以编写setup()参数而不必担心导入以及是否有可用的cython的问题:

setup(
    # ...
    ext_modules=[Extension("_my_fast_thing", ["src/_my_fast_thing.pyx"])],
    setup_requires=['numpy'],
    cmdclass={'build_ext': BuildExtWithNumpy}
    )

I think I found a pretty good way of doing this by providing a custom build_ext command. The idea is the following:

  1. I add the numpy headers by overriding finalize_options() and doing import numpy in the body of the function, which nicely avoids the problem of numpy not being available before setup() installs it.

  2. If cython is available on the system, it hooks into the command’s check_extensions_list() method and by cythonizes all out-of-date cython modules, replacing them with C extensions that can later handled by the build_extension() method. We just provide the latter part of the functionality in our module too: this means that if cython is not available but we have a C extension present, it still works, which allows you to do source distributions.

Here’s the code:

import re, sys, os.path
from distutils import dep_util, log
from setuptools.command.build_ext import build_ext

try:
    import Cython.Build
    HAVE_CYTHON = True
except ImportError:
    HAVE_CYTHON = False

class BuildExtWithNumpy(build_ext):
    def check_cython(self, ext):
        c_sources = []
        for fname in ext.sources:
            cname, matches = re.subn(r"(?i)\.pyx$", ".c", fname, 1)
            c_sources.append(cname)
            if matches and dep_util.newer(fname, cname):
                if HAVE_CYTHON:
                    return ext
                raise RuntimeError("Cython and C module unavailable")
        ext.sources = c_sources
        return ext

    def check_extensions_list(self, extensions):
        extensions = [self.check_cython(ext) for ext in extensions]
        return build_ext.check_extensions_list(self, extensions)

    def finalize_options(self):
        import numpy as np
        build_ext.finalize_options(self)
        self.include_dirs.append(np.get_include())

This allows one to just write the setup() arguments without worrying about imports and whether one has cython available:

setup(
    # ...
    ext_modules=[Extension("_my_fast_thing", ["src/_my_fast_thing.pyx"])],
    setup_requires=['numpy'],
    cmdclass={'build_ext': BuildExtWithNumpy}
    )