Introducing an easy-to-use skeleton package
For one of my academic papers – GenSVM – I implemented the method in a C library for speed. Although this library comes with a number of executables, I wanted to create a Python package for this method as well to make it easier to use. Since the library uses a lot of linear algebra methods, the Python package needed to work nicely with NumPy arrays as well.
One of the things that turned out to be quite a bit harder than expected was
structuring the package and creating a setup.py
file that compiles both
the C library and the Cython extension, and installs everything properly.
After quite a bit of trial and error, I eventually came to a setup.py
file
that works. Below, I’ll present a skeleton package that wraps a C library as a
Numpy extension for use in Python. You can find the complete package on
GitHub, ready to be extended
for your own use.
The following directory structure is what I used:
├── Makefile
├── numpy_c_skeleton
│ ├── core.py
│ └── __init__.py
├── README.rst
├── setup.py
├── src
│ ├── c_package
│ │ ├── c_package_helper.c
│ │ ├── include
│ │ │ └── code.h
│ │ └── src
│ │ └── code.c
│ ├── cython_wrapper.pxd
│ └── cython_wrapper.pyx
└── tests
└── test_package.py
All the Python code is in the numpy_c_skeleton
directory, the Cython code
is in the top level of the src
directory and the C library is in the
src/c_package
directory.
If you’re using Git, it
might be a good idea to add the C library as a Git
Submodule.
For this illustration the C code only has a single function for computing the
sum of a NumPy array. However, it should be straightforward to extend this for
your own C library. In the skeleton package each file is extensively
documented to describe what it does, but here is a brief description:
numpy_c_skeleton/core.py
is the Python code that you want to expose to the users. This is just as in a regular Python package and you’re not restricted to having only one file here. For this example package, thecore.py
file defines a single function which uses the code from the Cython wrapper.src/cython_wrapper.pxd
is a Cython pxd file, which is like a C header file. This is were you define where each function or structure that you want to use comes from.src/cython_wrapper.pyx
is the main Cython wrapper file. This contains a function that takes a Python NumPy array and calls the external function from the C helper file on it.src/c_package/c_package_helper.c
adds a layer between the C library and the Cython wrapper code. It’s not strictly necessary but it can help in exposing only those parts of your C library to Cython that you want, or to make sure you don’t have to change anything in your library to work nicely with Cython.src/c_package/include
contains the header files for your C library. In this case there is only one file, which contains the declaration of the function defined insrc/c_package/src/code.c
.src/c_package/src
contains the source files for your C library. Note that this code can be agnostic to the fact that NumPy is used for the arrays. This makes it easy to reuse a library in a Python package.tests/test_package.py
contains unit tests, these show how the function from thecore.py
file can be used.
As mentioned, the setup.py
file can be a bit tricky to get right in terms
of compilation of the extension. In the end, I settled on the following:
#!/usr/bin/env python
# -*- coding: utf-8 -*-
import os
import re
import numpy
from setuptools import setup
from numpy.distutils.misc_util import Configuration
# Set this to True to enable building extensions using Cython. Set it to False·
# to build extensions from the C file (that was previously generated using·
# Cython). Set it to 'auto' to build with Cython if available, otherwise from·
# the C file.
USE_CYTHON = 'auto'
# If we are in a release, we always never use Cython directly
IS_RELEASE = os.path.exists('PKG-INFO')
if IS_RELEASE:
USE_CYTHON = False
# If we do want to use Cython, we double check if it is available
if USE_CYTHON:
try:
from Cython.Build import cythonize
except ImportError:
if USE_CYTHON == 'auto':
USE_CYTHON = False
else:
raise
def configuration():
# This is the numpy Configuration class
config = Configuration('numpy_c_skeleton', '', None)
# Wrapper code in Cython uses the .pyx extension if we want to USE_CYTHON,
# otherwise it ends in .c.
wrapper_ext = '*.pyx' if USE_CYTHON else '*.c'
# Sources include the C/Cython code from the wrapper and the source code of
# the C library
sources = [
os.path.join('src', wrapper_ext),
os.path.join('src', 'c_package', 'src', '*.c'),
]
# Dependencies are the header files of the C library and any potential
# helper code between the library and the Cython code
depends = [
os.path.join('src', 'c_package', 'include', '*.h'),
os.path.join('src', 'c_package', 'c_package_helper.c')
]
# Register the extension
config.add_extension('cython_wrapper',
sources=sources,
include_dirs=[
os.path.join('src', 'c_package'),
os.path.join('src', 'c_package', 'include'),
numpy.get_include(),
],
depends=depends)
# Cythonize if necessary
if USE_CYTHON:
config.ext_modules = cythonize(config.ext_modules)
return config
def read(fname):
return open(os.path.join(os.path.dirname(__file__), fname)).read()
if __name__ == '__main__':
# Pull the version from the package __init__.py
version = re.search("__version__ = '([^']+)'",
open('numpy_c_skeleton/__init__.py').read()).group(1)
# load the configuration
attr = configuration().todict()
# Add the other setup attributes
attr['description'] = 'Python package skeleton for numpy C extension'
attr['long_description'] = read('README.rst')
attr['packages'] = ['numpy_c_skeleton']
attr['version'] = version
attr['author'] = "G.J.J. van den Burg"
attr['author_email'] = "gertjanvandenburg@gmail.com"
attr['license'] = 'GPL v2'
attr['install_requires'] = ['numpy']
attr['zip_safe'] = True
# Finally, run the setup command
setup(**attr)
This setup.py
file uses Cython to compile the *.pyx
files if in
development mode. In release mode, the “cythonized” versions of these files
are used. This way the cythonize
function is run only when needed. Note
further that the regular setup()
function arguments are supplied as
elements to the attr
dictionary.
This is it! I hope this skeleton package helps you with adding your C/C++ libraries to a Python package. If so, you can “star” the repository on GitHub to let me know.