December 19, 2023

Writing a C/C++ extension for Python code

No items found.
Writing a C/C++ extension for Python code

There are a few reasons, why you would like to write a C/C++ extension for your Python code. The extension can implement new built-in object types, do system calls, or call C/C++ library functions. As the ML engineer, you would most likely use the last functionality to speed up your algorithms. Most of the commonly used algorithms are already implemented in the form of highly optimized libraries like OpenCV, SciPy, Skimage, Numpy, or Pandas, so there is no need to write your own extensions. However, especially in the field of computer vision, you may encounter obstacles that can be overcome with a C/C++ extension. In my work, I had to use a flood-filling algorithm to segment 3D volumes. Skimage offers a flood-filling algorithm. However, there is no space for customization. My custom Python flood-filling algorithm was extremely slow. Therefore, I decided to write my own Python extension in C. The C algorithm was 100 times faster than the Python equivalent. Pretty impressive.

There are many approaches for integrating C/C++ code and Python:

I will briefly go through selected approaches and show how the binding between C/C++ and Python can be created.

Ctypes

Ctypes is the simplest solution - great if you want to send a matrix to C, do heavy calculations, and get results back. The main drawback is that c_types supports only C, so if you would like to use C++, you need to do the C bridge to C++ or go for other solutions.

Ctypes should be used for small extensions only.

Create ctypes_example.c file:

Create makefile:

Run make in terminal. It will create py_ext.so file in a current directory.

Create ctypes_example.py:

And run the Python file. A simple Numpy array editing script is ready.

Python C API

This is the currently Python-recommended way of writing C and C++ extensions. Alternatively, Cython can be used, which is something between Python and C. Cython is great for the speed of code writing. There is definitely less code needed in Cython to get things done, than in C. However, it comes with the speed trade-off. It is often hard to match the C code speed in Cython.

To write the Python C API extension, create the py_objects.c file:

Compile the C code with following command:

Finally create the python file py_objects.py:

Boost Python

Boost gives a very C++-like interface between C++ and Python. It provides a direct bridge between std::vector<>, std::map<>, and lists and dictionaries in Python. Moreover, C++ code automatically shares reference counts in smart pointers with Python reference counts and allows C++ virtual methods to be overridden by Python classes.

Boost on Linux can be installed using the:

sudo apt-get install libboost-all-dev -lboost_python3 -lboost_numpy3

Create boost_python.cpp:

Create makefile:

where

-lboost_python{PYTHON_VERSION_SUFFIX}

and

-lboost_numpy{PYTHON_VERSION_SUFFIX}

have the PYTHON_VERSION_SUFFIX

matching the system python version and usually located by default in

usr/lib and -I/home/<user>/anaconda3/envs/sample_env/include/python3.10,

is the path to the python include directory.   

Create boost_python.py:

And run the Python file. A simple Numpy array editing script is ready. The benefit of using Boost is that we operate in C++ in a very natural way.

Replacing makefile with setup.py

When we create a Python project, we don’t want to bother with manual C code compilation. The makefile can be replaced with well known setup.py Python script.

Further, we just need to install the setup.py file. We can wrap the building and extension code testing in a short makefile file as presented below:

Conclusions

In the article, I presented motivations and a few examples of C/C++ extension integration with the Python code. If you have a small function to do the heavy calculations, the easiest way is to go for Ctypes. Usually, the pythonic way of writing the extension is to use Python C API or Cython. Cython is fast to write in, but might not match the C code performance. Python C API gives the best speed but requires more code to be written. Last the boost provides very C++-like integration with the Python code. It supports virtual methods overriding by Python classes, a direct bridge between Python lists and C++ vectors, and more. Each use case should be individually treated and the best method chosen.