Recently, I came over Cython and started experimenting with it. After some basic testing, I found several things of interest.
- Used properly, Cython is a fantastic way to speed up Python code.
- It is extremely liberal in what Python code it will accept, but works best when the code is tweaked for Cython.
- Cython works extremely well, but it is not magic and can reduce performance and make code more brittle if not used carefully.
- Really, use the profiler before doing any serious optimization work.
What is Cython?
Cython is a programming language that compiles to C or C++ code that can then be compiled with a normal C/C++ compiler. It supports the optional static typing of variables with C-types in order to improve performance. One common usage is to create optimized extension modules to be imported into python. Cython also seems able to create standalone executables, but I have not tried that yet.
The story and what I found.
When Cython caught my interest, I read about it on the web and played just a bit with some of the examples that were posted. Then I decided I should play with it in something a bit closer to the real world and preferably with code that I had written.
I had a small little project that I had been toying with lying around that seemed perfect. The entire thing was under a thousand lines of actual working code and I had it cleanly split into two files, one that did all the work and the other that just supported a pyQt gui. The module with the working functions could be exercised completely separately from the GUI, I had unit tests for all the important functions, and I was unhappy with the performance. This seemed like a perfect opportunity to test Cython.
I knew that under normal circumstances my first step in optimizing my simple program should be to run it through the profiler. I’ve read that many times and I’ve seen good reasons for it in practice. But this time my goal was to play with Cython, if it fixed the performance issues in the process that would be a nice bonus. So, I was going to mess with Cython no matter what the profiler said and didn’t see a reason to bother with it at this point.
So, I forked the project into a new folder and set aside the gui, focusing on the working module for now. I pulled out nearly all of the code and put it into a new .pyx file. I left the original .py file as a small shell that set a couple of parameters and called the main function. Then at the top of that file I added:
Import pyximport; pyximort.install() Import (the pyx module)
Then I ran it, and it took substantially longer than the original pure python file. It didn’t take much looking to see that was because of compiling the new pyx module. So, I ran it again and the time reduced to only a fraction of a second slower than it originally was, but it was still slower than pure Python. Just to play with it, I built a setup.py and made a pyd file to take pyxImport entirely out of the quest. In this case, it didn’t make any measurable difference, but I have not done extensive testing on it and there are cases where a setup.py is needed to compile the file.
These initial tests showed me two things immediately. The first is that Cython is extremely liberal in what it accepts. The python code I ran through the Cython compiler included calls to import various libraries and other idiomatic Python features and it handled it without complaint. I saw references that Cython is not a full Python implementation (at least not yet) and that some types of valid Python will fail, but it certainly seems to handle most cases. All of the unit tests, continued to pass at this point. The second is that while many examples show that Cython can speed up certain types of Python code with no changes at all, it is not magic and certain types of Python programs can be slowed down by using Cython.
Along those lines, I played with it by passing in the “wrong” type. Namely, I passed in a float where I had defined an int in the Cython code. The Python version happily switched to outputting float values. The Cython version executed without errors but the output remained an int and was now substantially rounded compared to the results from the Python version.
So I pulled out just one class that mostly did different types of arithmetic. I went through and meticulously went through changing def to cpdef and putting in types everywhere it made any sense to do so. I tried that, and there was still no speedup.
So I finally ran it through the profiler, something that under normal circumstances would have been my first action. I found that the vast majority of the time was being spent in a function of PIL that got called several times (though the functions from the class did seem to be minutely faster). This proved a solid reminder of the importance of profiling before doing any kind of optimization. Here, the time certainly wasn’t wasted since it gave me a chance to play with Cython, but had I started with the profiler I might readily have decided this wasn’t a great test case and found a different one.
To keep playing with it, I setup a new python file that imported that class in both its pure python version and in the Cython version with type declarations and ran some of the methods a few thousand times with timeit. This showed that the Cython version ran in about a tenth of the time.
Cython seems to have a great ability increase the speed of Python code and will almost certainly be a tool I use repeatedly. With that said, to get full advantage of it require some additional work. Also, the first step in almost any sensible optimization plan should be to apply the profiler.