'Protecting' python code with Cython and Docker
I was looking for a method to distribute some proprietary code to 3rd parties with some level of IP protection. Knowing that this is a losing battle, and that I suspected that it was more about limiting curiosity rather than deliberate reverse engineering, I decided to investigate the use of cython to compile python via C into a binary.
This repo demonstrates the process of building a Flask app into a Docker container with a compiled binary of the application using Cython. It also uses a multi-stage docker build to ensure that the original source files are never copied into the target image.
For a more thorough obfuscation, I would suggesting looking into using a more robust solution such as this.
How it works
There is simply a compilation step defined in setup.py
that refers to the src/api.pyx
file. This file uses cython to
do the build, and outputs a binary library file which can later be referenced via gunicorn, or in other python files as
a normal module reference.
from distutils.core import setup
from Cython.Distutils import build_ext
from Cython.Build import cythonize
setup(
name='Hello Flask',
cmdclass={'build_ext': build_ext},
ext_modules=cythonize("src/api.pyx",
compiler_directives={'language_level': "3"},
build_dir="build"
)
)
Local Build
pip install -r requirements.txt
python setup.py build_ext --inplace
Confirming the binary is built
As you can see, the binary is compiled as a shared object fil, the following screenshot shows a hex dump of this file. Although it would be possible to reverse engineer this file, the effort required to do so would be considerable.
Running the local binary with gunicorn
gunicorn api:app
Building the image for docker
Build single-arch image, load to local docker
docker buildx build --platform linux/arm64 --load --tag helloflask .
docker buildx build --platform linux/amd64 --load --tag helloflask .
Build multi-arch image
docker buildx build --platform linux/arm64,linux/amd64 --tag helloflask .
Running the image
docker run -p 80:80 -it helloflask
Confirming the binary is in the image
Conclusion
Although this is literally the definition of security through obscurity, for many use cases this might be enough.
I haven't gone beyond a hello world example of this approach, so there are likely deficiencies in the approach, I'm not sure for example what would happen with more complex dependencies. The fact that Flask worked more or less straight up is encouraging though.
I've stored this in a public repo, so if you have any suggestions or improvements, please feel free to raise an issue or PR.