Skip to main content

'Protecting' python code with Cython and Docker

· 3 min read
Shaw Innes
Builder of things

Security Through Obscurity

I was looking for a method to distribute some proprietary code to 3rd parties with some level of IP protection. Knowing that this is a losing battle, and that I suspected that it was more about limiting curiosity rather than deliberate reverse engineering, I decided to investigate the use of cython to compile python via C into a binary.

This repo demonstrates the process of building a Flask app into a Docker container with a compiled binary of the application using Cython. It also uses a multi-stage docker build to ensure that the original source files are never copied into the target image.

For a more thorough obfuscation, I would suggesting looking into using a more robust solution such as this.

How it works

There is simply a compilation step defined in setup.py that refers to the src/api.pyx file. This file uses cython to do the build, and outputs a binary library file which can later be referenced via gunicorn, or in other python files as a normal module reference.

setup.py
from distutils.core import setup
from Cython.Distutils import build_ext
from Cython.Build import cythonize

setup(
name='Hello Flask',
cmdclass={'build_ext': build_ext},
ext_modules=cythonize("src/api.pyx",
compiler_directives={'language_level': "3"},
build_dir="build"
)
)

Local Build

pip install -r requirements.txt
python setup.py build_ext --inplace

Confirming the binary is built

As you can see, the binary is compiled as a shared object fil, the following screenshot shows a hex dump of this file. Although it would be possible to reverse engineer this file, the effort required to do so would be considerable.

XXD hexdump of the compiled library

Running the local binary with gunicorn

gunicorn api:app

Building the image for docker

Build single-arch image, load to local docker

docker buildx build --platform linux/arm64 --load --tag helloflask .

docker buildx build --platform linux/amd64 --load --tag helloflask .

Build multi-arch image

docker buildx build --platform linux/arm64,linux/amd64 --tag helloflask .

Running the image

docker run -p 80:80 -it helloflask

Running the built docker image

Confirming the binary is in the image

Screenshot showing that there are no source files in the docker image

Conclusion

Although this is literally the definition of security through obscurity, for many use cases this might be enough.

I haven't gone beyond a hello world example of this approach, so there are likely deficiencies in the approach, I'm not sure for example what would happen with more complex dependencies. The fact that Flask worked more or less straight up is encouraging though.

I've stored this in a public repo, so if you have any suggestions or improvements, please feel free to raise an issue or PR.