Skip to content

RedisGears Design: Isolation Techniques

RedisGears ships with an embedded Python Interpreter . This makes it possible to run Python code using the RG.PYEXECUTE command . Because the interpreter is a singleton that's shared among all calls to RG.PYEXECUTE , there's the risk of different executions using the same identifiers (that is, global variables, functions, class names, etc.).

One way to address this problem is by restarting the interpreter before each execution. However, an interpreter restart is a time-costly operation that renders this approach less-than-desirable. Also, two executions may run in parallel, either in two different threads or even in the same thread (while the execution of the Python code itself is always non-parallelized, the rest of the execution's lifecycle might in fact be parallelized). When more than one execution is running, restarting the interpreter is no longer an option.

Currently, RedisGears uses global dictionaries that isolate the each execution's from one another. This means that if two executions declare the same function name (with different implementations), they will not step on each other's toes. Though this isolation is sufficient in most cases, it isn't perfect. The biggest shortcoming of this approach is importing modules: all executions that import a module X will get the same instance of module X . This means that even if we create a virtual environment per execution with a different implementation of module X , all of the executions will use the implementation imported first.

Another way to provide isolation is with Python's Sub-Interpreters . A sub-interpreter is an (almost) separate environment for the execution of Python code. The Python C API makes it possible to create a new sub-interpreter using Py_NewInterpreter , destroy it using Py_EndInterpreter , and switch between sub-interpreters using PyThreadState_Swap .

RedisGears has used sub-interpreters before, but once we found out that some libraries are not compatible with sub-interpreters, we decided to switch to global dictionaries. We are planning to return the sub-interpreter support in the future with some On/Off switch to still allow the use of libraries that are not compatible with it. The idea is to maintain the association between the user's call to RG.PYEXECUTE and its respective sub-interpreter.

When RG.PYEXECUTE is called, a new sub-interpreter will be created to execute the provided script. That sub-interpreter will also be "inherited" by all subsequent operations (that is, executions and registrations) that the script created. Because there may be multiple owners of the sub-interpreter, RedisGears will keep an internal reference count for each sub-interpreter so it can safely be freed.

Notice that the isolation between sub-interpreters is also imperfect. For example, when using low-level file operations like os.close() , sub-interpreters can (accidentally or maliciously) affect each other’s open files because of the way extensions are shared.

Further reference

* Python Sub-Interpreters