The major difficulty with this approach is C calling Python. The
problem is that the C stack now holds a nested execution of the
byte-code interpreter. In that situation, a coroutine /
microthread extension cannot be permitted to transfer control to a
frame in a different invocation of the byte-code interpreter. If a
frame were to complete and exit back to C from the wrong
interpreter, the C stack could be trashed.
The ideal solution is to create a mechanism where nested
executions of the byte code interpreter are never needed. The easy
solution is for the coroutine / microthread extension(s) to
recognize the situation and refuse to allow transfers outside the
current invocation.
We can categorize code that involves C calling Python into two
camps: Python"s implementation, and C extensions. And hopefully we
can offer a compromise: Python"s internal usage (and C extension
writers who want to go to the effort) will no longer use a nested
invocation of the interpreter. Extensions which do not go to the
effort will still be safe, but will not play well with coroutines
/ microthreads.
Generally, when a recursive call is transformed into a loop, a bit
of extra bookkeeping is required. The loop will need to keep its
own "stack" of arguments and results since the real stack can now
only hold the most recent. The code will be more verbose, because
it"s not quite as obvious when we"re done. While Stackless is not
implemented this way, it has to deal with the same issues.
In normal Python, PyEval_EvalCode is used to build a frame and
execute it. Stackless Python introduces the concept of a
FrameDispatcher. Like PyEval_EvalCode, it executes one frame. But
the interpreter may signal the FrameDispatcher that a new frame
has been swapped in, and the new frame should be executed. When a
frame completes, the FrameDispatcher follows the back pointer to
resume the "calling" frame.
So Stackless transforms recursions into a loop, but it is not the
FrameDispatcher that manages the frames. This is done by the
interpreter (or an extension that knows what it"s doing).
The general idea is that where C code needs to execute Python
code, it creates a frame for the Python code, setting its back
pointer to the current frame. Then it swaps in the frame, signals
the FrameDispatcher and gets out of the way. The C stack is now
clean - the Python code can transfer control to any other frame
(if an extension gives it the means to do so).
In the vanilla case, this magic can be hidden from the programmer
(even, in most cases, from the Python-internals programmer). Many
situations present another level of difficulty, however.
The map builtin function involves two obstacles to this
approach. It cannot simply construct a frame and get out of the
way, not just because there"s a loop involved, but each pass
through the loop requires some "post" processing. In order to play
well with others, Stackless constructs a frame object for map
itself.
Most recursions of the interpreter are not this complex, but
fairly frequently, some "post" operations are required. Stackless
does not fix these situations because of the amount of code changes
required. Instead, Stackless prohibits transfers out of a nested
interpreter. While not ideal (and sometimes puzzling), this
limitation is hardly crippling.