Thursday, 27 August 2009

Control flows in Heaven with Scala 2.8 continuations - Part 1

I'm no expert by any means but I'm very interested in high-performance and high-scalability topics (who hasn't broken his neck on those issues too late in the development process?) and continuations are a very promising language concept in this area. They allow to suspend a program control flow and then to resume it when (or if ever) needed.
Yes, that's useful as, for example, you could write a program that processes an HTTP request, then renders a page to gather more info and "freezes" (in the sense that it suspends and releases its thread) while it waits for an answer, then resumes execution in a new request processing thread. You could say: "well, that's more easily done by blocking the thread until an answer arrives or by writing event-style code as any sane framework will force you to do".
The thing is:
  1. Event-style code can be difficult to read and mantain when it doesn't match your logic
  2. OS threads are an expensive resource, as many platforms won't let you have more than a few thousands alive at the same time in a system (you thought many more, didn't you?), so having one just to wait for an answer is an highly suboptimal approach as far as scalability is concerned
See this Apache Cocoon page for a more thorough explanation of this use case. You can easily see, anyway, that any concurrent request-processing software piece could benefit from them.

On a less immediately pragmatic side, continuations can be used to unify seemingly different control flow structures like exceptions or coroutines (which, as an aside, are very useful to write highly-scalable concurrent programs as they allow cooperative multi-threading using only one OS thread).

The Scala 2.8 language will feature a compiler plugin and a library-based DSL for composable continuations (there's a preview post at Scala's website and another very good post by Rich Dougherty here and more excellent continuation-related ones follow in his blog; the full-blown original paper is here).
Continuations can be thought of as values that store the (partial or total) rest of a control flow as it was at a given point in time. In Scala 2.8, they are functions built by the shift call, which always stores a partial remaining flow delimited by an outer (call-wise, not necessarily syntactically) reset call.

Since neither the JVM nor .NET provide primitives to build continuations (or not yet, see the Da Vinci Machine project), the Scala team has implemented them by using a program transformation technique called "Continuation-Passing Style (CPS) transform": they built a compiler plugins that rewrites select portions of a Scala program in such a way that continuations work as intended. This plugins also provides a typechecker / type inferrer extension, on which it relies to check the "reset outside shift" calls structure and to track the portions of the program that need the cps transform to be applied; very intuitively it assigns annotations to shift calls, it propagates them up the calls structure in the program and expects to find them back when checking reset calls.

Perhaps it's just me but the exact behaviour of these new primitives has escaped my full understanding at first, so I decided to drill a bit into the topic and to perform several small experiments to figure out what was happening.

In this post and the next ones I'll show a few Scala 2.8 examples I built that helped me in grasping gradually how exactly Scala continuations and their typechecking work. All of them are self-contained, show explicit annotated types and try to avoid any Scala syntactic sugar in order to make things more apparent and explicit. They also abound in tracing println statements and there are comments with numbers that track the control flow throughout the code.

First of all, how do you invoke shift and reset and what's happening when you do?
  • A reset call takes as its argument the toplevel continuation function, i.e. the code that will be suspended by some shift call (and will possibly be resumed later). The shift call must be performed at a certain point during the execution of the toplevel continuation function itself (possibly not directly in its body but at least somewhere in sub-calls), else the compiler plugin will be able to detect this anomaly and will complain.
  • A shift call builds an object (the continuation) that will "freeze" and store the rest of the control flow up to the end of the toplevel continuation function passed to the reset call. Control is then transferred immediately to the argument that we could name the "redirect" or "replacement" function. The redirect function will, of course, receive the continuation object as its argument and, after doing anything it wants, will have the option to call it to resume the enclosed control flow (and then do again anything it wants with the result of it). Please note that, unless the redirect function calls the continuation, the toplevel continuation function won't finish executing anymore and will just stay freezed until the continuation is garbage collected. The shift call ends when its redirect function returns. The reset call, instead, ends when the first shift call has returned. In case there are subsequent shift calls, they'll be part of the continuation built by a previous shift, which means they're not necessarily executed. But, if they are, the reset call returns when their redirect function has finished as well. This is an odd behaviour at first but, as you'll see, it makes perfectly sense.
The Example1 program below show, in a simplified procedural version (all types involved are Unit) how all of the above pieces fit and work together: you can see the reset call in main method on the toplevel continuation function. The latter doesn't call directly shift but invokes a couple of times a private metod that will take care of it insted. The shift call receives a redirect function as its argument, which uses the continuation parameter to resume the frozen flow. Please read it, follow and understand the control sequence by following number comments, make sure they make sense with the above explanation and then use the info in Rich Dougherty's post to compile and run it:

import scala.continuations._
import ControlContext._

object Example1 {
private def l( msg : String ) {
println( "[Thread '" + Thread.currentThread().getName() + "'] " + msg )

private def redirectFunction( continuation : Unit => Unit ) : Unit = {
l( "Entered redirectFunction, calling continuation" )

/* 3 */
/* 11 */

val r : Unit = /* <-- 19 */ /* <-- 22 */ continuation()

/* 20 */
/* 23 */

l( "Exiting redirectFunction, continuation called, returning ()" )

/* 21 */
/* 24 */


private def packContAndUseItIfNeeded() : Unit @cps[Unit, Unit] = {
l( "Entered packContAndUseItIfNeeded, doing shift" )

/* 2 */
/* 10 */

val r : Unit @cps[Unit, Unit] = /* <-- 4 */ /* <-- 12 */ shift( redirectFunction )

/* 5 */
/* 13 */

"Exiting packContAndUseItIfNeeded, shift done, " +
"ACTUAL CONTINUATION (1 or 2) RUNNING! returning ()"

/* 6 */
/* 14 */


private def toplevelContinuationFunction() : Unit @cps[Unit, Unit] = {
l( "Entered reset and toplevelContinuationFunction, before first shift" )

/* 1 */

val r1 : Unit @cps[Unit, Unit] = /* <-- 7 */ packContAndUseItIfNeeded

/* 8 */

"In toplevelContinuationFunction, after first packContAndUseItIfNeeded, " +
"ACTUAL CONTINUATION 1 RUNNING! Before second shift"

/* 9 */

val r2 : Unit @cps[Unit, Unit] = /* <-- 15 */ packContAndUseItIfNeeded

/* 16 */

"In toplevelContinuationFunction, after second packContAndUseItIfNeeded, " +

/* 17 */

l( "Exiting toplevelContinuationFunction and reset, returning ()" )

/* 18 */


def main( args: Array[String] ) : Unit = {
l( "Before reset" )

val x : Unit = /* <-- 25 */ reset( toplevelContinuationFunction )

l( "After reset, terminating" )

I really hope this first part has been useful to you. If you find any mistakes or misunderstandings please let me know and I'll fix them ASAP (plus I'll learn something new, thank you!).
In the next posts we'll see slightly different versions of the above example in order to better understand the continuations type-checking and type restrictions.

No comments:

Post a Comment