Rewriting recursion as iteration

A consequence of the Church-Turing thesis is that iteration and recursion are equivalent. This means that you can always take a piece of code using iteration, and rewrite it to use recursion, while changing nothing about the behavior of the code itself, and vice versa.

For instance, you can rewrite iterating over a list with recursion in Rust:

fn foo_iter<T>(xs: Vec<T>) {
  for x in xs.into_iter().rev() {
    bar(x);
  }
}

fn foo_rec<T>(mut xs: Vec<T>) {
  match xs.pop() {
    None => {}
    Some(x) => {
      bar(x);
      foo_rec(xs);
    }
  }
}

Of course, this is a contrived example, and these functions may not compile down to the exact same assembly, especially depending on whether the tail-call optimizer kicks in. But importantly, both foo_iter and foo_rec have the same behavior, in that they call bar with all the same x in the same order.

But how may we rewrite recursion as iteration? The answer is not always as clear, especially when considering things like:

Mutual recursion
Multiple recursive calls
Non-tail calls

In this post, we will start with some complex recursive code, containing all three of the above mentioned tricky bits. We will then gradually rewrite the code, preserving behavior along the way, until finally all recursive calls are removed.

The code for this post is available in this repository, which also shows each stage of the rewrite as git commits.

Overview

We’ll have 2 functions, func and gunc.
They recursively call themselves and each other.
They pass around various changing data to each other.
The functions will also pass around and append to a log of events, represented as a list.
As we rewrite, we will test that the original version and the rewritten version produce the same event log, given the same starting data.

Remember, though, that the important thing is not the specific contrived example code that we’ll be working with. Rather, it is the process itself, of incrementally rewriting and transforming the code while preserving its behavior, that is the most useful.

Common code

First we define the events, some data, and a threshold so we don’t infinitely recur:

#[derive(Debug, Clone, PartialEq, Eq, Hash)]
pub enum Event {
  A(bool),
  B(usize),
  C,
  D,
  E(usize),
  F,
  G,
}

#[derive(Debug, Clone)]
pub struct Data {
  pub num: usize,
  pub cond: bool,
}

pub const THRESHOLD: usize = 150;

Starter code

Now we may define our glorious, contrived, mutually recursive func and gunc.

use crate::common::{Data, Event, THRESHOLD};

pub fn func(es: &mut Vec<Event>, mut data: Data) -> usize {
  if data.num >= THRESHOLD {
    es.push(Event::A(data.cond));
    return es.len() + data.num;
  }
  data.cond = !data.cond;
  if data.cond {
    es.push(Event::B(data.num));
    data.num += 1;
    gunc(es, data.num).num + 2
  } else {
    es.push(Event::C);
    data.num += 6;
    func(es, data) + 3
  }
}

pub fn gunc(es: &mut Vec<Event>, num: usize) -> Data {
  let cond = es.len() % 2 == 0;
  if num >= THRESHOLD {
    es.push(Event::D);
    return Data { num: es.len() | num, cond };
  }
  let data = Data { num: num + 2, cond };
  if es.len() < 5 {
    es.push(Event::E(es.len()));
    Data { num: func(es, data) + 3, cond }
  } else if es.len() % 3 > 0 && func(es, data.clone()) % 2 == 0 {
    es.push(Event::F);
    let mut tmp = gunc(es, num + 4);
    tmp.cond = !tmp.cond;
    tmp
  } else {
    es.push(Event::G);
    let fst = func(es, data);
    let mut tmp = gunc(es, fst);
    tmp.num += fst;
    tmp
  }
}

Do easy vars

We’re going to be putting each recursive call on its own line, since we’re going to be moving and changing the calls around a lot.

For most calls this is pretty easy.

@@ -9,11 +9,13 @@
   if data.cond {
     es.push(Event::B(data.num));
     data.num += 1;
-    gunc(es, data.num).num + 2
+    let tmp = gunc(es, data.num).num;
+    tmp + 2
   } else {
     es.push(Event::C);
     data.num += 6;
-    func(es, data) + 3
+    let tmp = func(es, data);
+    tmp + 3
   }
 }

@@ -26,7 +28,8 @@
   let data = Data { num: num + 2, cond };
   if es.len() < 5 {
     es.push(Event::E(es.len()));
-    Data { num: func(es, data) + 3, cond }
+    let tmp = func(es, data);
+    Data { num: tmp + 3, cond }
   } else if es.len() % 3 > 0 && func(es, data.clone()) % 2 == 0 {
     es.push(Event::F);
     let mut tmp = gunc(es, num + 4);

Split `else if`

It’s a little harder to do this when the call is inside the condition of an if. It’s even harder when && is involved.

We’ll start by splitting an else if into else and if.

@@ -30,7 +30,8 @@
     es.push(Event::E(es.len()));
     let tmp = func(es, data);
     Data { num: tmp + 3, cond }
-  } else if es.len() % 3 > 0 && func(es, data.clone()) % 2 == 0 {
+  } else {
+    if es.len() % 3 > 0 && func(es, data.clone()) % 2 == 0 {
       es.push(Event::F);
       let mut tmp = gunc(es, num + 4);
       tmp.cond = !tmp.cond;
@@ -43,3 +44,4 @@
       tmp
     }
   }
+}

Make cond var

Now we can make a variable for that new if condition.

@@ -31,7 +31,8 @@
     let tmp = func(es, data);
     Data { num: tmp + 3, cond }
   } else {
-    if es.len() % 3 > 0 && func(es, data.clone()) % 2 == 0 {
+    let cond = es.len() % 3 > 0 && func(es, data.clone()) % 2 == 0;
+    if cond {
       es.push(Event::F);
       let mut tmp = gunc(es, num + 4);
       tmp.cond = !tmp.cond;

Split `&&`

To split up && we have to be careful to be maximally lazy.

@@ -31,7 +31,10 @@
     let tmp = func(es, data);
     Data { num: tmp + 3, cond }
   } else {
-    let cond = es.len() % 3 > 0 && func(es, data.clone()) % 2 == 0;
+    let mut cond = es.len() % 3 > 0;
+    if cond {
+      cond = func(es, data.clone()) % 2 == 0;
+    }
     if cond {
       es.push(Event::F);
       let mut tmp = gunc(es, num + 4);

Use `tmp`

Now we can put the call to func on its own line.

@@ -33,7 +33,8 @@
   } else {
     let mut cond = es.len() % 3 > 0;
     if cond {
-      cond = func(es, data.clone()) % 2 == 0;
+      let tmp = func(es, data.clone());
+      cond = tmp % 2 == 0;
     }
     if cond {
       es.push(Event::F);

Avoid early `return`

We have some early returns in func and gunc for the base case. We’re going to remove the return and instead use a big else, indenting everything that used to be after the return in one level.

This isn’t great coding style, but we need to do this because we’re going to be putting both functions in one big wrapper function, to remove the mutual recursion (we’ll still be recursive, but it’ll be one function). That wrapper function will need to do its own bookkeeping, so we can’t be returning explicitly at inopportune times.

Note that the diffs shown here are ignoring whitespace changes (using git show -w), otherwise they’d be massive and confusing.

@@ -3,8 +3,8 @@
 pub fn func(es: &mut Vec<Event>, mut data: Data) -> usize {
   if data.num >= THRESHOLD {
     es.push(Event::A(data.cond));
-    return es.len() + data.num;
-  }
+    es.len() + data.num
+  } else {
     data.cond = !data.cond;
     if data.cond {
       es.push(Event::B(data.num));
@@ -18,13 +18,14 @@
       tmp + 3
     }
   }
+}

 pub fn gunc(es: &mut Vec<Event>, num: usize) -> Data {
   let cond = es.len() % 2 == 0;
   if num >= THRESHOLD {
     es.push(Event::D);
-    return Data { num: es.len() | num, cond };
-  }
+    Data { num: es.len() | num, cond }
+  } else {
     let data = Data { num: num + 2, cond };
     if es.len() < 5 {
       es.push(Event::E(es.len()));
@@ -50,3 +51,4 @@
       }
     }
   }
+}

Add unused `hunc`

Now we can start to construct the wrapper function. We’ll call it hunc. It will take a type that represents either the arguments to func or gunc, and will return a type that represents either the return type of func or gunc.

One unfortunate caveat of this approach is that it doesn’t encode in the type system the information that whenever we pass it a func arg, we expect a func return, and same for gunc. We’ll have to check every time at runtime, which is a performance penalty.

@@ -52,3 +52,20 @@
     }
   }
 }
+
+enum Arg {
+  Func(Data),
+  Gunc(usize),
+}
+
+enum Ret {
+  Func(usize),
+  Gunc(Data),
+}
+
+fn hunc(es: &mut Vec<Event>, arg: Arg) -> Ret {
+  match arg {
+    Arg::Func(data) => Ret::Func(func(es, data)),
+    Arg::Gunc(num) => Ret::Gunc(gunc(es, num)),
+  }
+}

Swap `func`

Now we can put the body of func into hunc and swap func to just delegate to hunc.

@@ -1,22 +1,9 @@
 use crate::common::{Data, Event, THRESHOLD};

-pub fn func(es: &mut Vec<Event>, mut data: Data) -> usize {
-  if data.num >= THRESHOLD {
-    es.push(Event::A(data.cond));
-    es.len() + data.num
-  } else {
-    data.cond = !data.cond;
-    if data.cond {
-      es.push(Event::B(data.num));
-      data.num += 1;
-      let tmp = gunc(es, data.num).num;
-      tmp + 2
-    } else {
-      es.push(Event::C);
-      data.num += 6;
-      let tmp = func(es, data);
-      tmp + 3
-    }
+pub fn func(es: &mut Vec<Event>, data: Data) -> usize {
+  match hunc(es, Arg::Func(data)) {
+    Ret::Func(ret) => ret,
+    Ret::Gunc(_) => unreachable!(),
   }
 }

@@ -65,7 +52,23 @@

 fn hunc(es: &mut Vec<Event>, arg: Arg) -> Ret {
   match arg {
-    Arg::Func(data) => Ret::Func(func(es, data)),
+    Arg::Func(mut data) => Ret::Func(if data.num >= THRESHOLD {
+      es.push(Event::A(data.cond));
+      es.len() + data.num
+    } else {
+      data.cond = !data.cond;
+      if data.cond {
+        es.push(Event::B(data.num));
+        data.num += 1;
+        let tmp = gunc(es, data.num).num;
+        tmp + 2
+      } else {
+        es.push(Event::C);
+        data.num += 6;
+        let tmp = func(es, data);
+        tmp + 3
+      }
+    }),
     Arg::Gunc(num) => Ret::Gunc(gunc(es, num)),
   }
 }

Swap `gunc`

And same for gunc.

@@ -8,35 +8,9 @@
 }

 pub fn gunc(es: &mut Vec<Event>, num: usize) -> Data {
-  let cond = es.len() % 2 == 0;
-  if num >= THRESHOLD {
-    es.push(Event::D);
-    Data { num: es.len() | num, cond }
-  } else {
-    let data = Data { num: num + 2, cond };
-    if es.len() < 5 {
-      es.push(Event::E(es.len()));
-      let tmp = func(es, data);
-      Data { num: tmp + 3, cond }
-    } else {
-      let mut cond = es.len() % 3 > 0;
-      if cond {
-        let tmp = func(es, data.clone());
-        cond = tmp % 2 == 0;
-      }
-      if cond {
-        es.push(Event::F);
-        let mut tmp = gunc(es, num + 4);
-        tmp.cond = !tmp.cond;
-        tmp
-      } else {
-        es.push(Event::G);
-        let fst = func(es, data);
-        let mut tmp = gunc(es, fst);
-        tmp.num += fst;
-        tmp
-      }
-    }
+  match hunc(es, Arg::Gunc(num)) {
+    Ret::Func(_) => unreachable!(),
+    Ret::Gunc(ret) => ret,
   }
 }

@@ -69,6 +43,37 @@
         tmp + 3
       }
     }),
-    Arg::Gunc(num) => Ret::Gunc(gunc(es, num)),
+    Arg::Gunc(num) => Ret::Gunc({
+      let cond = es.len() % 2 == 0;
+      if num >= THRESHOLD {
+        es.push(Event::D);
+        Data { num: es.len() | num, cond }
+      } else {
+        let data = Data { num: num + 2, cond };
+        if es.len() < 5 {
+          es.push(Event::E(es.len()));
+          let tmp = func(es, data);
+          Data { num: tmp + 3, cond }
+        } else {
+          let mut cond = es.len() % 3 > 0;
+          if cond {
+            let tmp = func(es, data.clone());
+            cond = tmp % 2 == 0;
+          }
+          if cond {
+            es.push(Event::F);
+            let mut tmp = gunc(es, num + 4);
+            tmp.cond = !tmp.cond;
+            tmp
+          } else {
+            es.push(Event::G);
+            let fst = func(es, data);
+            let mut tmp = gunc(es, fst);
+            tmp.num += fst;
+            tmp
+          }
+        }
+      }
+    }),
   }
 }

Use `unwrap_*`

In func and gunc, which are now just stubs that call hunc, we have the previously described run-time match to extract the appropriate func or gunc return value. We’re going to put that matching logic into some helper functions, since we’ll be needing it a lot.

@@ -1,17 +1,11 @@
 use crate::common::{Data, Event, THRESHOLD};

 pub fn func(es: &mut Vec<Event>, data: Data) -> usize {
-  match hunc(es, Arg::Func(data)) {
-    Ret::Func(ret) => ret,
-    Ret::Gunc(_) => unreachable!(),
-  }
+  hunc(es, Arg::Func(data)).unwrap_func()
 }

 pub fn gunc(es: &mut Vec<Event>, num: usize) -> Data {
-  match hunc(es, Arg::Gunc(num)) {
-    Ret::Func(_) => unreachable!(),
-    Ret::Gunc(ret) => ret,
-  }
+  hunc(es, Arg::Gunc(num)).unwrap_gunc()
 }

 enum Arg {
@@ -24,6 +18,22 @@
   Gunc(Data),
 }

+impl Ret {
+  fn unwrap_func(self) -> usize {
+    match self {
+      Ret::Func(ret) => ret,
+      Ret::Gunc(_) => unreachable!(),
+    }
+  }
+
+  fn unwrap_gunc(self) -> Data {
+    match self {
+      Ret::Func(_) => unreachable!(),
+      Ret::Gunc(ret) => ret,
+    }
+  }
+}
+
 fn hunc(es: &mut Vec<Event>, arg: Arg) -> Ret {
   match arg {
     Arg::Func(mut data) => Ret::Func(if data.num >= THRESHOLD {

Inline `func`

Notice how short the definition of func is now. We can just inline every usage of func with a call to hunc with func args that unwraps a func return value.

@@ -49,7 +49,7 @@
       } else {
         es.push(Event::C);
         data.num += 6;
-        let tmp = func(es, data);
+        let tmp = hunc(es, Arg::Func(data)).unwrap_func();
         tmp + 3
       }
     }),
@@ -62,12 +62,12 @@
         let data = Data { num: num + 2, cond };
         if es.len() < 5 {
           es.push(Event::E(es.len()));
-          let tmp = func(es, data);
+          let tmp = hunc(es, Arg::Func(data)).unwrap_func();
           Data { num: tmp + 3, cond }
         } else {
           let mut cond = es.len() % 3 > 0;
           if cond {
-            let tmp = func(es, data.clone());
+            let tmp = hunc(es, Arg::Func(data.clone())).unwrap_func();
             cond = tmp % 2 == 0;
           }
           if cond {
@@ -77,7 +77,7 @@
             tmp
           } else {
             es.push(Event::G);
-            let fst = func(es, data);
+            let fst = hunc(es, Arg::Func(data)).unwrap_func();
             let mut tmp = gunc(es, fst);
             tmp.num += fst;
             tmp

Inline `gunc`

And same for gunc.

Notice at this point, we have removed the mutual recursion. func and gunc just delegate to hunc, and hunc only recursively calls itself.

@@ -44,7 +44,7 @@
       if data.cond {
         es.push(Event::B(data.num));
         data.num += 1;
-        let tmp = gunc(es, data.num).num;
+        let tmp = hunc(es, Arg::Gunc(data.num)).unwrap_gunc().num;
         tmp + 2
       } else {
         es.push(Event::C);
@@ -72,13 +72,13 @@
           }
           if cond {
             es.push(Event::F);
-            let mut tmp = gunc(es, num + 4);
+            let mut tmp = hunc(es, Arg::Gunc(num + 4)).unwrap_gunc();
             tmp.cond = !tmp.cond;
             tmp
           } else {
             es.push(Event::G);
             let fst = hunc(es, Arg::Func(data)).unwrap_func();
-            let mut tmp = gunc(es, fst);
+            let mut tmp = hunc(es, Arg::Gunc(fst)).unwrap_gunc();
             tmp.num += fst;
             tmp
           }

Wrap every branch in a `Ret`

When we inlined the original bodies of func and gunc into hunc, we took both whole blocks and wrapped them each in their own appropriate Ret enum variant.

We’re now going to go and wrap each individual final expression instead, since we’re going to be moving around the various inner blocks more.

@@ -36,34 +36,36 @@

 fn hunc(es: &mut Vec<Event>, arg: Arg) -> Ret {
   match arg {
-    Arg::Func(mut data) => Ret::Func(if data.num >= THRESHOLD {
+    Arg::Func(mut data) => {
+      if data.num >= THRESHOLD {
         es.push(Event::A(data.cond));
-      es.len() + data.num
+        Ret::Func(es.len() + data.num)
       } else {
         data.cond = !data.cond;
         if data.cond {
           es.push(Event::B(data.num));
           data.num += 1;
           let tmp = hunc(es, Arg::Gunc(data.num)).unwrap_gunc().num;
-        tmp + 2
+          Ret::Func(tmp + 2)
         } else {
           es.push(Event::C);
           data.num += 6;
           let tmp = hunc(es, Arg::Func(data)).unwrap_func();
-        tmp + 3
+          Ret::Func(tmp + 3)
         }
-    }),
-    Arg::Gunc(num) => Ret::Gunc({
+      }
+    }
+    Arg::Gunc(num) => {
       let cond = es.len() % 2 == 0;
       if num >= THRESHOLD {
         es.push(Event::D);
-        Data { num: es.len() | num, cond }
+        Ret::Gunc(Data { num: es.len() | num, cond })
       } else {
         let data = Data { num: num + 2, cond };
         if es.len() < 5 {
           es.push(Event::E(es.len()));
           let tmp = hunc(es, Arg::Func(data)).unwrap_func();
-          Data { num: tmp + 3, cond }
+          Ret::Gunc(Data { num: tmp + 3, cond })
         } else {
           let mut cond = es.len() % 3 > 0;
           if cond {
@@ -74,16 +76,16 @@
             es.push(Event::F);
             let mut tmp = hunc(es, Arg::Gunc(num + 4)).unwrap_gunc();
             tmp.cond = !tmp.cond;
-            tmp
+            Ret::Gunc(tmp)
           } else {
             es.push(Event::G);
             let fst = hunc(es, Arg::Func(data)).unwrap_func();
             let mut tmp = hunc(es, Arg::Gunc(fst)).unwrap_gunc();
             tmp.num += fst;
-            tmp
+            Ret::Gunc(tmp)
+          }
         }
       }
     }
-    }),
   }
 }

Add `Cont`

This next step is also a bit of pre-work. The general idea of the next stage of the transformation is that we are going to change how we handle all the bits of code that comes after a given recursive call.

Instead of recursively calling hunc and then doing things with the return value, we’re going to keep track of an explicit stack of reified local variables that we need to access after the recursive call.

Right before we make a call, we will push onto that stack the local variables we need around after that call is done.
Then we’ll make the call.
Then finally we’ll look at the stack in reverse order, popping off the stack and doing whatever it is we were going to do with the recursive call return value and the local variables (which we have in the stack).

This is basically what the “call stack” is and does. We’re just making it explicit.

Doing this means every recursive call is a tail call.

We’re going to create a type that will represent:

All of the possible different recursive call sites
The actions we have to do after those calls
The local variables we need to be live across the calls to do those actions

We’ll call it Cont, for “continuation”. We might have also called it Frame, which is terminology associated with the call stack.

@@ -34,8 +34,11 @@
   }
 }

+enum Cont {}
+
 fn hunc(es: &mut Vec<Event>, arg: Arg) -> Ret {
-  match arg {
+  let mut cs = Vec::<Cont>::new();
+  let ret = match arg {
     Arg::Func(mut data) => {
       if data.num >= THRESHOLD {
         es.push(Event::A(data.cond));
@@ -87,5 +90,9 @@
         }
       }
     }
+  };
+  while let Some(cont) = cs.pop() {
+    match cont {}
   }
+  ret
 }

Wrap in `loop`

We discussed in the last step how we’re going to use an explicit stack of continuations to make it so that every recursive call is a tail call.

It turns out that if we have a recursive function, where every recursive call is a tail call, we can transform it into an iterative function with a loop and mutation:

We take the entire function and wrap it in a loop.
Every time we would make a recursive call, we instead update the arguments to this function to be the new arguments, and continue along the loop.
At the bottom we actually return out of the loop.

This is essentially what tail-call optimization is. We’re just going to make it explicit.

In this step, we add a loop around everything that never loops (note the clippy allow), because we just unconditionally (for now) return at the very bottom.

If these last few steps seem a bit weird, stay with me! The next few steps, where we actually start using Cont and this outer loop, should help make it make sense.

Remember that at every step of the transformation so far, and to come, the behavior of func and gunc has not changed, and will not change.

@@ -38,6 +38,8 @@

 fn hunc(es: &mut Vec<Event>, arg: Arg) -> Ret {
   let mut cs = Vec::<Cont>::new();
+  #[allow(clippy::never_loop)]
+  loop {
     let ret = match arg {
       Arg::Func(mut data) => {
         if data.num >= THRESHOLD {
@@ -94,5 +96,6 @@
     while let Some(cont) = cs.pop() {
       match cont {}
     }
-  ret
+    return ret;
+  }
 }

Do `C1`

This is the first defunctionalization of a continuation we perform. Most of the rest of the steps will basically look like this one.

We:

Add a new variant to Cont.
Update a single recursive call site to push that new Cont variant to the stack before the call.
Remove the recursive call and instead mutate the arg and continue.
Take all the code that was after the recursive call and move it to the match cont that we do as we pop off the stack.

Note that in the continuation code, we are updating the local variable ret, which was initially set above. When we’re done removing all the recursion, ret will always start as a base-case value, and then we will pop off all the continuations from cs that we popped on when recurring (aka continueing).

Note also that in this case, we didn’t need any local variables to be live across the recursive call, so the Cont variant we’re adding doesn’t carry any data.

@@ -34,13 +34,14 @@
   }
 }

-enum Cont {}
+enum Cont {
+  C1,
+}

-fn hunc(es: &mut Vec<Event>, arg: Arg) -> Ret {
+fn hunc(es: &mut Vec<Event>, mut arg: Arg) -> Ret {
   let mut cs = Vec::<Cont>::new();
-  #[allow(clippy::never_loop)]
   loop {
-    let ret = match arg {
+    let mut ret = match arg {
       Arg::Func(mut data) => {
         if data.num >= THRESHOLD {
           es.push(Event::A(data.cond));
@@ -50,16 +51,16 @@
           if data.cond {
             es.push(Event::B(data.num));
             data.num += 1;
-            let tmp = hunc(es, Arg::Gunc(data.num)).unwrap_gunc().num;
-            Ret::Func(tmp + 2)
-          } else {
+            cs.push(Cont::C1);
+            arg = Arg::Gunc(data.num);
+            continue;
+          }
           es.push(Event::C);
           data.num += 6;
           let tmp = hunc(es, Arg::Func(data)).unwrap_func();
           Ret::Func(tmp + 3)
         }
       }
-      }
       Arg::Gunc(num) => {
         let cond = es.len() % 2 == 0;
         if num >= THRESHOLD {
@@ -94,7 +95,12 @@
       }
     };
     while let Some(cont) = cs.pop() {
-      match cont {}
+      match cont {
+        Cont::C1 => {
+          let tmp = ret.unwrap_gunc().num;
+          ret = Ret::Func(tmp + 2);
+        }
+      }
     }
     return ret;
   }

Do `C2`

This is pretty similar to the previous step.

@@ -36,6 +36,7 @@

 enum Cont {
   C1,
+  C2,
 }

 fn hunc(es: &mut Vec<Event>, mut arg: Arg) -> Ret {
@@ -57,8 +58,9 @@
           }
           es.push(Event::C);
           data.num += 6;
-          let tmp = hunc(es, Arg::Func(data)).unwrap_func();
-          Ret::Func(tmp + 3)
+          cs.push(Cont::C2);
+          arg = Arg::Func(data);
+          continue;
         }
       }
       Arg::Gunc(num) => {
@@ -100,6 +102,10 @@
           let tmp = ret.unwrap_gunc().num;
           ret = Ret::Func(tmp + 2);
         }
+        Cont::C2 => {
+          let tmp = ret.unwrap_func();
+          ret = Ret::Func(tmp + 3);
+        }
       }
     }
     return ret;

Do `C3`

This is mostly the same, but note that we need a local variable from before the call now. So we have this variant carry data, namely, the local variable we need after the call.

@@ -37,6 +37,7 @@
 enum Cont {
   C1,
   C2,
+  C3(bool),
 }

 fn hunc(es: &mut Vec<Event>, mut arg: Arg) -> Ret {
@@ -72,9 +73,10 @@
           let data = Data { num: num + 2, cond };
           if es.len() < 5 {
             es.push(Event::E(es.len()));
-            let tmp = hunc(es, Arg::Func(data)).unwrap_func();
-            Ret::Gunc(Data { num: tmp + 3, cond })
-          } else {
+            cs.push(Cont::C3(cond));
+            arg = Arg::Func(data);
+            continue;
+          }
           let mut cond = es.len() % 3 > 0;
           if cond {
             let tmp = hunc(es, Arg::Func(data.clone())).unwrap_func();
@@ -94,7 +96,6 @@
           }
         }
       }
-      }
     };
     while let Some(cont) = cs.pop() {
       match cont {
@@ -106,6 +107,10 @@
           let tmp = ret.unwrap_func();
           ret = Ret::Func(tmp + 3);
         }
+        Cont::C3(cond) => {
+          let tmp = ret.unwrap_func();
+          ret = Ret::Gunc(Data { num: tmp + 3, cond });
+        }
       }
     }
     return ret;

Use `post_if_c4`

There’s a bit of difficulty around transforming recursive calls that are followed by other recursive calls, as well as transforming calls that are part of a conditional branch that ends and rejoins the pre-existing control flow.

We’ll see more in the next step, but for now we’re just going to move some stuff into a helper function. It has only one call site now, but in the next step we’ll add another call site.

@@ -82,18 +82,7 @@
             let tmp = hunc(es, Arg::Func(data.clone())).unwrap_func();
             cond = tmp % 2 == 0;
           }
-          if cond {
-            es.push(Event::F);
-            let mut tmp = hunc(es, Arg::Gunc(num + 4)).unwrap_gunc();
-            tmp.cond = !tmp.cond;
-            Ret::Gunc(tmp)
-          } else {
-            es.push(Event::G);
-            let fst = hunc(es, Arg::Func(data)).unwrap_func();
-            let mut tmp = hunc(es, Arg::Gunc(fst)).unwrap_gunc();
-            tmp.num += fst;
-            Ret::Gunc(tmp)
-          }
+          post_if_c4(es, data, num, cond)
         }
       }
     };
@@ -116,3 +105,18 @@
     return ret;
   }
 }
+
+fn post_if_c4(es: &mut Vec<Event>, data: Data, num: usize, cond: bool) -> Ret {
+  if cond {
+    es.push(Event::F);
+    let mut tmp = hunc(es, Arg::Gunc(num + 4)).unwrap_gunc();
+    tmp.cond = !tmp.cond;
+    Ret::Gunc(tmp)
+  } else {
+    es.push(Event::G);
+    let fst = hunc(es, Arg::Func(data)).unwrap_func();
+    let mut tmp = hunc(es, Arg::Gunc(fst)).unwrap_gunc();
+    tmp.num += fst;
+    Ret::Gunc(tmp)
+  }
+}

Do `C4`

Now we see why we needed the post_if_c4 helper from last step.

@@ -38,6 +38,7 @@
   C1,
   C2,
   C3(bool),
+  C4(Data, usize),
 }

 fn hunc(es: &mut Vec<Event>, mut arg: Arg) -> Ret {
@@ -77,10 +78,11 @@
             arg = Arg::Func(data);
             continue;
           }
-          let mut cond = es.len() % 3 > 0;
+          let cond = es.len() % 3 > 0;
           if cond {
-            let tmp = hunc(es, Arg::Func(data.clone())).unwrap_func();
-            cond = tmp % 2 == 0;
+            cs.push(Cont::C4(data.clone(), num));
+            arg = Arg::Func(data);
+            continue;
           }
           post_if_c4(es, data, num, cond)
         }
@@ -100,6 +102,11 @@
           let tmp = ret.unwrap_func();
           ret = Ret::Gunc(Data { num: tmp + 3, cond });
         }
+        Cont::C4(data, num) => {
+          let tmp = ret.unwrap_func();
+          let cond = tmp % 2 == 0;
+          ret = post_if_c4(es, data, num, cond);
+        }
       }
     }
     return ret;

Return `ControlFlow`

This next step is more pre-work.

In the next few steps, we’re going to transform our helper function. Note that when we added the helper function, we actually unfortunately returned to a state where we have not only recursion, but mutual recursion: hunc called post_if_c4 and vice versa.

We need to change post_if_c4 to push to cs, update arg, and continue instead of recursively calling hunc, as in the previous steps. But there’s a problem: we can’t use continue, because we’re not syntactically inside the outer loop in hunc.

The solution is to use std::ops::ControlFlow, in the Rust standard library. This reifies the control flow choice between break and continue.

When we want to continue from post_if_c4, we’ll return a Continue variant containing the new argument for the recursive call. When we want to just return, we’ll use Break.

Then hunc, which is the only caller of post_if_c4, can examine the ControlFlow value and either use the returned Break value or mutate its arg to the new argument inside the Continue variant and continue.

Note that we need to add a label to the 'outer wrapper loop, so that, inside the while loop popping off the stack of Conts, we can jump out of that loop and back into the main outer loop.

This corresponds to making more than one recursive call. If we have more than one recursive call, those further calls are necessarily after that first recursive call. Everything after a recursive call is inside a continuation, and we handle continuations in the while loop.

@@ -1,4 +1,5 @@
 use crate::common::{Data, Event, THRESHOLD};
+use std::ops::ControlFlow;

 pub fn func(es: &mut Vec<Event>, data: Data) -> usize {
   hunc(es, Arg::Func(data)).unwrap_func()
@@ -43,7 +44,7 @@

 fn hunc(es: &mut Vec<Event>, mut arg: Arg) -> Ret {
   let mut cs = Vec::<Cont>::new();
-  loop {
+  'outer: loop {
     let mut ret = match arg {
       Arg::Func(mut data) => {
         if data.num >= THRESHOLD {
@@ -84,7 +85,13 @@
             arg = Arg::Func(data);
             continue;
           }
-          post_if_c4(es, data, num, cond)
+          match post_if_c4(es, data, num, cond) {
+            ControlFlow::Continue(a) => {
+              arg = a;
+              continue;
+            }
+            ControlFlow::Break(r) => r,
+          }
         }
       }
     };
@@ -105,7 +112,13 @@
         Cont::C4(data, num) => {
           let tmp = ret.unwrap_func();
           let cond = tmp % 2 == 0;
-          ret = post_if_c4(es, data, num, cond);
+          match post_if_c4(es, data, num, cond) {
+            ControlFlow::Continue(a) => {
+              arg = a;
+              continue 'outer;
+            }
+            ControlFlow::Break(r) => ret = r,
+          }
         }
       }
     }
@@ -113,17 +126,17 @@
   }
 }

-fn post_if_c4(es: &mut Vec<Event>, data: Data, num: usize, cond: bool) -> Ret {
+fn post_if_c4(es: &mut Vec<Event>, data: Data, num: usize, cond: bool) -> ControlFlow<Ret, Arg> {
   if cond {
     es.push(Event::F);
     let mut tmp = hunc(es, Arg::Gunc(num + 4)).unwrap_gunc();
     tmp.cond = !tmp.cond;
-    Ret::Gunc(tmp)
+    ControlFlow::Break(Ret::Gunc(tmp))
   } else {
     es.push(Event::G);
     let fst = hunc(es, Arg::Func(data)).unwrap_func();
     let mut tmp = hunc(es, Arg::Gunc(fst)).unwrap_gunc();
     tmp.num += fst;
-    Ret::Gunc(tmp)
+    ControlFlow::Break(Ret::Gunc(tmp))
   }
 }

Do `C5`

With the new ControlFlow machinery, we can continue along (heh). Note that the continuations we’re pushing are now inside post_if_c4.

We also have to pass cs into post_if_c4.

@@ -40,6 +40,7 @@
   C2,
   C3(bool),
   C4(Data, usize),
+  C5,
 }

 fn hunc(es: &mut Vec<Event>, mut arg: Arg) -> Ret {
@@ -85,7 +86,7 @@
             arg = Arg::Func(data);
             continue;
           }
-          match post_if_c4(es, data, num, cond) {
+          match post_if_c4(&mut cs, es, data, num, cond) {
             ControlFlow::Continue(a) => {
               arg = a;
               continue;
@@ -112,7 +113,7 @@
         Cont::C4(data, num) => {
           let tmp = ret.unwrap_func();
           let cond = tmp % 2 == 0;
-          match post_if_c4(es, data, num, cond) {
+          match post_if_c4(&mut cs, es, data, num, cond) {
             ControlFlow::Continue(a) => {
               arg = a;
               continue 'outer;
@@ -120,18 +121,28 @@
             ControlFlow::Break(r) => ret = r,
           }
         }
+        Cont::C5 => {
+          let mut tmp = ret.unwrap_gunc();
+          tmp.cond = !tmp.cond;
+          ret = Ret::Gunc(tmp);
+        }
       }
     }
     return ret;
   }
 }

-fn post_if_c4(es: &mut Vec<Event>, data: Data, num: usize, cond: bool) -> ControlFlow<Ret, Arg> {
+fn post_if_c4(
+  cs: &mut Vec<Cont>,
+  es: &mut Vec<Event>,
+  data: Data,
+  num: usize,
+  cond: bool,
+) -> ControlFlow<Ret, Arg> {
   if cond {
     es.push(Event::F);
-    let mut tmp = hunc(es, Arg::Gunc(num + 4)).unwrap_gunc();
-    tmp.cond = !tmp.cond;
-    ControlFlow::Break(Ret::Gunc(tmp))
+    cs.push(Cont::C5);
+    ControlFlow::Continue(Arg::Gunc(num + 4))
   } else {
     es.push(Event::G);
     let fst = hunc(es, Arg::Func(data)).unwrap_func();

Do `C6`

Note that we recursively call hunc in the newly moved block for Cont::C6. We’ll get rid of that in the next step.

@@ -41,6 +41,7 @@
   C3(bool),
   C4(Data, usize),
   C5,
+  C6,
 }

 fn hunc(es: &mut Vec<Event>, mut arg: Arg) -> Ret {
@@ -126,6 +127,12 @@
           tmp.cond = !tmp.cond;
           ret = Ret::Gunc(tmp);
         }
+        Cont::C6 => {
+          let fst = ret.unwrap_func();
+          let mut tmp = hunc(es, Arg::Gunc(fst)).unwrap_gunc();
+          tmp.num += fst;
+          ret = Ret::Gunc(tmp);
+        }
       }
     }
     return ret;
@@ -145,9 +152,7 @@
     ControlFlow::Continue(Arg::Gunc(num + 4))
   } else {
     es.push(Event::G);
-    let fst = hunc(es, Arg::Func(data)).unwrap_func();
-    let mut tmp = hunc(es, Arg::Gunc(fst)).unwrap_gunc();
-    tmp.num += fst;
-    ControlFlow::Break(Ret::Gunc(tmp))
+    cs.push(Cont::C6);
+    ControlFlow::Continue(Arg::Func(data))
   }
 }

Do `C7`

As predicted, we replace the recursive call with a cont push, arg set, and continue, but we must use the 'outer label because we’re in the while loop.

This is the last one!

@@ -42,6 +42,7 @@
   C4(Data, usize),
   C5,
   C6,
+  C7(usize),
 }

 fn hunc(es: &mut Vec<Event>, mut arg: Arg) -> Ret {
@@ -129,7 +130,12 @@
         }
         Cont::C6 => {
           let fst = ret.unwrap_func();
-          let mut tmp = hunc(es, Arg::Gunc(fst)).unwrap_gunc();
+          cs.push(Cont::C7(fst));
+          arg = Arg::Gunc(fst);
+          continue 'outer;
+        }
+        Cont::C7(fst) => {
+          let mut tmp = ret.unwrap_gunc();
           tmp.num += fst;
           ret = Ret::Gunc(tmp);
         }

Remove `ControlFlow`

Since we only return Continue from post_if_c4, we can remove ControlFlow and just always continue.

@@ -1,5 +1,4 @@
 use crate::common::{Data, Event, THRESHOLD};
-use std::ops::ControlFlow;

 pub fn func(es: &mut Vec<Event>, data: Data) -> usize {
   hunc(es, Arg::Func(data)).unwrap_func()
@@ -88,14 +87,9 @@
             arg = Arg::Func(data);
             continue;
           }
-          match post_if_c4(&mut cs, es, data, num, cond) {
-            ControlFlow::Continue(a) => {
-              arg = a;
+          arg = post_if_c4(&mut cs, es, data, num, cond);
           continue;
         }
-            ControlFlow::Break(r) => r,
-          }
-        }
       }
     };
     while let Some(cont) = cs.pop() {
@@ -115,14 +109,9 @@
         Cont::C4(data, num) => {
           let tmp = ret.unwrap_func();
           let cond = tmp % 2 == 0;
-          match post_if_c4(&mut cs, es, data, num, cond) {
-            ControlFlow::Continue(a) => {
-              arg = a;
+          arg = post_if_c4(&mut cs, es, data, num, cond);
           continue 'outer;
         }
-            ControlFlow::Break(r) => ret = r,
-          }
-        }
         Cont::C5 => {
           let mut tmp = ret.unwrap_gunc();
           tmp.cond = !tmp.cond;
@@ -145,20 +134,14 @@
   }
 }

-fn post_if_c4(
-  cs: &mut Vec<Cont>,
-  es: &mut Vec<Event>,
-  data: Data,
-  num: usize,
-  cond: bool,
-) -> ControlFlow<Ret, Arg> {
+fn post_if_c4(cs: &mut Vec<Cont>, es: &mut Vec<Event>, data: Data, num: usize, cond: bool) -> Arg {
   if cond {
     es.push(Event::F);
     cs.push(Cont::C5);
-    ControlFlow::Continue(Arg::Gunc(num + 4))
+    Arg::Gunc(num + 4)
   } else {
     es.push(Event::G);
     cs.push(Cont::C6);
-    ControlFlow::Continue(Arg::Func(data))
+    Arg::Func(data)
   }
 }

Conclusion

This transformation process should work for almost any arbitrarily complicated set of mutually recursive functions.

One case I found that I couldn’t handle was when one of the mutually recursive functions takes an argument by reference, and its caller constructs an owned local variable and passes a reference to it. This is tricky to handle with Rust’s borrow checker when transforming the continuations.

Even if I tried to move the temporary in the continuation variant and then grab a reference to that, it would still not work because we might mutate the continuation stack, which could invalidate the reference if the backing array was reallocated.

It might be possible to do this with some hackery with Pin. Basically, we need a guarantee that the local variable will not move until we process the continuation, at which point we can drop it.