Let's take the first example from the Rust generators RFC:
#[async]
fn print_lines() -> io::Result<()> {
let addr = "127.0.0.1:8080".parse().unwrap();
let tcp = await!(TcpStream::connect(&addr))?;
...
}
The way this works it that calling TcpStream::connect
would immediately start the process of connecting to the address,
and return an object (a future) that represents the moment when the connection succeeds. Then the call to await!
suspends
the current function and resumes it only when the connection succeeded.
What the RFC doesn't show is that one of the interesting things with generator is that we can do this:
#[async]
fn print_lines() -> io::Result<()> {
let addr = "127.0.0.1:8080".parse().unwrap();
let connection1 = TcpStream::connect(&addr);
let connection2 = TcpStream::connect(&addr);
let connection3 = TcpStream::connect(&addr);
let actual_c1 = await!(connection1)?;
let actual_c2 = await!(connection2)?;
let actual_c3 = await!(connection3)?;
...
}
In other words we immediately start three connections, then only we interrupt the current function three times
(of course as an optimization, a call to await!
would not interrupt the function if the connection has
already succeeded).
The advantage of splitting TcpStream::connect
from await!
is that the three connections execute concurrently.
In the context of a game, what you want is to process each frame as fast as possible. You have a list of CPU actions to perform, and they should all be finished in under 16 milliseconds after the first one has started. Since we are in the era of multi-core processors, you usually want to dispatch these actions between multiple threads.
Most modern game engines handle this by using tasks. All the CPU actions are split in small tasks that are processed by a threads pool. The tasks should be small enough so that they can be equally dispatched between all the CPU cores without having some cores idle because they are waiting. The fact that the tasks must be small means that we want some way to make it easy to create lots of tasks.
But that's easy! you might say. After all, it is possible to use the futures-cpupool
crate or an equivalent, and
write something like this:
#[async]
fn create_task() -> u32 {
let t1 = spawn_task(|| 5);
let t3 = spawn_task(|| 6);
let t2 = spawn_task(|| 7);
await!(t1) + await!(t2) + await!(t3)
}
But this approach has several problems.
The biggest problem of this approach is that the closure of the task must be 'static
.
If it wasn't then you could write something like this, which is undefined behavior:
#[async]
fn create_task() {
let mut my_string = "test".to_owned();
{
let task = spawn_task(|| {
println!("{}", my_string);
});
mem::forget(task);
}
drop(my_string);
}
In this code, calling spawn_task
immediately starts executing the closure, but since the rest of the create_task
function
continues to execute concurrently, chances are that my_string
is going to be dropped before the println!
executes.
Requiring 'static
everywhere is really unpractical, especially since we want to split operations into lots of tasks. You would basically need Arc
wrapped around structs that themselves contain Arc
s, that themselves wrap around structs that contain Arc
s. And consequently you would also need mutexes everywhere to go with the Arcs. We really want to avoid that.
Let me know if I'm missing something, but for me the only way to solve this leak safety problem is to ensure that the function that spawns the task doesn't return until that task is finished.
In other words you create a task then pass it to a function that immediately executes it, like this:
#[task]
fn create_task(some_borrow: &Foo) {
foo.bar();
}
let local_foo: Foo = ...;
let task = create_task(&local_foo); // Doesn't start the task yet
task.start_the_task_in_a_thread_and_wait_until_it_is_finished();
This way you can't destroy local_foo
while it is in use by the task.
A consequence of this is that calling create_task
doesn't actually do anything expect return a dummy object (which the compiler
will probably optimize into a no-op).
Obviously you might say that this constraint kills the point of using tasks. But here's the trick: a task can
immediately spawn other tasks, provided that the lifetime of the spawned task outlives the lifetime of the spawner. In
other words, during its execution create_task
can immediately spawn subtasks if it only passes some_borrow
(or some of
fields of some_borrow
) to them. But create_task
cannot create a local variable and spawn a subtask that borrows this
local variable, unless it uses start_the_task_and_wait_until_it_is_finished
.
To sum things up, tasks can safely behave in two possible ways:
- If we're within a task and create a subtask whose lifetime outlives the parent, then the subtask can begin its execution immediately.
- Otherwise the task must not immediately start when it is created. Instead the helping library must provide functions (
start_the_task_and_wait_until_it_is_finished
or similar) that start the task and block the current thread until the task finishes.
Creating a task sometimes does nothing and sometimes should start the task immediately, depending on the context of the caller. Obviously this isn't possible to handle automatically, so we need to use a macro in the case where we want to start the task immediately.
In other words, it would look like this:
impl Foo {
#[task]
fn foo1(&self) -> u32 {
let foo2_task = task!(self.foo2()); // `self.foo2()` doesn't do anything, and `task!` starts the task
let foo3_task = task!(self.foo3());
await!(foo2_task) + await!(foo3_task) // `await!` waits for the task to be finished
}
#[task]
fn foo2(&self) -> u32 { .. }
#[task]
fn foo3(&self) -> u32 { .. }
}
The task!
macro would have to enforce that the lifetime of its content doesn't outlive the lifetime of the parent. Fortunately
this can be done with the procedural macro that handles #[task]
. The task!
macro wouldn't come from an external crate, but
instead be redeclared locally at the start of each function with the #[task]
attribute.
If, however, you want to spawn a subtask that uses a local variable, then you have to block:
impl Foo {
#[task]
fn foo1(&self) -> u32 {
let local_variable = Bar::new();
let bar_task = bar.bar(); // Doesn't start any execution immediately
let foo2_task = task!(self.foo2()); // Does start executing immediately
task_await!(bar_task) + await!(foo2_task) // Notice `task_await!`, a combination of `task!` + `await!`
}
}
The task_await!
macro would also be able to take multiple parameters so that we can block multiple tasks at once:
impl Foo {
#[task]
fn foo1(&self) -> u32 {
let local_variable = Bar::new();
let bar_task1 = bar.bar();
let bar_task2 = bar.bar();
let (r1, r2) = task_await!(bar_task1, bar_task2);
r1 + r2
}
}
The task_await!
macro works by yielding. If the user advances the coroutines generated by calling
foo.foo1()
until task_await!
yields, then mem::forget
s that coroutine, then we have an undefined behavior.
This means that it is important that only the helper library must be able to resume coroutines.
Outside of the context of a task, you would have to call start_the_task_and_wait_until_it_is_finished()
as explained.
let my_foo: Foo = ..;
let task = my_foo.foo1();
let ret_val = task.start_the_task_and_wait_until_it_is_finished();
A task system still isn't simple neither to create nor to use. The leakpocalypse problem still makes multithreaded code difficult to write.
I just realized that the "subtask lifetime must outlive parent" thing still isn't safe, because the parent can return before the child if it
mem::forget
s the child.