By Marvin Desmond

Posted 31st March 2023

Asynchronous Programming in Julia

When a program needs to interact with the outside world, for example communicating with another machine over the internet, operations in the program may need to happen in an unpredictable order.
Say your program needs to download a file. We would like to initiate the download operation, perform other operations while we wait for it to complete, and then resume the code that needs the downloaded file when it is available. This sort of scenario falls in the domain of asynchronous programming, sometimes also referred to as concurrent programming (since, conceptually, multiple things are happening at once).

To address these scenarios, Julia provides Tasks (also known by several other names, such as symmetric coroutines, lightweight threads, cooperative multitasking, or one-shot continuations). When a piece of computing work (in practice, executing a particular function) is designated as a Task, it becomes possible to interrupt it by switching to another Task.

The original Task can later be resumed, at which point it will pick up right where it left off. At first, this may seem similar to a function call. However there are two key differences.
~ First, switching tasks does not use any space, so any number of task switches can occur without consuming the call stack.
~ Second, switching among tasks can occur in any order, unlike function calls, where the called function must finish executing before control returns to the calling function.

Basic Task operations

You can think of a Task as a handle to a unit of computational work to be performed. It has a create-start-run-finish lifecycle. Tasks are created by calling the Task constructor on a 0-argument function to run, or using the @task macro

What is a macro?

Macros change existing source code or generate entirely new code. They are not some kind of more powerful function that unlocks secret abilities of Julia, they are just a way to automatically write code that you could have written out by hand anyway. There’s just the question whether writing that code by hand is practical, not if it’s possible.

Macros have the distinct advantage of being more efficient (and faster) than functions, because their corresponding code is inserted directly into your source code at the point where the macro is called. There is no overhead involved in using a macro like there is in placing a call to a function

Note: A decorator is a higher order function: it takes a function in and returns another function. A macro is different because it actually takes in the syntax that defines a function as a list, unevaluated, manipulates the syntax as it sees fit, and then returns the changed syntax tree to be evaluated later. It's like it gets the text of the thing being decorated and can rewrite it however it wants.

Let's consider the function below:

function show_fruit(a::String)
    println("The fruit you passed is $a")
end

first_fruit = "orange"
second_fruit = "apple"

show_fruit(first_fruit)
show_fruit(second_fruit)

            [fourier]$ julia main.jl 
The fruit you passed is orange
The fruit you passed is apple
[fourier]$
          

In this code snippet above, there is no way for the author of the function to know what the user named their variable. The function just receives a value, and as far as it is concerned, that value is named a. Any information about what the user wrote is lost, as the function only knows "orange" and "apple" were passed. If we want to incorporate the information contained in the variable names, we need a macro.

macro show_fruit(a)
    return :( println("The ",$(string(a))," you passed is ", $a) )
end

first_fruit = "orange"
second_fruit = "apple"

@show_fruit(first_fruit) # OR @show_fruit first_fruit
@show_fruit(second_fruit) # OR @show_fruit second_fruit
end

            [fourier]$ julia main.jl
The first_fruit you passed is orange
The second_fruit you passed is apple
[fourier]$
          

Macro invocation

It is important to emphasize that macros receive their arguments as expressions, literals, or symbols. The above argument to the macro was taken as a Symbol.

In julia, an operation can be transformed into an expression as below:

            julia> 1 + 2
3

julia> :(1 + 2)
:(1 + 2)

julia> typeof(:(1 + 2))
Expr

julia> eval(:(1 + 2))
3

julia>
          

Let's define an expression, evaluate it using eval, write a custom evaluator macro that takes operations and transforms them to expressions implicitly, then evaluates them using macro's way of evaluation.

quiz = :(4 * 8)
println("Expression $(quiz) composed of $(quiz.args) becomes $(eval(quiz))")

macro evaluate(ex)
    return :($ex)
end
a = @evaluate 4 * 8
println(a)
a = @evaluate(4 + 6)
println(a)

            [fourier]$ julia main.jl
Expression 4 * 8 composed of Any[:*, 4, 8] becomes 32
32
10
[fourier]$
          

Let's create a simplified definition of Julia's @assert macro and call it superassert. Let's then understand the tool for debugging macros, macroexpand. Important note, this is an extremely useful tool for debugging macros

macro superassert(ex)
    return :( $ex ? nothing : throw(AssertionError($(string(ex)))) )
end

println(macroexpand(Fourier, :(@superassert(1 == 1)) )) 
# Fourier is the name of module
# If the name of the module is wrong, it will fail
# or use the macro equivalent of macroexpand
# println(@macroexpand @superassert 1 == 1)
@superassert 1 == 1.0

            [fourier]$ julia main.jl
if 1 == 1
    Main.Fourier.nothing
else
    Main.Fourier.throw(Main.Fourier.AssertionError("1 == 1"))
end
[fourier]$
          

Now on the final codebase on @macro, let's implement a macro which accepts a cube function and applies 2x^2 - x where x is the result of the cube function.

More on what happens behind @macro

function cube(x::Int)::Integer
    x * x * x
end

macro macroPower(a)
    return :(2 * $(a) ^ 2 - $(a))
end

println(@macroexpand @macroPower cube(3))
res = @macroPower cube(3)
println(res)

            [fourier]$ julia main.jl
2 * Main.cube(3) ^ 2 - Main.cube(3)
1431
[fourier]$
          

Now back onto Tasks

Let's implement a Task that will wait for five seconds, and then print done. A defined task does not run immediately but only after you schedule it to start running. Let's also include a for loop to see how the output of both occurs.

t = @task begin; sleep(5); println("done"); end

 schedule(t)

 for i = 0:10
     println("i -> $i")
     sleep(1)
 end

            [fourier]$ julia main.jl
i -> 0
i -> 1
i -> 2
i -> 3
i -> 4
done
i -> 5
i -> 6
i -> 7
i -> 8
i -> 9
i -> 10
[fourier]$
          

When we want to wait for the Task to finish hence blocking any other function call, we can use wait function

t = @task begin; sleep(5); println("done"); end

schedule(t)
wait(t)

for i = 0:10
    println("i -> $i")
    sleep(1)
end

            [fourier]$ julia main.jl
done
i -> 0
i -> 1
i -> 2
i -> 3
i -> 4
i -> 5
i -> 6
i -> 7
i -> 8
i -> 9
i -> 10
[fourier]$
          

It is common to want to create a task and schedule it right away, so the macro @async is provided for that purpose –- @async x is equivalent to schedule(@task x).

@async begin; sleep(5); println("done"); end


for i = 0:10
    println("i -> $i")
    sleep(1)
end

            [fourier]$ julia main.jl
i -> 0
i -> 1
i -> 2
i -> 3
i -> 4
done
i -> 5
i -> 6
i -> 7
i -> 8
i -> 9
i -> 10
[fourier]$
          

Communicating with channels

In some problems, the various pieces of required work are not naturally related by function calls; there is no obvious "caller" or "callee" among the jobs that need to be done. An example is the producer-consumer problem, where one complex procedure is generating values and another complex procedure is consuming them. The consumer cannot simply call a producer function to get a value, because the producer may have more values to generate and so might not yet be ready to return. With tasks, the producer and consumer can both run as long as they need to, passing values back and forth as necessary.

Julia provides a Channel mechanism for solving this problem. A Channel is a waitable first-in first-out queue which can have multiple tasks reading from and writing to it.

Let's define a producer task, which produces values via the put! call. To consume values, we need to schedule the producer to run in a new task. A special Channel constructor which accepts a 1-arg function as an argument can be used to run a task bound to a channel. We can then take! values repeatedly from the channel object

function producer(c::Channel)
    put!(c, "start")
    for n=1:4
        put!(c, 2n)
    end
    put!(c, "stop")
end


# EITHER
chnl = Channel(producer)
res = take!(chnl)
println(res)
res = take!(chnl)
println(res)
res = take!(chnl)
println(res)
res = take!(chnl)
println(res)
res = take!(chnl)
println(res)
res = take!(chnl)
println(res)

#= OR
for x in Channel(producer)
    println(x)
end
=#

            [fourier]$ julia main.jl
start
2
4
6
8
stop
[fourier]$
          

One way to think of this behavior is that producer was able to return multiple times. Between calls to put!, the producer's execution is suspended and the consumer has control.

When you use the EITHER way to get values from the channel, and if you request values when the consumer has already called all values provided by the producer and the producer has closed the channel, you will get LoadError ~ Channel is closed. So I recommend always, use the OR way. Always!

While the Task constructor expects a 0-argument function, the Channel method that creates a task-bound channel expects a function that accepts a single argument of type Channel. A common pattern is for the producer to be parameterized, in which case a partial function application is needed to create a 0 or 1 argument anonymous function.\

#= By a partial function I mean instead of 
calling function this way =#
function(args)
# we use
() -> function(args)

Let's create a recursive fibonacci function and then create a task-bound channel by way of partial function application

function fibonacci(fib::Vector{Int64}, needed::Int)
    while length(fib) < needed
        a = fib[end - 1:end - 1][1] # OR first(a[end-1:end-1])
        b = last(fib)
        a, b = b, a + b
        push!(fib, b)
        fibonacci(fib, needed)
    end
    return fib
end

fib_numbers = Vector{Int64}([0, 1])

fib_task = Task(() -> global fib_numbers = fibonacci(fib_numbers, 20))
schedule(fib_task)

println(fib_numbers)

Rewriting the fibonacci to compute the first two values dynamically instead of passing them to the function, the modified function becomes

function fibonacci(fib::Vector{Int64}, needed::Int)
    while length(fib) < needed
        a = length(fib) == 0 ? 0 : 
        (length(fib) == 1 ? last(fib) : fib[end - 1:end - 1][1])
        b = length(fib) == 0 ? a :
        (length(fib) == 1 ? 1 : last(fib))
        a, b = b, a + b
        push!(fib, b)
        fibonacci(fib, needed)
    end
    return fib
end

fib_numbers = Int64[]

fib_task = @task global fib_numbers = fibonacci(fib_numbers, 10)
schedule(fib_task)
# if maybe the list is not updated
# add the line below just in case
wait(fib_task)
println(fib_numbers)

            [fourier]$ julia main.jl
[0, 1, 1, 2, 3, 5, 8, 13, 21, 34, 55, 89, 144, 233, 377, 610, 987, 1597, 2584, 4181]
[fourier]$
          

Binding tasks/channels

We've handled channels with producer and consumer tasks, and we've also handled tasks before this together with schedule, wait and async. Now let's handle binding multiple tasks to a channel and multiple channels to a task. In summary, let's explicitly link a set of channels with a set of producer/consumer tasks.

When a channel is bound to multiple tasks, the first task to terminate will close the channel. When multiple channels are bound to the same task, termination of the task will close all of the bound channels.

Let's bind a single task to a single channel

c = Channel(0)

task = @async begin; foreach(i -> put!(c, i), 1:4); end

bind(c, task)

for i in c 
    @show i
end
# println(take!(c))

            [fourier]$ julia main.jl
i = 1
i = 2
i = 3
i = 4
[fourier]$
          

If you try accessing value from the channel after all values have been read from the channel, an error occurs. This is because binding channel to the task will close the channel created if the task ends.

One Channel, Many tasks

Let's bind two tasks to a channel:
Result ~ When the first task completes, the channel should close and the program ends without the second task completing.

c = Channel(0)

task = @async begin; foreach(i -> put!(c, i), 1:4); end
task2 = @async begin; foreach(i -> put!(c, i), 20:30); end

bind(c, task)
bind(c, task2)

for i in c 
    @show i
end

            [fourier]$ julia main.jl
i = 1
i = 20
i = 2
i = 21
i = 3
i = 22
i = 4
[fourier]$
          

One way to handle this is to:
1~ Declare the channel with large buffer size. This will make channel receive the values synchronously.
2~ Instead of binding the tasks to the channels, only close the channel when all the tasks are done.

# Step 1 ~ Declare the channel with large buffer size
c = Channel{Int64}(100)

task = @async begin; foreach(i -> put!(c, i), 1:4); end
task2 = @async begin; foreach(i -> put!(c, i), 20:30); end

bind(c, task)
bind(c, task2)

for i in c 
    @show i
end

            [fourier]$ julia main.jl
i = 1
i = 2
i = 3
i = 4
i = 20
i = 21
i = 22
i = 23
i = 24
i = 25
i = 26
i = 27
i = 28
i = 29
i = 30
[fourier]$
          

#=
Step 2
Instead of binding the tasks to the channels, 
only close the channel when all the tasks are done.
=#
c = Channel{Int64}(0)

task = @async begin; foreach(i -> put!(c, i), 1:4); end
task2 = @async begin; foreach(i -> put!(c, i), 20:30); end

for i in c 
    @show i
    if all(i -> istaskdone(i), [task, task2])
        close(c)
    end
end

            [fourier]$ julia main.jl
i = 1
i = 20
i = 2
i = 21
i = 3
i = 22
i = 4
i = 23
i = 24
i = 25
i = 26
i = 27
i = 28
i = 29
i = 30
[fourier]$
          

One Task, Many Channels

c = Channel{Int64}(100)
d = Channel{Float64}(100)

task2 = @async begin; foreach(i -> (put!(c, i); put!(d, sqrt(i));), 20:30); end

bind(c, task2)
bind(d, task2)

for i in c 
    @show i 
end

for i in d 
    @show i
end

            [fourier]$ julia main.jl
i = 20
i = 21
i = 22
i = 23
i = 24
i = 25
i = 26
i = 27
i = 28
i = 29
i = 30
i = 4.47213595499958
i = 4.58257569495584
i = 4.69041575982343
i = 4.795831523312719
i = 4.898979485566356
i = 5.0
i = 5.0990195135927845
i = 5.196152422706632
i = 5.291502622129181
i = 5.385164807134504
i = 5.477225575051661
[fourier]$
          

This is the first step in parallel computing. Gracias!