SyncBarriers.jl
SyncBarriers
— ModuleSyncBarriers
SyncBarriers.jl provides various implementations of barrier for shared memory synchronization and reductions in concurrent Julia programs. It respects the cooperative multitasking nature of Julia's task system while allowing the programmers to express and leverage the structure of the parallelism in their program.
See the documentation for more information.
Note: Appropriate insertion of barriers for correct and efficient parallel program is rather hard. For casual programming, it is recommended to ues higher-level data-parallel approaches.
A toy example
julia> using SyncBarriers
julia> xs = zeros(Bool, 20);
julia> xs[end÷2] = true;
julia> barrier = Barrier(length(xs) - 2);
julia> @sync for i in 2:length(xs)-1
b = barrier[i-1]
Threads.@spawn begin
if i == 2
println()
join(stdout, (" █"[x + 1] for x in xs))
println()
end
for _ in 1:8
cycle!(b) # wait for print
l, c, r = xs[i-1:i+1] # (loading)
cycle!(b) # wait for load
xs[i] = l ⊻ (c | r) # (storing)
cycle!(b) # wait for store
if i == 2
join(stdout, (" █"[x + 1] for x in xs))
println()
end
end
end
end
█
███
██ █
██ ████
██ █ █
██ ████ ███
██ █ █ █
██ ████ ██████
██ █ ███ █
See the benchmarks for examples with actual performance considerations.
Barrier factories
Barrier factories create a barrier with a given property without specifying the actual implementation. They use simple heuristics to determine an appropriate implementation.
SyncBarriers.Barrier
— TypeBarrier(ntasks::Integer) -> barrier
Create a barrier for ntasks
tasks. Call cycle!(barrier[i])
in the i
-th task for waiting for other tasks to arrive at the same phase.
The actual returned concrete type is not the part of API. It is CentralizedBarrier
for small ntasks
and DisseminationBarrier
for large ntasks
.
Supported method: cycle!
SyncBarriers.reduce_barrier
— Functionreduce_barrier(op, T::Type, ntasks::Integer) -> barrier::Barrier
Create a reduce barrier for ntasks
tasks. A reduce barrier supports computing a reduction with an associative operator op(::T, ::T)
across tasks by calling reduce!(barrier[i], xᵢ::T)
.
SyncBarriers.fuzzy_barrier
— Functionfuzzy_barrier(ntasks::Integer) -> barrier::Barrier
Create a fuzzy barrier for ntasks
tasks. In addition to the methods supported by "plain" barriers (see Barrier
), fuzzy barriers support arrive!(barrier[i])
and depart!(barrier[i])
to do cycle!
in two steps.
SyncBarriers.fuzzy_reduce_barrier
— Functionfuzzy_reduce_barrier(op, T::Type, ntasks::Integer) -> barrier::Barrier
Create a fuzzy reduce barrier for ntasks
tasks. In addition to the methods supported by reduce barriers (see reduce_barrier
], fuzzy reduce barriers support reduce_arrive!(barrier[i], xᵢ)
and depart!(barrier[i])
.
Barrier constructors
SyncBarriers.CentralizedBarrier
— TypeCentralizedBarrier(ntasks::Integer)
Create the sense-reversing centralized barrier for ntasks
tasks. It supports fuzzy barrier methods. For small ntasks
(⪅ 32), it provides the best performance.
SyncBarriers.DisseminationBarrier
— TypeDisseminationBarrier(ntasks::Integer)
Create the dissemination barrier for ntasks
tasks. It provides the best performance especially for large ntasks
(⪆ 32).
Supported method: cycle!
SyncBarriers.StaticTreeBarrier
— TypeStaticTreeBarrier{NArrive,NDepart}(ntasks::Integer)
StaticTreeBarrier{NArrive,NDepart,T}(op, ntasks::Integer)
Create the static tree barrier for ntasks
tasks with the branching factor for arrival NArrive::Integer
and departure NDepart::Integer
specified by the type parameters.
It support fuzzy reduce barrier methods if the associative operations op
and its domain T
are given. Otherwise, it only supports fuzzy barrier methods.
It provides the best performance for large ntasks
(⪆ 32) when reduction is needed.
SyncBarriers.TreeBarrier
— TypeTreeBarrier{NBranches}(ntasks::Integer)
TreeBarrier{NBranches,T}(op, ntasks::Integer)
Create the tree barrier for ntasks
tasks with the branching factor specified by the type parameter NBranches::Integer
.
It support fuzzy reduce barrier methods if the associative operations op
and its domain T
are given. Otherwise, it only supports fuzzy barrier methods.
Supported methods: cycle!
, arrive!
, depart!
reduce!
, reduce_arrive!
SyncBarriers.FlatTreeBarrier
— TypeFlatTreeBarrier{NBranches}(ntasks::Integer)
FlatTreeBarrier{NBranches,T}(op, ntasks::Integer)
Create the tree barrier for ntasks
tasks with the branching factor specified by the type parameter NBranches::Integer
. The departure is done serially (hence "flat").
It support fuzzy reduce barrier methods if the associative operations op
and its domain T
are given. Otherwise, it only supports fuzzy barrier methods.
Supported methods: cycle!
, arrive!
, depart!
reduce!
. reduce_arrive!
.
Synchronizing operations
SyncBarriers.cycle!
— Functioncycle!(barrier[i])
Using a barrier::Barrier
, signal that the i::Integer
-th task has reached a certain phase of the program and wait for other tasks to reach the same phase.
Examples
julia> using SyncBarriers
julia> xs = [1:3;];
julia> barrier = Barrier(3);
julia> @sync for i in 1:3
Threads.@spawn begin
x = i^2
xs[i] = x
cycle!(barrier[i])
xs[mod1(i + 1, 3)] -= x
end
end
julia> xs
3-element Vector{Int64}:
-8
3
5
SyncBarriers.arrive!
— Functionarrive!(barrier[i])
Signal that the i::Integer
-th task has reached a certain phase but postpone the synchronization for the departure.
A call to cycle!
is equivalent to arrive!
followed by depart!
. However, the task calling arrive!
can work on some other local computations before calling depart!
which waits for other tasks to call arrive!
.
Note that not all Barrier
subtypes support arrive!.
See fuzzy_barrier
, depart!
.
Examples
julia> using SyncBarriers
julia> xs = [1:3;];
julia> ys = similar(xs);
julia> barrier = fuzzy_barrier(3);
julia> @sync for i in 1:3
Threads.@spawn begin
x = i^2
xs[i] = x
arrive!(barrier[i]) # does not `wait`
ys[i] = x - 1 # do some work while waiting for other tasks
depart!(barrier[i]) # ensure all tasks have reached `arrive!`
xs[mod1(i + 1, 3)] -= x
end
end
julia> xs
3-element Vector{Int64}:
-8
3
5
julia> ys
3-element Vector{Int64}:
0
3
8
SyncBarriers.depart!
— Functiondepart!(barrier[i])
depart!(barrier[i]) -> acc::T
Wait for all calls to arrive!(barrier[i])
or reduce_arrive!(barrier[i], _)
for i = 1, 2, ..., ntasks
.
If the barrier
is a fuzzy reduce barrier (created, e.g., by fuzzy_reduce_barrier(op, T, ntasks)
), it returns the result of reduction started by the prior call to reduce_arrive!(barrier[i], xᵢ::T)
.
Note that not all Barrier
subtypes support depart!.
See fuzzy_barrier
, arrive!
, reduce_arrive!
.
SyncBarriers.reduce!
— Functionreduce!(barrier[i], xᵢ::T) -> acc::T
Using a reduce barrier barrier
(created, e.g., by reduce_barrier(⊗, T, n)
), it computes acc = x₁ ⊗ x₂ ⊗ ⋯ ⊗ xₙ
.
Examples
julia> using SyncBarriers
julia> xs = Float64[1:4;];
julia> barrier = reduce_barrier(+, Float64, length(xs));
julia> @sync for i in eachindex(xs)
Threads.@spawn begin
x = i^2
s = reduce!(barrier[i], x)
m = s / length(xs)
xs[i] = x - m
end
end
julia> xs
4-element Vector{Float64}:
-6.5
-3.5
1.5
8.5
SyncBarriers.reduce_arrive!
— Functionreduce_arrive!(barrier[i], xᵢ::T)
Using a fuzzy reduce barrier barrier
(created, e.g., by fuzzy_reduce_barrier(op, T, ntasks)
), it initiates the reduction across tasks. The result of the reduction can be retrieved by a call to depart!(barrier[i])
once all tasks have called reduce_arrive!
.
Examples
julia> using SyncBarriers
julia> xs = Float64[1:4;];
julia> ys = similar(xs);
julia> barrier = fuzzy_reduce_barrier(+, Float64, length(xs));
julia> @sync for i in eachindex(xs)
Threads.@spawn begin
x = i^2
reduce_arrive!(barrier[i], x)
ys[i] = x - 1
s = depart!(barrier[i])
m = s / length(xs)
xs[i] = x - m
end
end
julia> xs
4-element Vector{Float64}:
-6.5
-3.5
1.5
8.5
julia> ys
4-element Vector{Float64}:
0.0
3.0
8.0
15.0