Pipeline macros
All macros have a currified version, so they can be easily concatenated using |>
. For example:
julia> t = table([1,2,1,2], [4,5,6,7], [0.1, 0.2, 0.3,0.4], names = [:x, :y, :z]);
julia> t |> @where(:x >= 2) |> @transform({:x+:y})
Table with 2 rows, 4 columns:
x y z x + y
────────────────
2 5 0.2 7
2 7 0.4 9
To avoid the parenthesis and to use the _
curryfication syntax, you can use the @apply
macro instead:
JuliaDBMeta.@apply
— Macro.@apply(args...)
Concatenate a series of operations. Non-macro operations from JuliaDB, are supported via the _
curryfication syntax. A second optional argument is used for grouping:
julia> t = table([1,2,1,2], [4,5,6,7], [0.1, 0.2, 0.3,0.4], names = [:x, :y, :z]);
julia> @apply t begin
@where :x >= 2
@transform {:x+:y}
sort(_, :z)
end
Table with 2 rows, 4 columns:
x y z x + y
────────────────
2 5 0.2 7
2 7 0.4 9
julia> @apply t :x flatten=true begin
@transform {w = :y + 1}
sort(_, :w)
end
Table with 4 rows, 4 columns:
x y z w
────────────
1 4 0.1 5
1 6 0.3 7
2 5 0.2 6
2 7 0.4 8
Use @applychunked
to apply your pipeline independently on different processors:
JuliaDBMeta.@applychunked
— Macro.@applychunked(args...)
Split the table into chunks, apply the processing pipeline separately to each chunk and return the result as a distributed table.
julia> t = table([1,2,1,2], [4,5,6,7], [0.1, 0.2, 0.3,0.4], names = [:x, :y, :z], chunks = 2);
julia> @applychunked t begin
@where :x >= 2
@transform {:x+:y}
sort(_, :z)
end
Distributed Table with 2 rows in 2 chunks:
x y z x + y
────────────────
2 5 0.2 7
2 7 0.4 9