Pros :
Cons :
Pros :
Cons :
Scientists need to easily explore new ideas :
Need for customizations (avoid black boxes, try variations)
But also need for performance (intrinsic goal or to allow re-use)
What is done now (when you need to go further than using existing packages, for e.g. data analysis)
1) Prototyping in R/Python/Matlab
2) Rewriting whole or parts (if money/man power) to a performant low-level language as C/Fortran
Its goal : Solve the two-languages problem by having a dynamic, high level language with good mathematical expressivity able to be as fast as C/Fortran.
A = randn(4,4) # 4 x 4 matrix of normally distributed random numbers
4×4 Array{Float64,2}: -1.7897 -1.27394 -0.796802 -1.12804 -0.843538 -0.607065 0.187264 0.0447423 -0.83152 -0.483635 0.0594034 -0.607388 0.194026 1.4971 0.533445 -0.379708
A[:,1] # familiar colon syntax --extract first column of A
4-element Array{Float64,1}: -1.7896952667660175 -0.843538211646361 -0.8315200775522081 0.19402554880895634
using LinearAlgebra
b = randn(4)
x = A\b # solve Ax=b for x using backslash operator
norm(A*x-b)
5.551115123125783e-16
U, σ, V = svd(A); # singular value decomposition --note unicode variable name σ
Σ = diagm(0 => σ)
norm(A - U*Σ*V') # compute error of SVD factorization A = U Σ V'
8.164634113057559e-16
function loop()
a=0
for i in 1:1_000_000
a+=1;
end
end
@time loop() # compile time
@time loop() # now fast
0.004979 seconds (20.83 k allocations: 1.173 MiB) 0.000001 seconds (4 allocations: 160 bytes)
I know this is not Matlab-friendly, that's the point !
function [] = loop()
a=0;
for i=1:1000000
a=i+1;
end
end
f = @() loop();
timeit(f) => 0.0018s
1000x slower !
Takeaway: Julia timings are clustered around C timing, Matlab/Python/R orders of magnitude slower.
Takeaway: The Julia PDE code is almost as compact as Matlab/Python, almost as fast as C/Fortran.
**Just-in-time ahead-of-time compilation
Functions are compiled to machine code when first run. Subsequent runs are as fast as compiled C, C++, Fortran. Not a tracing JIT (like pypy)
Like a C struct, or a Python object... but without methods
struct Spaceship
speed::Float64
Position::Array{Float64,2}
end
collide(a::Asteroid, s::Spaceship) # a first method of the function collide
collide(s1::Spaceship, s2::Spaceship) # another method of the function collide
Ref https://medium.com/@Jernfrost/defining-custom-units-in-julia-and-python-513c34a4c971
See also (very interesting!) : https://www.youtube.com/watch?v=kc9HwsxE1OY
abstract type Temperature end
struct Celsius <: Temperature
value
end
struct Kelvin <: Temperature
value::Float64
end
struct Fahrenheit <: Temperature
value::Float64
end
import Base: promote_rule, convert
promote_rule(::Type{Kelvin}, ::Type{Celsius}) = Kelvin
promote_rule(::Type{Fahrenheit}, ::Type{Celsius}) = Celsius
promote_rule(::Type{Fahrenheit}, ::Type{Kelvin}) = Kelvin
convert(::Type{Kelvin}, t::Celsius) = Kelvin(t.value + 273.15)
convert(::Type{Kelvin}, t::Fahrenheit) = Kelvin(Celsius(t))
convert(::Type{Celsius}, t::Kelvin) = Celsius(t.value - 273.15)
convert(::Type{Celsius}, t::Fahrenheit) = Celsius((t.value - 32)*5/9)
convert(::Type{Fahrenheit}, t::Celsius) = Fahrenheit(t.value*9/5 + 32)
convert(::Type{Fahrenheit}, t::Kelvin) = Fahrenheit(Celsius(t))
import Base: +,-,*
+(x::T, y::T) where {T <: Temperature} = T(x.value + y.value)
-(x::T, y::T) where {T <: Temperature} = T(x.value - y.value)
+(x::Temperature, y::Temperature) = +(promote(x,y)...)
-(x::Temperature, y::Temperature) = -(promote(x,y)...)
*(x::Number, y::T) where {T <: Temperature} = T(x * y.value)
const °C = Celsius(1); const °F = Fahrenheit(1); const K = Kelvin(1);
1K+1°C
Kelvin(275.15)
f(x) = 4x*(1-x) # define logistic map
@time f(0.3); # first run is slow
@time f(0.4); # second run is a thousand times faster
0.008778 seconds (18.00 k allocations: 1.003 MiB) 0.000003 seconds (5 allocations: 176 bytes)
user Julia -> generic Julia expression -> typed Julia expression -> intermediate compiler language -> machine code
@code_lowered f(0.3) # show f(x) as generic Julia expression
CodeInfo( 1 ─ %1 = 4 * x │ %2 = 1 - x │ %3 = %1 * %2 └── return %3 )
@code_typed f(0.3) # show f(x) as Julia expr with inferred types, based on the arg types
CodeInfo( 1 ─ %1 = (Base.mul_float)(4.0, x)::Float64 │ %2 = (Base.sub_float)(1.0, x)::Float64 │ %3 = (Base.mul_float)(%1, %2)::Float64 └── return %3 ) => Float64
@code_llvm f(0.3) # show f(x) in intermediate compiler language (LLVM)
; @ In[9]:1 within `f' define double @julia_f_12939(double) { top: ; ┌ @ promotion.jl:314 within `*' @ float.jl:399 %1 = fmul double %0, 4.000000e+00 ; └ ; ┌ @ promotion.jl:315 within `-' @ float.jl:397 %2 = fsub double 1.000000e+00, %0 ; └ ; ┌ @ float.jl:399 within `*' %3 = fmul double %1, %2 ; └ ret double %3 }
We can see that if we call f on an Integer we get a code specialised for integer (i64)
@code_native f(0.3) # show f(x) in IA-64 assembly language
.text ; ┌ @ In[9]:1 within `f' movabsq $140409732451304, %rax # imm = 0x7FB3B039CBE8 ; │┌ @ promotion.jl:314 within `*' @ float.jl:399 vmulsd (%rax), %xmm0, %xmm1 movabsq $140409732451312, %rax # imm = 0x7FB3B039CBF0 ; │└ ; │┌ @ promotion.jl:315 within `-' @ float.jl:397 vmovsd (%rax), %xmm2 # xmm2 = mem[0],zero vsubsd %xmm0, %xmm2, %xmm0 ; │└ ; │┌ @ float.jl:399 within `*' vmulsd %xmm0, %xmm1, %xmm0 ; │└ retq nopw %cs:(%rax,%rax) ; └
@code_native f(3) # show f(x) in IA-64 assembly language
.text ; ┌ @ In[9]:1 within `f' ; │┌ @ In[9]:1 within `-' movl $1, %eax subq %rdi, %rax ; │└ ; │┌ @ int.jl:54 within `*' imulq %rdi, %rax shlq $2, %rax ; │└ retq nopw %cs:(%rax,%rax) ; └
using LinearAlgebra
# First we write a generic function (that takes an array A and a list of vectors
# and sums the inner products of A and each vector)
function inner_sum(A,vs)
t = zero(eltype(A))
for v in vs
t += inner(v,A,v)
end
return t
end
# we define the inner product used in our inner_sum function
inner(v,A,w) = dot(v,A*w)
## Now we want a new type, the famous One Hot Vector
import Base: *
struct OneHotVector <: AbstractVector{Bool}
idx::UInt32
len::UInt32
end
Base.size(xs::OneHotVector) = (Int64(xs.len),)
Base.getindex(xs::OneHotVector, i::Integer) = i == xs.idx
Base.getindex(xs::OneHotVector, ::Colon) = OneHotVector(xs.idx, xs.len)
# It works (generic) ! but it's slow
v = [OneHotVector(i,1000) for i in rand(1:1000,100)]
A=rand(1000,1000)
inner_sum(A,v)
@time inner_sum(A,v)
# And now specify to get speed !
A::AbstractMatrix * b::OneHotVector = A[:, b.idx]
inner(v::OneHotVector,A,w::OneHotVector) = A[v.idx,w.idx] # How to do that in single dispatch ??
inner_sum(A,v)
@time inner_sum(A,v)
0.037617 seconds (105 allocations: 793.922 KiB) 0.000003 seconds (5 allocations: 176 bytes)
51.52316795511521
using Pkg #load the Pkg stdlib
Pkg.activate(".") #activate the local environment
"/home/raphael/Projets/Perso/tests-julia/julia-intro/Project.toml"
Solve the Lorenz equations:
$$ \begin{align} \frac{dx}{dt} &= σ(y-x) \\ \frac{dy}{dt} &= x(ρ-z) - y \\ \frac{dz}{dt} &= xy - βz \\ \end{align} $$#Pkg.add("DifferentialEquations") # add the Differential equation suite
using DifferentialEquations # first time is very slow (precompilation)
using Plots
gr()
Plots.GRBackend()
function lorenz(du,u,p,t)
du[1] = 10.0*(u[2]-u[1])
du[2] = u[1]*(28.0-u[3]) - u[2]
du[3] = u[1]*u[2] - (8/3)*u[3]
end
u0 = [1.0;0.0;0.0] ; tspan = (0.0,100.0);
prob = ODEProblem(lorenz,u0,tspan)
sol = DifferentialEquations.solve(prob)
Plots.plot(sol,vars=(1,2,3))
using JuMP, GLPK, Test
"""
Formulate and solve a simple LP:
max 5x + 3y
st 1x + 5y <= 3
0 <= x <= 2
0 <= y <= 30
"""
function example_basic()
model = Model(with_optimizer(GLPK.Optimizer))
@variable(model, 0 <= x <= 2)
@variable(model, 0 <= y <= 30)
@objective(model, Max, 5x + 3y)
@constraint(model, 1x + 5y <= 3.0)
println(model)
JuMP.optimize!(model)
obj_value = JuMP.objective_value(model)
x_value = JuMP.value(x)
y_value = JuMP.value(y)
println("Objective value: ", obj_value)
println("x = ", x_value)
println("y = ", y_value)
@test obj_value ≈ 10.6
@test x_value ≈ 2
@test y_value ≈ 0.2
end
example_basic()
┌ Info: Precompiling GLPK [60bf3e95-4087-53dc-ae20-288a0d20c6a6] └ @ Base loading.jl:1186
Max 5 x + 3 y Subject to x + 5 y ≤ 3.0 x ≥ 0.0 y ≥ 0.0 x ≤ 2.0 y ≤ 30.0 Objective value: 10.6 x = 2.0 y = 0.2
Test Passed
using DifferentialEquations, Measurements, Plots
g = 9.79 ± 0.02; # Gravitational constants
L = 1.00 ± 0.01; # Length of the pendulum
#Initial Conditions
u₀ = [0 ± 0, π / 60 ± 0.01] # Initial speed and initial angle
tspan = (0.0, 6.3)
#Define the problem
function simplependulum(du,u,p,t)
θ = u[1]
dθ = u[2]
du[1] = dθ
du[2] = -(g/L)*θ
end
#Pass to solvers
prob = ODEProblem(simplependulum, u₀, tspan)
sol = DifferentialEquations.solve(prob, Tsit5(), reltol = 1e-6)
# Analytic solution
u = u₀[2] .* cos.(sqrt(g / L) .* sol.t)
plot(sol.t, getindex.(sol.u, 2), label = "Numerical")
plot!(sol.t, u, label = "Analytic")
#Pkg.add("ForwardDiff")
using ForwardDiff
# Define a function f
f(x::Vector) = sum(sin, x) + prod(tan, x) * sum(sqrt, x);
x = rand(5) # small size for example's sake
# Get the gradient of f
g = x -> ForwardDiff.gradient(f, x); # g = ∇f
@show g(x)
# Get the Hessian
ForwardDiff.hessian(f, x)
g(x) = [2.57287, 2.51714, 2.3224, 1.87393, 1.8107]
5×5 Array{Float64,2}: 1.39246 4.80673 4.31075 3.52545 3.74384 4.80673 1.37213 4.18168 3.41991 3.63178 4.31075 4.18168 1.32308 3.06703 3.25711 3.52545 3.41991 3.06703 1.78856 2.66366 3.74384 3.63178 3.25711 2.66366 2.8778
Classic MNIST number classification with Flux.jl
using Flux, Flux.Data.MNIST, Statistics
using Flux: onehotbatch, onecold, crossentropy, throttle
using Base.Iterators: repeated
# using CuArrays
# Classify MNIST digits with a simple multi-layer-perceptron
# Load images
imgs = MNIST.images()
# Stack images into one large batch
X = hcat(float.(reshape.(imgs, :))...) |> gpu
labels = MNIST.labels()
# One-hot-encode the labels
Y = onehotbatch(labels, 0:9) |> gpu
# Define the neural network (two Dense layers)
m = Chain(
Dense(28^2, 32, relu),
Dense(32, 10),
softmax) |> gpu
# define loss function and metric
loss(x, y) = crossentropy(m(x), y)
accuracy(x, y) = mean(onecold(m(x)) .== onecold(y))
# create batches of 100
dataset = repeated((X, Y), 100)
# Define callback that show the current loss
evalcb = () -> @show(loss(X, Y))
opt = ADAM()
# Train
Flux.train!(loss, params(m), dataset, opt, cb = throttle(evalcb, 10))
@show accuracy(X, Y)
# Test set accuracy
tX = hcat(float.(reshape.(MNIST.images(:test), :))...) |> gpu
tY = onehotbatch(MNIST.labels(:test), 0:9) |> gpu
@show accuracy(tX, tY)
loss(X, Y) = 2.2864406f0 (tracked) loss(X, Y) = 1.495227f0 (tracked) loss(X, Y) = 1.018613f0 (tracked) loss(X, Y) = 0.7433482f0 (tracked) loss(X, Y) = 0.60261893f0 (tracked) loss(X, Y) = 0.51817137f0 (tracked) loss(X, Y) = 0.46358913f0 (tracked) loss(X, Y) = 0.41939938f0 (tracked) loss(X, Y) = 0.39135918f0 (tracked) accuracy(X, Y) = 0.8994333333333333 accuracy(tX, tY) = 0.9029
0.9029
Predict the number for a given image
n = rand(1:length(MNIST.images(:test)))
print("Prediction:",argmax(tY[:,n])-1)
heatmap(MNIST.images(:test)[n])
Prediction:5
=> Scientific Machine learning
http://www.stochasticlifestyle.com/the-essential-tools-of-scientific-machine-learning-scientific-ml/
https://fluxml.ai/2019/03/05/dp-vs-rl.html
https://fluxml.ai/2019/02/07/what-is-differentiable-programming.html
Interop possible avec Python, Matlab, R, Java, C/Fortran,...
#Pkg.add("PyCall") # add python binding package
using PyCall
@pyimport math # import python math library
@show math.sin(math.pi / 4)
@show sin(pi / 4) ; #Julia native sinus function
math.sin(math.pi / 4) = 0.7071067811865475 sin(pi / 4) = 0.7071067811865475
# call C
t = ccall((:clock, "libc.so.6"), Int32, ())
854653866
# @time macro inserts timing and memory profiling into expression, then evaluates, and prints
@time f(2//3)
0.076290 seconds (114.51 k allocations: 6.258 MiB, 24.08% gc time)
8//9
# @which macro determines which function is called, provides link to source code on GitHub
@which exp(π)
@which exp(π*im)
Macros enable run-time code generation and transformation.
Applications :
Some very trivial examples of Julia's built-in parallelism
using Distributed
#Pkg.add("DistributedArrays")
# add 4 process
addprocs(4);
; cat codes/count_heads.jl
function count_heads(n) c::Int = 0 for i=1:n c += rand(Bool) end c end
# define function on all process
@everywhere include("codes/count_heads.jl")
# dispatch two tasks
a = @spawn count_heads(10000000)
b = @spawn count_heads(10000000)
(fetch(a)+fetch(b))/20000000 # Get proba of drawing a head
0.49999545
# distribute the simulation on all process and sum results
nheads = @distributed (+) for i=1:200000000
Int(rand(Bool))
end
nheads/200000000
0.499957515
And more :
Just changing the behavior of the underlying structure can bring new hardware support for all packages
Julia
Not covered
Thanks : the Julia community for most of the examples, xkcd
Julia website: http://www.julialang.org, this talk: https://github.com/raphbacher/julia-intro