You don't know what to choose between Matlab and Python :¶

Try¶

Play with the slides¶

https://mybinder.org/v2/gh/raphbacher/julia-intro/master?filepath=presentation_julia.ipynb

Sources:

https://github.com/raphbacher/julia-intro

Outline¶

State of the art : the two language problem
What is Julia
Why Julia is fast
Julia Ecosystem & interop

Matlab¶

Pros :

Polished product with support
Simulink
High level syntax

Cons :

Closed : not everyone has access to it, impossible to put a demonstrator online.
Slow loops (better since 2015) : not everything is pretty once vectorized
Not fast per se (Fortran bindings), memory management is difficult
Not a generalist language (cumbersome to put a demo on the web e.g.)
€€ each year for a lab

Python¶

Pros :

Free and open-source, generalist, widly used outside scientific community
Lot of scientific communities are embracing it
Lots of efforts to make it fast (numba, ...)

Cons :

Scientific computing is not native :
- all fallback to C/Fortran black-boxes -> limit flexibility
Object Oriented paradigm can be cumbersome for scientific code

Science and the two languages problem¶

Scientists need to easily explore new ideas :

Need for mathematical abstractions
Need for customizations (avoid black boxes, try variations)
But also need for performance (intrinsic goal or to allow re-use)

What is done now (when you need to go further than using existing packages, for e.g. data analysis)

1) Prototyping in R/Python/Matlab

2) Rewriting whole or parts (if money/man power) to a performant low-level language as C/Fortran

Yet another language¶

Here comes Julia :¶

innovative open-source generalist programming language with scientific computing at its core
easy as Matlab, fast as Fortran, flexible as Python, deep as LISP
leverages (for now...) major C/Fortran science packages, e.g. LAPACK, MPI, FFTW...
5th language to run at 1 petaflops (Celeste project), after assembly, Fortran, C, C++
State of the art packages in Optimization (JUMP), Differential Equations (DifEq), ... Well positioned for ML (Flux, Knet, autodiff...)
solves the "two-languages problem"

Background¶

origins at MIT, UCSB applied math & computer science, 2010
founders Viral Shah, Jeff Bezanson, Stefan Karpinsky (now at Julia Computing LLC), Alan Edelman (MIT)
~10-person core language team, ~870 contributors, ~2500 registered packages
support from Intel, Microsoft, Wall St., Moore Foundation, NumFocus
julia-0.1 released in 2012, julia-1.0 in August 2018, now julia-1.2

Its goal : Solve the two-languages problem by having a dynamic, high level language with good mathematical expressivity able to be as fast as C/Fortran.

Disclaimer¶

I am not a Julia expert : I've been following the language for ~ two years and I am currently porting python research codes to Julia.
Julia (like Python...) is not a drop-in replacement for Matlab, it is an alternative.
Don't fix that ain't broken :
- if Matlab/R/Python/C is good for you, no need to change
- but if you are sometimes faced with some limitations of these languages, give Julia a try

Julia: expressive as Matlab¶

Julia can as expressive as Matlab for interactive calculations with built-in numerical libraries and plotting. A few examples...

Arrays¶

In [1]:

A = randn(4,4)       # 4 x 4 matrix of normally distributed random numbers

Out[1]:

4×4 Array{Float64,2}:
 -1.7897    -1.27394   -0.796802   -1.12804  
 -0.843538  -0.607065   0.187264    0.0447423
 -0.83152   -0.483635   0.0594034  -0.607388 
  0.194026   1.4971     0.533445   -0.379708

In [2]:

A[:,1]                # familiar colon syntax --extract first column of A

Out[2]:

4-element Array{Float64,1}:
 -1.7896952667660175 
 -0.843538211646361  
 -0.8315200775522081 
  0.19402554880895634

Linear Algebra¶

In [3]:

using LinearAlgebra
b = randn(4)          
x = A\b               # solve Ax=b for x using backslash operator
norm(A*x-b)

Out[3]:

5.551115123125783e-16

In [4]:

U, σ, V = svd(A);     # singular value decomposition --note unicode variable name σ

In [5]:

Σ = diagm(0 => σ)
norm(A - U*Σ*V')      # compute error of SVD factorization A = U Σ V'

Out[5]:

8.164634113057559e-16

Fast as Fortran¶

In [6]:

function loop()
    a=0 
    for i in 1:1_000_000
        a+=1;
    end
end
@time loop() # compile time
@time loop() # now fast

  0.004979 seconds (20.83 k allocations: 1.173 MiB)
  0.000001 seconds (4 allocations: 160 bytes)

Compare with Matlab (R2018)¶

I know this is not Matlab-friendly, that's the point !

function [] = loop()
    a=0;
    for i=1:1000000
        a=i+1;
    end
end

f = @() loop();

timeit(f) => 0.0018s

1000x slower !

Benchmarks on common code pattterns¶

Takeaway: Julia timings are clustered around C timing, Matlab/Python/R orders of magnitude slower.

KS-CNAB2 benchmark: CPU time versus lines of code¶

Takeaway: The Julia PDE code is almost as compact as Matlab/Python, almost as fast as C/Fortran.

Julia: easy, dynamic, and fast. How?¶

Just-in-time compilation (JIT)
- user-level code is compiled to machine code on-the-fly

Meticulous type system
- designed to maximize impact of JIT
- type inference: compiler determines types of variables

Multiple dispatch
- functions are compiled for each set of argument types
- function dispatch determined at compile time when possible, run time when not

**Just-in-time ahead-of-time compilation

Functions are compiled to machine code when first run. Subsequent runs are as fast as compiled C, C++, Fortran. Not a tracing JIT (like pypy)

Caveats¶

It's a marathon not a sprint : Julia can be slow at first
It's an open-source language with lots of good willing contributors, it's not a product (see JuliaPro...)
If you directly translate from Matlab to Julia you will not always see an improvement, you have exploit Julia strengths (loops to avoid allocations, type stability, ...)

A julia type¶

Like a C struct, or a Python object... but without methods

struct Spaceship
           speed::Float64
           Position::Array{Float64,2}
   end

Multiple dispatch¶

collide(a::Asteroid, s::Spaceship) # a first method of the function collide
collide(s1::Spaceship, s2::Spaceship) # another method of the function collide

The power of Multiple dispatch¶

Ref https://medium.com/@Jernfrost/defining-custom-units-in-julia-and-python-513c34a4c971

See also (very interesting!) : https://www.youtube.com/watch?v=kc9HwsxE1OY

In [7]:

abstract type Temperature end

struct Celsius <: Temperature
    value
end

struct Kelvin <: Temperature
   value::Float64 
end

struct Fahrenheit <: Temperature
    value::Float64
end

import Base: promote_rule, convert
promote_rule(::Type{Kelvin}, ::Type{Celsius})     = Kelvin
promote_rule(::Type{Fahrenheit}, ::Type{Celsius}) = Celsius
promote_rule(::Type{Fahrenheit}, ::Type{Kelvin})  = Kelvin

convert(::Type{Kelvin},  t::Celsius)     = Kelvin(t.value + 273.15)
convert(::Type{Kelvin},  t::Fahrenheit)  = Kelvin(Celsius(t))
convert(::Type{Celsius}, t::Kelvin)      = Celsius(t.value - 273.15)
convert(::Type{Celsius}, t::Fahrenheit)  = Celsius((t.value - 32)*5/9)
convert(::Type{Fahrenheit}, t::Celsius)  = Fahrenheit(t.value*9/5 + 32)
convert(::Type{Fahrenheit}, t::Kelvin)   = Fahrenheit(Celsius(t))

import Base: +,-,*
+(x::T, y::T) where {T <: Temperature} = T(x.value + y.value)
-(x::T, y::T) where {T <: Temperature} = T(x.value - y.value)

+(x::Temperature, y::Temperature) = +(promote(x,y)...)
-(x::Temperature, y::Temperature) = -(promote(x,y)...)

*(x::Number, y::T) where {T <: Temperature} = T(x * y.value)

const °C = Celsius(1); const °F = Fahrenheit(1); const K = Kelvin(1);

In [8]:

1K+1°C

Out[8]:

Kelvin(275.15)

Introspection¶

$$f(x) = 4\, x\, (1-x)$$

In [9]:

f(x) = 4x*(1-x)     # define logistic map
@time f(0.3);       # first run is slow
@time f(0.4);       # second run is a thousand times faster

  0.008778 seconds (18.00 k allocations: 1.003 MiB)
  0.000003 seconds (5 allocations: 176 bytes)

Observe the compilation of $f(x)$ by stages¶

user Julia -> generic Julia expression -> typed Julia expression -> intermediate compiler language -> machine code

In [11]:

@code_lowered f(0.3)  # show f(x) as generic Julia expression

Out[11]:

CodeInfo(
1 ─ %1 = 4 * x
│   %2 = 1 - x
│   %3 = %1 * %2
└──      return %3
)

In [12]:

@code_typed f(0.3)    # show f(x) as Julia expr with inferred types, based on the arg types

Out[12]:

CodeInfo(
1 ─ %1 = (Base.mul_float)(4.0, x)::Float64
│   %2 = (Base.sub_float)(1.0, x)::Float64
│   %3 = (Base.mul_float)(%1, %2)::Float64
└──      return %3
) => Float64

In [13]:

@code_llvm f(0.3)    # show f(x) in intermediate compiler language (LLVM)

;  @ In[9]:1 within `f'
define double @julia_f_12939(double) {
top:
; ┌ @ promotion.jl:314 within `*' @ float.jl:399
   %1 = fmul double %0, 4.000000e+00
; └
; ┌ @ promotion.jl:315 within `-' @ float.jl:397
   %2 = fsub double 1.000000e+00, %0
; └
; ┌ @ float.jl:399 within `*'
   %3 = fmul double %1, %2
; └
  ret double %3
}

Type inference and dispatch¶

We can see that if we call f on an Integer we get a code specialised for integer (i64)

In [14]:

@code_native f(0.3) # show f(x) in IA-64 assembly language

	.text
; ┌ @ In[9]:1 within `f'
	movabsq	$140409732451304, %rax  # imm = 0x7FB3B039CBE8
; │┌ @ promotion.jl:314 within `*' @ float.jl:399
	vmulsd	(%rax), %xmm0, %xmm1
	movabsq	$140409732451312, %rax  # imm = 0x7FB3B039CBF0
; │└
; │┌ @ promotion.jl:315 within `-' @ float.jl:397
	vmovsd	(%rax), %xmm2           # xmm2 = mem[0],zero
	vsubsd	%xmm0, %xmm2, %xmm0
; │└
; │┌ @ float.jl:399 within `*'
	vmulsd	%xmm0, %xmm1, %xmm0
; │└
	retq
	nopw	%cs:(%rax,%rax)
; └

In [15]:

@code_native f(3)    # show f(x) in IA-64 assembly language

	.text
; ┌ @ In[9]:1 within `f'
; │┌ @ In[9]:1 within `-'
	movl	$1, %eax
	subq	%rdi, %rax
; │└
; │┌ @ int.jl:54 within `*'
	imulq	%rdi, %rax
	shlq	$2, %rax
; │└
	retq
	nopw	%cs:(%rax,%rax)
; └

Genericity¶

(from https://www.youtube.com/watch?v=kc9HwsxE1OY)

In [16]:

using LinearAlgebra

# First we write a generic function (that takes an array A and a list of vectors 
# and sums the inner products of A and each vector)
function inner_sum(A,vs)
    t = zero(eltype(A))
    for v in vs
        t += inner(v,A,v)
    end
    return t
end

# we define the inner product used in our inner_sum function
inner(v,A,w) = dot(v,A*w)

## Now we want a new type, the famous One Hot Vector

import Base: *

struct OneHotVector <: AbstractVector{Bool}
  idx::UInt32
  len::UInt32
end

Base.size(xs::OneHotVector) = (Int64(xs.len),)
Base.getindex(xs::OneHotVector, i::Integer) = i == xs.idx
Base.getindex(xs::OneHotVector, ::Colon) = OneHotVector(xs.idx, xs.len)

In [17]:

# It works (generic) ! but it's slow
v = [OneHotVector(i,1000) for i in rand(1:1000,100)]
A=rand(1000,1000)

inner_sum(A,v)
@time inner_sum(A,v)

# And now specify to get speed !
A::AbstractMatrix * b::OneHotVector = A[:, b.idx]
inner(v::OneHotVector,A,w::OneHotVector) = A[v.idx,w.idx] # How to do that in single dispatch ??

inner_sum(A,v)
@time inner_sum(A,v)

  0.037617 seconds (105 allocations: 793.922 KiB)
  0.000003 seconds (5 allocations: 176 bytes)

Out[17]:

51.52316795511521

An ecosystem of packages¶

Most of packages are available on Github (easy collaboration)
Main fields are grouped in Github Organizations (see https://julialang.org/community/)
Julia comes with a powerful package/environment manager

In [19]:

using Pkg #load the Pkg stdlib
Pkg.activate(".") #activate the local environment

Out[19]:

"/home/raphael/Projets/Perso/tests-julia/julia-intro/Project.toml"

Solving Differential Equations¶

Solve the Lorenz equations:

$$ \begin{align} \frac{dx}{dt} &= σ(y-x) \\ \frac{dy}{dt} &= x(ρ-z) - y \\ \frac{dz}{dt} &= xy - βz \\ \end{align} $$

In [49]:

#Pkg.add("DifferentialEquations") # add the Differential equation suite
using DifferentialEquations # first time is very slow (precompilation)
using Plots
gr()

Out[49]:

Plots.GRBackend()

In [3]:

function lorenz(du,u,p,t)
 du[1] = 10.0*(u[2]-u[1])
 du[2] = u[1]*(28.0-u[3]) - u[2]
 du[3] = u[1]*u[2] - (8/3)*u[3]
end

u0 = [1.0;0.0;0.0] ; tspan = (0.0,100.0);
prob = ODEProblem(lorenz,u0,tspan)
sol = DifferentialEquations.solve(prob)
Plots.plot(sol,vars=(1,2,3))

Out[3]:

Optimization¶

In [22]:

using JuMP, GLPK, Test

"""
Formulate and solve a simple LP:
    max 5x + 3y
     st 1x + 5y <= 3
         0 <= x <= 2
         0 <= y <= 30
"""
function example_basic()
    model = Model(with_optimizer(GLPK.Optimizer))

    @variable(model, 0 <= x <= 2)
    @variable(model, 0 <= y <= 30)

    @objective(model, Max, 5x + 3y)
    @constraint(model, 1x + 5y <= 3.0)

    println(model)
    JuMP.optimize!(model)

    obj_value = JuMP.objective_value(model)
    x_value = JuMP.value(x)
    y_value = JuMP.value(y)

    println("Objective value: ", obj_value)
    println("x = ", x_value)
    println("y = ", y_value)
    @test obj_value ≈ 10.6
    @test x_value ≈ 2
    @test y_value ≈ 0.2

end

example_basic()

┌ Info: Precompiling GLPK [60bf3e95-4087-53dc-ae20-288a0d20c6a6]
└ @ Base loading.jl:1186

Max 5 x + 3 y
Subject to
 x + 5 y ≤ 3.0
 x ≥ 0.0
 y ≥ 0.0
 x ≤ 2.0
 y ≤ 30.0

Objective value: 10.6
x = 2.0
y = 0.2

Out[22]:

Test Passed

Composability¶

See https://tutorials.juliadiffeq.org/html/type_handling/02-uncertainties.html

In [2]:

using DifferentialEquations, Measurements, Plots

g = 9.79 ± 0.02; # Gravitational constants
L = 1.00 ± 0.01; # Length of the pendulum

#Initial Conditions
u₀ = [0 ± 0, π / 60 ± 0.01] # Initial speed and initial angle
tspan = (0.0, 6.3)

#Define the problem
function simplependulum(du,u,p,t)
    θ  = u[1]
    dθ = u[2]
    du[1] = dθ
    du[2] = -(g/L)*θ
end

#Pass to solvers
prob = ODEProblem(simplependulum, u₀, tspan)
sol = DifferentialEquations.solve(prob, Tsit5(), reltol = 1e-6)

# Analytic solution
u = u₀[2] .* cos.(sqrt(g / L) .* sol.t)

plot(sol.t, getindex.(sol.u, 2), label = "Numerical")
plot!(sol.t, u, label = "Analytic")

Out[2]:

Automatic differentiation¶

In [42]:

#Pkg.add("ForwardDiff")
using ForwardDiff

# Define a function f
f(x::Vector) = sum(sin, x) + prod(tan, x) * sum(sqrt, x);

x = rand(5) # small size for example's sake

# Get the gradient of f
g = x -> ForwardDiff.gradient(f, x); # g = ∇f

@show g(x)

# Get the Hessian
ForwardDiff.hessian(f, x)

g(x) = [2.57287, 2.51714, 2.3224, 1.87393, 1.8107]

Out[42]:

5×5 Array{Float64,2}:
 1.39246  4.80673  4.31075  3.52545  3.74384
 4.80673  1.37213  4.18168  3.41991  3.63178
 4.31075  4.18168  1.32308  3.06703  3.25711
 3.52545  3.41991  3.06703  1.78856  2.66366
 3.74384  3.63178  3.25711  2.66366  2.8778

Deep learning¶

Classic MNIST number classification with Flux.jl

In [4]:

using Flux, Flux.Data.MNIST, Statistics
using Flux: onehotbatch, onecold, crossentropy, throttle
using Base.Iterators: repeated
# using CuArrays

# Classify MNIST digits with a simple multi-layer-perceptron

# Load images
imgs = MNIST.images()
# Stack images into one large batch
X = hcat(float.(reshape.(imgs, :))...) |> gpu

labels = MNIST.labels()
# One-hot-encode the labels
Y = onehotbatch(labels, 0:9) |> gpu

# Define the neural network (two Dense layers)
m = Chain(
  Dense(28^2, 32, relu),
  Dense(32, 10),
  softmax) |> gpu

# define loss function and metric
loss(x, y) = crossentropy(m(x), y)
accuracy(x, y) = mean(onecold(m(x)) .== onecold(y))

# create batches of 100
dataset = repeated((X, Y), 100)
# Define callback that show the current loss
evalcb = () -> @show(loss(X, Y))

opt = ADAM()

# Train
Flux.train!(loss, params(m), dataset, opt, cb = throttle(evalcb, 10))
@show accuracy(X, Y)
# Test set accuracy
tX = hcat(float.(reshape.(MNIST.images(:test), :))...) |> gpu
tY = onehotbatch(MNIST.labels(:test), 0:9) |> gpu
@show accuracy(tX, tY)

loss(X, Y) = 2.2864406f0 (tracked)
loss(X, Y) = 1.495227f0 (tracked)
loss(X, Y) = 1.018613f0 (tracked)
loss(X, Y) = 0.7433482f0 (tracked)
loss(X, Y) = 0.60261893f0 (tracked)
loss(X, Y) = 0.51817137f0 (tracked)
loss(X, Y) = 0.46358913f0 (tracked)
loss(X, Y) = 0.41939938f0 (tracked)
loss(X, Y) = 0.39135918f0 (tracked)
accuracy(X, Y) = 0.8994333333333333
accuracy(tX, tY) = 0.9029

Out[4]:

0.9029

Prediction¶

Predict the number for a given image

In [6]:

n = rand(1:length(MNIST.images(:test)))
print("Prediction:",argmax(tY[:,n])-1)
heatmap(MNIST.images(:test)[n])

Prediction:5

Out[6]:

Deep learning + Autodiff¶

=> Scientific Machine learning

http://www.stochasticlifestyle.com/the-essential-tools-of-scientific-machine-learning-scientific-ml/

https://fluxml.ai/2019/03/05/dp-vs-rl.html

https://fluxml.ai/2019/02/07/what-is-differentiable-programming.html

https://www.youtube.com/watch?v=FGfx8CQHdQA

Ecosystem¶

And much more :

JuliaRobotics (real-time !)
Stats/Machine Learning
Web
GPU
...

Interop¶

Interop possible avec Python, Matlab, R, Java, C/Fortran,...

In [38]:

#Pkg.add("PyCall") # add python binding package

using PyCall 
@pyimport math # import python math library
@show math.sin(math.pi / 4) 
@show sin(pi / 4) ; #Julia native sinus function

math.sin(math.pi / 4) = 0.7071067811865475
sin(pi / 4) = 0.7071067811865475

In [29]:

# call C
t = ccall((:clock, "libc.so.6"), Int32, ())

Out[29]:

854653866

Macros: code that transforms code¶

In [30]:

# @time macro inserts timing and memory profiling into expression, then evaluates, and prints
@time f(2//3)

  0.076290 seconds (114.51 k allocations: 6.258 MiB, 24.08% gc time)

Out[30]:

8//9

In [31]:

# @which macro determines which function is called, provides link to source code on GitHub
@which exp(π)

Out[31]:

exp(x::Real) in Base.Math at special/exp.jl:73

In [32]:

@which exp(π*im)

Out[32]:

exp(z::Complex) in Base at complex.jl:604

Macros enable run-time code generation and transformation.

Applications :

generation and execution of boilerplate code
run-time generation and optimization of algorithms, e.g. FFTW, ATLAS
symbolic mathematics, automatic differentiation
all written like high-level Python, running like compiled C !!!

Parallelism in Julia: just change the array type¶

Some very trivial examples of Julia's built-in parallelism

In [33]:

using Distributed
#Pkg.add("DistributedArrays")
# add 4 process
addprocs(4);

In [34]:

; cat codes/count_heads.jl

function count_heads(n)
    c::Int = 0
    for i=1:n
        c += rand(Bool)
    end
    c
end

In [44]:

# define function on all process
@everywhere include("codes/count_heads.jl")

# dispatch two tasks
a = @spawn count_heads(10000000)
b = @spawn count_heads(10000000)
(fetch(a)+fetch(b))/20000000 # Get proba of drawing a head

Out[44]:

0.49999545

Parallel loops with reduction¶

In [48]:

# distribute the simulation on all process and sum results
nheads = @distributed (+) for i=1:200000000
  Int(rand(Bool))
end 
nheads/200000000

Out[48]:

0.499957515

And more :

Distributed arrays
GPUArrays
TPUArrays ...
Nested Threading (PARTR)

Just changing the behavior of the underlying structure can bring new hardware support for all packages

Helpful materials¶

Main site https://julialang.org/
Docs https://docs.julialang.org/en/v1/
Courses : https://juliaacademy.com/
Forum https://discourse.julialang.org/
Book https://benlauwens.github.io/ThinkJulia.jl/latest/book.html
Blog http://www.stochasticlifestyle.com/
All-in-one package : https://juliacomputing.com/products/juliapro.html
Try online : https://juliabox.com/

Code as a first-class citizen product of research¶

The (new version) Julia package manager has reproductibility at its core (each code project is linked to a deterministic set of dependencies)
Creating a Julia package comes with tools for documentation, unit testing, continuous integration
New scientific collaborations can be based on software (see the Github organizations such as JuliaDiff, JuliaStats, etc...)

Food for thoughts¶

Cost and open source
- Matlab licences cost several 10K€ each year to some labs
- Julia (and Python and R) come for free but development is not free
- A part of Matlab costs could go to financing open source software that is critical for science (see e.g. https://bitbucket.org/paugier/etude-asso-pynumfr/src/default/etude_asso_python_sciences_fr.rst?fileviewer=file-view-default)
The combo C(++) low-level libraries and high level bindings (as Tensorflow, Keras...) lead to black box workflows...

Conclusions¶

Julia

Easy as Matlab, fast as Fortran, flexible as Python, deep as Lisp
Scientific computing, from interactive exploration to HPC
Paradigms (multiple dispatch) that enforce collaboration

Not covered

large-scale programming, development ecosystem, environments, debuggers, etc.
Abstract Types, compositions, ...
rough edges: plotting, static compilation (not quite there), package loading times, young 1.x

Thanks : the Julia community for most of the examples, xkcd

Julia website: http://www.julialang.org, this talk: https://github.com/raphbacher/julia-intro