The Julia Data Ecosystem

Author

Marcel Angenvoort

Published

February 1, 2025

Abstract

lorem ipsum

Keywords

julia, programming, scientific computing

Important Julia Packages

  • Plots.jl: Modern plotting library similar to MatplotLib
  • LinearSolve.jl High-performance library for solving linear equations
  • DifferentialEquations.jl Efficient solvers for various differential equations
  • FFTW.jl Bindings to the FFTW libeary for Fast Fourier Transform
  • StatsKit.jl Convenience meta-package to load essential packages for statistics
  • MLJ.jl Machine Learning framework for Julia
  • Turing.jl Bayesian inference with probabilistic programming
  • JuliaGPU Packages for programming on GPUs with Julia

Plots.jl

Plots.jl is a data visualisation and plotting library similar to Matplotlib. It provides an interface for several backends, offering great flexibility while remaining simple to use.

Installation:

Pkg.add("Plots")
Pkg.add("PlotThemes")

We also need LaTeXStrings.jl:

using Plots
using LaTeXStrings
using WebIO

theme(:dark)

Example Plot of the trigonometric functions:

Code
x = range(0, 2π, length=100)
y = sin.(x)
plot(x, y)
plot!(x, cos.(x))

We can use LaTeXStrings to add LaTeX labels to each plot;

Let’s customize the plot by using a different colors, and also add a title and xlabel:

Code
x = range(0, 2π, length=100)
plot(x, sin.(x), label=L"$\sin(x)$", color="darkred", lw=3)
plot!(x, cos.(x), label=L"$\cos(x)$", color="teal", lw=2)

title!("Trigonometric functions")
xlabel!("x-axis")
ylabel!("y-axis")

Scatter Plots:

Code
x = range(0, 10, length=100)
y = sin.(x)
y_noisy = @. sin(x) + 0.1*randn()

plot(x, y, label=L"$\sin(x)$")
scatter!(x, y_noisy, label="data", markersize=2.5)

Histograms can be useful for visualizing statistical data:

Code
using StatsPlots
x = randn(1000)

histogram(x, bins=50, normalize=:pdf, label="data")
density!(x, trim=true, label="KDE", lw=3)

3D-Plots

Surface Plots of the Rosenbrock function:

Code
# Use Plotly backend with MathJax support
plotlyjs()
#plotlyjs(extra_plot_kwargs=KW(:include_mathjax=>"cdn"))

# Rosenbrock function:
f(x, y) = (1 - x)^2 + 100 * (y - x^2)^2

x = range(-2, 2, length=100)
y = range(-1, 3, length=100)

surface(x, y, f, color=:viridis, extra_plot_kwargs=KW(:width=>900, :height=>600))
#surface(x, y, f, color=:viridis)
title!("Rosenbrock function")
xlabel!("x")
ylabel!("y")

Contour Plots of the Himmelblau function:

Code
# Himmelblau-function
f(x, y) = (x^2 + y - 11)^2 + (x + y^2 - 7)^2

x = range(-5, 5, length=100)
y = range(-5, 5, length=100)
z = @. f(x', y)

contour(x, y, z, levels=20, clabels=true, color=:turbo)
title!("Himmelblau function")
xlabel!(L"x")
ylabel!(L"y")

DataFrames

Using the StatsPlots extension, you can use DataFrames as arguments:

Code
using StatsPlots, RDatasets

data = dataset("datasets", "iris")
@df data scatter(:SepalLength, :SepalWidth;
    group = :Species,
    title = "Scatter Plot of the iris dataset",
    xlabel = "Sepal Length (cm)",
    ylabel = "Sepal Width (cm)",
)

Unlike in Seaborn, it is not possible to label the axes automatically.

import seaborn as sns
import matplotlib.pyplot as plt

data = sns.load_dataset("iris")
sns.scatterplot(data, x="sepal_length", y="sepal_width", hue="species")
plt.show()