Data Visualization
Astro 497, Week 11, Day 1
TableOfContents()
General Presentation Tips
What is the purpose of your presentation/slide/figure?
What is essential to accomplishing that goal?
What content is only periperial?
Who is your audience?
What do they...
Already know well?
Have already heard, but need a reminder?
Not know, but are well-prepared to understand?
Not know and need scaffolding to appreciate?
Choose complexity of figure (or entire presentation) to your audience.
Foldable("Example: Color-luminosity diagram of stars in local neighborhood",
md"""
$(RobustLocalResource("https://vlas.dev/post/gaia-dr2-hrd/gaia-hrd-dr2.png", "../_assets/week11/gaia-hrd-dr2.png", :height=>"80%", :alt=>"Color-Luminosity Diagram from Gaia DR2"))
--- Credit: [Gaia Collaboration (2018)](https://doi.org/10.1051/0004-6361/201832843) (DR2) & [Vlas Sokolov](https://vlas.dev/post/gaia-dr2-hrd/)
""")
Example: Color-luminosity diagram of stars in local neighborhood
–- Credit: Gaia Collaboration (2018) (DR2) & Vlas Sokolov
Figures
Choose what variables to plot
Can you choose a set of variables that:
Shows what is most directly observable
Makes the relationship simpler or more obvious
Makes the figure more general ("switch to theorist units")
Makes the figure easier to compare to observations ("switch to observer units")
Would a transformation make the relationship (nearly) linear?
Log
Power-law
Divide by a baseline model
Foldable("Example: Transit timiing variations",md"""
- Most directly observable: Transit times vs Time
- Makes the relationship more obvious: Residual of Transit time minus linear transit timing model
- Observer units
- X-axis: Transit times ($t_n$'s) in days or years
- Y-axis: Residuals ($\Delta~t_n$'s) in minutes
- Theorist units
- X-axis: Transit Number ($n$'s)
- Y-axis: Residuals ($\Delta~t_n$'s) in fraction of orbital period
""")
Example: Transit timiing variations
Most directly observable: Transit times vs Time
Makes the relationship more obvious: Residual of Transit time minus linear transit timing model
Observer units
X-axis: Transit times ($t_n$'s) in days or years
Y-axis: Residuals ($\Delta~t_n$'s) in minutes
Theorist units
X-axis: Transit Number ($n$'s)
Y-axis: Residuals ($\Delta~t_n$'s) in fraction of orbital period
Figure type
Start with figure type that's well-suited for the data and purpose of the plot.
Points – measurements
Lines – predictions of models
Contours or Heatmap – 2-d histogram or probability density
Bar chart – comparing categorical data
Vector field or quiver – velocities in fluid flow
Box plot or Violin plots – comparing distributions"
let
x = range(0,step=π/16,stop=π);
X = repeat(x,1,length(x))
Y = repeat(x',length(x))
U = cos.(X.*Y)
V = sin.(X.*Y)
quiver_scale = 0.2
plt = quiver(X,Y,quiver=(quiver_scale.*U,quiver_scale.*V), color=:black)
Foldable("Example: Quiver Plot", plt)
end
Example: Quiver Plot
let
plt = violin(repeat(["Type I", "Type II", "Type III"],outer=100),randn(300), legend=:none)
Foldable("Example: Violin Plot",plt)
end
Example: Violin Plot
Fonts
Use font large enough to read
Sans serif is easier to read (especially if small)
Font includes accents or symbols to be used
Choose colors wisely
Proportional spacing vs
Mono-spaced
begin
df = DataFrame(udec=Int64[], uhex=String[], Glyph=String[], LaTeX=String[], Description=String[])
for i in (
[0x00278, "ɸ", "\\ltphi", "Latin Small Letter Phi"] ,
[0x003C6, "φ", "\\varphi", "Greek Small Letter Phi"] ,
[0x003D5, "ϕ", "\\phi", "Greek Phi Symbol / Greek Small Letter Script Phi"] ,
[0x01D60, "ᵠ", "\\^phi", "Modifier Letter Small Greek Phi"] ,
[0x01D69, "ᵩ", "\\_phi", "Greek Subscript Small Letter Phi"] ,
[0x1D6D7, "𝛗", "\\bfvarphi", "Mathematical Bold Small Phi"] ,
[0x1D6DF, "𝛟", "\\bfphi", "Mathematical Bold Phi Symbol"] ,
[0x1D711, "𝜑", "\\itphi", "Mathematical Italic Small Phi"] ,
[0x1D719, "𝜙", "\\itvarphi", "Mathematical Italic Phi Symbol"] ,
[0x1D74B, "𝝋", "\\biphi", "Mathematical Bold Italic Small Phi"] ,
[0x1D753, "𝝓", "\\bivarphi", "Mathematical Bold Italic Phi Symbol"] ,
[0x1D785, "𝞅", "\\bsansphi", "Mathematical Sans-Serif Bold Small Phi"] ,
[0x1D78D, "𝞍", "\\bsansvarphi", "Mathematical Sans-Serif Bold Phi Symbol"] ,
[0x1D7BF, "𝞿", "\\bisansphi", "Mathematical Sans-Serif Bold Italic Small Phi"] ,
[0x1D7C7, "𝟇", "\\bisansvarphi", "Mathematical Sans-Serif Bold Italic Phi Symbol"])
push!(df, (i[1], string(i[1], base=16), string(Char(i[1])), i[3], i[4]))
end
Foldable("Example of mathematical symbols",
md"""
$(df[!,3:5])
#### [JuliaMono](https://juliamono.netlify.app/) has great coverage.
"""
)
end
Example of mathematical symbols
Row | Glyph | LaTeX | Description |
---|---|---|---|
String | String | String | |
1 | ɸ | \\ltphi | Latin Small Letter Phi |
2 | φ | \\varphi | Greek Small Letter Phi |
3 | ϕ | \\phi | Greek Phi Symbol / Greek Small Letter Script Phi |
4 | ᵠ | \\^phi | Modifier Letter Small Greek Phi |
5 | ᵩ | \\_phi | Greek Subscript Small Letter Phi |
6 | 𝛗 | \\bfvarphi | Mathematical Bold Small Phi |
7 | 𝛟 | \\bfphi | Mathematical Bold Phi Symbol |
8 | 𝜑 | \\itphi | Mathematical Italic Small Phi |
9 | 𝜙 | \\itvarphi | Mathematical Italic Phi Symbol |
10 | 𝝋 | \\biphi | Mathematical Bold Italic Small Phi |
11 | 𝝓 | \\bivarphi | Mathematical Bold Italic Phi Symbol |
12 | 𝞅 | \\bsansphi | Mathematical Sans-Serif Bold Small Phi |
13 | 𝞍 | \\bsansvarphi | Mathematical Sans-Serif Bold Phi Symbol |
14 | 𝞿 | \\bisansphi | Mathematical Sans-Serif Bold Italic Small Phi |
15 | 𝟇 | \\bisansvarphi | Mathematical Sans-Serif Bold Italic Phi Symbol |
JuliaMono has great coverage.
Axes
Label axes
Specify units (unless dimensionless)
≥ 3 tick marks per axis
Plenty of space between axis labels
Axis Range
Don't just accept default axis scales!
Scale that includes all points may hide important variations.
Zooming in all the way can give the impression that variations are large, regardless of whether they are small or large.
Questions to ask:
Does zero (or one) have significance?
Linear vs Log?
Would showing residuals to a baseline model be more helpful (or just more confusing)?
let
plt1 = plot(xlabel="Transit Number", ylabel="Time (d)",
markersize=1, legend=:none)
n = 100
tr_num = collect(1:n)
period = 10.0
t0 = 1000.0
t_linear = t0 .+ tr_num .* period
p_ttv = 365
t = t_linear .+ 10/(24*60) .* sin.(2π.*t_linear/p_ttv)
t .+= 5/(24*60) *randn(n)
resid = (t.-t_linear).*(24*60)
σt = ones(n).*15 ./(24*60)
mask = rand(n) .< 0.75
scatter!(plt1, tr_num[mask], t[mask], )
plt2 = plot(xlabel="Time (days)", ylabel="Δt (min)",
markersize=2, legend=:none)
σt .*= (24*60)
if plt_errorbars=="All"
scatter!(plt2, t[mask], resid[mask], yerr=σt[mask], )
else
scatter!(plt2, t[mask], resid[mask], )
if plt_errorbars == "One"
scatter!(plt2, [t[mask][end]], [resid[mask][end]], yerr=[σt[mask][end]], markercolor=1)
end
end
plot(plt1, plt2, layout=(2,1) )
end
Show erorrbars?
Lines
Increase line width (or weight) when used for print or projection
Hard to tell apart more than 4 line styles
Distinguish lines with both color and line style
let
plt = plot( xlabel="X", ylabel="Y", color_palette=cs_ex_categories_unordered, legend=:none)
x = 0:0.1:2π
y = sin.(x)
lw = 4
plot!(plt, x, y, linewidth=lw, linestyle= :solid)
plot!(plt, x, y.+0.1, linewidth=lw, linestyle= :dash)
plot!(plt, x, y.+0.2, linewidth=lw, linestyle= :dot)
plot!(plt, x, y.+0.3, linewidth=lw, linestyle= :dashdot)
plot!(plt, x, y.+0.4, linewidth=lw, linestyle= :dashdotdot)
end
Points
Increase point size, especially for print or projection
Hard to tell apart more than ~5 point (marker) shapes
Too many points overlapping can be misleading
Convey measurement uncertainties
let
plt = plot( xlabel="X", ylabel="Y", color_palette=ColorScheme(cvd_dict[cvd].(ColorSchemes.Paired_12)), legend=:none)
x = 0:0.1:2π
y = sin.(x)
lw = 4
shape_wo_stroke = [:circle, :rect, :star5, :diamond, :hexagon, :utriangle, :dtriangle, :rtriangle, :ltriangle, :pentagon, :heptagon, :octagon, :star4, :star6, :star7, :star8]
shape_w_stroke = [:cross, :xcross, :vline, :hline]
for i in 1:length(shape_wo_stroke)
scatter!(plt, x, y.+i.*0.1, markershape=shape_wo_stroke[i], markersize=3.5, markerstrokewidth=0)
end
for i in 1:length(shape_w_stroke)
j = i + length(shape_wo_stroke)
scatter!(plt, x, y.+j.*0.1, markershape=shape_w_stroke[i], markersize=3.5, markerstrokewidth=3)
end
plt
end
a, b = randn(4_000), randn(4_000);
let
if plt_heatmap
plt = histogram2d(a,b, bins=40, xlabel="X", ylabel="Y", color_palette=:lajolla, legend=:none)
else
plt = plot(xlabel="X", ylabel="Y", color_palette=:lajolla, legend=:none)
end
xlims!(-3.5, 3.5)
ylims!(-3.5, 3.5)
dens = kde((a,b))
dens_interp = InterpKDE(dens)
mask = map(i->pdf(dens_interp, a[i], b[i]) .<= plt_point_threshold, 1:length(a) )
#mask = trues(length(a))
scatter!(plt, a[mask],b[mask], markercolor=:blue, markersize=pointsize, markerstrokewidth=0)
if plt_contours
contour_color = plt_heatmap ? :yellow : :blue
plot!(plt, dens)
end
plt
end
Contours:
Point size:
Colors
Why are you using color?
1. To convey additional information
begin
local t = range(0,stop=5,length=500)
y = 10 .+ (2.0.+t./20) .* sin.(2π.*(t.+0.2.*sin.(2π.*t./20)))
tmod1 = mod.(t,1)
plt_dim1 = scatter(t, y, markerz=t, markersize=2.5, markerstrokewidth=0, xlabel = "Time", ylabel="Flux", legend=:none)
plt_dim2 = scatter(tmod1, y, markerz=t, markersize=2.5, markerstrokewidth=0, xlabel = "Time modulo 1", ylabel="Flux",legend=:none)
plt_dim = plot(plt_dim1, plt_dim2, layout=(2,1) )
end
#$plt_dim
md"""
#### 2. Draw attention to one element
"""
2. Draw attention to one element
begin # Microlensing magnification for single lens
A(u) = (2+u^2)/(u*sqrt(4+u^2))
u(t; u0::Real, t0::Real=zero(t), tE::Real=one(t) ) = sqrt(u0^2+(t-t0)^2/tE^2)
end;
begin
plt_accent = plot(xlabel = "Time", ylabel="Magnification", legend=:none)
local t = -5:0.02:5
local flux = 0.5 .*( A.(u.(t,u0=0.75)) .+ A.(u.(t,u0=2,t0=2,tE=0.05)) )
flux .+= 0.005.*randn(length(flux))
scatter!(plt_accent, t, flux, markersize=2, markerstrokewidth=0 )
local idx_accent = findall(t->abs(t-2)<0.15, t)
scatter!(plt_accent, t[idx_accent], flux[idx_accent], markersize=2, markerstrokewidth=0 )
end
#$plt_accent
md"""
#### 3. To fit in with color palette
$cs_ex_psu_penn
"""
3. To fit in with color palette
begin
plt_theme = plot(xlabel = "Time", ylabel="Magnification", legend=:none,
fg_color_subplot=cs_pennsylvania[2], markersize=2.5, framestyle=:box,
fontfamily="Bookman Demi", color_palette=cs_ex_psu_accent
)
local λ = 5428:0.05:5432
line(λ; λ0=zero(λ), depth=1, width=7000/3e8 ) = 1-depth*exp(-0.5*((λ-λ0)^2/(λ0*width)^2))
local flux = line.(λ, λ0=5430).+0.05.*randn(length(λ))
scatter!(plt_theme, λ, flux, markercolor=2, markerstrokewidth=0, )
flux = 0.3.+line.(λ, λ0=5430.05).+0.05.*randn(length(λ))
scatter!(plt_theme, λ, flux, markercolor=3, markerstrokewidth=0, )
flux = 0.6.+line.(λ, λ0=5430.10).+0.05.*randn(length(λ))
scatter!(plt_theme, λ, flux, markercolor=4, markerstrokewidth=0, )
flux = 0.9.+line.(λ, λ0=5430.15).+0.05.*randn(length(λ))
scatter!(plt_theme, λ, flux, markercolor=5, markerstrokewidth=0, )
flux = 1.2.+line.(λ, λ0=5430.20).+0.05.*randn(length(λ))
scatter!(plt_theme, λ, flux, markercolor=6, markerstrokewidth=0, )
end
Choosing a Color Palette
Continuous Values
Should it be perceptually uniform?
Otherwise can give an inaccurate impression
What if you print it in black & white?
Is there sufficient contrast?
Especially important when using projector
Can truncate palette to avoid using too light a color
How is perceived by people with a color vision deficiency?
Examples:
Linear (e.g.,
lajolla
)
vs vs
Diverging (e.g.,
vik
)
vs
Cyclic (e.g.,
cyclic_protanopic_deuteranopic_bwyk_16_96_c31_n256
)
vs
Color Vision Deficiency:
Categories
Unordered
Ordered Categories
Paired
Other common mistakes
Cramming too much information into a figure
The goal of visualization is to make it easy for the audience to understand, and not to show off your skill at making complicated plots.
When giving a presentation, can build up a figure step-by-step to turn complex figure into a story.
Making plot 3d when 2d would be easier to interpret
Using yellows in a presentation to be displayed with a projector
Keep your eyes open for...
Particularly effective plots
Poorly executed plots
Plot that are complex, but still readable thanks to good design
Think about why they were good/bad and what you can learn from it.
Setup & Helper Code
begin
using PlutoUI, PlutoTeachingTools
using Plots, Plots.PlotMeasures, LaTeXStrings
using StatsPlots, KernelDensity
using Colors, ColorSchemes, FixedPointNumbers
using DataFrames
end
ChooseDisplayMode()
Figures
PSU Palletes
cs_pennsylvania = ColorScheme(reinterpret.(RGB24,[0x001E44, 0x1E407C, 0x009CDE, 0x314D64, 0x3EA39E, 0xA2AAAD, 0xffffff]))
cs_classic_accent = ColorScheme(reinterpret.(RGB24,[0x6A3028,0xB88965,0xBF8226,0x4A7729,0x96BEE6,0xAC8DCE,0x444444,0xBC204B]))
cs_vibrant_accent = ColorScheme(reinterpret.(RGB24,[0xF2665E,0xE98300,0xFFD100,0x99CC00,0x008755,0x491D70,0x000321]))
Color vision deficiency code
begin
cvd_names = ["None", "Protanopic","Deuteranopic","Tritanopic","Greyscale"]
cvd_funcs = [identity, protanopic, deuteranopic, tritanopic,Gray]
cvd_dict = Dict(zip(cvd_names,cvd_funcs))
end;
begin
cs_ex_linear = ColorScheme(cvd_dict[cvd].(ColorSchemes.lajolla))
cs_ex_diverge = ColorScheme(cvd_dict[cvd].(ColorSchemes.vik))
cs_ex_cyclic = ColorScheme(cvd_dict[cvd].(ColorSchemes.cyclic_protanopic_deuteranopic_bwyk_16_96_c31_n256))
cs_ex_categories_unordered = ColorScheme(cvd_dict[cvd].(ColorSchemes.tol_bright))
cs_ex_categories_ordered = ColorScheme(cvd_dict[cvd].(ColorSchemes.YlGnBu_6))
cs_ex_paired = ColorScheme(cvd_dict[cvd].(ColorSchemes.Paired_6))
cs_ex_psu_penn = ColorScheme(cvd_dict[cvd].(cs_pennsylvania))
cs_ex_psu_accent = ColorScheme(cvd_dict[cvd].(cs_classic_accent))
cs_ex_psu_vibrant = ColorScheme(cvd_dict[cvd].(cs_vibrant_accent))
end;
begin
cs_ex_not_linear1 = ColorScheme(cvd_dict[cvd].(reverse(ColorSchemes.hot)))
cs_ex_not_linear2 = ColorScheme(cvd_dict[cvd].(ColorSchemes.jet1))
cs_ex_not_diverge = ColorScheme(cvd_dict[cvd].(ColorSchemes.coolwarm))
cs_ex_not_cyclic = ColorScheme(cvd_dict[cvd].(ColorSchemes.phase))
end;
Built with Julia 1.8.2 and
ColorSchemes 3.19.0Colors 0.12.8
DataFrames 1.4.2
FixedPointNumbers 0.8.4
KernelDensity 0.6.5
LaTeXStrings 1.3.0
Plots 1.35.5
PlutoTeachingTools 0.2.3
PlutoUI 0.7.48
StatsPlots 0.15.4
To run this tutorial locally, download this file and open it with Pluto.jl.
To run this tutorial locally, download this file and open it with Pluto.jl.
To run this tutorial locally, download this file and open it with Pluto.jl.
To run this tutorial locally, download this file and open it with Pluto.jl.