Distribution
Distributions are the flagship data type in Squiggle. The distribution type is a generic data type that contains one of three different formats of distributions. These subtypes are point set, sample set, and symbolic. The first two of these have a few custom functions that only work on them. You can read more about the differences between these formats here.
Several functions below only can work on particular distribution formats. For example, scoring and pointwise math requires the point set format. When this happens, the types are automatically converted to the correct format. These conversions are lossy.
Distributions are created as sample sets by default. To create a symbolic distribution, use Sym.
namespace: Sym.normal
, Sym.beta
and so on.
Distribution Creation
These are functions for creating primitive distributions. Many of these could optionally take in distributions as inputs. In these cases, Monte Carlo Sampling will be used to generate the greater distribution. This can be used for simple hierarchical models.
See a longer tutorial on creating distributions here.
normal
normal: (distribution|number, distribution|number) => distribution
normal: (dict<{p5: distribution|number, p95: distribution|number}>) => distribution
normal: (dict<{p10: distribution|number, p90: distribution|number}>) => distribution
normal: (dict<{p25: distribution|number, p75: distribution|number}>) => distribution
normal: (dict<{mean: distribution|number, stdev: distribution|number}>) => distribution
Examples
normal(5, 1)
normal({ p5: 4, p95: 10 })
normal({ p10: 5, p95: 9 })
normal({ p25: 5, p75: 9 })
normal({ mean: 5, stdev: 2 })
normal(5 to 10, normal(3, 2))
normal({ mean: uniform(5, 9), stdev: 3 })
lognormal
lognormal: (distribution|number, distribution|number) => distribution
lognormal: (dict<{p5: distribution|number, p95: distribution|number}>) => distribution
lognormal: (dict<{p10: distribution|number, p90: distribution|number}>) => distribution
lognormal: (dict<{p25: distribution|number, p75: distribution|number}>) => distribution
lognormal: (dict<{mean: distribution|number, stdev: distribution|number}>) => distribution
Examples
lognormal(0.5, 0.8)
lognormal({ p5: 4, p95: 10 })
lognormal({ p10: 5, p95: 9 })
lognormal({ p25: 5, p75: 9 })
lognormal({ mean: 5, stdev: 2 })
to
The to
function is an easy way to generate simple distributions using predicted 5th and 95th percentiles.
To
is simply an alias for lognormal({p5:low, p95:high})
. It does not accept values of 0 or less, as those are not valid for lognormal distributions.
to: (distribution|number, distribution|number) => distribution
Examples
5 to 10
to(5,10)
uniform
uniform: (distribution|number, distribution|number) => distribution
Examples
uniform(10, 12)
beta
beta: (distribution|number, distribution|number) => distribution
beta: (dict<{mean: distribution|number, stdev: distribution|number}>) => distribution
Examples
beta(20, 25)
beta({ mean: 0.39, stdev: 0.1 })
cauchy
cauchy: (distribution|number, distribution|number) => distribution
Examples
cauchy(5, 1)
gamma
gamma: (distribution|number, distribution|number) => distribution
Examples
gamma(5, 1)
logistic
logistic: (distribution|number, distribution|number) => distribution
Examples
gamma(5, 1)
exponential
exponential: (distribution|number) => distribution
Examples
exponential(2)
bernoulli
bernoulli: (distribution|number) => distribution
Examples
bernoulli(0.5)
triangular
triangular: (number, number, number) => distribution;
Examples
triangular(5, 10, 20)
mixture
mixture: (...distributionLike, weights?:list<float>) => distribution
mixture: (list<distributionLike>, weights?:list<float>) => distribution
Note: If you want to pass in over 5 distributions, you must use the list syntax.
Examples
mixture(normal(5, 1), normal(10, 1), 8)
mx(normal(5, 1), normal(10, 1), [0.3, 0.7])
mx([normal(5, 1), normal(10, 1)], [0.3, 0.7])
Functions
make
Make a distribution, starting with either a number of a distribution.
make: (distribution|number) => distribution
sample
One random sample from the distribution
sample: (distribution) => number
Examples
sample(normal(5, 2))
sampleN
N random samples from the distribution
sampleN: (distribution, number) => list<number>
Examples
sampleN(normal(5, 2), 100)
mean
The distribution mean
mean: (distribution) => number
Examples
mean(normal(5, 2))
stdev
Standard deviation. Only works now on sample set distributions (so converts other distributions into sample set in order to calculate.)
stdev: (distribution) => number
variance
Variance. Similar to stdev, only works now on sample set distributions.
variance: (distribution) => number
Median
Median. The quantile at 0.5.
median: (distribution) => number
cdf
cdf: (distribution, number) => number
Examples
cdf(normal(5, 2), 3)
pdf: (distribution, number) => number
Examples
pdf(normal(5, 2), 3)
quantile
quantile: (distribution, number) => number
Examples
quantile(normal(5, 2), 0.5)
truncate
Truncates both the left side and the right side of a distribution.
truncate: (distribution, left: number, right: number) => distribution
Implementation Details
Sample set distributions are truncated by filtering samples, but point set
distributions are truncated using direct geometric manipulation. Uniform
distributions are truncated symbolically. Symbolic but non-uniform
distributions get converted to Point Set distributions.
truncateLeft
Truncates the left side of a distribution.
truncateLeft: (distribution, left: number) => distribution
Examples
truncateLeft(normal(5, 2), 3)
truncateRight
Truncates the right side of a distribution.
truncateRight: (distribution, right: number) => distribution
Examples
truncateRight(normal(5, 2), 6)
klDivergence
Kullback–Leibler divergence (opens in a new tab) between two distributions.
Note that this can be very brittle. If the second distribution has probability mass at areas where the first doesn't, then the result will be infinite. Due to numeric approximations, some probability mass in point set distributions is rounded to zero, leading to infinite results with klDivergence.
klDivergence: (distribution, distribution) => number
Examples
Dist.klDivergence(normal(5, 2), normal(5, 4)) // returns 0.57
logScore
A log loss score. Often that often acts as a scoring rule (opens in a new tab). Useful when evaluating the accuracy of a forecast.
Note that it is fairly slow.
Dist.logScore: ({estimate: distribution, ?prior: distribution, answer: distribution|number}) => number
Examples
Dist.logScore({
estimate: normal(5, 2),
answer: normal(4.5, 1.2),
prior: normal(6, 4),
}) // returns -0.597.57
Display
toString
toString: (distribution) => string
Examples
toString(normal(5, 2))
sparkline
Produce a sparkline of length n. For example, ▁▁▁▁▁▂▄▆▇██▇▆▄▂▁▁▁▁▁
. These can be useful for testing or quick text visualizations.
sparkline: (distribution, n = 20) => string
Examples
sparkline(truncateLeft(normal(5, 2), 3), 20) // produces ▁▇█████▇▅▄▃▂▂▁▁▁▁▁▁▁
Normalization
There are some situations where computation will return unnormalized distributions. This means that their cumulative sums are not equal to 1.0. Unnormalized distributions are not valid for many relevant functions; for example, klDivergence and scoring.
The only functions that do not return normalized distributions are the pointwise arithmetic operations and the scalewise arithmetic operations. If you use these functions, it is recommended that you consider normalizing the resulting distributions.
normalize
Normalize a distribution. This means scaling it appropriately so that it's cumulative sum is equal to 1. This only impacts Point Set distributions, because those are the only ones that can be non-normlized.
normalize: (distribution) => distribution
Examples
normalize(normal(5, 2))
isNormalized
Check of a distribution is normalized. Most distributions are typically normalized, but there are some commands that could produce non-normalized distributions.
isNormalized: (distribution) => bool
Examples
isNormalized(normal(5, 2)) // returns true
integralSum
Note: If you have suggestions for better names for this, please let us know.
Get the sum of the integral of a distribution. If the distribution is normalized, this will be 1.0. This is useful for understanding unnormalized distributions.
integralSum: (distribution) => number
Examples
integralSum(normal(5, 2))
Regular Arithmetic Operations
Regular arithmetic operations cover the basic mathematical operations on distributions. They work much like their equivalent operations on numbers.
The infixes +
,-
, *
, /
, ^
are supported for addition, subtraction, multiplication, division, power, and unaryMinus.
pointMass(5 + 10) == pointMass(5) + pointMass(10);
add
add: (distributionLike, distributionLike) => distribution
Examples
normal(0, 1) + normal(1, 3) // returns normal(1, 3.16...)
add(normal(0, 1), normal(1, 3)) // returns normal(1, 3.16...)
sum
sum: (list<distributionLike>) => distribution
cumulative sum
cumsum: (list<distributionLike>) => distribution
multiply
multiply: (distributionLike, distributionLike) => distribution
product
product: (list<distributionLike>) => distribution
cumulative product
cumprod: (list<distributionLike>) => list<distribution>
diff
diff: (list<distributionLike>) => list<distribution>
subtract
subtract: (distributionLike, distributionLike) => distribution
divide
divide: (distributionLike, distributionLike) => distribution
pow
pow: (distributionLike, distributionLike) => distribution
exp
exp: (distributionLike, distributionLike) => distribution
log
log: (distributionLike, distributionLike) => distribution
log10
log10: (distributionLike, distributionLike) => distribution
unaryMinus
unaryMinus: (distribution) => distribution
Examples
-normal(5, 2) // same as normal(-5, 2)
unaryMinus(normal(5, 2)) // same as normal(-5, 2)
Pointwise Arithmetic Operations
Unnormalized Results
Pointwise arithmetic operations typically return unnormalized or completely
invalid distributions. For example, the operation
normal(5,2) .- uniform(10,12)
results in a distribution-like
object with negative probability mass.
Pointwise arithmetic operations cover the standard arithmetic operations, but work in a different way than the regular operations. These operate on the y-values of the distributions instead of the x-values. A pointwise addition would add the y-values of two distributions.
The infixes .+
,.-
, .*
, ./
, .^
are supported for their respective operations.
The mixture
methods works with pointwise addition.
dotAdd
dotAdd: (distributionLike, distributionLike) => distribution
dotMultiply
dotMultiply: (distributionLike, distributionLike) => distribution
dotSubtract
dotSubtract: (distributionLike, distributionLike) => distribution
dotDivide
dotDivide: (distributionLike, distributionLike) => distribution
dotPow
dotPow: (distributionLike, distributionLike) => distribution
dotExp
dotExp: (distributionLike, distributionLike) => distribution