Skip to main content

Distribution

Distributions are the flagship data type in Squiggle. The distribution type is a generic data type that contains one of three different formats of distributions. These subtypes are point set, sample set, and symbolic. The first two of these have a few custom functions that only work on them. You can read more about the differences between these formats here.

Several functions below only can work on particular distribution formats. For example, scoring and pointwise math requires the point set format. When this happens, the types are automatically converted to the correct format. These conversions are lossy.

Distribution Creation

These are functions for creating primative distributions. Many of these could optionally take in distributions as inputs. In these cases, Monte Carlo Sampling will be used to generate the greater distribution. This can be used for simple hierarchical models.

See a longer tutorial on creating distributions here.

normal

normal: (distribution|number, distribution|number) => distribution
normal: (dict<{p5: distribution|number, p95: distribution|number}>) => distribution
normal: (dict<{mean: distribution|number, stdev: distribution|number}>) => distribution

Examples

normal(5, 1)
normal({ p5: 4, p95: 10 })
normal({ mean: 5, stdev: 2 })
normal(5 to 10, normal(3, 2))
normal({ mean: uniform(5, 9), stdev: 3 })

lognormal

lognormal: (distribution|number, distribution|number) => distribution
lognormal: (dict<{p5: distribution|number, p95: distribution|number}>) => distribution
lognormal: (dict<{mean: distribution|number, stdev: distribution|number}>) => distribution

Examples

lognormal(0.5, 0.8);
lognormal({ p5: 4, p95: 10 });
lognormal({ mean: 5, stdev: 2 });

uniform

uniform: (distribution|number, distribution|number) => distribution

Examples

uniform(10, 12);

beta

beta: (distribution|number, distribution|number) => distribution
beta: (dict<{mean: distribution|number, stdev: distribution|number}>) => distribution

Examples

beta(20, 25);
beta({ mean: 0.39, stdev: 0.1 });

cauchy

cauchy: (distribution|number, distribution|number) => distribution

Examples

cauchy(5, 1);

gamma

gamma: (distribution|number, distribution|number) => distribution

Examples

gamma(5, 1);

logistic

logistic: (distribution|number, distribution|number) => distribution

Examples

gamma(5, 1);

exponential

exponential: (distribution|number) => distribution

Examples

exponential(2);

bernoulli

bernoulli: (distribution|number) => distribution

Examples

bernoulli(0.5);

triangular

triangular: (number, number, number) => distribution;

Examples

triangular(5, 10, 20);

to / credibleIntervalToDistribution

The to function is an easy way to generate simple distributions using predicted 5th and 95th percentiles.

If both values are above zero, a lognormal distribution is used. If not, a normal distribution is used.

To is an alias for credibleIntervalToDistribution. However, because of its frequent use, it is recommended to use the shorter name.

to: (distribution|number, distribution|number) => distribution
credibleIntervalToDistribution(distribution|number, distribution|number) => distribution

Examples

5 to 10
to(5,10)
-5 to 5

mixture

mixture: (...distributionLike, weights?:list<float>) => distribution
mixture: (list<distributionLike>, weights?:list<float>) => distribution

Examples

mixture(normal(5, 1), normal(10, 1), 8);
mx(normal(5, 1), normal(10, 1), [0.3, 0.7]);
mx([normal(5, 1), normal(10, 1)], [0.3, 0.7]);

Functions

sample

One random sample from the distribution

sample: (distribution) => number

Examples

sample(normal(5, 2));

sampleN

N random samples from the distribution

sampleN: (distribution, number) => list<number>

Examples

sampleN(normal(5, 2), 100);

mean

The distribution mean

mean: (distribution) => number

Examples

mean(normal(5, 2));

stdev

Standard deviation. Only works now on sample set distributions (so converts other distributions into sample set in order to calculate.)

stdev: (distribution) => number

variance

Variance. Similar to stdev, only works now on sample set distributions.

variance: (distribution) => number

mode

mode: (distribution) => number

cdf

cdf: (distribution, number) => number

Examples

cdf(normal(5, 2), 3);

pdf

pdf: (distribution, number) => number

Examples

pdf(normal(5, 2), 3);

quantile

quantile: (distribution, number) => number

Examples

quantile(normal(5, 2), 0.5);

truncate

Truncates both the left side and the right side of a distribution.

truncate: (distribution, left: number, right: number) => distribution
Implementation Details

Sample set distributions are truncated by filtering samples, but point set distributions are truncated using direct geometric manipulation. Uniform distributions are truncated symbolically. Symbolic but non-uniform distributions get converted to Point Set distributions.

truncateLeft

Truncates the left side of a distribution.

truncateLeft: (distribution, left: number) => distribution

Examples

truncateLeft(normal(5, 2), 3);

truncateRight

Truncates the right side of a distribution.

truncateRight: (distribution, right: number) => distribution

Examples

truncateRight(normal(5, 2), 6);

klDivergence

Kullback–Leibler divergence between two distributions.

klDivergence: (distribution, distribution) => number

Examples

klDivergence(normal(5, 2), normal(5, 4)); // returns 0.57

logScore

A log loss score. Often that often acts as a scoring rule. Useful when evaluating the accuracy of a forecast.

Note that it is fairly slow.

Dist.logScore: ({estimate: distribution, ?prior: distribution, answer: distribution|number}) => number

Examples

Dist.logScore({
estimate: normal(5, 2),
answer: normal(4.5, 1.2),
prior: normal(6, 4),
}); // returns -0.597.57

Display

toString

toString: (distribution) => string

Examples

toString(normal(5, 2));

sparkline

Produce a sparkline of length n. For example, ▁▁▁▁▁▂▄▆▇██▇▆▄▂▁▁▁▁▁. These can be useful for testing or quick text visualizations.

sparkline: (distribution, n = 20) => string

Examples

toSparkline(truncateLeft(normal(5, 2), 3), 20); // produces ▁▇█████▇▅▄▃▂▂▁▁▁▁▁▁▁

Normalization

There are some situations where computation will return unnormalized distributions. This means that their cumulative sums are not equal to 1.0. Unnormalized distributions are not valid for many relevant functions; for example, klDivergence and scoring.

The only functions that do not return normalized distributions are the pointwise arithmetic operations and the scalewise arithmetic operations. If you use these functions, it is recommended that you consider normalizing the resulting distributions.

normalize

Normalize a distribution. This means scaling it appropriately so that it's cumulative sum is equal to 1. This only impacts Point Set distributions, because those are the only ones that can be non-normlized.

normalize: (distribution) => distribution

Examples

normalize(normal(5, 2));

isNormalized

Check of a distribution is normalized. Most distributions are typically normalized, but there are some commands that could produce non-normalized distributions.

isNormalized: (distribution) => bool

Examples

isNormalized(normal(5, 2)); // returns true

integralSum

Note: If you have suggestions for better names for this, please let us know.

Get the sum of the integral of a distribution. If the distribution is normalized, this will be 1.0. This is useful for understanding unnormalized distributions.

integralSum: (distribution) => number

Examples

integralSum(normal(5, 2));

Regular Arithmetic Operations

Regular arithmetic operations cover the basic mathematical operations on distributions. They work much like their equivalent operations on numbers.

The infixes +,-, *, /, ^ are supported for addition, subtraction, multiplication, division, power, and unaryMinus.

pointMass(5 + 10) == pointMass(5) + pointMass(10);

add

add: (distributionLike, distributionLike) => distribution

Examples

normal(0, 1) + normal(1, 3); // returns normal(1, 3.16...)
add(normal(0, 1), normal(1, 3)); // returns normal(1, 3.16...)

sum

Todo: Not yet implemented

sum: (list<distributionLike>) => distribution

Examples

sum([normal(0, 1), normal(1, 3), uniform(10, 1)]);

multiply

multiply: (distributionLike, distributionLike) => distribution

product

product: (list<distributionLike>) => distribution

subtract

subtract: (distributionLike, distributionLike) => distribution

divide

divide: (distributionLike, distributionLike) => distribution

pow

pow: (distributionLike, distributionLike) => distribution

exp

exp: (distributionLike, distributionLike) => distribution

log

log: (distributionLike, distributionLike) => distribution

log10

log10: (distributionLike, distributionLike) => distribution

unaryMinus

unaryMinus: (distribution) => distribution

Examples

-normal(5, 2); // same as normal(-5, 2)
unaryMinus(normal(5, 2)); // same as normal(-5, 2)

Pointwise Arithmetic Operations

Unnormalized Results

Pointwise arithmetic operations typically return unnormalized or completely invalid distributions. For example, the operation normal(5,2) .- uniform(10,12) results in a distribution-like object with negative probability mass.

Pointwise arithmetic operations cover the standard arithmetic operations, but work in a different way than the regular operations. These operate on the y-values of the distributions instead of the x-values. A pointwise addition would add the y-values of two distributions.

The infixes .+,.-, .*, ./, .^ are supported for their respective operations.

The mixture methods works with pointwise addition.

dotAdd

dotAdd: (distributionLike, distributionLike) => distribution

dotMultiply

dotMultiply: (distributionLike, distributionLike) => distribution

dotSubtract

dotSubtract: (distributionLike, distributionLike) => distribution

dotDivide

dotDivide: (distributionLike, distributionLike) => distribution

dotPow

dotPow: (distributionLike, distributionLike) => distribution

dotExp

dotExp: (distributionLike, distributionLike) => distribution