The STATISTICS library¶
The STATISTICS
library provides some basic statistical
functions along with optimized implementations.
Note
Currently, the statistical functions are only available
for limited vectors of <double-float>
values. This is
expected to change in the future.
The STATISTICS module¶
Types¶
- <double-float-vector> Type¶
A
<vector>
that only contains<double-float>
values.- Equivalent:
limited(<vector>, of: <double-float>)
- Discussion:
A
<vector>
that only contains<double-float>
values.This type is used for implementations of statistical functions which are specialized for
<double-float>
values.- See also:
- <double-float?-vector> Type¶
A
<vector>
that contains values that are either<double-float>
or#f
.- Equivalent:
limited(<vector>, of: false-or(<double-float>))
- Discussion:
A
<vector>
that contains values that are either<double-float>
or#f
.This type is used for implementations of statistical functions which may need to handle missing data. By using a separate type from
<double-float-vector>
, the implementation can limit any overhead from handling missing values to only being applied where it is needed.Note
Implementations of the statistical functions which handle missing data have not yet been provided.
- See also:
- <numeric-sequence> Type¶
- Equivalent:
type-union(<double-float-vector>, <double-float?-vector>)
- See also:
Coercion Functions¶
- double-float-vector Function¶
Utility function for converting a sequence that contains only
<double-float>
values to a<double-float-vector>
for use with the optimized implementations of the basic statistical functions.- Signature:
double-float-vector (seq) => (vec)
- Parameters:
seq – An instance of
<sequence>
.
- Values:
vec – An instance of
<double-float-vector>
.
- Example:
let dv = double-float-vector(#[1.0d0, 2.0d0, 3.0d0]);
Extrema¶
- maximum Open Generic function¶
Returns the maximum value from a numeric sequence.
- Signature:
maximum (sample) => (maximum)
- Parameters:
sample – An instance of
<numeric-sequence>
.
- Values:
maximum – An instance of
<number>
.
- Example:
Assuming that
dv
contains the values#[1.0d0, -1.0d0, 2.0d0]
:? maximum(dv) => 2.0d0
- See also:
- maximum(<double-float-vector>) Sealed Method¶
A specialized implementation of
maximum
for<double-float>
.- Parameters:
sample – An instance of
<double-float-vector>
.
- Values:
maximum – An instance of
<double-float>
.
- maximum/trimmed Open Generic function¶
Returns the maximum value from a numeric sequence that is below (or optionally equal to) an upper limit.
- Signature:
maximum/trimmed (sample upper-limit #key inclusive?) => (maximum)
- Parameters:
sample – An instance of
<numeric-sequence>
.upper-limit – An instance of
<number>
.inclusive? (#key) – An instance of
<boolean>
. Default value is#t
.
- Values:
maximum – An instance of
<number>
.
- Discussion:
Returns the maximum value from a numeric sequence that is below (or optionally equal to) an
upper-limit
.If
inclusive?
is true (the default), then values equal to theupper-limit
are included when calculating the maximum value.- Example:
Assuming that
dv
contains the values#[1.0d0, 2.0d0, 3.0d0, 4.0d0]
:? maximum/trimmed(dv, 3.0d0, inclusive?: #t) => 3.0d0 ? maximum/trimmed(dv, 3.0d0, inclusive?: #f) => 2.0d0
- See also:
- maximum/trimmed(<double-float-vector>, <double-float>) Sealed Method¶
A specialized implementation of
maximum/trimmed
for<double-float>
.- Parameters:
sample – An instance of
<double-float-vector>
.upper-limit – An instance of
<double-float>
.inclusive? (#key) – An instance of
<boolean>
.
- Values:
maximum – An instance of
<double-float>
.
- minimum Open Generic function¶
Returns the minimum value from a numeric sequence.
- Signature:
minimum (sample) => (minimum)
- Parameters:
sample – An instance of
<numeric-sequence>
.
- Values:
minimum – An instance of
<number>
.
- Example:
Assuming that
dv
contains the values#[1.0d0, -1.0d0, 2.0d0]
:? minimum(dv) => -1.0d0
- See also:
- minimum(<double-float-vector>) Sealed Method¶
A specialized implementation of
minimum
for<double-float>
.- Parameters:
sample – An instance of
<double-float-vector>
.
- Values:
minimum – An instance of
<double-float>
.
- minimum/trimmed Open Generic function¶
Returns the minimum value from a numeric sequence that is over (or optionally equal to) a
lower-limit
.- Signature:
minimum/trimmed (sample lower-limit #key inclusive?) => (minimum)
- Parameters:
sample – An instance of
<numeric-sequence>
.lower-limit – An instance of
<number>
.inclusive? (#key) – An instance of
<boolean>
.
- Values:
minimum – An instance of
<number>
.
- Discussion:
Returns the minimum value from a numeric sequence that is over (or optionally equal to) a
lower-limit
.If
inclusive?
is true (the default), then values equal to thelower-limit
are included when calculating the minimum value.- Example:
Assuming that
dv
contains the values#[1.0d0, 2.0d0, 3.0d0, 4.0d0]
:? minimum/trimmed(dv, 2.0d0, inclusive?: #t) => 2.0d0 ? minimum/trimmed(dv, 2.0d0, inclusive?: #f) => 3.0d0
- See also:
- minimum/trimmed(<double-float-vector>, <double-float>) Sealed Method¶
A specialized implementation of
minimum/trimmed
for<double-float>
.- Parameters:
sample – An instance of
<double-float-vector>
.lower-limit – An instance of
<double-float>
.inclusive? (#key) – An instance of
<boolean>
.
- Values:
minimum – An instance of
<double-float>
.
- minimum+maximum Open Generic function¶
Returns both the minimum and maximum values within a numeric sequence.
- Signature:
minimum+maximum (sample) => (minimum maximum)
- Parameters:
sample – An instance of
<numeric-sequence>
.
- Values:
- Example:
Assuming that
dv
contains the values#[1.0d0, -1.0d0, 2.0d0]
:? minimum+maximum(dv) => values(-1.0d0, 2.0d0)
- See also:
- minimum+maximum(<double-float-vector>) Sealed Method¶
A specialized implementation of
minimum+maximum
for<double-float>
.- Parameters:
sample – An instance of
<double-float-vector>
.
- Values:
minimum – An instance of
<double-float>
.maximum – An instance of
<double-float>
.
Means¶
- mean/arithmetic Open Generic function¶
Returns the arithmetic mean of a numeric sequence.
- Signature:
mean/arithmetic (sample) => (mean)
- Parameters:
sample – An instance of
<numeric-sequence>
.
- Values:
mean – An instance of
<number>
.
- Discussion:
Returns the arithmetic mean of a numeric sequence.
Commonly known as just ‘mean’ or ‘average’, the arithmetic mean is the sum of the values of the sequence, divided by the number of values in the sequence. It is distinct from other ways of calculating a mean such as those provided by
mean/geometric
andmean/harmonic
.A simple (and slightly faster) naive implementation of the arithmetic mean is subject to numerical inaccuracy. This implementation follows the method presented by Knuth in The Art of Computer Programming, 3rd edition on page 232.
- Equivalent:
The arithmetic mean is given by:
\[\frac{1}{n} \sum_{i=1}^{n} x_{i}\]Our implementation is computed as follows:
\[\begin{split}&m_{1} = x_{1} \\ &m_{k} = m_{k-1} + \frac{x_{k} - m_{k-1}}{k}\end{split}\]- Example:
Assuming that
dv
contains the values#[1.0d0, 2.0d0, 8.0d0, 9.0d0]
:? mean/arithmetic(dv) => 5.25d0
- See also:
- mean/arithmetic(<double-float-vector>) Sealed Method¶
A specialized implementation of
mean/arithmetic
for<double-float>
.- Parameters:
sample – An instance of
<double-float-vector>
.
- Values:
mean – An instance of
<double-float>
.
- mean/fast Open Generic function¶
Returns the arithmetic mean of a numeric sequence.
- Signature:
mean/fast (sample) => (mean)
- Parameters:
sample – An instance of
<numeric-sequence>
.
- Values:
mean – An instance of
<number>
.
- Discussion:
Returns the arithmetic mean of a numeric sequence.
This differs from
mean/arithmetic
by using a naive algorithm that is slightly faster, but subject to numerical inaccuracy. You should only use this function if you’re aware of the risks.- Equivalent:
\(\frac{1}{n} \sum_{i=1}^{n} x_{i}\)
- Example:
Assuming that
dv
contains the values#[1.0d0, 2.0d0, 8.0d0, 9.0d0]
:? mean/arithmetic(dv) => 5.25d0
- See also:
- mean/fast(<double-float-vector>) Sealed Method¶
A specialized implementation of
mean/fast
for<double-float>
.- Parameters:
sample – An instance of
<double-float-vector>
.
- Values:
mean – An instance of
<double-float>
.
- mean/geometric Open Generic function¶
Returns the geometric mean of a numeric sequence.
- Signature:
mean/geometric (sample) => (mean)
- Parameters:
sample – An instance of
<numeric-sequence>
.
- Values:
mean – An instance of
<number>
.
- Discussion:
Returns the geometric mean of a numeric sequence.
For greater numerical accuracy, our implementation is based on the exponentiation of the arithmetic mean of the natural logarithm of each value in
sample
.- Equivalent:
The geometric mean is given by:
\[\left(\prod_{i=1}^na_i \right)^{1/n}\]Our implementation is computed as follows:
\[\exp\left[\frac1n\sum_{i=1}^n\ln a_i\right]\]- Example:
Assuming that
dv
contains the values#[2.0d0, 4.0d0, 8.0d0]
:? mean/geometric(dv) => 4.0d0
- See also:
- mean/geometric(<double-float-vector>) Sealed Method¶
A specialized implementation of
mean/geometric
for<double-float>
.- Parameters:
sample – An instance of
<double-float-vector>
.
- Values:
mean – An instance of
<double-float>
.
- mean/harmonic Open Generic function¶
Returns the harmonic mean of a numeric sequence.
- Signature:
mean/harmonic (sample) => (mean)
- Parameters:
sample – An instance of
<numeric-sequence>
.
- Values:
mean – An instance of
<number>
.
- Discussion:
Returns the harmonic mean of a numeric sequence.
The harmonic mean is the reciprocal of the arithmetic mean of the reciprocals of the values of the sequence.
- Equivalent:
The harmonic mean is given by:
\[\frac{n}{\sum_{i=1}^{n} \frac{1}{x_{i}}}\]- See also:
- mean/harmonic(<double-float-vector>) Sealed Method¶
A specialized implementation of
mean/harmonic
for<double-float>
.- Parameters:
sample – An instance of
<double-float-vector>
.
- Values:
mean – An instance of
<double-float>
.
Scaling¶
- scale Open Generic function¶
- Signature:
scale (sample lower-bound upper-bound) => (res)
- Parameters:
sample – An instance of
<numeric-sequence>
.lower-bound – An instance of
<number>
.upper-bound – An instance of
<number>
.
- Values:
res – An instance of
<numeric-sequence>
.
- scale(<double-float-vector>, <double-float>, <double-float>) Sealed Method¶
A specialized implementation of
scale
for<double-float>
.- Parameters:
sample – An instance of
<double-float-vector>
.lower-bound – An instance of
<double-float>
.upper-bound – An instance of
<double-float>
.
- Values:
res – An instance of
<double-float-vector>
.
Variance and Deviation¶
- standard-deviation/population Open Generic function¶
- Signature:
standard-deviation/population (sample) => (standard-deviation)
- Parameters:
sample – An instance of
<numeric-sequence>
.
- Values:
standard-deviation – An instance of
<number>
.
- See also:
- standard-deviation/population(<double-float-vector>) Sealed Method¶
A specialized implementation of
standard-deviation/population
for<double-float>
.- Parameters:
sample – An instance of
<double-float-vector>
.
- Values:
standard-deviation – An instance of
<double-float>
.
- standard-deviation/sample Open Generic function¶
- Signature:
standard-deviation/sample (sample) => (standard-deviation)
- Parameters:
sample – An instance of
<numeric-sequence>
.
- Values:
standard-deviation – An instance of
<number>
.
- Discussion:
The standard-deviation calculation for a sample, rather than a complete population, uses
sample.size - 1
rather than the sample size. This is Bessel’s Correction.- See also:
- standard-deviation/sample(<double-float-vector>) Sealed Method¶
A specialized implementation of
standard-deviation/sample
for<double-float>
.- Parameters:
sample – An instance of
<double-float-vector>
.
- Values:
standard-deviation – An instance of
<double-float>
.
- variance/population Open Generic function¶
- Signature:
variance/population (sample) => (variance)
- Parameters:
sample – An instance of
<numeric-sequence>
.
- Values:
variance – An instance of
<number>
.
- See also:
- variance/population(<double-float-vector>) Sealed Method¶
A specialized implementation of
variance/population
for<double-float>
.- Parameters:
sample – An instance of
<double-float-vector>
.
- Values:
variance – An instance of
<double-float>
.
- variance/sample Open Generic function¶
- Signature:
variance/sample (sample) => (variance)
- Parameters:
sample – An instance of
<numeric-sequence>
.
- Values:
variance – An instance of
<number>
.
- See also:
- variance/sample(<double-float-vector>) Sealed Method¶
A specialized implementation of
variance/sample
for<double-float>
.- Parameters:
sample – An instance of
<double-float-vector>
.
- Values:
variance – An instance of
<double-float>
.
- standard-scores Open Generic function¶
- Signature:
standard-scores (population) => (scores)
- Parameters:
population – An instance of
<numeric-sequence>
.
- Values:
scores – An instance of
<numeric-sequence>
.
- Equivalent:
The standard score of a value in a sequence is given by:
\[z = {x- \mu \over \sigma}\]Where:
μ is the mean of the population
σ is the standard deviation of the population
- See also:
- standard-scores(<double-float-vector>) Sealed Method¶
A specialized implementation of
standard-scores
for<double-float>
.- Parameters:
population – An instance of
<double-float-vector>
.
- Values:
scores – An instance of
<double-float-vector>
.