This is the cl-mathstats Reference Manual, version 0.8.2, generated automatically by Declt version 4.0 beta 2 "William Riker" on Sun Dec 15 05:08:33 2024 GMT+0.
cl-mathstats/cl-mathstats.asd
cl-mathstats/dev/package.lisp
cl-mathstats/dev/api.lisp
cl-mathstats/dev/parameters.lisp
cl-mathstats/dev/math-utilities.lisp
cl-mathstats/dev/class-defs.lisp
cl-mathstats/dev/definitions.lisp
cl-mathstats/dev/binary-math.lisp
cl-mathstats/dev/matrices.lisp
cl-mathstats/dev/matrix-fns.lisp
cl-mathstats/dev/density-fns.lisp
cl-mathstats/dev/svd.lisp
cl-mathstats/dev/utilities.lisp
cl-mathstats/dev/define-statistical-fun.lisp
cl-mathstats/dev/basic-statistics.lisp
cl-mathstats/dev/smoothing.lisp
cl-mathstats/dev/correlation-regression.lisp
cl-mathstats/dev/anova.lisp
The main system appears first, followed by any subsystem dependency.
cl-mathstats
Common Lisp math and statistics routines
Gary Warren King <gwking@metabang.com>
Gary Warren King <gwking@metabang.com>
MIT Style License
0.8.2
metatilities-base
(system).
cl-containers
(system).
Modules are listed depth-first from the system components tree.
cl-mathstats/dev
cl-mathstats
(system).
package.lisp
(file).
api.lisp
(file).
parameters.lisp
(file).
math-utilities.lisp
(file).
class-defs.lisp
(file).
definitions.lisp
(file).
binary-math.lisp
(file).
matrices.lisp
(file).
matrix-fns.lisp
(file).
density-fns.lisp
(file).
svd.lisp
(file).
utilities.lisp
(file).
define-statistical-fun.lisp
(file).
basic-statistics.lisp
(file).
smoothing.lisp
(file).
correlation-regression.lisp
(file).
anova.lisp
(file).
cl-mathstats/website
cl-mathstats
(system).
source
(module).
Files are sorted by type and then listed depth-first from the systems components trees.
cl-mathstats/cl-mathstats.asd
cl-mathstats/dev/package.lisp
cl-mathstats/dev/api.lisp
cl-mathstats/dev/parameters.lisp
cl-mathstats/dev/math-utilities.lisp
cl-mathstats/dev/class-defs.lisp
cl-mathstats/dev/definitions.lisp
cl-mathstats/dev/binary-math.lisp
cl-mathstats/dev/matrices.lisp
cl-mathstats/dev/matrix-fns.lisp
cl-mathstats/dev/density-fns.lisp
cl-mathstats/dev/svd.lisp
cl-mathstats/dev/utilities.lisp
cl-mathstats/dev/define-statistical-fun.lisp
cl-mathstats/dev/basic-statistics.lisp
cl-mathstats/dev/smoothing.lisp
cl-mathstats/dev/correlation-regression.lisp
cl-mathstats/dev/anova.lisp
cl-mathstats/cl-mathstats.asd
cl-mathstats
(system).
cl-mathstats/dev/api.lisp
package.lisp
(file).
dev
(module).
dot-product
(generic function).
cl-mathstats/dev/parameters.lisp
package.lisp
(file).
dev
(module).
*gaussian-cdf-signals-zero-standard-deviation-error*
(special variable).
cl-mathstats/dev/math-utilities.lisp
package.lisp
(file).
dev
(module).
+e+
(constant).
2fpi
(constant).
combination-count
(function).
degrees->radians
(function).
ensure-float
(function).
f-measure
(function).
fpi
(constant).
linear-scale
(function).
on-interval
(function).
permutation-count
(function).
radians->degrees
(function).
round-to-factor
(function).
square
(function).
truncate-to-factor
(function).
cl-mathstats/dev/class-defs.lisp
package.lisp
(file).
dev
(module).
data-error
(condition).
enormous-contingency-table
(condition).
insufficient-data
(condition).
no-data
(condition).
not-binary-variables
(condition).
unmatched-sequences
(condition).
zero-standard-deviation
(condition).
zero-variance
(condition).
cl-mathstats/dev/definitions.lisp
math-utilities.lisp
(file).
dev
(module).
+0degrees+
(constant).
+10degrees+
(constant).
+120degrees+
(constant).
+135degrees+
(constant).
+150degrees+
(constant).
+15degrees+
(constant).
+180degrees+
(constant).
+210degrees+
(constant).
+225degrees+
(constant).
+240degrees+
(constant).
+270degrees+
(constant).
+300degrees+
(constant).
+30degrees+
(constant).
+315degrees+
(constant).
+330degrees+
(constant).
+360degrees+
(constant).
+45degrees+
(constant).
+5degrees+
(constant).
+60degrees+
(constant).
+90degrees+
(constant).
cl-mathstats/dev/binary-math.lisp
package.lisp
(file).
dev
(module).
cl-mathstats/dev/matrices.lisp
package.lisp
(file).
dev
(module).
normalize-matrix
(function).
sum-of-array-elements
(function).
transpose-matrix
(function).
1-or-2d-arrayp
(function).
check-type-of-arg
(macro).
fill-2d-array
(function).
invert-matrix
(function).
invert-matrix-iterate
(function).
list-2d-array
(function).
matrix-norm
(function).
multiply-matrices
(function).
scalar-matrix-multiply
(function).
cl-mathstats/dev/matrix-fns.lisp
matrices.lisp
(file).
dev
(module).
matrix-multiply
(function).
matrix-trace
(function).
matrix-addition
(function).
matrix-plus-matrix
(function).
matrix-plus-scalar
(function).
matrix-times-matrix
(function).
matrix-times-scalar
(function).
matrix-times-scalar!
(function).
reduce-matrix
(function).
cl-mathstats/dev/density-fns.lisp
parameters.lisp
(file).
dev
(module).
beta
(function).
beta-incomplete
(function).
binomial-cdf
(function).
binomial-cdf-exact
(function).
binomial-coefficient
(function).
binomial-coefficient-exact
(function).
binomial-probability
(function).
binomial-probability-exact
(function).
chi-square-significance
(function).
error-function
(function).
error-function-complement
(function).
f-significance
(function).
factorial
(function).
factorial-exact
(function).
factorial-ln
(function).
gamma-incomplete
(function).
gamma-ln
(function).
gaussian-cdf
(function).
gaussian-significance
(function).
poisson-cdf
(function).
safe-exp
(function).
students-t-significance
(function).
underflow-goes-to-zero
(macro).
+log-pi+
(constant).
+sqrt-pi+
(constant).
error-function-complement-short-1
(function).
error-function-complement-short-2
(function).
cl-mathstats/dev/svd.lisp
matrix-fns.lisp
(file).
dev
(module).
aref1
(macro).
aref11
(macro).
pythag-df
(function).
pythag-sf
(function).
sign-df
(macro).
sign-sf
(macro).
singular-value-decomposition
(function).
svbksb-df
(function).
svbksb-sf
(function).
svd-back-substitute
(function).
svd-inverse-fast-df
(function).
svd-inverse-fast-sf
(function).
svd-inverse-slow-df
(function).
svd-inverse-slow-sf
(function).
svd-matrix-inverse
(function).
svd-solve-linear-system
(function).
svd-zero
(function).
svdvar
(function).
svzero-df
(function).
svzero-sf
(function).
cl-mathstats/dev/utilities.lisp
package.lisp
(file).
dev
(module).
extract-unique-values
(function).
with-temp-table
(macro).
with-temp-vector
(macro).
*temporary-table*
(special variable).
*temporary-vector*
(special variable).
cl-mathstats/dev/define-statistical-fun.lisp
package.lisp
(file).
dev
(module).
convert
(method).
*create-statistical-objects*
(special variable).
composite-statistic
(class).
composite-statistic-p
(method).
composite-statistic-p
(method).
data
(class).
define-statistic
(macro).
make-statistic
(method).
remove-&rest
(function).
simple-statistic
(class).
simple-statistic-p
(method).
simple-statistic-p
(method).
statistic
(class).
statisticp
(method).
statisticp
(method).
with-routine-error-handling
(macro).
cl-mathstats/dev/basic-statistics.lisp
class-defs.lisp
(file).
define-statistical-fun.lisp
(file).
utilities.lisp
(file).
dev
(module).
confidence-interval
(function).
confidence-interval
(class).
confidence-interval-proportion
(function).
confidence-interval-proportion
(class).
confidence-interval-t
(function).
confidence-interval-t
(class).
confidence-interval-t-summaries
(function).
confidence-interval-z
(function).
confidence-interval-z
(class).
convert
(method).
convert
(method).
covariance
(function).
covariance
(class).
cross-product
(method).
d-test
(function).
d-test
(class).
data-length
(function).
data-length
(class).
dot-product
(method).
interquartile-range
(function).
interquartile-range
(class).
maximum
(function).
maximum
(class).
mean
(function).
mean
(class).
median
(function).
median
(class).
minimum
(function).
minimum
(class).
mode
(function).
mode
(class).
multiple-modes
(function).
multiple-modes
(class).
quantile
(function).
quantile
(class).
r-score
(function).
range
(function).
range
(class).
scheffe-tests
(function).
significance
(function).
significance
(class).
skewness
(function).
skewness
(class).
standard-deviation
(function).
standard-deviation
(class).
statistical-summary
(function).
statistical-summary
(class).
t-significance
(function).
t-significance
(class).
t-test
(function).
t-test
(class).
t-test-matched
(function).
t-test-matched
(class).
t-test-one-sample
(function).
t-test-one-sample
(class).
trimmed-mean
(function).
trimmed-mean
(class).
tukey-summary
(function).
tukey-summary
(class).
variance
(function).
variance
(class).
z-test-one-sample
(function).
z-test-one-sample
(class).
*continous-data-window-divisor*
(special variable).
*continuous-variable-uniqueness-factor*
(special variable).
*way-too-big-contingency-table-dimension*
(special variable).
chi-square-2x2
(function).
chi-square-2x2-counts
(function).
chi-square-rxc
(function).
chi-square-rxc-counts
(function).
confidence-interval-internal
(function).
confidence-interval-proportion-internal
(function).
confidence-interval-t-internal
(function).
confidence-interval-z-internal
(function).
confidence-interval-z-summaries
(function).
covariance-internal
(function).
d-test-internal
(function).
data-continuous-p
(function).
data-length-internal
(function).
difference-list
(function).
find-critical-value
(function).
g-test
(function).
inner-product
(function).
interquartile-range-internal
(function).
make-contingency-table
(function).
maximum-internal
(function).
mean-internal
(function).
median-internal
(function).
minimum-internal
(function).
mode-for-continuous-data
(function).
mode-internal
(function).
multiple-modes-internal
(function).
print-scheffe-table
(function).
quantile-internal
(function).
range-internal
(function).
significance-internal
(function).
skewness-internal
(function).
smart-mode
(function).
standard-deviation-internal
(function).
start/end
(macro).
statistical-summary-internal
(function).
sum-list
(function).
sum-of-squares
(function).
t-significance-internal
(function).
t-test-internal
(function).
t-test-matched-internal
(function).
t-test-one-sample-internal
(function).
trimmed-mean-internal
(function).
tukey-summary-internal
(function).
variance-internal
(function).
z-test-one-sample-internal
(function).
cl-mathstats/dev/smoothing.lisp
utilities.lisp
(file).
dev
(module).
smooth-4253h
(function).
smooth-hanning
(function).
smooth-mean-2
(function).
smooth-mean-3
(function).
smooth-mean-4
(function).
smooth-mean-5
(function).
smooth-median-2
(function).
smooth-median-3
(function).
smooth-median-4
(function).
smooth-median-5
(function).
cl-mathstats/dev/correlation-regression.lisp
package.lisp
(file).
define-statistical-fun.lisp
(file).
dev
(module).
autocorrelation
(function).
autocorrelation
(class).
correlation
(function).
correlation
(class).
correlation-from-summaries
(function).
correlation-matrix
(function).
cross-correlation
(function).
cross-correlation
(class).
lagged-correlation
(function).
linear-regression-brief
(function).
linear-regression-brief-summaries
(function).
linear-regression-minimal
(function).
linear-regression-minimal-summaries
(function).
linear-regression-verbose
(function).
linear-regression-verbose-summaries
(function).
multiple-linear-regression-arrays
(function).
multiple-linear-regression-brief
(function).
multiple-linear-regression-minimal
(function).
multiple-linear-regression-normal
(function).
multiple-linear-regression-verbose
(function).
partials-from-parents
(function).
autocorrelation-internal
(function).
correlation-internal
(function).
cross-correlation-internal
(function).
cl-mathstats/dev/anova.lisp
package.lisp
(file).
dev
(module).
anova-one-way-variables
(function).
anova-one-way-variables
(class).
anova-two-way-variables
(function).
anova-two-way-variables
(class).
anova-two-way-variables-unequal-cell-sizes
(function).
anova-two-way-variables-unequal-cell-sizes
(class).
anova-one-way-groups
(function).
anova-one-way-variables-internal
(function).
anova-two-way-groups
(function).
anova-two-way-variables-internal
(function).
anova-two-way-variables-unequal-cell-sizes-internal
(function).
make-3d-table
(function).
print-anova-table
(function).
Packages are listed by definition order.
cl-mathstats
metabang.math
common-lisp
.
metabang.cl-containers
.
metabang.utilities
.
+0degrees+
(constant).
+10degrees+
(constant).
+120degrees+
(constant).
+135degrees+
(constant).
+150degrees+
(constant).
+15degrees+
(constant).
+180degrees+
(constant).
+210degrees+
(constant).
+225degrees+
(constant).
+240degrees+
(constant).
+270degrees+
(constant).
+300degrees+
(constant).
+30degrees+
(constant).
+315degrees+
(constant).
+330degrees+
(constant).
+360degrees+
(constant).
+45degrees+
(constant).
+5degrees+
(constant).
+60degrees+
(constant).
+90degrees+
(constant).
+e+
(constant).
2fpi
(constant).
anova-one-way-variables
(function).
anova-one-way-variables
(class).
anova-two-way-variables
(function).
anova-two-way-variables
(class).
anova-two-way-variables-unequal-cell-sizes
(function).
anova-two-way-variables-unequal-cell-sizes
(class).
autocorrelation
(function).
autocorrelation
(class).
beta
(function).
beta-incomplete
(function).
binomial-cdf
(function).
binomial-cdf-exact
(function).
binomial-coefficient
(function).
binomial-coefficient-exact
(function).
binomial-probability
(function).
binomial-probability-exact
(function).
chi-square-significance
(function).
combination-count
(function).
confidence-interval
(function).
confidence-interval
(class).
confidence-interval-proportion
(function).
confidence-interval-proportion
(class).
confidence-interval-t
(function).
confidence-interval-t
(class).
confidence-interval-t-summaries
(function).
confidence-interval-z
(function).
confidence-interval-z
(class).
convert
(generic function).
correlation
(function).
correlation
(class).
correlation-from-summaries
(function).
correlation-matrix
(function).
covariance
(function).
covariance
(class).
cross-correlation
(function).
cross-correlation
(class).
cross-product
(generic function).
d-test
(function).
d-test
(class).
data-length
(function).
data-length
(class).
degrees->radians
(function).
div2
(function).
dot-product
(generic function).
ensure-float
(function).
error-function
(function).
error-function-complement
(function).
exp2
(function).
extract-unique-values
(function).
f-measure
(function).
f-significance
(function).
factorial
(function).
factorial-exact
(function).
factorial-ln
(function).
fpi
(constant).
gamma-incomplete
(function).
gamma-ln
(function).
gaussian-cdf
(function).
gaussian-significance
(function).
interquartile-range
(function).
interquartile-range
(class).
lagged-correlation
(function).
linear-regression-brief
(function).
linear-regression-brief-summaries
(function).
linear-regression-minimal
(function).
linear-regression-minimal-summaries
(function).
linear-regression-verbose
(function).
linear-regression-verbose-summaries
(function).
linear-scale
(function).
log2
(function).
matrix-multiply
(function).
matrix-trace
(function).
maximum
(function).
maximum
(class).
mean
(function).
mean
(class).
median
(function).
median
(class).
minimum
(function).
minimum
(class).
mod2
(function).
mode
(function).
mode
(class).
multiple-linear-regression-arrays
(function).
multiple-linear-regression-brief
(function).
multiple-linear-regression-minimal
(function).
multiple-linear-regression-normal
(function).
multiple-linear-regression-verbose
(function).
multiple-modes
(function).
multiple-modes
(class).
normalize-matrix
(function).
on-interval
(function).
partials-from-parents
(function).
permutation-count
(function).
poisson-cdf
(function).
quantile
(function).
quantile
(class).
r-score
(function).
radians->degrees
(function).
range
(function).
range
(class).
round-to-factor
(function).
safe-exp
(function).
scheffe-tests
(function).
significance
(function).
significance
(class).
skewness
(function).
skewness
(class).
smooth-4253h
(function).
smooth-hanning
(function).
smooth-mean-2
(function).
smooth-mean-3
(function).
smooth-mean-4
(function).
smooth-mean-5
(function).
smooth-median-2
(function).
smooth-median-3
(function).
smooth-median-4
(function).
smooth-median-5
(function).
square
(function).
standard-deviation
(function).
standard-deviation
(class).
statistical-summary
(function).
statistical-summary
(class).
students-t-significance
(function).
sum-of-array-elements
(function).
svdcmp-df
(function).
svdcmp-sf
(function).
t-significance
(function).
t-significance
(class).
t-test
(function).
t-test
(class).
t-test-matched
(function).
t-test-matched
(class).
t-test-one-sample
(function).
t-test-one-sample
(class).
times2
(function).
transpose-matrix
(function).
trimmed-mean
(function).
trimmed-mean
(class).
trunc2
(function).
truncate-to-factor
(function).
tukey-summary
(function).
tukey-summary
(class).
underflow-goes-to-zero
(macro).
variance
(function).
variance
(class).
with-temp-table
(macro).
with-temp-vector
(macro).
z-test-one-sample
(function).
z-test-one-sample
(class).
*continous-data-window-divisor*
(special variable).
*continuous-variable-uniqueness-factor*
(special variable).
*create-statistical-objects*
(special variable).
*gaussian-cdf-signals-zero-standard-deviation-error*
(special variable).
*temporary-table*
(special variable).
*temporary-vector*
(special variable).
*way-too-big-contingency-table-dimension*
(special variable).
+log-pi+
(constant).
+sqrt-pi+
(constant).
1-or-2d-arrayp
(function).
anova-one-way-groups
(function).
anova-one-way-variables-internal
(function).
anova-two-way-groups
(function).
anova-two-way-variables-internal
(function).
anova-two-way-variables-unequal-cell-sizes-internal
(function).
aref1
(macro).
aref11
(macro).
autocorrelation-internal
(function).
check-type-of-arg
(macro).
chi-square-2x2
(function).
chi-square-2x2-counts
(function).
chi-square-rxc
(function).
chi-square-rxc-counts
(function).
composite-statistic
(class).
composite-statistic-p
(generic function).
confidence-interval-internal
(function).
confidence-interval-proportion-internal
(function).
confidence-interval-t-internal
(function).
confidence-interval-z-internal
(function).
confidence-interval-z-summaries
(function).
correlation-internal
(function).
covariance-internal
(function).
cross-correlation-internal
(function).
d-test-internal
(function).
data
(class).
data-continuous-p
(function).
data-error
(condition).
data-length-internal
(function).
define-statistic
(macro).
difference-list
(function).
enormous-contingency-table
(condition).
error-function-complement-short-1
(function).
error-function-complement-short-2
(function).
fill-2d-array
(function).
find-critical-value
(function).
g-test
(function).
inner-product
(function).
insufficient-data
(condition).
interquartile-range-internal
(function).
invert-matrix
(function).
invert-matrix-iterate
(function).
list-2d-array
(function).
make-3d-table
(function).
make-contingency-table
(function).
make-statistic
(generic function).
matrix-addition
(function).
matrix-norm
(function).
matrix-plus-matrix
(function).
matrix-plus-scalar
(function).
matrix-times-matrix
(function).
matrix-times-scalar
(function).
matrix-times-scalar!
(function).
maximum-internal
(function).
mean-internal
(function).
median-internal
(function).
minimum-internal
(function).
mode-for-continuous-data
(function).
mode-internal
(function).
multiple-modes-internal
(function).
multiply-matrices
(function).
no-data
(condition).
not-binary-variables
(condition).
print-anova-table
(function).
print-scheffe-table
(function).
pythag-df
(function).
pythag-sf
(function).
quantile-internal
(function).
range-internal
(function).
reduce-matrix
(function).
remove-&rest
(function).
scalar-matrix-multiply
(function).
sign-df
(macro).
sign-sf
(macro).
significance-internal
(function).
simple-statistic
(class).
simple-statistic-p
(generic function).
singular-value-decomposition
(function).
skewness-internal
(function).
smart-mode
(function).
standard-deviation-internal
(function).
start/end
(macro).
statistic
(class).
statistical-summary-internal
(function).
statisticp
(generic function).
sum-list
(function).
sum-of-squares
(function).
svbksb-df
(function).
svbksb-sf
(function).
svd-back-substitute
(function).
svd-inverse-fast-df
(function).
svd-inverse-fast-sf
(function).
svd-inverse-slow-df
(function).
svd-inverse-slow-sf
(function).
svd-matrix-inverse
(function).
svd-solve-linear-system
(function).
svd-zero
(function).
svdvar
(function).
svzero-df
(function).
svzero-sf
(function).
t-significance-internal
(function).
t-test-internal
(function).
t-test-matched-internal
(function).
t-test-one-sample-internal
(function).
trimmed-mean-internal
(function).
tukey-summary-internal
(function).
unmatched-sequences
(condition).
variance-internal
(function).
with-routine-error-handling
(macro).
z-test-one-sample-internal
(function).
zero-standard-deviation
(condition).
zero-variance
(condition).
Definitions are sorted by export status, category, package, and then by lexicographic order.
An approximation of the constant e (named for Euler!).
The constant 2*pi, in single-float format. Using this constant avoid run-time double-float contagion.
The constant pi, in single-float format. Using this constant avoid run-time double-float contagion.
Protects against floating point underflow errors and sets the value to 0.0 instead.
Binds ‘temp’ to a hash table.
Binds ‘temp’ to a vector of length at least ‘min-size.’ It’s a vector of pointers and has a fill-pointer, initialized to ‘min-size.’
ANOVA-ONE-WAY-VARIABLES (IV DV &OPTIONAL (SCHEFFE-TESTS-P T)
CONFIDENCE-INTERVALS)
Performs a one-way analysis of variance (ANOVA) on the input data, which
should be two equal-length sequences: ‘iv’ is the independent variable,
represented as a sequence of categories or group identifiers, and ‘dv’ is the
dependent variable, represented as a sequence of numbers. The ‘iv’ variable
must be “sorted,” meaning that AAABBCCCCCDDDD is okay but ABCDABCDABDCDC is
not, where A, B, C and D are group identifiers. Furthermore, each group should
consist of at least 2 elements.
The significance of the result indicates that the group means are not all equal;
that is, at least two of the groups have significantly different means. If
there were only two groups, this would be semantically equivalent to an
unmatched, two-tailed t-test, so you can think of the one-way ANOVA as a
multi-group, two-tailed t-test.
This function returns five values: 1. an ANOVA table; 2. a list a group means; 3. either a Scheffe table or nil depending on ‘scheffe-tests-p’; and 4. an alternate value for SST. 5. a list of confidence intervals in the form ‘(,mean ,lower ,upper) for each group, if ‘confidence-intervals’ is a number between zero and one, giving the kind of confidence interval, such as 0.9. The fourth value is only interesting if you think there are numerical accuracy problems; it should be approximately equal to the SST value in the ANOVA table. This function differs from ‘anova-one-way-groups’ only in its input representation. See the manual for more information.
ANOVA-TWO-WAY-VARIABLES (DV IV1 IV2)
Calculates the analysis of variance when there are two factors that may
affect the dependent variable, specifically ‘iv1’ and ‘iv2.’ Unlike the one-way
ANOVA, there are mathematical difficulties with the two-way ANOVA if there are
unequal cell sizes; therefore, we require all cells to be the same size; that
is, the same number of values (of the dependent variable) for each combination
of the independent factors.
The result of the analysis is an anova-table, as described in the manual. This
function differs from ‘anova-two-way-groups’ only in its input representation.
See the manual for further discussion of analysis of variance.
The row effect is ‘iv1’ and the column effect is ‘iv2.’
ANOVA-TWO-WAY-VARIABLES-UNEQUAL-CELL-SIZES (IV1 IV2 DV)
Calculates the analysis of variance when there are two factors that may
affect the dependent variable, specifically ‘iv1’ and ‘iv2.’
Unlike the one-way ANOVA, there are mathematical difficulties with the two-way
ANOVA if there are unequal cell sizes. This function differs fron the standard
two-anova by (1) the use of cell means as single scores, (2) the division of
squared quantities by the number of cell means contributing to the quantity
that is squared and (3) the multiplication of a "sum of squares" by the harmonic
mean of the sample sizes.
The result of the analysis is an anova-table, as described in the manual.
See the manual for further discussion of analysis of
variance. The row effect is ‘iv1’ and the
column effect is ‘iv2.’
AUTOCORRELATION (SAMPLE MAX-LAG &OPTIONAL (MIN-LAG 0))
Autocorrelation is merely a cross-correlation between a sample and itself.
This function returns a list of correlations, where the i’th element is the
correlation of the sample with the sample starting at ‘i.’
Returns the value of the Beta function, defined in terms of the complete gamma function, G, as: G(z)G(w)/G(z+w). The implementation follows Numerical Recipes in C, section 6.1.
This function is useful in defining the cumulative distributions for
Student’s t and the F distribution.
All arguments must be floating-point numbers; ‘a’ and ‘b’ must be positive and ‘x’ must be between 0.0 and 1.0, inclusive.
Suppose an event occurs with probability ‘p’ per trial. This function
computes the probability of ‘k’ or more events occurring in ‘n’ trials. Note
that this is the complement of the usual definition of cdf. This function
approximates the actual computation using the incomplete beta function, but is
preferable for large ‘n’ (greater than a dozen or so) because it avoids
summing many tiny floating-point numbers.
The implementation follows Numerical Recipes in C, section 6.3
.
This is an exact but computationally intensive form of the preferred function, ‘binomial-cdf.’
Returns the binomial coefficient, ‘n’ choose ‘k,’ as an integer. The result may not be exactly correct, since the computation is done with logarithms. The result is rounded to an integer. The implementation follows Numerical Recipes in C, section 6.1
This is an exact but computationally intensive form of the preferred function, ‘binomial-coefficient.’
Returns the probability of ‘k’ successes in ‘n’ trials, where at each trial the probability of success is ‘p.’ This function uses floating-point approximations, and so is computationally efficient but not necessarily exact.
This is an exact but computationally intensive form of the preferred function, ‘binomial-probability.’
Computes the complement of the cumulative distribution function for a Chi-square random variable with ‘dof’ degrees of freedom evaluated at ‘x.’ The result is the probability that the observed chi-square for a correct model should be greater than ‘x.’ The implementation follows Numerical Recipes in C, section 6.2. Small values suggest that the null hypothesis should be rejected; in other words, this computes the significance of ‘x.’
Returns the number of combinations of n elements taken k at a time. Assumes valid input.
CONFIDENCE-INTERVAL NIL NIL
CONFIDENCE-INTERVAL-PROPORTION (X N CONFIDENCE)
Suppose we have a sample of ‘n’ things and ‘x’ of them are “successes.” We
can estimate the population proportion of successes as x/n; call it ‘p-hat.’
This function computes the estimate and a confidence interval on it. This
function is not appropriate for small samples with p-hat far from 1/2: ‘x’
should be at least 5, and so should ‘n’-‘x.’ This function returns three values:
p-hat, and the lower and upper bounds of the confidence interval. ‘Confidence’
should be a number between 0 and 1, exclusive.
CONFIDENCE-INTERVAL-T (DATA CONFIDENCE)
Suppose you have a sample of 10 numbers and you want to compute a 90 percent
confidence interval on the population mean. This function is the one to use.
This function uses the t-distribution, and so it is appropriate for small sample
sizes. It can also be used for large sample sizes, but the function
‘confidence-interval-z’ may be computationally faster. It returns three values:
the mean and the lower and upper bound of the confidence interval. True, only
two numbers are necessary, but the confidence intervals of other statistics may
be asymmetrical and these values would be consistent with those confidence
intervals. ‘Sample’ should be a sequence of numbers. ‘Confidence’ should be a
number between 0 and 1, exclusive.
This function is just like ‘confidence-interval-t,’ except that instead of its arguments being the actual data, it takes the following summary statistics: ‘mean,’ which is the estimator of some t-distributed parameter; ‘dof,’ which is the number of degrees of freedom in estimating the mean; and the ‘standard-error’ of the estimator. In general, ‘mean’ is a point estimator of the mean of a t-distribution, which may be the slope parameter of a regression, the difference between two means, or other practical t-distributions. ‘Confidence’ should be a number between 0 and 1, exclusive.
CONFIDENCE-INTERVAL-Z (DATA CONFIDENCE)
Suppose you have a sample of 50 numbers and you want to compute a 90 percent
confidence interval on the population mean. This function is the one to use.
Note that it makes the assumption that the sampling distribution is normal, so
it’s inappropriate for small sample sizes. Use confidence-interval-t instead.
It returns three values: the mean and the lower and upper bound of the
confidence interval. True, only two numbers are necessary, but the confidence
intervals of other statistics may be asymmetrical and these values would be
consistent with those confidence intervals. This function handles 90, 95 and 99
percent confidence intervals as special cases, so those will be quite fast.
‘Sample’ should be a sequence of numbers. ‘Confidence’ should be a number
between 0 and 1, exclusive.
CORRELATION (SAMPLE1 SAMPLE2 &KEY START1 END1 START2 END2) Computes the correlation coefficient of two samples, which should be equal-length sequences of numbers.
Computes the correlation of two variables given summary statistics of the variables. All of these arguments are summed over the variable: ‘x’ is the sum of the x’s, ‘x2’ is the sum of the squares of the x’s, and ‘xy’ is the sum of the cross-products, which is also known as the inner product of the variables x and y. Of course, ‘n’ is the number of data values in each variable.
Returns a matrix of all the correlations of all the variables. The dependent variable is row and column zero.
COVARIANCE (SAMPLE1 SAMPLE2 &KEY START1 END1 START2 END2)
Computes the covariance of two samples, which should be equal-length
sequences of numbers. Covariance is the inner product of differences between
sample elements and their sample means. For more information, see the manual.
CROSS-CORRELATION (SEQUENCE1 SEQUENCE2 MAX-LAG &OPTIONAL (MIN-LAG 0)) Returns a list of the correlation coefficients for all lags from ‘min-lag’ to ‘max-lag,’ inclusive, where the ‘i’th list element is the correlation of the first (length-of-sequence1 - i) elements of sequence1 with with the last i elements of sequence2. Both sequences should be sequences of numbers and of equal length.
D-TEST (SAMPLE-1 SAMPLE-2 TAILS &KEY (TIMES 1000) (H0MEAN 0))
Two-sample test for difference in means. Competes with the unmatched,
two-sample t-test. Each sample should be a sequence of numbers. We calculate
the mean of ‘sample-1’ minus the mean of ‘sample-2’; call that D. Under the null
hypothesis, D is zero. There are three possible alternative hypotheses: D is
positive, D is negative, and D is either, and they are selected by the ‘tails’
parameter, which must be :positive, :negative, or :both, respectively. We count
the number of chance occurrences of D in the desired rejection region, and
return the estimated probability.
DATA-LENGTH (DATA &KEY START END KEY)
Returns the number of data values in ‘data.’ Essentially, this is the Common
Lisp ‘length’ function, except it handles sequences where there is a ‘start’ or
‘end’ parameter. The ‘key’ parameter is ignored.
Convert degrees to radians.
Divide positive fixnum ‘i’ by 2 or a power of 2, yielding an integer result. For example, (div2 35 5) => 1.
Computes the error function, which is typically used to compute areas under
the Gaussian probability distribution. See the manual for more information.
Also see the function ‘gaussian-cdf.’
This implementation follows Numerical Recipes in C, section 6.2
This function computes the complement of the error function, “erfc(x),” defined as 1-erf(x). See the documentation for ‘error-function’ for a more complete definition and description. Essentially, this function on z/sqrt2 returns the two-tailed significance of z in a standard Gaussian distribution.
This function implements the function that Numerical Recipes in C calls erfcc, see section 6.3; that is, it’s the one using the Chebyshev approximation, since that is the one they call from their statistical functions. It is quick to compute and has fractional error everywhere less than 1.2x10^\{-7\}.
2^n
A faster version of ‘remove-duplicates’. Note you cannot specify a :TEST (it is always #’eq).
Returns the f-measure, the combination of precision and recall based on
parameter beta - default = .5 which => precision and recall are equally weighted.
beta = 1 => precision is maximized. beta = 0 => recall is maximized.
From a recent statistics book - All of Statistics - springer verlag http://www2.springeronline.com/sgw/cda/frontpage/0,,4-10128-22-13887455-0,00.html
This function occurs in the statistical test of whether two observed samples
have the same variance. A certain statistic, F, essentially the ratio of the
observed dispersion of the first sample to that of the second one, is
calculated. This function computes the tail areas of the null hypothesis: that
the variances of the numerator and denominator are equal. It can be used for
either a one-tailed or two-tailed test. The default is two-tailed, but
one-tailed can be computed by setting the optional argument ‘one-tailed-p’ to
true.
For a two-tailed test, this function computes the probability that F would be as
different from 1.0 (larger or smaller) as it is, if the null hypothesis is
true.
For a one-tailed test, this function computes the probability that F would be as
LARGE as it is if the first sample’s underlying distribution actually has
SMALLER variance that the second’s, where ‘numerator-dof’ and ‘denominator-dof’
is the number of degrees of freedom in the numerator sample and the denominator
sample. In other words, this computes the significance level at which the
hypothesis “the numerator sample has smaller variance than the denominator
sample” can be rejected.
A small numerical value implies a very significant rejection.
The ‘f-statistic’ must be a non-negative floating-point number. The degrees of
freedom arguments must be positive integers. The ‘one-tailed-p’ argument is
treated as a boolean.
This implementation follows Numerical Recipes in C, section 6.3 and the ‘ftest’ function in section 13.4. Some of the documentation is also drawn from the section 6.3, since I couldn’t improve on their explanation.
Returns the factorial of ‘n,’ which should be a non-negative integer. The
result will returned as a floating-point number, single-float if possible,
otherwise double-float. If it is returned as a double-float, it won’t
necessarily be integral, since the actual computation is
(exp (gamma-ln (1+ n)))
Implementation is loosely based on Numerical Recipes in C, section 6.1. On the TI Explorer, the largest argument that won’t cause a floating overflow is 170.
Returns the factorial of ‘n,’ which should be an integer. The result will returned as an integer or bignum. This implementation is exact, but is more computationally expensive than ‘factorial,’ which is to be preferred.
Returns the natural logarithm of n!; ‘n’ should be an integer. The result will be a single-precision, floating point number. The implementation follows Numerical Recipes in C, section 6.1
This is an incomplete gamma function, what Numerical Recipes in C calls “gammp.” This function also returns, as the second value, g(a,x). See the manual for more information.
Returns the natural logarithm of the Gamma function evaluated at ‘x.’ Mathematically, the Gamma function is defined to be the integral from 0 to Infinity of t^x exp(-t) dt. The implementation is copied, with extensions for the reflection formula, from Numerical Recipes in C, section 6.1. The argument ‘x’ must be positive. Full accuracy is obtained for x>1. For x<1, the reflection formula is used. The computation is done using double-floats, and the result is a double-float.
Computes the cumulative distribution function for a Gaussian random variable (defaults: mean=0.0, s.d.=1.0) evaluated at ‘x.’ The result is the probability of getting a random number less than or equal to ‘x,’ from the given Gaussian distribution.
Computes the significance of ‘x’ in a Gaussian distribution with mean=‘mean’
(default 0.0) and standard deviation=‘sd’ (default 1.0); that is, it returns
the area which farther from the mean than ‘x’ is.
The null hypothesis is roughly that ‘x’ is zero; you must specify your alternative hypothesis (H1) via the ‘tails’ parameter, which must be :both, :positive or :negative. The first corresponds to a two-tailed test: H1 is that ‘x’ is not zero, but you are not specifying a direction. If the parameter is :positive, H1 is that ‘x’ is positive, and similarly for :negative.
INTERQUARTILE-RANGE (DATA)
The interquartile range is similar to the variance of a sample because both
are statistics that measure out “spread out” a sample is. The interquartile
range is the difference between the 3/4 quantile (the upper quartile) and the
1/4 quantile (the lower quartile).
Returns the correlations of ‘sequence1’ with ‘sequence2’ after shifting ‘sequence1’ by ‘lag’. This means that for all n, element n of ‘sequence1’ is paired with element n+‘lag’ of ‘sequence2’, where both of those elements exist.
Calculates the main statistics of a linear regression: the slope and
intercept of the line, the coefficient of determination, also known as r-square,
the standard error of the slope, and the p-value for the regression. This
function takes two equal-length sequences of raw data. Note that the dependent
variable, as always, comes first in the argument list.
You should first look at your data with a scatter plot to see if a linear model is plausible. See the manual for a fuller explanation of linear regression statistics.
Calculates the main statistics of a linear regression: the slope and
intercept of the line, the coefficient of determination, also known as r-square,
the standard error of the slope, and the p-value for the regression. This
function differs from ‘linear-regression-brief’ in that it takes summary
variables: ‘x’ and ‘y’ are the sums of the independent variable and dependent
variables, respectively; ‘x2’ and ‘y2’ are the sums of the squares of the
independent variable and dependent variables, respectively; and ‘xy’ is the sum
of the products of the independent and dependent variables.
You should first look at your data with a scatter plot to see if a linear model is plausible. See the manual for a fuller explanation of linear regression statistics.
Calculates the slope and intercept of the regression line. This function
takes two equal-length sequences of raw data. Note that the dependent variable,
as always, comes first in the argument list.
You should first look at your data with a scatter plot to see if a linear model is plausible. See the manual for a fuller explanation of linear regression statistics.
Calculates the slope and intercept of the regression line. This function
differs from ‘linear-regression-minimal’ in that it takes summary statistics:
‘x’ and ‘y’ are the sums of the independent variable and dependent variables,
respectively; ‘x2’ and ‘y2’ are the sums of the squares of the independent
variable and dependent variables, respectively; and ‘xy’ is the sum of the
products of the independent and dependent variables.
You should first look at your data with a scatter plot to see if a linear model is plausible. See the manual for a fuller explanation of linear regression statistics.
Calculates almost every statistic of a linear regression: the slope and
intercept of the line, the standard error on each, the correlation coefficient,
the coefficient of determination, also known as r-square, and an ANOVA table as
described in the manual.
This function takes two equal-length sequences of raw data. Note that the
dependent variable, as always, comes first in the argument list. If you don’t
need all this information, consider using the “-brief,” or “-minimal”
functions, which do less computation.
You should first look at your data with a scatter plot to see if a linear model is plausible. See the manual for a fuller explanation of linear regression statistics.
Calculates almost every statistic of a linear regression: the slope and
intercept of the line, the standard error on each, the correlation coefficient,
the coefficient of determination, also known as r-square, and an ANOVA table as
described in the manual.
If you don’t need all this information, consider using the “-brief” or
“-minimal” functions, which do less computation.
This function differs from ‘linear-regression-verbose’ in that it takes summary
variables: ‘x’ and ‘y’ are the sums of the independent variable and dependent
variables, respectively; ‘x2’ and ‘y2’ are the sums of the squares of the
independent variable and dependent variables, respectively; and ‘xy’ is the sum
of the products of the independent and dependent variables.
You should first look at your data with a scatter plot to see if a linear model is plausible. See the manual for a fuller explanation of linear regression statistics.
Rescales value linearly from the old-min/old-max scale to the new-min/new-max one.
Log of ‘n’ to base 2.
Does successive multiplications of each element in ‘args’. If two elements are scalar, then their product is i * j, if a scalar is multiplied by a matrix, then each element in the matrix is multiplied by the scalar, lastly, if two matrices are multiplied then standard matrix multiplication is applied, and the ranks must be such that if ARGi is rank a x b and ARGj is rank c x d, then b must be equal to c.
MAXIMUM (DATA &KEY START END KEY)
Returns the element of the sequence ‘data’ whose ‘key’ is maximum. Signals
‘no-data’ if there is no data. If there is only one element in the data
sequence, that element will be returned, regardless of whether it is valid (a
number).
MEAN (DATA &KEY START END KEY)
Returns the arithmetic mean of ‘data,’ which should be a sequence.
Signals ‘no-data’ if there is no data.
MEDIAN (DATA &KEY START END KEY)
Returns the median of the subsequence of ‘data’ from ‘start’ to ‘end’, using
‘key’. The median is just the 0.5 quantile, and so this function returns the
same values as the ‘quantile’ function.
MINIMUM (DATA &KEY START END KEY)
Returns the element of the sequence ‘data’ whose ‘key’ is minimum. Signals
‘no-data’ if there is no data. If there is only one element in the data
sequence, that element will be returned, regardless of whether it is valid (a
number).
Find ‘n’ mod a power of 2.
MODE (DATA &KEY START END KEY)
Returns the most frequent element of ‘data,’ which should be a sequence. The
algorithm involves sorting, and so the data must be numbers or the ‘key’
function must produce numbers. Consider ‘sxhash’ if no better function is
available. Also returns the number of occurrences of the mode. If there is
more than one mode, this returns the first mode, as determined by the sorting of
the numbers.
This is an internal function for the use of the multiple-linear-regression functions. It takes the lists of values given by CLASP and puts them into a pair of arrays, A and b, suitable for solving the matrix equation Ax=b, to find the regression equation. The values are A and b. The first column of A is the constant 1, so that an intercept will be included in the regression model.
Let m be the number of independent variables, ‘ivs.’ This function returns a
vector of length m which are the coefficients of a linear equation that best
predicts the dependent variable, ‘dv,’ in the least squares sense. It also
returns, as the second value, the sum of squared deviations of the data from the
fitted model, aka SSE, aka chi-square. The third value is the number of degrees
of freedom for the chi-square, if you want to test the fit.
This function returns an intermediate amount of information. Consider using the sibling functions -minimal and -verbose if you want less or more information.
Let m be the number of independent variables, ‘ivs.’ This function returns
a vector of length m which are the coefficients of a linear equation that best
predicts the dependent variable, ‘dv,’ in the least squares sense.
This function returns the minimal information for a least squares regression model, namely a list of the coefficients of the ivs, with the constant term first. Consider using the sibling functions -brief and -verbose if you want more information.
Performs linear regression of the dependent variable, ‘dv,’ on multiple independent variables, ‘ivs.’ Y on multiple X’s, calculating the intercept and regression coefficient. Calculates the F statistic, intercept and the correlation coefficient for Y on X’s.
Let m be the number of independent variables, ‘ivs.’ This function returns
fourteen values:
1. the intercept
2. a list of coefficients
3. a list of correlations of each iv to the dv and to each iv
4. a list of the t-statistic for each coefficient
5. a list of the standardized coefficients (betas)
6. the fraction of variance accounted for, aka r-square
7. the ratio of MSR (see #12) to MSE (see #13), aka F
8. a list of the portion of the SSR due to each iv
9. a list of the fraction of variance accounted for by each iv
10. the sum of squares of the regression, aka SSR
11. the sum of squares of the residuals, aka SSE, aka chi-square
12. the mean squared error of the regression, aka MSR
13. the mean squared error of the residuals, aka MSE
14. a list of indices of “zeroed” independent variables
This function returns a lot of information about the regression. Consider using the sibling functions -minimal and -brief if you need less information.
MULTIPLE-MODES (DATA K &KEY START END KEY)
Returns the ‘k’ most frequent elements of ‘data,’ which should be a sequence.
The algorithm involves sorting, and so the data must be numbers or the ‘key’
function must produce numbers. Consider #’sxhash if no better function is
available. Also returns the number of occurrences of each mode. The value is
an association list of modes and their counts. This function is a little more
computationally expensive than ‘mode,’ so only use it if you really need
multiple modes.
Returns a new matrix such that the sum of its elements is 1.0
returns t iff x in the interval
Returns the number of possible ways of taking k elements out of n total.
Computes the cumulative distribution function for a Poisson random variable with mean ‘x’ evaluated at ‘k.’ The result is the probability that the number of Poisson random events occurring will be between 0 and k-1 inclusive, if the expected number is ‘x.’ The argument ‘k’ should be an integer, while ‘x’ should be a float. The implementation follows Numerical Recipes in C, section 6.2
QUANTILE (DATA Q &KEY START END KEY)
Returns the element which is the q’th percentile of the data when accessed by
‘key.’ That is, it returns the element such that ‘q’ of the data is smaller than
it and 1-‘q’ is above it, where ‘q’ is a number between zero and one, inclusive.
For example, if ‘q’ is .5, this returns the median; if ‘q’ is 0, this returns
the minimum (although the ‘minimum’ function is more efficient).
This function uses the bisection method, doing linear interpolation between elements i and i+1, where i=floor(q(n-1)). See the manual for more information. The function returns three values: the interpolated quantile and the two elements that determine the interval it was interpolated in. If the quantile was exact, the second two values are the same element of the data.
Takes two sequences and returns the correlation coefficient.
Formula: Sum (Cross-product (Difference-list (number-list-1)
Difference-list (number-list-2)) /
(Sqrt (Sum-of-Squares (number-list-1) *
Sum-of-Squares (number-list-2)))).
Convert radians to degrees. Does not round the result.
RANGE (DATA &KEY START END KEY)
Returns the range of the sequence ‘data.’ Signals ‘no-data’ if there is no
data. The range is given by max - min.
Equivalent to (* factor (round n factor)). For example, ‘round-to-factor’ of 65 and 60 is 60. Useful for converting to certain units, say when converting minutes to the nearest hours. See also ‘truncate-to-factor.’
Eliminates floating point underflow for the exponential function. Instead, it just returns 0.0d0
Performs all pairwise comparisons between group means, testing for significance using Scheffe’s F-test. Returns an upper-triangular table in a format described in the manual. Also see the function ‘print-scheffe-table.’
‘Group-means’ and ‘group-sizes’ should be sequences. The arguments ‘ms-error’ and ‘df-error’ are the mean square error within groups and its degrees of freedom, both of which are computed by the analysis of variance. An ANOVA test should always be run first, to see if there are any significant differences.
SIGNIFICANCE NIL NIL
SKEWNESS (DATA &KEY START END KEY)
Returns the skewness of ‘data’, which is the sum of cubed distances from the
mean divided by the standard deviation, divided by N.
Smooths ‘data’ by successive smoothing: 4,median; then 2,median; then 5,median; then 3,median; then hanning. The ends are handled by duplicating the end elements. This function is not destructive; it returns a list the same length as ‘data,’ which should be a list of numbers.
Smooths ‘data’ by replacing each element with the weighted mean of it and its two neighbors. The weights are 1/2 for itself and 1/4 for each neighbor. The ends are handled by duplicating the end elements. This function is not destructive; it returns a list the same length as ‘data,’ which should be a sequence of numbers.
With a window of size two, the median and mean smooth functions are the same.
Smooths ‘data’ by replacing each element with the mean of it and its two neighbors. The ends are handled by duplicating the end elements. This function is not destructive; it returns a list the same length as ‘data,’ which should be a sequence of numbers.
Smooths ‘data’ by replacing each element with the mean of it, its left neighbor, and its two right neighbors. The ends are handled by duplicating the end elements. This function is not destructive; it returns a list the same length as ‘data,’ which should be a sequence of numbers.
Smooths ‘data’ by replacing each element with the median of it, its two left neighbors and its two right neighbors. The ends are handled by duplicating the end elements. This function is not destructive; it returns a list the same length as ‘data,’ which should be a sequence of numbers.
Smooths ‘data’ by replacing each element with the median of it and its neighbor on the left. A median of two elements is the same as their mean. The end is handled by duplicating the end element. This function is not destructive; it returns a list the same length as ‘data,’ which should be a sequence of numbers.
Smooths ‘data’ by replacing each element with the median of it and its two neighbors. The ends are handled by duplicating the end elements. This function is not destructive; it returns a list the same length as ‘data,’ which should be a sequence of numbers.
Smooths ‘data’ by replacing each element with the median of it, its left neighbor, and its two right neighbors. The ends are handled by duplicating the end elements. This function is not destructive; it returns a list the same length as ‘data,’ which should be a sequence of numbers.
Smooths ‘data’ by replacing each element with the median of it, its two left neighbors and its two right neighbors. The ends are handled by duplicating the end elements. This function is not destructive; it returns a list the same length as ‘data,’ which should be a sequence of numbers.
STANDARD-DEVIATION (DATA &KEY START END KEY)
Returns the standard deviation of ‘data,’ which is just the square root of
the variance.
Signals ‘no-data’ if there is no data. Signals ‘insufficient-data’ if there is only one datum.
STATISTICAL-SUMMARY (DATA &KEY START END KEY)
Compute the length, minimum, maximum, range, median, mode, mean, variance,
standard deviation, and interquartile-range of ‘sequence’ from ‘start’ to ‘end’,
accessed by ‘key’.
Student’s distribution is much like the Gaussian distribution except with
heavier tails, depending on the number of degrees of freedom, ‘dof.’ As ‘dof’
goes to infinity, Student’s distribution approaches the Gaussian. This function
computes the significance of ‘t-statistic.’ Values range from 0.0 to 1.0: small
values suggest that the null hypothesis—that ‘t-statistic’ is drawn from a t
distribution—should be rejected. The ‘t-statistic’ parameter should be a
float, while ‘dof’ should be an integer.
The null hypothesis is roughly that ‘t-statistic’ is zero; you must specify your
alternative hypothesis (H1) via the ‘tails’ parameter, which must be :both,
:positive or :negative. The first corresponds to a two-tailed test: H1 is that
‘t-statistic’ is not zero, but you are not specifying a direction. If the
parameter is :positive, H1 is that ‘t-statistic’ is positive, and similarly for
:negative.
This implementation follows Numerical Recipes in C, section 6.3.
Given an ‘m’x‘n’ matrix ‘A,’ this routine computes its singular value
decomposition, A = U W V^T. The matrix U replaces ‘A’ on output. The diagonal
matrix of singular values W is output as a vector ‘W’ of length ‘n.’ The matrix
‘V’ – not the transpose V^T – is output as an ‘n’x‘n’ matrix ‘V.’ The row
dimension ‘m’ must be greater or equal to ‘n’; if it is smaller, then ‘A’ should
be filled up to square with zero rows. See the discussion in Numerical Recipes
in C, section 2.6.
This routine returns no values, storing the results in ‘A,’ ‘W,’ and ‘V.’ It does use some auxiliary storage, which can be passed in as ‘rv1,’ a double-float array of length ‘n,’ if you want to avoid consing.
Given an ‘m’x‘n’ matrix ‘A,’ this routine computes its singular value
decomposition, A = U W V^T. The matrix U replaces ‘A’ on output. The diagonal
matrix of singular values W is output as a vector ‘W’ of length ‘n.’ The matrix
‘V’ – not the transpose V^T – is output as an ‘n’x‘n’ matrix ‘V.’ The row
dimension ‘m’ must be greater or equal to ‘n’; if it is smaller, then ‘A’ should
be filled up to square with zero rows. See the discussion in Numerical Recipes
in C, section 2.6.
This routine returns no values, storing the results in ‘A,’ ‘W,’ and ‘V.’ It does use some auxiliary storage, which can be passed in as ‘rv1,’ a single-float array of length ‘n,’ if you want to avoid consing. All input arrays should be of single-floats.
T-SIGNIFICANCE NIL NIL
T-TEST (SAMPLE-1 SAMPLE-2 &OPTIONAL (TAILS BOTH) (H0MEAN 0))
Returns the t-statistic for the difference in the means of two samples, which
should each be a sequence of numbers. Let D=mean1-mean2. The null hypothesis
is that D=0. The alternative hypothesis is specified by ‘tails’: ‘:both’ means
D/=0, ‘:positive’ means D>0, and ‘:negative’ means D<0. Unless you’re using
:both tails, be careful what order the two samples are in: it matters!
The function also returns the significance, the standard error, and the degrees of freedom. Signals ‘standard-error-is-zero’ if that condition occurs. Signals ‘insufficient-data’ unless there are at least two elements in each sample.
T-TEST-MATCHED (SAMPLE1 SAMPLE2 &OPTIONAL (TAILS BOTH))
Returns the t-statistic for two matched samples, which should be equal-length
sequences of numbers. Let D=mean1-mean2. The null hypothesis is that D=0. The
alternative hypothesis is specified by ‘tails’: ‘:both’ means D/=0, ‘:positive’
means D>0, and ‘:negative’ means D<0. Unless you’re using :both tails, be
careful what order the two samples are in: it matters!
The function also returns the significance, the standard error, and the degrees of freedom. Signals ‘standard-error-is-zero’ if that condition occurs. Signals ‘insufficient-data’ unless there are at least two elements in each sample.
T-TEST-ONE-SAMPLE (DATA TAILS &OPTIONAL (H0-MEAN 0) &KEY START END KEY) Returns the t-statistic for the mean of the data, which should be a sequence of numbers. Let D be the sample mean. The null hypothesis is that D equals the ‘H0-mean.’ The alternative hypothesis is specified by ‘tails’: ‘:both’ means D /= H0-mean, ‘:positive’ means D > H0-mean, and ‘:negative’ means D < H0-mean.
The function also returns the significance, the standard error, and the degrees of freedom. Signals ‘zero-variance’ if that condition occurs. Signals ‘insufficient-data’ unless there are at least two elements in the sample.
Multiply ‘i’ by a power of 2.
TRIMMED-MEAN (DATA PERCENTAGE &KEY START END KEY)
Returns a trimmed mean of ‘data.’ A trimmed mean is an ordinary, arithmetic
mean of the data, except that an outlying percentage has been discarded. For
example, suppose there are ten elements in ‘data,’ and ‘percentage’ is 0.1: the
result would be the mean of the middle eight elements, having discarded the
biggest and smallest elements. If ‘percentage’ doesn’t result in a whole number
of elements being discarded, then a fraction of the remaining biggest and
smallest is discarded. For example, suppose ‘data’ is ’(1 2 3 4 5) and
‘percentage’ is 0.25: the result is (.75(2) + 3 + .75(4))/(.75+1+.75) or 3. By
convention, the 0.5 trimmed mean is the median, which is always returned as a
number.
Truncate ‘n’ to a power of 2.
Equivalent to (* factor (truncate n factor)). For example, ‘truncate-to-factor’ of 65 and 60 is 60. Useful for converting to certain units, say when converting minutes to hours and minutes. See also ‘round-to-factor.’
TUKEY-SUMMARY (DATA &KEY START END KEY)
Computes a Tukey five-number summary of the data. That is, it returns, in
increasing order, the extremes and the quartiles: the minimum, the 1/4 quartile,
the median, the 3/4 quartile, and the maximum.
VARIANCE (DATA &KEY START END KEY)
Returns the variance of ‘data,’ that is, the ‘sum-of-squares’ divided by
n-1. Signals ‘no-data’ if there is no data. Signals ‘insufficient-data’ if
there is only one datum.
Z-TEST-ONE-SAMPLE (DATA TAILS &OPTIONAL (H0-MEAN 0) (H0-STD-DEV 1) &KEY START
END KEY)
NIL
sequence
) (number-list-2 sequence
)) ¶Takes two sequences of numbers and returns a sequence of cross products. Formula XYi = Xi * Yi.
http://en.wikipedia.org/wiki/Dot_product
sequence
) (number-list-2 sequence
)) ¶Takes two sequences of numbers and returns the dot product.
A temporary table. This avoids consing.
A temporary vector for use by statistical functions such as ‘quantile,’ which uses it for sorting data. This avoids consing or rearranging the user’s data.
Generate error if the value of ARG-NAME doesn’t satisfy PREDICATE.
PREDICATE is a function name (a symbol) or an expression to compute.
TYPE-STRING is a string to use in the error message, such as "a list".
ERROR-TYPE-NAME is a keyword that tells condition handlers what type was desired.
In clasp, statistical objects have two parts, a class which stores the
various parts of the object and a computing function which computes the value
of the object from arguments. The define-statistic macro allows the
definition of new statistical types. The define-statistic macro must be
provided with all the information necessary to create a statistical object,
that is, everything required to create a new class, everything required to
create a computing function and some information to connect the two. This
last part consists of a list of arguments and their types and a list which
determines how the values of a statistical function should be used to fill the
slots of a statistical object.
When define-statistic is invoked, two things happen, first a class is defined
which is a subclass of ’statistic and any other named ‘superclasses’. Second,
a pair of functions is defined. ‘clasp-statistics::name’ is an internal
function which has the supplied ‘body’ and ‘lambda-list’ and must return as
many values as there are slots in the class ‘name’. The function ‘name’ is
also defined, it is basically a wrapper function which converts its arguments
to those which are accepted by ‘body’ and then calls ‘clasp-statistics::name’.
The parameter clasp:*create-statistical-objects* determines whether the
wrapper function packages the values returned by the intern function into a
statistical object or just returns them as multiple values.
The ‘argument-types’ argument must be an alist in which the keys are the names of arguments as given in ‘lambda-list’ and the values are lisp types which those arguments will be converted to before calling the internal statistical function. The primary purpose of this is to allow for coersion of clasp variables to sequences, but any coercion which is allowed by lisp is acceptable. The ‘values’ argument is intended to allow the programmer to specify which slots in the statistical object are filled by which of the values returned by the statistical function. By default, the order of the values is assumed to be direct slots in order of specification, inherited slots in order of specification in the superclasses which are also statistics.
Performs a one-way analysis of variance (ANOVA) on the ‘data,’ which should
be a sequence of sequences, where each interior sequence is the data for a
particular group. Furthermore, each sequence should consist entirely of
numbers, and each should have at least 2 elements.
The significance of the result indicates that the group means are not all equal;
that is, at least two of the groups have significantly different means. If
there were only two groups, this would be semantically equivalent to an
unmatched, two-tailed t-test, so you can think of the one-way ANOVA as a
multi-group, two-tailed t-test.
This function returns five values: 1. an ANOVA table; 2. a list a group means; 3. either a Scheffe table or nil depending on ‘scheffe-tests-p’; 4. an alternate value for SST; and 5. a list of confidence intervals in the form ‘(,mean ,lower ,upper) for each group, if ‘confidence-intervals’ is a number between zero and one, giving the kind of confidence interval, such as 0.9. The fourth value is only interesting if you think there are numerical accuracy problems; it should be approximately equal to the SST value in the ANOVA table. This function differs from ‘anova-one-way-variables’ only in its input representation. See the manual for more information.
See ANOVA-ONE-WAY-VARIABLES
Calculates the analysis of variance when there are two factors that may
affect the dependent variable. Because the input is represented as an array, we
can refer to these two factors as the row-effect and the column effect. Unlike
the one-way ANOVA, there are mathematical difficulties with the two-way ANOVA if
there are unequal cell sizes; therefore, we require all cells to be the same
size, and so the input is a three-dimensional array.
The result of the analysis is an anova-table, as described in the manual. This function differs from ‘anova-two-way-variables’ only in its input representation. See the manual for further discussion of analysis of variance.
See ANOVA-TWO-WAY-VARIABLES
See ANOVA-TWO-WAY-VARIABLES-UNEQUAL-CELL-SIZES
See AUTOCORRELATION
Performs a chi-square test for independence of the two variables, ‘v1’ and ‘v2.’ These should be categorial variables with only two values; the function will construct a 2x2 contingency table by counting the number of occurrences of each combination of the variables. See the manual for more details.
Runs a chi-square test for association on a simple 2 x 2 table. If ‘yates’
is nil, the correction for continuity is not done; default is t.
Returns the chi-square statistic and the significance of the value.
Performs a chi-square test for independence of the two variables, ‘v1’ and ‘v2.’ These should be categorial variables; the function will construct a contingency table by counting the number of occurrences of each combination of the variables. See the manual for more details.
Calculates the chi-square statistic and corresponding p-value for the given contingency table. The result says whether the row factor is independent of the column factor. Does not apply Yate’s correction.
See CONFIDENCE-INTERVAL-PROPORTION
See CONFIDENCE-INTERVAL-T
See CONFIDENCE-INTERVAL-Z
This function is just like ‘confidence-interval-z,’ except that instead of its arguments being the actual data, it takes the following summary statistics: ‘mean’, a point estimator of the mean of some normally distributed population; and the ‘standard-error’ of the estimator, that is, the estimated standard deviation of the normal population. ‘Confidence’ should be a number between 0 and 1, exclusive.
See CORRELATION
See COVARIANCE
See CROSS-CORRELATION
See D-TEST
See DATA-LENGTH
Takes a sequence of numbers and returns a sequence of differences
from the mean.
Formula: xi = Xi - Mean (X).
Returns the critical value of some statistic. The function ‘p-function’ should be a unary function mapping statistics—x values—to their significance—p values. The function will find the value of x such that the p-value is ‘p-value.’ The function works by binary search. A secant method might be better, but this seems to be acceptably fast. Only positive values of x are considered, and ‘p-function’ should be monotonically decreasing from its value at x=0. The binary search ends when either the function value is within ‘y-tolerance’ of ‘p-value’ or the size of the search region shrinks to less than ‘x-tolerance.’
Calculates the G-test for a contingency table. The formula for the
G-test statistic is
2 * sum[f_ij log [f_ij/f-hat_ij]]
where f_ij is the ith by jth cell in the table and f-hat_ij is the
expected value of that cell. If an expected-value-matrix is supplied,
it must be the same size as table and it is used for expected values,
in which case the G-test is a test of goodness-of-fit. If the
expected value matrix is unsupplied, it is calculated from using the
formula
e_ij = [f_i* * f_*j] / f_**
where f_i*, f_*j and f_** are the row, column and grand totals
respectively. In this case, the G-test is a test of independence. The degrees of freedom is the same as for the chi-square statistic and the significance is obtained by comparing
Returns the inner product of the two samples, which should be sequences of numbers. The inner product, also called the dot product or vector product, is the sum of the pairwise multiplication of the numbers. Stops when either sample runs out; it doesn’t check that they have the same length.
See INTERQUARTILE-RANGE
If matrix is singular returns nil, else returns its inverse. If into-matrix is supplied, inverse is returned in it, otherwise a new array is created.
If matrix is singular returns nil, else returns the inverse of matrix. Uses iterative improvement until no further improvement is possible.
Collects the ‘dv’ values for each unique combination of an element of ‘v1’ and an element of ‘v2.’ Returns a three-dimensional table of dv values.
Counts each unique combination of an element of ‘v1’ and an element of ‘v2.’ Returns a two-dimensional table of integers.
Returns the norm of matrix.
The norm is the maximum over the rows of the sum of the abs of the columns.
Adds two matrices together
Add a scalar value to a matrix
Multiplies two matrices together
Multiply a matrix by a scalar value
Multiply a matrix by a scalar value
See MAXIMUM
See MEAN
See MEDIAN
See MINIMUM
Returns the most frequent element of ‘data,’ which should be a sequence. The
algorithm involves sorting, and so the data must be numbers or the ‘key’
function must produce numbers. Consider ‘sxhash’ if no better function is
available. Also returns the number of occurrences of the mode. If there is
more than one mode, this returns the first mode, as determined by the sorting of
the numbers.
Keep in mind that if the data has multiple runs of like values that are bigger than the window size (currently defaults to 10% of the size of the data) this function will blindly pick the first one. If this is the case you probabaly should be calling ‘mode’ instead of this function.
See MODE
See MULTIPLE-MODES
Multiply matrices MATRIX-1 and MATRIX-2, storing into MATRIX-3 if supplied.
If MATRIX-3 is not supplied, then a new (ART-Q type) array is returned, else
MATRIX-3 must have exactly the right dimensions for holding the result of the multiplication.
Both MATRIX-1 and MATRIX-2 must be either one- or two-diimensional.
The first dimension of MATRIX-2 must equal the second dimension of MATRIX-1, unless MATRIX-1
is one-dimensional, when the first dimensions must match (thus allowing multiplications of the
form VECTOR x MATRIX)
Prints ‘anova-table’ on ‘stream.’
Prints ‘scheffe-table’ on ‘stream.’ If the original one-way anova data had N groups, the Scheffe table prints as an n-1 x n-1 upper-triangular table. If ‘group-means’ is given, it should be a list of the group means, which will be printed along with the table.
Computes square root of a*a + b*b without destructive overflow or underflow.
Computes square root of a*a + b*b without destructive overflow or underflow.
See QUANTILE
See RANGE
Uses the Gauss-Jordan reduction method to reduce a matrix.
Removes the ’&rest arg’ part from a lambda-list (strictly for documentation purposes.
Multiplies a matrix by a scalar value in the form M[i,j] = s*M[i,j].
Returns as three values the U W and V of singular value decomposition. If you have already consed up these matrices, you should call ‘svdcmp-sf’ or ‘svdcmp-df’ directly. The input matrix is preserved.
See SKEWNESS
See STANDARD-DEVIATION
See STATISTICAL-SUMMARY
Takes a sequence of numbers and returns their sum. Formula: Sum(X).
Returns the sum of squared distances from the mean of ‘data’.
Signals ‘no-data’ if there is no data.
Solves A X = B for a vector ‘X,’ where A is specified by the mxn array U, ‘n’
vector W, and nxn matrix V as returned by svdcmp. ‘m’ and ‘n’ are the
dimensions of ‘A,’ and will be equal for square matrices. ‘B’ is the 1xm input
vector for the right-hand side. ‘X’ is the 1xn output solution vector. All
arrays are of double-floats. No input quantities are destroyed, so the routine
may be called sequentially with different B’s. See the discussion in Numerical
Recipes in C, section 2.6.
This routine assumes that near zero singular values have already been zeroed. It returns no values, storing the result in ‘X.’ It does use some auxiliary storage, which can be passed in as ‘tmp,’ a double-float array of length ‘n,’ if you want to avoid consing.
Solves A X = B for a vector ‘X,’ where A is specified by the mxn array U, ‘n’
vector W, and nxn matrix V as returned by svdcmp. ‘m’ and ‘n’ are the
dimensions of ‘A,’ and will be equal for square matrices. ‘B’ is the 1xm input
vector for the right-hand side. ‘X’ is the 1xn output solution vector. All
arrays are of single-floats. No input quantities are destroyed, so the routine
may be called sequentially with different B’s. See the discussion in Numerical
Recipes in C, section 2.6.
This routine assumes that near zero singular values have already been zeroed. It returns no values, storing the result in ‘X.’ It does use some auxiliary storage, which can be passed in as ‘tmp,’ a single-float array of length ‘n,’ if you want to avoid consing.
Returns the solution vector to the Ax=b, where A has been decomposed into ‘u,’ ‘w’ and ‘v’ by ‘singular-value-decomposition.’ This function is just a minor wrapping of ‘svbksb-sf’ and ‘svbksb-df.’
Computes the inverse of a matrix that has been decomposed into ‘u,’ ‘w’ and ‘v’ by singular value decomposition. It assumes the “small” elements of ‘w’ have already been zeroed. It computes the inverse by taking advantage of the known zeros in the full 2-dimensional ‘w’ matrix. It uses the backsubstitution algorithm, only with the B vectors fixed at the columns of the identity matrix, which lets us take advantage of its zeros. It’s about twice as fast as the slow version and conses a lot less. Note that if you are computing the inverse merely to solve one or more systems of equations, you are better off using the decomposition and backsubstitution routines directly.
Computes the inverse of a matrix that has been decomposed into ‘u,’ ‘w’ and ‘v’ by singular value decomposition. It assumes the “small” elements of ‘w’ have already been zeroed. It computes the inverse by taking advantage of the known zeros in the full 2-dimensional ‘w’ matrix. It uses the backsubstitution algorithm, only with the B vectors fixed at the columns of the identity matrix, which lets us take advantage of its zeros. It’s about twice as fast as the slow version and conses a lot less. Note that if you are computing the inverse merely to solve one or more systems of equations, you are better off using the decomposition and backsubstitution routines directly.
Computes the inverse of a matrix that has been decomposed into ‘u,’ ‘w’ and ‘v’ by singular value decomposition. It assumes the “small” elements of ‘w’ have already been zeroed. It computes the inverse by constructing a diagonal matrix ‘w2’ from ‘w’ (which is just a vector of the diagonal elements, and then explicitly multiplying u^t w2 and v. Note that if you are computing the inverse merely to solve one or more systems of equations, you are better off using the decomposition and backsubstitution routines directly.
Computes the inverse of a matrix that has been decomposed into ‘u,’ ‘w’ and ‘v’ by singular value decomposition. It assumes the “small” elements of ‘w’ have already been zeroed. It computes the inverse by constructing a diagonal matrix ‘w2’ from ‘w’ (which is just a vector of the diagonal elements, and then explicitly multiplying u^t w2 and v. Note that if you are computing the inverse merely to solve one or more systems of equations, you are better off using the decomposition and backsubstitution routines directly.
Use singular value decomposition to compute the inverse of ‘A.’ If an exact inverse is not possible, then zero the otherwise infinite inverted singular value and compute the inverse. The inverse is returned; ‘A’ is not destroyed. If you’re using this to solve several systems of equations, you’re better off computing the singular value decomposition and using it several times, because this function computes it anew each time.
Returns solution of linear system matrix * solution = b-vector. Employs the singular value decomposition method. See the discussion in Numerical Recipes in C, section 2.6, especially as to the semantics of ‘threshold.’
If the relative magnitude of an element in ‘w’ compared to the largest element is less than ‘threshold,’ then zero that element. Returns a list of indices of the zeroed elements. This function is just a convenient wrapper for ‘svzero-sf’ and ‘svzero-df.’
Given ‘v’ and ‘w’ as computed by singular value decomposition, computes the covariance matrix among the predictors. Based on Numerical Recipes in C, section 15.4, algorithm ‘svdvar.’ The covariance matrix is returned. It can be supplied as the third argument.
If the relative magnitude of an element in ‘w’ compared to the largest element is less than ‘threshold,’ then zero that element. If ‘report?’ is true, the indices of zeroed elements are printed. Returns a list of the indices of zeroed elements. This routine uses double-floats.
If the relative magnitude of an element in ‘w’ compared to the largest element is less than ‘threshold,’ then zero that element. If ‘report?’ is true, the indices of zeroed elements are printed. Returns a list of indices of the zeroed elements. This routine uses single-floats.
See T-TEST
See T-TEST-MATCHED
See T-TEST-ONE-SAMPLE
See TRIMMED-MEAN
See TUKEY-SUMMARY
See VARIANCE
See Z-TEST-ONE-SAMPLE
composite-statistic
)) ¶simple-statistic
)) ¶error
.
condition
.
autocorrelation
.
confidence-interval-proportion
.
confidence-interval-t
.
confidence-interval-z
.
correlation
.
covariance
.
cross-correlation
.
data-length
.
interquartile-range
.
maximum
.
mean
.
median
.
minimum
.
mode
.
multiple-modes
.
quantile
.
range
.
skewness
.
standard-deviation
.
statistical-summary
.
t-test
.
t-test-matched
.
t-test-one-sample
.
trimmed-mean
.
variance
.
z-test-one-sample
.
Jump to: | 1
A B C D E F G I L M N O P Q R S T U V W Z |
---|
Jump to: | 1
A B C D E F G I L M N O P Q R S T U V W Z |
---|
Jump to: | *
+
2
A B C D F G L M R S T U V |
---|
Jump to: | *
+
2
A B C D F G L M R S T U V |
---|
Jump to: | A B C D E F I M N P Q R S T U V W Z |
---|
Jump to: | A B C D E F I M N P Q R S T U V W Z |
---|