# The lhstats Reference Manual

This is the lhstats Reference Manual, version 1.1.1, generated automatically by Declt version 4.0 beta 2 "William Riker" on Wed May 15 05:44:40 2024 GMT+0.

## 2 Systems

The main system appears first, followed by any subsystem dependency.

### 2.1 `lhstats`

Statistical functions by Larry Hunter and Jeff Shrager.

Maintainer

Matt Curtis <>

Author

Larry Hunter, Jeff Shrager

License

GNU General Public License version 2 (GPLv2)

Version

1.1.1

Source
Child Components

## 3 Files

Files are sorted by type and then listed depth-first from the systems components trees.

### 3.1 Lisp

#### 3.1.1 `lhstats/lhstats.asd`

Source
Parent Component

`lhstats` (system).

ASDF Systems

#### 3.1.2 `lhstats/package.lisp`

Source
Parent Component

`lhstats` (system).

Packages

#### 3.1.3 `lhstats/lhstats.lisp`

Dependency

`package.lisp` (file).

Source
Parent Component

`lhstats` (system).

Public Interface
Internals

## 4 Packages

Packages are listed by definition order.

### 4.1 `statistics`

Statistical functions

Source
Nickname

`stats`

Use List

`common-lisp`.

Public Interface
Internals

## 5 Definitions

Definitions are sorted by export status, category, package, and then by lexicographic order.

### 5.1 Public Interface

#### 5.1.1 Macros

Macro: square (x)
Package
Source
Macro: test-variables (&rest args)
Package
Source

#### 5.1.2 Ordinary functions

Function: bin-and-count (sequence n)

Make N equal width bins and count the number of elements of sequence that belong in each.

Package
Source
Function: binomial-cumulative-probability (n k p)

P(X<k) for X a binomial random variable with parameters n &
p. Bionomial expecations for fewer than k events in N trials, each having probability p.

Package
Source
Function: binomial-ge-probability (n k p)

The probability of k or more occurances in N events, each with probability p.

Package
Source
Function: binomial-probability (n k p)

P(X=k) for X a binomial random variable with parameters n &
p. Binomial expectations for seeing k events in N trials, each having probability p. Use the Poisson approximation if N>100 and P<0.01.

Package
Source
Function: binomial-probability-ci (n p alpha &key exact?)

Confidence intervals on a binomial probability. If a binomial probability of p has been observed in N trials, what is the 1-alpha confidence interval around p? Approximate (using normal theory approximation) when npq >= 10 unless told otherwise

Package
Source
Function: binomial-test-one-sample (p-hat n p &key tails exact?)

The significance of a one sample test for the equality of an observed probability p-hat to an expected probability p under a binomial distribution with N observations. Use the normal theory approximation if n*p*(1-p) > 10 (unless the exact flag is true).

Package
Source
Function: binomial-test-one-sample-sse (p-estimated p-null &key alpha 1-beta tails)

Returns the number of subjects needed to test whether an observed probability is significantly different from a particular binomial null hypothesis with a significance alpha and a power 1-beta.

Package
Source
Function: binomial-test-paired-sse (pd pa &key alpha 1-beta tails)

Sample size estimate for the McNemar (discordant pairs) test. Pd is the projected proportion of discordant pairs among all pairs, and Pa is the projected proportion of type A pairs among discordant pairs. alpha, 1-beta and tails are as binomal-test-two-sample-sse.

Returns the number of individuals necessary; that is twice the number of matched pairs necessary.

Package
Source
Function: binomial-test-two-sample (p-hat1 n1 p-hat2 n2 &key tails exact?)

Are the observed probabilities of an event (p-hat1 and p-hat2) in N1/N2 trials different? The normal theory method implemented here. The exact test is Fisher’s contingency table method, below.

Package
Source
Function: binomial-test-two-sample-sse (p1 p2 &key alpha sample-ratio 1-beta tails)

The number of subjects needed to test if two binomial probabilities are different at a given significance alpha and power 1-beta. The sample sizes can be unequal; the p2 sample is sample-sse-ratio * the size of the p1 sample. It can be a one tailed or two tailed test.

Package
Source
Function: chi-square (dof percentile)

Returns the point which is the indicated percentile in the Chi Square distribution with dof degrees of freedom.

Package
Source
Function: chi-square-cdf (x dof)

Computes the left hand tail area under the chi square distribution under dof degrees of freedom up to X. Adopted from CLASP 1.4.3, http://eksl-www.cs.umass.edu/clasp.html

Package
Source
Function: chi-square-test-for-trend (row1-counts row2-counts &optional scores)

This test works on a 2xk table and assesses if there is an increasing or decreasing trend. Arguments are equal sized lists counts. Optionally, provide a list of scores, which represent some numeric attribute of the group. If not provided, scores are assumed to be 1 to k.

Package
Source
Function: chi-square-test-one-sample (variance n sigma-squared &key tails)

The significance of a one sample Chi square test for the variance of a normal distribution. Variance is the observed variance, N is the number of observations, and sigma-squared is the test variance.

Package
Source
Function: chi-square-test-rxc (contingency-table)

Takes contingency-table, an RxC array, and returns the significance of the relationship between the row variable and the column variable. Any difference in proportion will cause this test to be significant – consider using the test for trend instead if you are looking for a consistent change.

Package
Source
Function: choose (n k)

How may ways to take n things taken k at a time, when order doesn’t matter

Package
Source
Function: coefficient-of-variation (sequence)
Package
Source
Function: convert-to-standard-normal (x mu sigma)

Convert X from a Normal distribution with mean mu and variance sigma to standard normal

Package
Source
Function: correlation-coefficient (points)

just r from linear-regression. Also called Pearson Correlation

Package
Source
Function: correlation-sse (rho &key alpha 1-beta)

Returns the size of a sample necessary to find a correlation of expected value rho with significance alpha and power 1-beta.

Package
Source
Function: correlation-test-two-sample (r1 n1 r2 n2 &key tails)

Test if two correlation coefficients are different. Users Fisher’s Z test.

Package
Source
Function: correlation-test-two-sample-on-sequences (points1 points2 &key tails)
Package
Source
Function: f-significance (f-statistic numerator-dof denominator-dof &optional one-tailed-p)

Adopted from CLASP, but changed to handle F < 1 correctly in the one-tailed case. The ‘f-statistic’ must be a positive number. The degrees of freedom arguments must be positive integers. The ‘one-tailed-p’ argument is treated as a boolean.

This implementation follows Numerical Recipes in C, section 6.3 and the ‘ftest’ function in section 13.4.

Package
Source
Function: f-test (variance1 n1 variance2 n2 &key tails)

F test for the equality of two variances

Package
Source
Function: false-discovery-correction (p-values &key rate)

A multiple testing correction that is less conservative than Bonferroni. Takes a list of p-values and a false discovery rate, and returns the number of p-values that are likely to be good enough to reject the null at that rate. Returns a second value which is the p-value cutoff. See

Benjamini Y and Hochberg Y. "Controlling the false discovery rate: a practical and powerful approach to multiple testing." J R Stat Soc Ser B 57: 289 300, 1995.

Package
Source
Function: fisher-exact-test (contingency-table &key tails)

Fisher’s exact test. Gives a p value for a particular 2x2 contingency table

Package
Source
Function: fisher-z-transform (r)

Transforms the correlation coefficient to an approximately normal distribution.

Package
Source
Function: geometric-mean (sequence &optional base)
Package
Source
Function: linear-regression (points)

Computes the regression equation for a least squares fit of a line to a sequence of points (each a list of two numbers, e.g. ’((1.0 0.1) (2.0 0.2))) and report the intercept, slope, correlation coefficient r, R^2, and the significance of the difference of the slope from 0.

Package
Source
Function: mcnemars-test (a-discordant-count b-discordant-count &key exact?)

McNemar’s test for correlated proportions, used for longitudinal studies. Look only at the number of discordant pairs (one treatment is effective and the other is not). If the two treatments are A and B, a-discordant-count is the number where A worked and B did not, and b-discordant-count is the number where B worked and A did not.

Package
Source
Function: mean (sequence)
Package
Source
Function: mean-sd-n (sequence)

A combined calculation that is often useful. Takes a sequence and returns three values: mean, standard deviation and N.

Package
Source
Function: median (sequence)
Package
Source
Function: mode (sequence)

Returns two values: a list of the modes and the number of times they occur.

Package
Source
Function: normal-mean-ci (mean sd n alpha)

Confidence interval for the mean of a normal distribution

The 1-alpha percent confidence interval on the mean of a normal distribution with parameters mean, sd & n.

Package
Source
Function: normal-mean-ci-on-sequence (sequence alpha)

The 1-alpha confidence interval on the mean of a sequence of numbers drawn from a Normal distribution.

Package
Source
Function: normal-pdf (x mu sigma)

The probability density function (PDF) for a normal distribution with mean mu and variance sigma at point x.

Package
Source
Function: normal-sd-ci (sd n alpha)

As normal-variance-ci-on-sequence, but a confidence inverval for the standard deviation.

Package
Source
Function: normal-sd-ci-on-sequence (sequence alpha)
Package
Source
Function: normal-variance-ci (variance n alpha)

The 1-alpha confidence interval on the variance of a sequence of numbers drawn from a Normal distribution.

Package
Source
Function: normal-variance-ci-on-sequence (sequence alpha)
Package
Source
Function: percentile (sequence percent)
Package
Source
Function: permutations (n k)

How many ways to take n things taken k at a time, when order matters

Package
Source
Function: phi (x)

the CDF of standard normal distribution. Adopted from CLASP 1.4.3, see copyright notice at http://eksl-www.cs.umass.edu/clasp.html

Package
Source
Function: poisson-cumulative-probability (mu k)

Probability of seeing fewer than K events over a time period when the expected number events over that time is mu.

Package
Source
Function: poisson-ge-probability (mu x)

Probability of X or more events when expected is mu.

Package
Source
Function: poisson-mu-ci (x alpha)

Confidence interval for the Poisson parameter mu

Given x observations in a unit of time, what is the 1-alpha confidence interval on the Poisson parameter mu (= lambda*T)?

Since find-critical-value assumes that the function is monotonic increasing, adjust the value we are looking for taking advantage of reflectiveness.

Package
Source
Function: poisson-probability (mu k)

Probability of seeing k events over a time period when the expected number of events over that time is mu.

Package
Source
Function: poisson-test-one-sample (observed mu &key tails approximate?)

The significance of a one sample test for the equality of an observed number of events (observed) and an expected number mu under the poisson distribution. Normal theory approximation is not that great, so don’t use it unless told.

Package
Source
Function: random-normal (&key mean sd)

returns a random number with mean and standard-distribution as specified.

Package
Source
Function: random-pick (sequence)

Random selection from sequence

Package
Source
Function: random-sample (n sequence)

Return a random sample of size N from sequence, without replacement. If N is equal to or greater than the length of the sequence, return the entire sequence.

Package
Source
Function: range (sequence)
Package
Source
Function: round-float (x &key precision)

Rounds a floating point number to a specified number of digits precision.

Package
Source
Function: sd (sequence)
Package
Source
Function: sign-test (plus-count minus-count &key exact? tails)

Really just a special case of the binomial one sample test with p = 1/2. The normal theory version has a correction factor to make it a better approximation.

Package
Source
Function: sign-test-on-sequences (sequence1 sequence2 &key exact? tails)

Same as sign-test, but takes two sequences and tests whether the entries in one are different (greater or less) than the other.

Package
Source
Function: spearman-rank-correlation (points)

Spearman rank correlation computes the relationship between a pair of variables when one or both are either ordinal or have a distribution that is far from normal. It takes a list of points (same format as linear-regression) and returns the spearman rank correlation coefficient and its significance.

Package
Source
Function: standard-deviation (sequence)
Package
Source
Function: standard-error-of-the-mean (sequence)
Package
Source
Function: t-distribution (dof percentile)

Returns the point which is the indicated percentile in the T distribution with dof degrees of freedom. Adopted from CLASP 1.4.3, http://eksl-www.cs.umass.edu/clasp.html

Package
Source
Function: t-significance (t-statistic dof &key tails)

Lookup table in Rosner; this is adopted from CLASP/Numeric Recipes (CLASP 1.4.3), http://eksl-www.cs.umass.edu/clasp.html

Package
Source
Function: t-test-one-sample (x-bar sd n mu &key tails)

The significance of a one sample T test for the mean of a normal distribution with unknown variance. X-bar is the observed mean, sd is the observed standard deviation, N is the number of observations and mu is the test mean.

See also t-test-one-sample-on-sequence

Package
Source
Function: t-test-one-sample-on-sequence (sequence mu &key tails)

As t-test-one-sample, but calculates the observed values from a sequence of numbers.

Package
Source
Function: t-test-one-sample-sse (mu mu-null variance &key alpha 1-beta tails)

Returns the number of subjects needed to test whether the mean of a normally distributed sample mu is different from a null hypothesis mean mu-null and variance variance, with alpha, 1-beta and tails as specified.

Package
Source
Function: t-test-paired (d-bar sd n &key tails)

The significance of a paired t test for the means of two normal distributions in a longitudinal study. D-bar is the mean difference, sd is the standard deviation of the differences, N is the number of pairs.

Package
Source
Function: t-test-paired-on-sequences (before after &key tails)

The significance of a paired t test for means of two normal distributions in a longitudinal study. Before is a sequence of before values, after is the sequence of paired after values (which must be the same length as the before sequence).

Package
Source
Function: t-test-paired-sse (difference-mu difference-variance &key alpha 1-beta tails)

Returns the number of subjects needed to test whether the differences with mean difference-mu and variance difference-variance, with alpha, 1-beta and tails as specified.

Package
Source
Function: t-test-two-sample (x-bar1 sd1 n1 x-bar2 sd2 n2 &key variances-equal? variance-significance-cutoff tails)

The significance of the difference of two means (x-bar1 and x-bar2) with standard deviations sd1 and sd2, and sample sizes n1 and n2 respectively. The form of the two sample t test depends on whether the sample variances are equal or not. If the variable variances-equal? is :test, then we use an F test and the variance-significance-cutoff to determine if they are equal. If the variances are equal, then we use the two sample t test for equal variances. If they are not equal, we use the Satterthwaite method, which has good type I error properties (at the loss of some power).

Package
Source
Function: t-test-two-sample-on-sequences (sequence1 sequence2 &key variance-significance-cutoff tails)

Same as t-test-two-sample, but providing the sequences rather than the summaries.

Package
Source
Function: t-test-two-sample-sse (mu1 variance1 mu2 variance2 &key sample-ratio alpha 1-beta tails)

Returns the number of subjects needed to test whether the mean mu1 of a normally distributed sample (with variance variance1) is different from a second sample with mean mu2 and variance variance2, with alpha, 1-beta and tails as specified. It is also possible to set a sample size ratio of sample 1 to sample 2.

Package
Source
Function: variance (sequence)
Package
Source
Function: wilcoxon-signed-rank-test (differences &optional tails)

A test on the ranking of positive and negative differences (are the positive differences significantly larger/smaller than the negative ones). Assumes a continuous and symmetric distribution of differences, although not a normal one. This is the normal theory approximation, which is only valid when N > 15.

This test is completely equivalent to the Mann-Whitney test.

Package
Source
Function: wilcoxon-signed-rank-test-on-sequences (sequence1 sequence2 &optional tails)
Package
Source
Function: z (percentile &key epsilon)

The inverse normal function, P(X<Zu) = u where X is distributed as the standard normal. Uses binary search.

Package
Source
Function: z-test (x-bar n &key mu sigma tails)

The significance of a one sample Z test for the mean of a normal distribution with known variance.

mu is the null hypothesis mean, x-bar is the observed mean, sigma is the standard deviation and N is the number of observations. If tails is :both, the significance of a difference between x-bar and mu. If tails is :positive, the significance of x-bar is greater than mu, and if tails is :negative, the significance of x-bar being less than mu.

Returns a p value.

Package
Source
Function: z-test-on-sequence (sequence &key mu sigma tails)
Package
Source

### 5.2 Internals

#### 5.2.1 Special variables

Special Variable: *critical-values-of-r*
Package
Source
Special Variable: *critical-values-of-r-two-tailed-column-interpretaion*
Package
Source
Special Variable: *f0.05*
Package
Source
Special Variable: *f0.10*
Package
Source
Special Variable: *q-table*
Package
Source
Special Variable: *t-cdf-critical-points-table-for-.05*
Package
Source

#### 5.2.2 Macros

Macro: display (&rest l)
Package
Source
Macro: underflow-goes-to-zero (&body body)

Protects against floating point underflow errors and sets the value to 0.0 instead.

Package
Source
Macro: z/protect (expr testvar)

Macro to protect from division by zero.

Package
Source

#### 5.2.3 Ordinary functions

Function: 2-tailed-correlation-significance (n r)

We use the first line for anything less than 5, and the last line for anything over 500. Otherwise, find the nearest value (maybe we should interpolate ... too much bother!)

Package
Source
Function: all-squares (as bs)
Package
Source
Function: anova1 (d)

One way simple ANOVA, from Neter, et al. p677+. Data is give as a list of lists, each one representing a treatment, and each containing the observations.

Package
Source
Function: anova2 (a1b1 a1b2 a2b1 a2b2)

Two-Way Anova. (From Misanin & Hinderliter, 1991, p. 367-) This is specialized for four groups of equal n, called by their plot location names: left1 left2 right1 right2.

Package
Source
Function: anova2r (g1 g2)

Two way ANOVA with repeated measures on one dimension. From Ferguson & Takane, 1989, p. 359. Data is organized differently for this test. Each group (g1 g2) contains list of all subjects’ repeated measures, and same for B. So, A: ((t1s1g1 t2s1g1 ...) (t1s2g2 t2s2g2 ...) ...) Have to have the same number of test repeats for each subject, and this assumes the same number of subject in each group.

Package
Source
Function: average-rank (value sorted-values)

Average rank calculation for non-parametric tests. Ranks are 1 based, but lisp is 0 based, so add 1!

Package
Source
Function: beta-incomplete (a b x)

Adopted from CLASP 1.4.3, http://eksl-www.cs.umass.edu/clasp.html

Package
Source
Function: binomial-le-probability (n k p)
Package
Source
Function: chi-square-1 (expected observed)
Package
Source
Function: chi-square-2 (table)
Package
Source
Function: correlate (x y)

Correlation of two sequences, as in Ferguson & Takane, 1989, p. 125. Assumes NO MISSING VALUES!

Package
Source
Function: cross-mean (l)

Cross mean takes a list of lists, as ((1 2 3) (4 3 2 1) ...) and produces a list with mean and standard error for each VERTICLE entry, so, as: ((2.5 . 1) ...) where the first pair is computed from the nth 1 of all the sublists in the input set, etc. This is useful in some cases of data cruching.

Note that missing data is assumed to be always at the END of lists. If it isn’t, you’ve got to do something previously to interpolate.

Package
Source
Function: dumplot (v &optional show-values)

A dumb terminal way of plotting data.

Package
Source
Function: error-function (x)

Adopted from CLASP 1.4.3, http://eksl-www.cs.umass.edu/clasp.html

Package
Source
Function: error-function-complement (x)

Adopted from CLASP 1.4.3, http://eksl-www.cs.umass.edu/clasp.html

Package
Source
Function: even-power-of-two? (n)
Package
Source
Function: f-score>p-limit? (df1 df2 f-score limits-table)
Package
Source
Function: factorial (number)
Package
Source
Function: find-critical-value (p-function p-value &optional x-tolerance y-tolerance)

Adopted from CLASP 1.4.3, http://eksl-www.cs.umass.edu/clasp.html

Package
Source
Function: gamma-incomplete (a x)

Adopted from CLASP 1.4.3, http://eksl-www.cs.umass.edu/clasp.html

Package
Source
Function: gamma-ln (x)

Adopted from CLASP 1.4.3, http://eksl-www.cs.umass.edu/clasp.html

Package
Source
Function: harmonic-mean (seq)

See: http://mathworld.wolfram.com/HarmonicMean.html

Package
Source
Function: histovalues (v* &key nbins)

Take a set of values and produce a histogram binned into n groups, so that you can get a report of the distribution of values. There’s a large chance for off-by-one errores here!

Package
Source
Function: lmean (ll)

Lmean takes the mean of entries in a list of lists vertically. So: (lmean ’((1 2) (5 6))) -> (3 4) The args have to be the same length.

Package
Source
Function: max* (l &rest ll)
Package
Source
Function: min* (l &rest ll)
Package
Source
Function: n-random (n l)

Select n random sublists from a list, without replacement. This copies the list and then destroys the copy. N better be less than or equal to (length l).

Package
Source
Function: normalize (v)

Normalize a vector by dividing it through by subtracting its min and then dividing through by its range (max-min). If the numbers are all the same, this would screw up, so we check that first and just return a long list of 0.5 if so!

Package
Source
Function: p2 (v)
Package
Source
Function: protected-mean (l)

Computes a mean protected where there will be a divide by zero, and gives us n/a in that case.

Package
Source
Function: pround (n v)

Returns a string that is rounded to the appropriate number of digits, but the only thing you can do with it is print it. It’s just a convenience hack for rounding recursive lists.

Package
Source
Function: regress (x y)

Simple linear regression.

Package
Source
Function: round-up (x)
Package
Source
Function: s2 (l n)
Package
Source
Function: safe-exp (x)

Eliminates floating point underflow for the exponential function. Instead, it just returns 0.0d0

Package
Source
Function: sign (x)
Package
Source
Function: sqr (a)
Package
Source
Function: standard-error (sequence)
Package
Source
Function: sum (l)
Package
Source
Function: t-p-value (x df &optional warn?)
Package
Source
Function: t1-test (values target &optional warn?)

One way t-test to see if a group differs from a numerical mean target value. From Misanin & Hinderliter p. 248.

Package
Source
Function: t1-value (values target)
Package
Source
Function: t2-test (l1 l2)

T2-test calculates an UNPAIRED t-test.

From Misanin & Hinderliter p. 268. The t-cdf part is inherent in xlispstat, and I’m not entirely sure that it’s really the right computation since it doens’t agree entirely with Table 5 of M&H, but it’s close, so I assume that M&H have round-off error.

Package
Source
Function: t2-value (l1 l2)
Package
Source
Function: testanova2 ()
Package
Source
Function: tukey-q (k dfwg)

Finds the Q table for the appopriate K, and then walks BACKWARDS through it (in a kind of ugly way!) to find the appropriate place in the table for the DFwg, and then uses the level (which must be 0.01 or 0.05, indicating the first, or second col of the table) to determine if the Q value reaches significance, and gives us a + or - final result.

Package
Source
Function: wilcoxon-1 (initial-values target)

Nonparametric one-sample (signed) rank test (Wilcoxon).

From http://www.graphpad.com/instatman/HowtheWilcoxonranksumtestworks.htm

Package
Source
Function: x2test ()

Simple Chi-Squares From Clarke & Cooke p. 431; should = ~7.0

Package
Source