A collection of statistical tests and utility functions in ruby.

Copyright (c) 2006,2007 Josh Myer <josh@joshisanerd.com>

Released under the terms of the GPL v2 and v2 only (without the upgrade provision). See the file COPYING for more information.

RStats is a collection of approximate statistical tests and techniques. It's meant mostly as my notes while learning statistics, but I'm releasing it in the hopes that someone else might find it useful.

Most people will only be interested in `RStats.chi_square_gof` and
`RStats.chi_square_cont`, a ruby implementation of χ^{2} (chi-squared)
goodness-of-fit and χ^{2} contingency/independence tests.

You can obtain a copy by git cloning this URL (right-click, copy URL, and git clone it). This will get you a complete copy of the repo, along with rough instructions on how to make your own purely-static distribution of the site/code. It may also be available at github:jbm9/rstats.

All the functions are documented, so you're best to simply look at the appropriate function documentation. For examples, see the "unit tests" in test/. These are made up of the questions at the end of each section of Langley's book.

The following tests are included in the current release:

- The zM test. Langley pp152-9
- The Student's t test (due to Gosset). Langley 160-5
- Wilcoxon's Sum of Ranks test. Langley pp166-78
- Wilcoxon's Signed Ranks Test. Langley pp179-189 Wilcoxon's Stratified Test. Langley pp190-198
- Spearman's Correlation Test. Langley pp199-211
- Kruskal and Wallis' Test. Langley pp 212-21
- Friedman's Test. Langley pp222-9
- χ
^{2}(Chi-Square) contingency test. Bass; Langley pp269-84 - χ
^{2}(Chi-Square) goodness-of-fit test. Bass; Langley pp269-84

`Langley, Russell. Practical Statistics Simply Explained.
Dover Publications, Inc., New York, 1971.
ISBN 0486227294
Bass, Issa. A Zest of Non Parametric Testing - The Chi Square Test.
SixSigmaFirst, http://www.sixsigmafirst.com/chi_square_test.htm
Accessed 2006-12-20.`

The primary reference source for this is Russell Langley's excellent
*Practical Statistics Simply Explained*. It's a Dover print,
available as ISBN 0-486-22729-4, for about 13$US. The book is a
gentle introduction to statistics, which makes it a good primer, but
not a terribly good reference over the longer term. It focuses on
approximate tests, which give reasonably good results but are easy to
run by hand. Also, it was written before computers were commonly
available, which makes many of the tests described somewhat obsolete.

Unfortunately, Langley's section on χ^{2}
is messy. χ^{2} is used in many different
circumstances, all with different procedures. In Langley, these are
all presented in parallel, which is confusing. To make the most
common uses of χ^{2} (that I use, at least)
more clear, I did some quick googling. Since it's such a common test,
there's no shortage of tutorial sites on the web; feel free to find
your own. I like Bass's *A Zest of Non
Parametric Testing - The Chi Square Test*, because it's short
and to the point.

Finally, the `pochisq` function is a direct translation of the
C version available in the public domain. The original C was done by
Gary Perlman of Wang Institute, Tyngsboro, MA 01879. With that in
mind, the `pochisq` function may be freely adapted into the
public domain (you'll need to pull in `z_to_prob` as well,
which is a trivial function).

The author's homepage.