When I look at the documentation, there is no "correlation", but there is "ppearson" and "spearson". They are mentioned exactly once, as a "group-by statistical operation." But .. how exactly are they defined?
Also, when I try to use one, there is an error message, but I don't understand how to fix it. How do you use ppearson or spearson?
$ cat > foo.tsv
1^I2
2^I3
$ cat foo.tsv | datamash ppearson 1,2
datamash: operation ‘ppearson’ requires field pairs
EDIT: This documentation section says
GNU Datamash is designed to closely follow R project’s (https://www.r-project.org/) statistical functions. See the files/operators.R file for the R equivalent code for each of datamash’s operators. When building datamash from source code on your local computer, operators are compared to known results of the equivalent R functions.
Looking in R, I don't see an spearson
:
> ?spearson
No documentation for ‘spearson’ in specified packages and libraries:
you could try ‘??spearson’