What's the minimum sample required for correlation analysis (linear model in R)

Question

I'm trying to perform correlation analysis with R's linear model

lm()

I'm wondering what is the reasonable minimum sample for it? Is there any rule for determining that?

I think if you have two data columns or vectors with minimum two data points then the lm function should work. — Jd Baba, Apr 19 '13 at 03:06

score 9 · Accepted Answer · edited May 23 '17 at 12:09

As a rule of thumb, ~~20, 30, 1000, samples~~ As a rule of thumb, you should be wary of rules of thumb. Excluding perhaps that "less is more, except of course for sample size" (Cohen & Cohen, 1983: 169-171).

You could ask your question on https://stats.stackexchange.com/ but they're probably going to give you answers that might not be the round number that you're looking for. For example:

You'll get more useful responses if you edit your question here to include a reproducible example that resembles your actual use-case and then ask for help coding calculations of relevant measures of error. You might explore the pwr package before you edit your question (see here for examples: http://www.statmethods.net/stats/power.html).

Do a bit of googling to find the names of error measures you think will be useful to you. You might start with these:

Lenth, R. V. (2001), Some Practical Guidelines for Effective Sample Size Determination, The American Statistician, 55, 187-193.

Wheeler, R. E. (1974), 'Portable Power', Technometrics, 16, 193–201.

Cohen, J. & Cohen, P. (1983). Applied multiple regression/correlation analysis for the behavioral sciences (2nd ed.).(Hillsdale, NJ: Erlbaum)

What's the minimum sample required for correlation analysis (linear model in R)

1 Answers1