6

I would like to compute the beta or standardized coefficient of a linear regression model using standard tools in Python (numpy, pandas, scipy.stats, etc.).

A friend of mine told me that this is done in R with the following command:

lm(scale(y) ~ scale(x))

Currently, I am computing it in Python like this:

from scipy.stats import linregress
from scipy.stats.mstats import zscore

(beta_coeff, intercept, rvalue, pvalue, stderr) = linregress(zscore(x), zscore(y))
print('The Beta Coeff is: %f' % beta_coeff)

Is there a more straightforward function to compute this figure in Python?

ali_m
  • 71,714
  • 23
  • 223
  • 298
David
  • 511
  • 2
  • 4
  • 15
  • 1
    In case you have more than one independent variable. Please see this [post](https://stackoverflow.com/a/54652025/9900084) – steven Feb 12 '19 at 14:17

1 Answers1

6

Python is a general purpose language, but R was designed specifically for statistics. It's almost always going to take a few more lines of code to achieve the same (statistical) goal in python, purely because R comes ready to fit regression models (using lm) as soon as you boot it up.

The short answer to your question is No - your python code is already pretty straightforward.

That said, I think a closer equivalent to your R code would be

import statsmodels.api as sm
from scipy.stats.mstats import zscore

print sm.OLS(zscore(y), zscore(x)).fit().summary()
Eoin
  • 565
  • 3
  • 14