0

I have a dataset containing multiple products and their npi values. I want to create linear regressions for each product and output the slope, intercept, rvalue and pvalue in a data frame with the columns for product name slope value, intercept value, rvalue and pvalue.

I have managed to code my for loop for the linear regressions but the appending of the results keeps throwing errors.

this is my code:

result = pd.DataFrame()

for prod in product_array:
    data_aggr_period_prod_loop = data_aggr_period_prod.loc[data_aggr_period_prod['product']==prod].sort_values('period')
    if len(data_aggr_period_prod_loop) > 1:
        x = np.array([date_map[ix] for ix in data_aggr_period_prod_loop['period']])
        y1 = np.array(data_aggr_period_prod_loop['npi'])
        slope, intercept, rvalue, pvalue, _ = linregress(x, y1)
        result = result.append("product", "slope", "intercept", "rvalue", "pvalue")

this code above gave me this error code 'TypeError: append() takes from 2 to 5 positional arguments but 6 were given'

Please can someone tell me how to get the results appended into a dataframe.

  • Does this answer your question? [Insert a row to pandas dataframe](https://stackoverflow.com/questions/24284342/insert-a-row-to-pandas-dataframe) – Yevhen Kuzmovych Feb 10 '23 at 11:13
  • "keeps throwing errors". What errors? – Yevhen Kuzmovych Feb 10 '23 at 11:14
  • the code above threw this error 'TypeError: append() takes from 2 to 5 positional arguments but 6 were given' –  Feb 10 '23 at 11:14
  • "product", "slope", "intercept", "rvalue", "pvalue" are constant strings – Yevhen Kuzmovych Feb 10 '23 at 11:15
  • Read [the docs](https://pandas.pydata.org/docs/reference/api/pandas.DataFrame.append.html) please – Yevhen Kuzmovych Feb 10 '23 at 11:16
  • yes they are strings, is that the problem? –  Feb 10 '23 at 11:16
  • Why are you trying to append the same strings over and over? – Yevhen Kuzmovych Feb 10 '23 at 11:17
  • I want a dataframe with every product with the corresponding regression values. Thats what i am trying to do –  Feb 10 '23 at 11:18
  • You cannot just supply `append` with a group of values - that's not how it works. As comments say: read the docs. You could for example form a dictionary of the values and use `append`, or better use `concat` to add a row to the DF or else use `.loc[-1]`. Look up what these functions require and see examples. Note that it is inefficient to add rows one at a time - it is better to form a list of rows and add them all once. – user19077881 Feb 10 '23 at 11:37

1 Answers1

0

like others have said the append method wont work and you are just trying to append the string values rather than the actual variable values you create which would make your result dataframe pointless.

just added a columns argument to your pandas dataframe creation and changed the way you are appending the data to your result df

result = pd.DataFrame(columns=["product", "slope", "intercept", "rvalue", "pvalue"])

for prod in product_array:
    data_aggr_period_prod_loop = data_aggr_period_prod.loc[data_aggr_period_prod['product']==prod].sort_values('period')
    if len(data_aggr_period_prod_loop) > 1:
        x = np.array([date_map[ix] for ix in data_aggr_period_prod_loop['period']])
        y1 = np.array(data_aggr_period_prod_loop['npi'])
        slope, intercept, rvalue, pvalue, _ = linregress(x, y1)
        result.loc[len(result)] = [product, slope, intercept, rvalue, pvalue]
Hillygoose
  • 177
  • 8