0

I have a lot of pandas timeseries that I would like to combine into a dataframe. The dataframe that results from it does not have column names. Is there some way to reuse the name of the timeseries (s_a, s_b, s_c,... ) in the dataframe without having to explicitly specify it?

import pandas as pd
import numpy as np

dates = pd.date_range('2017-01-01', '2017-03-01')
s_a = pd.Series(np.random.randn(60), index = dates)
s_b = pd.Series(np.random.randn(60), index = dates)
s_c = pd.Series(np.random.randn(60), index = dates)
s_d = pd.Series(np.random.randn(60), index = dates)
df_a = pd.concat([s_a, s_b, s_c, s_d], join='outer', axis = 1)

I am hoping something along the lines

s_list = [s_a, s_b, s_c, s_d]

and then a hypothetical function s_list.names applied after the construction of the dataframe.

df_a = pd.concat(s_list, join='outer', axis = 1)
df_a.columns = s_list.names()

will produce the desired dataframe.

Spinor8
  • 1,587
  • 4
  • 21
  • 48
  • 1
    There are only hacky solutions available. Have a look [here](https://stackoverflow.com/questions/18425225/getting-the-name-of-a-variable-as-a-string) and [here](https://stackoverflow.com/questions/2553354/how-to-get-a-variable-name-as-a-string-in-python) – languitar Mar 20 '17 at 10:27

1 Answers1

1

Your series does not have a name attribute so I think you have to assign name to your series which will flow nicely as column names when you concatenate.

import pandas as pd
import numpy as np

dates = pd.date_range('2017-01-01', '2017-03-01')
s_a = pd.Series(np.random.randn(60), index = dates,name='s_a')
s_b = pd.Series(np.random.randn(60), index = dates,name='s_b')
s_c = pd.Series(np.random.randn(60), index = dates,name='s_c')
s_d = pd.Series(np.random.randn(60), index = dates,name='s_d')
s_x = pd.Series(np.random.randn(60), index = dates)
df_a = pd.concat([s_a, s_b, s_c, s_d],join='outer', axis = 1)

EDIT: Alternate Solution

Based on this answer https://stackoverflow.com/a/18425523/5729272, you can fetch the names into a list and assign it to columns.

import pandas as pd
import numpy as np
import inspect
def retrieve_name(variables):
    callers_local_vars = inspect.currentframe().f_back.f_locals.items()
    return [var_name for var in variables for var_name, var_val in callers_local_vars if var_val is var]

dates = pd.date_range('2017-01-01', '2017-03-01')
s_a = pd.Series(np.random.randn(60), index = dates,name='s_a')
s_b = pd.Series(np.random.randn(60), index = dates,name='s_b')
s_c = pd.Series(np.random.randn(60), index = dates,name='s_c')
s_d = pd.Series(np.random.randn(60), index = dates,name='s_d')
s_x = pd.Series(np.random.randn(60), index = dates)
df_a = pd.concat([s_a, s_b, s_c, s_d],join='outer', axis = 1)
df_a.columns = retrieve_name([s_a, s_b, s_c,s_d])
df_a.head()
Community
  • 1
  • 1