0

I want to create a new list merging the elements of a previous list plus the values of a specific column of one data frame. See the example below.

The list I already have:

variables = ['Price_x_x', 'Mkt Cap_x', '24h Volume_x','Contributors_x', 'Commits in Last 4 Weeks_x','Reddit Subscribers_x', 'FB Likes_x','Market Cap Dominance']

The column:

0        BTC
1        ETH
2        BNB
3       USDT
4        DOT
       ...  
995      CPC
996      BLK
997    ROUTE
998      CNN
999     BULL
Name: COIN, Length: 1000, dtype: object

The desired output:

['BTC_Price_x_x', 'BTC_Mkt Cap_x','BTC_24h Volume_x',...,'BULL_FB Likes_x','BULL_Market Cap Dominance']

2 Answers2

0

IIUC, just loop over them:

result = []
for ticker in df['desired_col'].unique():
    for variable in variables:
        result.append(f"{ticker}_{variable}")
fsl
  • 3,250
  • 1
  • 10
  • 20
  • this will work fine for small datasets, but won't scale. see [second answer here](https://stackoverflow.com/questions/16476924/how-to-iterate-over-rows-in-a-dataframe-in-pandas) – Umar.H Mar 13 '21 at 21:19
  • I have seen your answer. I am going to try. My question is why this wont work in large datasets? – Manolo Dominguez Becerra Mar 13 '21 at 21:56
  • @JorgeAlbertoPalacios it will work, but you're working against the pandas API, use the core methods first, if there is no other solution than resorting to looping in python then do that, but that's rarely the case. Read the linked post it has v.good insight into where and when looping should be used. – Umar.H Mar 13 '21 at 22:02
  • Right, @Manakin. He never did ask for efficiency though, and it's not really a function of the data set but of the # of unique labels. – fsl Mar 13 '21 at 22:05
  • 1
    @FelipeLanza I agree, but we should be pushing people towards best practice. You're still working with a `pandas.DataFrame` object, so why not take advantage of the api? – Umar.H Mar 13 '21 at 22:07
  • Agreed. Not saying your answer isn't more correct btw, just don't quite think it is more readable (given the simplicity of the requested operation I mean). – fsl Mar 13 '21 at 22:14
0

you can do a product of your list and dataframe.

product_list = pd.merge(df.assign(var=1),pd.DataFrame(variables).assign(var=1),
         how='inner').drop('var',1).agg('_'.join,1).tolist()

['BTC_Price_x_x',
 'BTC_Mkt Cap_x',
 'BTC_24h Volume_x',
 'BTC_Contributors_x',
 'BTC_Commits in Last 4 Weeks_x',
 'BTC_Reddit Subscribers_x',
 'BTC_FB Likes_x',
 'BTC_Market Cap Dominance',
 'ETH_Price_x_x',
 'ETH_Mkt Cap_x',
 'ETH_24h Volume_x',
 'ETH_Contributors_x',
 'ETH_Commits in Last 4 Weeks_x',
 'ETH_Reddit Subscribers_x',
 'ETH_FB Likes_x'...]
Umar.H
  • 22,559
  • 7
  • 39
  • 74