0

I have a dataframe of monthly returns for 1,000 stocks with ids as column names. monthly returns

I need to select only the columns that match the values in another dataframe which includes the ids I want.

permno list

I'm sure this is really quite simple, but I have been struggling for 2 days and if someone has an easy solution it would be so very much appreciated. Thank you.

Rob Roy
  • 1
  • 2
  • Please include [reproducible data](https://stackoverflow.com/questions/5963269/how-to-make-a-great-r-reproducible-example), not images of data – Jan Boyer Aug 30 '20 at 17:52
  • Thank you. New contributor and I appreciate the guidance on how to create better posts. – Rob Roy Aug 31 '20 at 14:06

2 Answers2

0

You could convert the single-column permno list dataframe (osr_curr_permnos) into a list, and then use that list to select certain columns from your main dataframe (all_rets).

To convert the osr_curr_permnos column "0" into a list, you can use .to_list()

Then, you can use that list to slice all_rets and .copy() to make a fresh copy of it into a new dataframe.

The python code might look something like:

keep = osr_curr_permnos['0'].to_list() 
selected_rets = all_rets[keep].copy() 

"keep" would be a list, and "selected_rets" would be your new dataframe.

If there's a chance that osr_curr_permnos would have duplicates, you'll want to filter those out:

keep = osr_curr_permnos['0'].drop_duplicates().to_list()
selected_rets = all_rets[keep].copy()
relizt
  • 375
  • 2
  • 11
0

As I expected, the answer was more simple than I was making it. Basically, I needed to take the integer values in my permnos list and recast those as strings.

osr_curr_permnos['0'] = osr_curr_permnos['0'].apply(str)
keep = osr_curr_permnos['0'].values

Then I can use that to select columns from my returns dataframe which had string values as column headers.

all_rets[keep]

It was all just a mismatch of int vs. string.

Rob Roy
  • 1
  • 2