Perhaps a duplicate, but I couldn't find relevant answers on SO. Please consider:
Create df:
df = pd.DataFrame({
'id': [1, 2, 3, 4, 5],
'country': ['USA', 'USA', 'Canada', 'Canada', 'Greenland'],
'color': ['Blue', 'Blue', 'Red', 'Green', 'Purple'],
'level': ['High', 'Low', 'Low', 'High', 'Low'],
'random': ['df', 'adsf', 'wqer', 'qewr', 'ycxb'],
'number': [1, 3, 5, 7, 1]})
df
Out:
id country color level random number
0 1 USA Blue High df 1
1 2 USA Blue Low adsf 3
2 3 Canada Red Low wqer 5
3 4 Canada Green High qewr 7
4 5 Greenland Purple Low ycxb 1
We can create list of columns to index our df. For example:
numeric = ['id', 'number']
df[numeric]
Out:
id number
0 1 1
1 2 3
2 3 5
3 4 7
4 5 1
But now let's say we create a second list of columns:
list2 = ['country', 'level']
And we try to retrieve variables using both lists:
df[[numeric, list2]]
This does not work.
My question: how do you use both (or more) lists to index the df? I have hundreds of variables so cannot write them all as a list using traditional df.loc
method, for example. Is there a simple solution?
Desired output:
id number country level
0 1 1 USA High
1 2 3 USA Low
2 3 5 Canada Low
3 4 7 Canada High
4 5 1 Greenland Low
Any help much appreciated.