0

Perhaps a duplicate, but I couldn't find relevant answers on SO. Please consider:

Create df:

df = pd.DataFrame({
    'id': [1, 2, 3, 4, 5],
    'country': ['USA', 'USA', 'Canada', 'Canada', 'Greenland'],
    'color': ['Blue', 'Blue', 'Red', 'Green', 'Purple'],
    'level': ['High', 'Low', 'Low', 'High', 'Low'],
    'random': ['df', 'adsf', 'wqer', 'qewr', 'ycxb'],
    'number': [1, 3, 5, 7, 1]})
df

Out:

   id   country    color  level random  number
0   1   USA        Blue   High  df      1
1   2   USA        Blue   Low   adsf    3
2   3   Canada     Red    Low   wqer    5
3   4   Canada     Green  High  qewr    7
4   5   Greenland  Purple Low   ycxb    1

We can create list of columns to index our df. For example:

numeric = ['id', 'number']
df[numeric]

Out:

    id  number
0   1   1
1   2   3
2   3   5
3   4   7
4   5   1

But now let's say we create a second list of columns:

list2 = ['country', 'level']

And we try to retrieve variables using both lists:

df[[numeric, list2]]

This does not work.

My question: how do you use both (or more) lists to index the df? I have hundreds of variables so cannot write them all as a list using traditional df.loc method, for example. Is there a simple solution?

Desired output:

    id  number  country    level
0   1   1       USA        High
1   2   3       USA        Low
2   3   5       Canada     Low
3   4   7       Canada     High
4   5   1       Greenland  Low

Any help much appreciated.

johnjohn
  • 716
  • 2
  • 9
  • 17

0 Answers0