1

So I've got a dictionary of files stored as pandas dataframe objects, and I'm accessing each file through a for loop to extract the 'Country' column. What I'm looking to do is extract each of these into a list and then take the set of the entire list of lists. Here is the code and my predicament:

    country_setter = []
        for file in files_list:
        country_setter.append(all_comps[file]['Country'].tolist())

    uni_country_setter = ?

The resulting output is a list of lists, with each pandas df ['Country'] column taking a list within the parent list. It looks like this:

[['France',
  'United States',
  'Poland',
  'Poland',
  'Poland',
  'Poland',
  'Hungary',
  'Poland',
  'France',
  'United Kingdom',
    ....
  'Namibia',
  'China',
  'China',
  'Ireland'],
 ['Netherlands',
  'Canada',
  'United States',
  'Canada',
  'Canada',
  'United States',
  'Sweden',
  'Sweden',
  'United Kingdom',
   ....
  'Ireland',
  'Netherlands',
  'Netherlands',
  'France',
  'Hong Kong',
  'France',
  'France',
  'United States',
  'France',
  'United States']]

It's a list with 40 individual lists within it. I can take the set(country_setter[0]) and that works fine in getting me the unique values of the first list, but I need to know the unique values of all files in conjunction.

Let me know if any of you can help. I've pored through stackoverflow and only found one question slightly similar, but they're goal was to maintain the list structure in the unique extraction and used itertools. I want the unique individual values across all of the lists here.

Thank you in advance!

fattmagan
  • 51
  • 7

1 Answers1

1

I think you need flatten lists and then create unique list by set:

uni_country_setter = list(set([item for sublist in country_setter for item in sublist]))

EDIT:

First loop is not necessary, is possible use:

uni_country_setter = list(set([item for file in files_list 
                               for item in all_comps[file]['Country'].tolist()]))
jezrael
  • 822,522
  • 95
  • 1,334
  • 1,252
  • Thank you! I don't think I could have figured that on my own. Can you explain the logic behind that double "for" call? Are you defining each sublist and then iterating through them? – fattmagan Oct 14 '17 at 19:30
  • Maybe better explanation is [here](https://stackoverflow.com/questions/952914/making-a-flat-list-out-of-list-of-lists-in-python) for flatenning. – jezrael Oct 14 '17 at 19:31