I have a DataFrame like this:
df = pd.DataFrame({'number': [['233182801104', '862824274124', '278711320172'], ['072287346459', '278058853506'], ['233182801104', '862824274124'], None, ['123412341234']], 'country':[None, 'France', 'USA', None, 'Germany'], 'c':np.random.randn(5), 'd':np.random.randn(5)})
Which looks like:
number country c d
0 [233182801104, 862824274124, 278711320172] None 0.177375 -0.226086
1 [072287346459, 278058853506] France -0.134511 0.551962
2 [233182801104, 862824274124] USA 0.490095 0.770992
3 None None -0.714745 0.807898
4 [123412341234] Germany 1.047809 0.523591
I want all unique combinations of elements of lists in the number column and the country. Additional problem is that list can very in length and number and country can contain None
:
code country_final
233182801104 USA
862824274124 USA
278711320172 None
072287346459 France
278058853506 France
123412341234 Germany
As a first step I would do this to have seperate columns
a['number'].apply(pd.Series)
After that I am not sure if I have to work with groupby
or some kind of pivot table.