1

I have many lists exactly like the ones below, provided by a weather station.

However, how can I "merge" the two daily observations into a single one? (the records available on the first set of daily observations are never present on the second set).

['82294', '04/03/2002', '0000', '', '30.9', '', '', '', '26.1', '93', '1.554', '']
['82294', '04/03/2002', '1200', '24', '', '22', '', '', '', '', '', '']
['82294', '05/03/2002', '0000', '', '29.9', '', '', '', '25.62', '92.5', '0.863333', '']
['82294', '05/03/2002', '1200', '11', '', '23.2', '', '', '', '', '', '']
['82294', '06/03/2002', '0000', '', '31.6', '', '', '', '27.12', '87.5', '1.381333', '']
['82294', '06/03/2002', '1200', '0.2', '', '22.6', '', '', '', '', '', '']
['82294', '07/03/2002', '0000', '', '32.2', '', '', '', '27.6', '90.75', '1.899333', '']
['82294', '07/03/2002', '1200', '2', '', '24.6', '', '', '', '', '', '']
['82294', '08/03/2002', '0000', '', '29.3', '', '', '', '25.66', '95.25', '1.036', '']
['82294', '08/03/2002', '1200', '21', '', '24.4', '', '', '', '', '', '']
['82294', '09/03/2002', '0000', '', '31.5', '', '', '', '26.26', '95.75', '1.899333', '']
['82294', '09/03/2002', '1200', '23', '', '22.8', '', '', '', '', '', '']
['82294', '10/03/2002', '0000', '', '31.7', '', '', '', '26.94', '90.5', '2.072', '']
relima
  • 3,462
  • 5
  • 34
  • 53

3 Answers3

6

You can use the pairwise iteration to group the pairs, then zip() the groups item by item and use or to choose one of the non-empty values:

[[x or y for x, y in zip(item1, item2)] 
 for item1, item2 in zip(data[0::2], data[1::2])]

where data is your input list of lists.

Produces:

[
    ['82294', '04/03/2002', '0000', '24', '30.9', '22', '', '', '26.1', '93', '1.554', ''], 
    ['82294', '05/03/2002', '0000', '11', '29.9', '23.2', '', '', '25.62', '92.5', '0.863333', ''], 
    ['82294', '06/03/2002', '0000', '0.2', '31.6', '22.6', '', '', '27.12', '87.5', '1.381333', ''], 
    ['82294', '07/03/2002', '0000', '2', '32.2', '24.6', '', '', '27.6', '90.75', '1.899333', ''], 
    ['82294', '08/03/2002', '0000', '21', '29.3', '24.4', '', '', '25.66', '95.25', '1.036', ''], 
    ['82294', '09/03/2002', '0000', '23', '31.5', '22.8', '', '', '26.26', '95.75', '1.899333', '']
]

You may additionally think of merging 0000 and 1200 in a better way cause now 0000 would be chosen.

Community
  • 1
  • 1
alecxe
  • 462,703
  • 120
  • 1,088
  • 1,195
1

You can also use pandas and its groupby() + apply():

import pandas as pd

df = pd.DataFrame(data, columns=['id', 'date', 'time', 'value1', 'value2', 'value3', 'value4', 'value5', 'value6', 'value7', 'value8', 'value9'])
df = df.groupby('date').apply(lambda x: x.max())

print(df.values.tolist())

Prints:

[
    ['82294', '04/03/2002', '1200', '24', '30.9', '22', '', '', '26.1', '93', '1.554', ''], 
    ['82294', '05/03/2002', '1200', '11', '29.9', '23.2', '', '', '25.62', '92.5', '0.863333', ''], 
    ['82294', '06/03/2002', '1200', '0.2', '31.6', '22.6', '', '', '27.12', '87.5', '1.381333', ''], 
    ['82294', '07/03/2002', '1200', '2', '32.2', '24.6', '', '', '27.6', '90.75', '1.899333', ''], 
    ['82294', '08/03/2002', '1200', '21', '29.3', '24.4', '', '', '25.66', '95.25', '1.036', ''], 
    ['82294', '09/03/2002', '1200', '23', '31.5', '22.8', '', '', '26.26', '95.75', '1.899333', ''], 
    ['82294', '10/03/2002', '0000', '', '31.7', '', '', '', '26.94', '90.5', '2.072', '']
]

Here, Series.max() works for us to merge the grouped items - maximum of an empty string and a non-empty string would always be a non-empty string. I though feel there should be a better (more appropriate, so to say) merging function.

alecxe
  • 462,703
  • 120
  • 1,088
  • 1,195
-1

Maybe something like that:

list_1=['82294', '04/03/2002', '0000', '', '30.9', '', '', '', '26.1', '93', '1.554', '']
list_2=['82294', '04/03/2002', '1200', '24', '', '22', '', '', '', '', '', '']
merged_list= list(set(list_1+list_2))

Update

merged_list = list([x for x in list_1 if x ])
merged_list.extend(x for x in list_2 if x)
DimKoim
  • 1,024
  • 6
  • 20
  • 33