4

I have a dictionary like this:

> { "Country1": {"time": [1/22/20, 1/23/20...], "cases":[0,10,20,...]},
> "Country2": {"time": [1/22/20, 1/23/20...], "cases":[0,10,20,...]},
> .... }

I want to drop the dates higher than a given date and their respective cases. I've tried with this, but it fails because of IndexError: list index out of range. How would yo do it?

for i in (range(len(Country_Region))):
    for j in (range(len(Countries_Dict[i][Country_Region[i]]['time']))):
        if datetime.strptime(Countries_Dict[i][Country_Region[i]]['time'][j], '%m/%d/%y')  > datetime.strptime(date, '%m-%d-%y'):
            Countries_Dict[i][Country_Region[i]]['time'].pop(j)
            Countries_Dict[i][Country_Region[i]]['cases'].pop(j)

Dates are in string format, and the desired output is the same dictionary as before without the dates higher than a given date and their respective cases.

mikrim
  • 41
  • 2
  • 3
    are the dates in `string` format? – Linux Geek Apr 16 '21 at 18:57
  • 1
    Those times will become floats, for example `1/22/20` -> `0.0022727272727272726`. They're supposed to be strings, aren't they? – wjandrea Apr 16 '21 at 19:10
  • Related: [Strange result when removing item from a list while iterating over it](https://stackoverflow.com/q/6260089/4518341) – wjandrea Apr 16 '21 at 19:12
  • 1
    It would help if you provided all the details, meaning some complete, valid example input, and the desired output, sort of like a [mre]. BTW, welcome to SO! Check out the [tour], and [ask] if you want tips. – wjandrea Apr 16 '21 at 19:13
  • Yes, the dates are in string format – mikrim Apr 17 '21 at 12:17

1 Answers1

0

Your IndexError is likely a fault of inconsistent data, with time and cases having different lengths.

You can use zip to bunch them together clipping at length of the smaller list.

data = {
    'Country1': {
        'time': ['1/22/20', '1/23/20', '1/24/20', '1/25/20'],
        'cases': [0, 10, 20, 30, 40]   # Inconsistent data
    },
    'Country2': {
        'time': ['1/22/20', '1/23/20', '1/24/20', '1/25/20'],
        'cases': [0, 10, 20, 30]
    }
}

threshold_date = '1/23/20'


def parse_date(date):
    return datetime.strptime(date, '%m/%d/%y')


for country, country_data in data.items():
    pre_threshold = [
        (time, case)
        for time, case in zip(country_data['time'], country_data['cases'])
        if parse_date(time) <= parse_date(threshold_date)
    ]
    country_data['time'] = [t for t, c in pre_threshold]
    country_data['cases'] = [c for t, c in pre_threshold]

This will result in,

{'Country1': {'time': ['1/22/20', '1/23/20'], 'cases': [0, 10]},
 'Country2': {'time': ['1/22/20', '1/23/20'], 'cases': [0, 10]}}

Notice that the first list had 4 vs 5 entries in time and cases but it didn't matter.

This is creating a new list, the efficiency of this code can be improved, but I've kept it so for the readability. You can choose to do so based on your need.

Hrishikesh
  • 1,076
  • 1
  • 8
  • 22