1

I have a dictionary that contains values of dates 6 month period between 2018-01-01 and 2018-06-01. They are stored in datetime format:

{datetime.datetime(2018, 3, 27, 0, 0): 1082721.139,
 datetime.datetime(2018, 1, 31, 0, 0): 819801.852,
 datetime.datetime(2018, 1, 10, 0, 0): 3369110.346,
 datetime.datetime(2018, 3, 29, 0, 0): 850877.5490000001,
 datetime.datetime(2018, 5, 31, 0, 0): 1441286.6970000002,
 datetime.datetime(2018, 2, 21, 0, 0): 1138380.7969999998,
 datetime.datetime(2018, 5, 23, 0, 0): 626211.4509999999,
 datetime.datetime(2018, 4, 30, 0, 0): 687981.8659999999,
 datetime.datetime(2018, 3, 8, 0, 0): 543523.898,
 datetime.datetime(2018, 4, 26, 0, 0): 527085.802,
 datetime.datetime(2018, 1, 4, 0, 0): 4374612.232,
 datetime.datetime(2018, 2, 7, 0, 0): 726471.939,
 datetime.datetime(2018, 4, 19, 0, 0): 619662.2050000001,
 datetime.datetime(2018, 2, 27, 0, 0): 642528.55,
 datetime.datetime(2018, 1, 8, 0, 0): 643899.718,
 datetime.datetime(2018, 5, 2, 0, 0): 714148.8159999999,
 datetime.datetime(2018, 4, 15, 0, 0): 1001.46,
 datetime.datetime(2018, 3, 17, 0, 0): 14950.402,
 datetime.datetime(2018, 3, 24, 0, 0): 8526.3,
 datetime.datetime(2018, 3, 4, 0, 0): 18.18,
 datetime.datetime(2018, 5, 26, 0, 0): 90.0,
 datetime.datetime(2018, 5, 5, 0, 0): 88070.0,
 datetime.datetime(2018, 5, 6, 0, 0): 63936.0}

I want to write a code that would check for missing days, and if there are any, add them to the dictionary give them a value of 0. What's the best way to do that?

I couldn't upload the entire dictionary as StackOverflow doesn't let me upload such a long code

  • 1
    We're happy to help you with homework questions, but the rules say you have to post your own attempt first as part of the question. To get you started, you can call `.weekday()` to convert any datetime to a day of the week, and use `set()` to remove duplicates – jnnnnn Jan 10 '21 at 03:19
  • @jnnnnn why do you think this is homework and where can I read those rules? It seems like you got some decent help with [this question](https://stackoverflow.com/questions/61221485/how-to-convert-json-rows-to-columns) without providing *your* attempt. – Mark Jan 10 '21 at 03:23
  • When you are writing a question, it asks you to `Describe what you’ve tried`. I was lucky with that other one but it wasn't a homework question either. – jnnnnn Jan 10 '21 at 03:26
  • if you like my answer please accept, appreciate thx – MaxYarmolinsky Jan 10 '21 at 03:33
  • If the final intention is to create a dictionary between `2018-01-01` and `2018-06-01`, then the simple thing is to create a generator going from first day thru last day, then add to dictionary with setdefault. That should do the trick. – Joe Ferndz Jan 10 '21 at 04:34

3 Answers3

3

Let's call your initial dictionary, myinitd

from datetime import datetime, timedelta

sdate = datetime(2018, 1, 1)   # start date
edate = datetime(2018, 6, 1)   # end date

delta = edate - sdate       # as timedelta

l = []
for i in range(delta.days + 1):
    day = sdate + timedelta(days=i)
    l.append(day)

for dt in l:
    if dt not in myinitd:
        myinitd[dt] = 0

please note that I got the code to produce the full range of dates from here: Print all day-dates between two dates

MaxYarmolinsky
  • 1,117
  • 1
  • 10
  • 16
2

Because datetime objects represent a sequential collection (each day is the previous day +1), there are strong benefits to using a serialized approach here. (This is especially the case where your dictionary or desired datetime range is greater than 1M observations -- list lookups will quickly add up and present terrible performance.)

import datetime
import numpy as np
import pandas as pd

my_dict = {
 datetime.datetime(2018, 3, 27, 0, 0): 1082721.139,
 datetime.datetime(2018, 1, 31, 0, 0): 819801.852,
 datetime.datetime(2018, 1, 10, 0, 0): 3369110.346,
 datetime.datetime(2018, 3, 29, 0, 0): 850877.5490000001,
 datetime.datetime(2018, 5, 31, 0, 0): 1441286.6970000002,
 datetime.datetime(2018, 2, 21, 0, 0): 1138380.7969999998,
 datetime.datetime(2018, 5, 23, 0, 0): 626211.4509999999,
 datetime.datetime(2018, 4, 30, 0, 0): 687981.8659999999,
 datetime.datetime(2018, 3, 8, 0, 0): 543523.898,
 datetime.datetime(2018, 4, 26, 0, 0): 527085.802,
 datetime.datetime(2018, 1, 4, 0, 0): 4374612.232,
 datetime.datetime(2018, 2, 7, 0, 0): 726471.939,
 datetime.datetime(2018, 4, 19, 0, 0): 619662.2050000001,
 datetime.datetime(2018, 2, 27, 0, 0): 642528.55,
 datetime.datetime(2018, 1, 8, 0, 0): 643899.718,
 datetime.datetime(2018, 5, 2, 0, 0): 714148.8159999999,
 datetime.datetime(2018, 4, 15, 0, 0): 1001.46,
 datetime.datetime(2018, 3, 17, 0, 0): 14950.402,
 datetime.datetime(2018, 3, 24, 0, 0): 8526.3,
 datetime.datetime(2018, 3, 4, 0, 0): 18.18,
 datetime.datetime(2018, 5, 26, 0, 0): 90.0,
 datetime.datetime(2018, 5, 5, 0, 0): 88070.0,
 datetime.datetime(2018, 5, 6, 0, 0): 63936.0
}

 
# Define your range
start_dt = datetime.datetime(year=2018, month=1, day=1)
end_dt = datetime.datetime(year=2018, month=6, day=1)

# Create a series of timestamps with values of 0
desired_dt = pd.date_range(start=start_dt, end=end_dt, freq="D")
desired_dt = pd.Series(index=pd.Series(desired_dt), data=np.zeros(len(desired_dt)))

# Create a series from your existing values
actual_dt = pd.Series(my_dict)

# Combine the two – similar to a left join
res = actual_dt.combine_first(desired_dt)

# Convert the index from datetime index to datetime values
res.index = res.index.to_pydatetime()

# Convert to a dictionary
res_dict = res.to_dict()

# {Timestamp('2018-01-01 00:00:00'): 0.0,
#  Timestamp('2018-01-02 00:00:00'): 0.0,
#  Timestamp('2018-01-03 00:00:00'): 0.0,
#  Timestamp('2018-01-04 00:00:00'): 4374612.232,
#  Timestamp('2018-01-05 00:00:00'): 0.0,
#  Timestamp('2018-01-06 00:00:00'): 0.0,
#  ...
#  }

In sum:

  • convert your observations to a series (a serialized array).
  • create an array with blank values for your desired date range.
  • combine (merge) the two arrays, keeping the left (non zero) values.
  • convert result back to a dictionary.
Yaakov Bressler
  • 9,056
  • 2
  • 45
  • 69
2

You can use the setdefault option in dictionary to load all the data between 1/1/2018 and 6/1/2018.

Python dictionary method setdefault() is similar to get(), but will set dict[key]=default if key is not already in dict. The command is as follows:

dict.setdefault(key, default=None)

key − This is the key to be searched.

default − This is the Value to be returned in case key is not found.

You can get more information about setdefault here.

To do this using setdefault, the code is:

date_val = datetime.datetime(2018,1,1)
while date_val <= datetime.datetime(2018,6,1):
    d.setdefault(date_val,0)
    date_val += timedelta(days=1)

Explanation:

Set the date_val to start date (2018, 1, 1). Then iterate using the while loop until you reach 2018, 6, 1 (your end date).

d.setdefault(date_val,0) will look for key date_val. If found, it will do nothing. If not found, then it will store the key into dictionary d with a value of 0.

date_val += timedelta(days=1) will increment the date by 1 day. timedelta(days=1) can be used to increment by days. In our case we want to increment by each day until we reach 6/1/2018. The while loop takes care of the rest by checking if we reached the limit. If not, it keeps adding new date as key with 0 as value.

The full code is as shown below:

from datetime import timedelta
import datetime

d = {datetime.datetime(2018, 3, 27, 0, 0): 1082721.139,
 datetime.datetime(2018, 1, 31, 0, 0): 819801.852,
 datetime.datetime(2018, 1, 10, 0, 0): 3369110.346,
 datetime.datetime(2018, 3, 29, 0, 0): 850877.5490000001,
 datetime.datetime(2018, 5, 31, 0, 0): 1441286.6970000002,
 datetime.datetime(2018, 2, 21, 0, 0): 1138380.7969999998,
 datetime.datetime(2018, 5, 23, 0, 0): 626211.4509999999,
 datetime.datetime(2018, 4, 30, 0, 0): 687981.8659999999,
 datetime.datetime(2018, 3, 8, 0, 0): 543523.898,
 datetime.datetime(2018, 4, 26, 0, 0): 527085.802,
 datetime.datetime(2018, 1, 4, 0, 0): 4374612.232,
 datetime.datetime(2018, 2, 7, 0, 0): 726471.939,
 datetime.datetime(2018, 4, 19, 0, 0): 619662.2050000001,
 datetime.datetime(2018, 2, 27, 0, 0): 642528.55,
 datetime.datetime(2018, 1, 8, 0, 0): 643899.718,
 datetime.datetime(2018, 5, 2, 0, 0): 714148.8159999999,
 datetime.datetime(2018, 4, 15, 0, 0): 1001.46,
 datetime.datetime(2018, 3, 17, 0, 0): 14950.402,
 datetime.datetime(2018, 3, 24, 0, 0): 8526.3,
 datetime.datetime(2018, 3, 4, 0, 0): 18.18,
 datetime.datetime(2018, 5, 26, 0, 0): 90.0,
 datetime.datetime(2018, 5, 5, 0, 0): 88070.0,
 datetime.datetime(2018, 5, 6, 0, 0): 63936.0}

date_val = datetime.datetime(2018,1,1)
while date_val <= datetime.datetime(2018,6,1):
    d.setdefault(date_val,0)
    date_val += timedelta(days=1)

print (d)
    

The output of this will be:

{datetime.datetime(2018, 3, 27, 0, 0): 1082721.139,
 datetime.datetime(2018, 1, 31, 0, 0): 819801.852,
 datetime.datetime(2018, 1, 10, 0, 0): 3369110.346,
 datetime.datetime(2018, 3, 29, 0, 0): 850877.5490000001, 
 datetime.datetime(2018, 5, 31, 0, 0): 1441286.6970000002, 
 datetime.datetime(2018, 2, 21, 0, 0): 1138380.7969999998, 
 datetime.datetime(2018, 5, 23, 0, 0): 626211.4509999999, 
 datetime.datetime(2018, 4, 30, 0, 0): 687981.8659999999, 
 datetime.datetime(2018, 3, 8, 0, 0): 543523.898, 
 datetime.datetime(2018, 4, 26, 0, 0): 527085.802, 
 datetime.datetime(2018, 1, 4, 0, 0): 4374612.232, 
 datetime.datetime(2018, 2, 7, 0, 0): 726471.939, 
 datetime.datetime(2018, 4, 19, 0, 0): 619662.2050000001, 
 datetime.datetime(2018, 2, 27, 0, 0): 642528.55, 
 datetime.datetime(2018, 1, 8, 0, 0): 643899.718, 
 datetime.datetime(2018, 5, 2, 0, 0): 714148.8159999999, 
 datetime.datetime(2018, 4, 15, 0, 0): 1001.46, 
 datetime.datetime(2018, 3, 17, 0, 0): 14950.402, 
 datetime.datetime(2018, 3, 24, 0, 0): 8526.3, 
 datetime.datetime(2018, 3, 4, 0, 0): 18.18, 
 datetime.datetime(2018, 5, 26, 0, 0): 90.0, 
 datetime.datetime(2018, 5, 5, 0, 0): 88070.0, 
 datetime.datetime(2018, 5, 6, 0, 0): 63936.0, 
 datetime.datetime(2018, 1, 1, 0, 0): 0, 
 datetime.datetime(2018, 1, 2, 0, 0): 0, 
 datetime.datetime(2018, 1, 3, 0, 0): 0, 
 datetime.datetime(2018, 1, 5, 0, 0): 0, 
 datetime.datetime(2018, 1, 6, 0, 0): 0, 
 datetime.datetime(2018, 1, 7, 0, 0): 0, 
 datetime.datetime(2018, 1, 9, 0, 0): 0, 
 datetime.datetime(2018, 1, 11, 0, 0): 0, 
 datetime.datetime(2018, 1, 12, 0, 0): 0, 
 datetime.datetime(2018, 1, 13, 0, 0): 0, 
 datetime.datetime(2018, 1, 14, 0, 0): 0, 
 datetime.datetime(2018, 1, 15, 0, 0): 0, 
 datetime.datetime(2018, 1, 16, 0, 0): 0, 
 datetime.datetime(2018, 1, 17, 0, 0): 0, 
 datetime.datetime(2018, 1, 18, 0, 0): 0, 
 datetime.datetime(2018, 1, 19, 0, 0): 0, datetime.datetime(2018, 1, 20, 0, 0): 0, datetime.datetime(2018, 1, 21, 0, 0): 0, datetime.datetime(2018, 1, 22, 0, 0): 0, datetime.datetime(2018, 1, 23, 0, 0): 0, datetime.datetime(2018, 1, 24, 0, 0): 0, datetime.datetime(2018, 1, 25, 0, 0): 0, datetime.datetime(2018, 1, 26, 0, 0): 0, datetime.datetime(2018, 1, 27, 0, 0): 0, datetime.datetime(2018, 1, 28, 0, 0): 0, datetime.datetime(2018, 1, 29, 0, 0): 0, datetime.datetime(2018, 1, 30, 0, 0): 0, datetime.datetime(2018, 2, 1, 0, 0): 0, datetime.datetime(2018, 2, 2, 0, 0): 0, datetime.datetime(2018, 2, 3, 0, 0): 0, datetime.datetime(2018, 2, 4, 0, 0): 0, datetime.datetime(2018, 2, 5, 0, 0): 0, datetime.datetime(2018, 2, 6, 0, 0): 0, datetime.datetime(2018, 2, 8, 0, 0): 0, datetime.datetime(2018, 2, 9, 0, 0): 0, datetime.datetime(2018, 2, 10, 0, 0): 0, datetime.datetime(2018, 2, 11, 0, 0): 0, datetime.datetime(2018, 2, 12, 0, 0): 0, datetime.datetime(2018, 2, 13, 0, 0): 0, datetime.datetime(2018, 2, 14, 0, 0): 0, datetime.datetime(2018, 2, 15, 0, 0): 0, datetime.datetime(2018, 2, 16, 0, 0): 0, datetime.datetime(2018, 2, 17, 0, 0): 0, datetime.datetime(2018, 2, 18, 0, 0): 0, datetime.datetime(2018, 2, 19, 0, 0): 0, datetime.datetime(2018, 2, 20, 0, 0): 0, datetime.datetime(2018, 2, 22, 0, 0): 0, datetime.datetime(2018, 2, 23, 0, 0): 0, datetime.datetime(2018, 2, 24, 0, 0): 0, datetime.datetime(2018, 2, 25, 0, 0): 0, datetime.datetime(2018, 2, 26, 0, 0): 0, datetime.datetime(2018, 2, 28, 0, 0): 0, datetime.datetime(2018, 3, 1, 0, 0): 0, datetime.datetime(2018, 3, 2, 0, 0): 0, datetime.datetime(2018, 3, 3, 0, 0): 0, datetime.datetime(2018, 3, 5, 0, 0): 0, datetime.datetime(2018, 3, 6, 0, 0): 0, datetime.datetime(2018, 3, 7, 0, 0): 0, datetime.datetime(2018, 3, 9, 0, 0): 0, datetime.datetime(2018, 3, 10, 0, 0): 0, datetime.datetime(2018, 3, 11, 0, 0): 0, datetime.datetime(2018, 3, 12, 0, 0): 0, datetime.datetime(2018, 3, 13, 0, 0): 0, datetime.datetime(2018, 3, 14, 0, 0): 0, datetime.datetime(2018, 3, 15, 0, 0): 0, datetime.datetime(2018, 3, 16, 0, 0): 0, datetime.datetime(2018, 3, 18, 0, 0): 0, datetime.datetime(2018, 3, 19, 0, 0): 0, datetime.datetime(2018, 3, 20, 0, 0): 0, datetime.datetime(2018, 3, 21, 0, 0): 0, datetime.datetime(2018, 3, 22, 0, 0): 0, datetime.datetime(2018, 3, 23, 0, 0): 0, datetime.datetime(2018, 3, 25, 0, 0): 0, datetime.datetime(2018, 3, 26, 0, 0): 0, datetime.datetime(2018, 3, 28, 0, 0): 0, datetime.datetime(2018, 3, 30, 0, 0): 0, datetime.datetime(2018, 3, 31, 0, 0): 0, datetime.datetime(2018, 4, 1, 0, 0): 0, datetime.datetime(2018, 4, 2, 0, 0): 0, datetime.datetime(2018, 4, 3, 0, 0): 0, datetime.datetime(2018, 4, 4, 0, 0): 0, datetime.datetime(2018, 4, 5, 0, 0): 0, datetime.datetime(2018, 4, 6, 0, 0): 0, datetime.datetime(2018, 4, 7, 0, 0): 0, datetime.datetime(2018, 4, 8, 0, 0): 0, datetime.datetime(2018, 4, 9, 0, 0): 0, datetime.datetime(2018, 4, 10, 0, 0): 0, datetime.datetime(2018, 4, 11, 0, 0): 0, datetime.datetime(2018, 4, 12, 0, 0): 0, datetime.datetime(2018, 4, 13, 0, 0): 0, datetime.datetime(2018, 4, 14, 0, 0): 0, datetime.datetime(2018, 4, 16, 0, 0): 0, datetime.datetime(2018, 4, 17, 0, 0): 0, datetime.datetime(2018, 4, 18, 0, 0): 0, datetime.datetime(2018, 4, 20, 0, 0): 0, datetime.datetime(2018, 4, 21, 0, 0): 0, datetime.datetime(2018, 4, 22, 0, 0): 0, datetime.datetime(2018, 4, 23, 0, 0): 0, datetime.datetime(2018, 4, 24, 0, 0): 0, datetime.datetime(2018, 4, 25, 0, 0): 0, datetime.datetime(2018, 4, 27, 0, 0): 0, datetime.datetime(2018, 4, 28, 0, 0): 0, datetime.datetime(2018, 4, 29, 0, 0): 0, datetime.datetime(2018, 5, 1, 0, 0): 0, datetime.datetime(2018, 5, 3, 0, 0): 0, datetime.datetime(2018, 5, 4, 0, 0): 0, datetime.datetime(2018, 5, 7, 0, 0): 0, datetime.datetime(2018, 5, 8, 0, 0): 0, datetime.datetime(2018, 5, 9, 0, 0): 0, datetime.datetime(2018, 5, 10, 0, 0): 0, datetime.datetime(2018, 5, 11, 0, 0): 0, datetime.datetime(2018, 5, 12, 0, 0): 0, datetime.datetime(2018, 5, 13, 0, 0): 0, datetime.datetime(2018, 5, 14, 0, 0): 0, datetime.datetime(2018, 5, 15, 0, 0): 0, datetime.datetime(2018, 5, 16, 0, 0): 0, datetime.datetime(2018, 5, 17, 0, 0): 0, datetime.datetime(2018, 5, 18, 0, 0): 0, datetime.datetime(2018, 5, 19, 0, 0): 0, datetime.datetime(2018, 5, 20, 0, 0): 0, datetime.datetime(2018, 5, 21, 0, 0): 0, datetime.datetime(2018, 5, 22, 0, 0): 0, datetime.datetime(2018, 5, 24, 0, 0): 0, datetime.datetime(2018, 5, 25, 0, 0): 0, datetime.datetime(2018, 5, 27, 0, 0): 0, datetime.datetime(2018, 5, 28, 0, 0): 0, datetime.datetime(2018, 5, 29, 0, 0): 0, datetime.datetime(2018, 5, 30, 0, 0): 0, datetime.datetime(2018, 6, 1, 0, 0): 0}
Joe Ferndz
  • 8,417
  • 2
  • 13
  • 33