Good day, please all I want to expand the date range implied by StartDate
and EndDate
.
import pandas as pd
import datetime
from pandas_datareader import data, wb
import csv
out= open("testfile.csv", "rb")
data = csv.reader(out)
data = [[row[0],row[1] + row[2],row[3] + row[4], row[5],row[6]] for row in data]
out.close()
print data
out=open("data.csv", "wb")
output = csv.writer(out)
for row in data:
output.writerow(row)
out.close()
df = pd.read_csv('data.csv')
for DateDpt, DateAr in df.iteritems():
df.DateDpt = pd.to_datetime(df.DateDpt, format='%Y-%m-%d')
df.DateAr = pd.to_datetime(df.DateAr, format='%Y-%m-%d')
df['DateAr'] = [pd.to_datetime(x, format='%Y-%m-%d') for x in df['DateAr']]
df['DateDpt'] = [pd.to_datetime(x, format='%Y-%m-%d') for x in df['DateDpt']]
df['range'] = df['DateDpt']-df['DateAr']
print df
ID ArCityArCountry DptCityDptCountry EndDate StartDate
1922 ParisFrance NewYorkUnitedState 2008-03-10 2008-12-01
1901 ParisFrance LagosNigeria 2001-03-05 2001-02-02
1922 ParisFrance NewYorkUnitedState 2011-02-03 2008-12-01
1002 ParisFrance CaliforniaUnitedState 2003-03-04 2002-03-04
1099 ParisFrance BeijingChina 2011-02-03 2009-02-04
1901 LosAngelesUnitedState ParisFrance 2001-03-05 2001-02-02
Output:
ID ArCityArCountry DptCityDptCountry EndDate
1922 ParisFrance NewYorkUnitedState 2008-03-10
1002 LosAngelesUnitedState ForidaUnitedState 2008-03-10
1901 ParisFrance LagosNigeria 2001-03-05
1922 ParisFrance NewYorkUnitedState 2011-02-03
1002 ParisFrance CaliforniaUnitedState 2003-03-04
1099 ParisFrance BeijingChina 2011-02-03
1901 LosAngelesUnitedState ParisFrance 2001-03-05
StartDate range
2001-02-02 2593 days
2008-12-01 266 days
2001-02-02 31 days
2008-12-01 794 days
2002-03-04 365 days
2009-02-04 729 days
2001-02-02 31 days
Expected out:
Let's consider row1
, we have 2593 days, I want a situation that from StartDate
i.e. 2001-02-02
to the EndDate
i.e 2008-03-10
, be listed
this should run through all the rows by expanding based on range until the value on StartDate
matches EndDate
.
ID ArCityArCountry DptCityDptCountry StartDate EndDate
1922 ParisFrance NewYorkUnitedState 2004-03-10 2008-12-01
1922 ParisFrance NewYorkUnitedState 2004-03-11 2008-12-01
1922 ParisFrance NewYorkUnitedState 2004-03-12 2008-12-01
until it gets to that date of EndDate which means on both date I should have something like StartDate = EndDate, ie 2008-12-01 on both sides. Considering the csv
1922 ParisFrance NewYorkUnitedState 2008-12-01 2008-12-01
Thank you so much
Another Question on it:
Thanks.. I have another question. I want to create a JSON, considering StartDate (However, wherever there are two dates matching each other, one of the dates would be used while all attributes will be appended. Let me make an example,
{ "2001-02-02" = { ParisFrance (ArCityArCountry): 1922 NewYorkUnitedStates: 1922} }
if we traverse the down csv we might likely to have another 2001-02-02. Instead of creating it, we can append it to initial StartDate. However, the DptCityDptCountry might be different but if another ID matches with the StartDate and DptCityDptCountry, it will be added up i.e.
{"2001-02-02" = {
ParisFrance (ArCityArCountry): 1922, 2212 //these are IDs with same StartDate and ArCityArCountry
NewYorkUnitedStates: 1922, 0029 //these are IDs with same StartDate and DptCityDptCountry}
}