-3

I want to extract some data based on date in an imported excel file in python. I want to be able to give the start date and end date and get the data for that specific period. I tried different ways to install pandas_datareader to use the following code, but I could not.

data = web.DataReader(dataset,start='', end='')

So, here is my code.

import pandas as pd  
import datetime 

data = pd.read_excel('file.xlsx')  
start = datetime.datetime(2009,1,1)  
end = datetime.datetime(2018,1,1)

#reshape based on date  
set_index = data.set_index('date')

How can I fix this problem? :(

Excel file

Nicole
  • 13
  • 3
  • can you share few rows and fields in excel ? – Hari_pb Dec 05 '18 at 20:33
  • @Harry_pb: I've just uploaded a screenshot of my file for you. https://i.stack.imgur.com/y6o3L.png – Nicole Dec 05 '18 at 20:41
  • Possible duplicate of [Sort Pandas Dataframe by Date](https://stackoverflow.com/questions/28161356/sort-pandas-dataframe-by-date) – Evan Dec 05 '18 at 20:58
  • @Evan: Thanks, but I can sort the dates. My problem is that I do not know how can I determine specific date periods. For example, in my data set if I just want to have the points from 2010 to 2018, how can I set this period. (My excel file includes data points from 1997 to 2018). – Nicole Dec 05 '18 at 21:03
  • HI @Nicole, you are asking about subsetting. I recommend `df.loc[]` in this case; I'll post an answer. – Evan Dec 05 '18 at 21:14
  • @Evan: Thanks a lot Evan. Your solution works. – Nicole Dec 05 '18 at 21:34

1 Answers1

0
import pandas as pd  
import datetime 

"""
ticker,date,closeunadj
ZF,2018-11-28,9.22
ZF,2018-11-27,9.16
ZF,2018-11-26,9.23
"""

df = pd.read_clipboard(sep=",", parse_dates=["date"]).set_index("date")
df

Output:

           ticker  closeunadj
date                         
2018-11-28     ZF        9.22
2018-11-27     ZF        9.16
2018-11-26     ZF        9.23

You can then use .loc to subset the index to the date range you like.

df.loc[(df.index < datetime.datetime(2018, 11, 28)) & (df.index > datetime.datetime(2018, 11, 26))]

Output:

           ticker  closeunadj
date                         
2018-11-27     ZF        9.16
Evan
  • 2,121
  • 14
  • 27