0

I have the problem that the dataframe from my import (stock prices from Yahoo) are not correct for a specific time period. I want to clear the data from 2010-01-01 until 2017-10-17 for "VAR1.DE" and replace it empty or with NaN. I have found the panda function "drop" but this will delete the hole column.

How can I solve the problem?

Here is my code:

from pandas_datareader import data as web
import pandas as pd
import numpy as np
from datetime import datetime

assets = ['1211.HK','BABA','BYND','CAP.DE','JKS','PLUG','QCOM','VAR1.DE']
weights = np.array([0.125,0.125,0.125,0.125,0.125,0.125,0.125,0.125])
stockStartDate='2010-01-01'
today = datetime.today().strftime('%Y-%m-%d')

df = pd.DataFrame()

for stock in assets:
 df[stock]=web.DataReader(stock, data_source='yahoo', start=stockStartDate,end=today)['Adj Close']
  • 1
    Does this answer your question? [How to delete rows from a pandas DataFrame based on a conditional expression](https://stackoverflow.com/questions/13851535/how-to-delete-rows-from-a-pandas-dataframe-based-on-a-conditional-expression) – IoaTzimas Jan 01 '21 at 00:30

1 Answers1

0

instead of having a for loop, you can simply do:

df = web.DataReader(name=assets, data_source='yahoo', start=stockStartDate, end=today)['Adj Close']

since the return dataframe would be indexed by datetime. (i.e. pd.DatetimeIndex) so you can simply do:

df.loc[:'2017-10-17', 'VAR1.DE'] = np.nan

reassigning values as NaN for column='VAR1.DE' that are before '2017-10-17'.

ABC
  • 635
  • 3
  • 10
  • Thank you for the code for replacement of the first loop and the correction of the data. Problem solved (Y) – mhol84n Jan 01 '21 at 17:06