5

I use yahoo finance (yfinance library in Python) to get the daily open, close, volume, min, max for crypto pairs.

I figured out that the data resets at 00:00 UTC (close/open time), but the new row only gets added at 02:00 UTC. I need the data available right away at 00:00 for the latest day.

Any way to get around this? I'm thinking I can get minute or hourly data up until now and summarise the volume.

pricedata = pdr.get_data_yahoo("BTC-ETH", start="2020-12-20", end="2020-12-22", interval = "1m")

Am I missing an easier way of doing this within yfinance? Or an API that is more up to speed? I'd use alpha vantage, but they don't offer historical data for crypto pairs.

Update

I can create the entry for the last day (19/12) myself for:

pricedata = pdr.get_data_yahoo(stock, start="2020-12-19", end="2020-12-20", interval = "1m")

close:

print(pricedata['Close'][-1])

open:

print(pricedata['Close'][0])

(very slightly off but close enough)

High:

print(pricedata['High'].max())

Low:

print(pricedata['Low'].min())

BUT Volume does not seem to work.

print(pricedata['Volume'].sum())

gives me 50981730304 whereas the daily reported value is 12830893778. If I get the sum of the hourly values instead of minute values, I get 2111141888. Still far off...

I suppose one 'hack' would be to get the current 24h volume. But if I get that at 01:00 UTC it would already be 'scewed'. Honestly I'd rather use a proper API that is timely with the daily history :)

So far I have this correction implemented, granted it could be improved a bit (volume, which will only work if retrieved at 00:00 UTC as I haven't figured out how to 'back-get' that without the 2h delay):

            yesterday = datetime.strptime(_end, '%Y-%m-%d').date() - timedelta(days=1)
            yesterdayf = yesterday.strftime('%Y-%m-%d')

            # TODO check if between 8-10:15am UTC

            # TODO what if it exists but is not complete
            if yesterdayf not in hist.index:
                print("not in index")
                finedata = pdr.get_data_yahoo(symbol, start=yesterdayf, end=_end, interval="1m")
                c = finedata['Close'][-1]
                o = finedata['Close'][0]
                h = finedata['High'].max()
                l = finedata['Low'].min()
                yticker = yf.Ticker(symbol)
                v =  yticker.info['volume24Hr']

                print (c)
                print (o)

                #create dataframe
                add = pd.DataFrame({'Open':o, 'High':h, 'Low':l, 'Close':c,'Adj Close':c, 'Volume':v}, index=[yesterday])

                #concatenate
                hist = pd.concat([hist, add])
                print(hist.tail())
dorien
  • 5,265
  • 10
  • 57
  • 116
  • 2
    I've had similar questions and experiences myself. Unfortunately I have not found a better solution. Apologies this is not the answer you are looking for, but figured I would comment in. If someone else has an answer, I look forward to reading it. – zerecees Dec 21 '20 at 03:47
  • Yes I'm just experimenting sumarizing the volumes per minute, but it doesn't even seems to correspond to the daily volume... mmmm... Min/max/close I can get though. Just the volume remaining... – dorien Dec 21 '20 at 03:51

0 Answers0