0

i have looked various examples but i am having errors . i am using python 3.5.2 I am trying to download yahoo minute data using Yahoo chartapi -- API with the flowing URL below.

I am getting

ValueError: could not convert string to float:

def read_data(passing_for_url,fp):
    all_features = []
    timestamp_list =[]
    close_list = []
    high_list = []
    low_list = []
    open_price_list =[]
    volume_list = []
    count=0
    if passing_for_url==1:

    datasetname= (urlopen('http://chartapi.finance.yahoo.com/instrument/1.0/GOOG/chartdata;type=quote;range=1d/csv').read().decode('utf-8')).split('\n')
    else:
       datasetname = fp
    for line in datasetname:
       l=line.split(',')
       #print (l)
       if(passing_for_url==1):
          if count > 16:
            fp.write(line)
          else:
            count+=1
            continue
    x = list(l[len(l)-1])
    x = x[0:len(x)-1]
    x = ''.join(x)
    l[len(l)-1]=x
    print (l)
    all_features.append(l)
    timestamp, close, high, low, open_price , volume = l
    timestamp_list.append(int(timestamp))
    close_list.append(float(close))
    high_list.append(float(high))
    low_list.append(float(low))
    open_price_list.append(float(open_price))
    volume_list.append(float(volume))  # <== Getting error here
return timestamp_list, close_list, high_list, low_list, open_price_list, volume_list

Below is the response sample from the URL

 uri:/instrument/1.0/GOOG/chartdata;type=quote;range=1d/csv
 ticker:goog
 Company-Name:Alphabet Inc.
 Exchange-Name:NMS
 unit:MIN
 timezone:EST
 currency:USD
 gmtoffset:-18000
 previous_close:835.6700
 Timestamp:1485441000,1485464400
 labels:1485442800,1485446400,1485450000,1485453600,1485457200,1485460800,1485464400
 values:Timestamp,close,high,low,open,volume
 close:827.1602,833.9300
 high:827.4200,834.6201
 low:827.0100,833.9300
 open:827.3400,833.9300
 volume:0,99800
 1485441610,833.9300,833.9300,833.9300,833.9300,99800 <== Need to start here
 1485442196,831.0830,831.0830,831.0830,831.0830,47700
 1485442503,832.3000,832.3000,832.3000,832.3000,60800
 1485442811,832.2100,832.2100,832.2100,832.2100,33000
 1485443111,831.4300,831.4300,831.4300,831.4300,41900
 1485443408,831.0120,831.0120,831.0120,831.0120,34600
 1485443712,831.8400,831.8400,831.8400,831.8400,39600
 1485443997,832.3400,832.3400,832.3400,832.3400,38400
 1485444312,831.7600,831.7600,831.7600,831.7600,36000
 1485444579,831.0001,831.4000,831.0000,831.4000,94700

I need only to have data from the timestamp, close, high, low, open_price , volume and below , the first 17 rows are omitted.

But i am getting an error using python 3.5.2

ValueError: could not convert string to float:


Traceback (most recent call last):
File "google.py", line 207, in <module>
timestamp_list, close_list, high_list, low_list, open_price_list,  volume_list = read_data(choice, fp1)
File "google.py", line 49, in read_data
volume_list.append(float(volume))
ValueError: could not convert string to float: 
JourneyMan
  • 117
  • 5
  • 12
  • [Relevant](http://stackoverflow.com/questions/8420143/valueerror-could-not-convert-string-to-float-id#8420179) – Himal Jan 27 '17 at 05:33

1 Answers1

1

This piece does not understand what it is for but it deletes the last character of the volume column:

x = list(l[len(l)-1])
x = x[0:len(x)-1]
x = ''.join(x)
l[len(l)-1]=x

There is a line with the following content:

1485450601,828.5500,828.5500,828.4400,828.4999,0

But as I mentioned earlier, this removes the last character from the volume column; in other words, convert the '0' to '', which when converting to float generates the error.

In addition the last end of line must be eliminated, for this we use strip()

Complete code:

from urllib.request import urlopen

def read_data(passing_for_url,fp):
    all_features = []
    timestamp_list =[]
    close_list = []
    high_list = []
    low_list = []
    open_price_list =[]
    volume_list = []
    count=0
    if passing_for_url==1:
        datasetname= (urlopen('http://chartapi.finance.yahoo.com/instrument/1.0/GOOG/chartdata;type=quote;range=1d/csv')
            .read().decode('utf-8').strip()).split('\n')
    else:
        datasetname = fp
    for line in datasetname:
        l=line.split(',')
        #print (l)
        if(passing_for_url==1):
            if count > 16:
                fp.write(line)
            else:
                count+=1
                continue
        all_features.append(l)
        timestamp, close, high, low, open_price , volume = l
        timestamp_list.append(int(timestamp))
        close_list.append(float(close))
        high_list.append(float(high))
        low_list.append(float(low))
        open_price_list.append(float(open_price))
        volume_list.append(float(volume))
    return timestamp_list, close_list, high_list, low_list, open_price_list, volume_list
eyllanesc
  • 235,170
  • 19
  • 170
  • 241
  • fantastic it works great. thank you. If i wanted to download or do all the stocks in S&P 500 list and do a loop for each of them . to accomplish the same thing, how would i go about it – JourneyMan Jan 27 '17 at 16:48
  • I do not understand your question, explain. – eyllanesc Jan 27 '17 at 17:34
  • Currently the above code will only read or download a single stock realtime minute data . I was wondering how to download all the minute data for stocks listed on S&P 500 list ( or may be just 100 of them to start , usually they are about 500 company names on the list that make up of S&P 500 ) – JourneyMan Jan 27 '17 at 18:01
  • Is the content of the page corresponding to the url updated every time interval? – eyllanesc Jan 27 '17 at 18:09