3

I'm currently trying convert a pandas dataframe into a list of tuples. However I'm having difficulties getting the Index (which is the Date) for the values in the tuple as well. My first step was going here, but they do not add any index to the tuple.

Pandas convert dataframe to array of tuples

My only problem is accessing the index for each row in the numpy array. I have one solution shown below, but it uses an additional counter indexCounter and it looks sloppy. I feel like there should be a more elegant solution to retrieving an index from a particular numpy array.

def get_Quandl_daily_data(ticker, start, end):
prices = []
symbol = format_ticker(ticker)


try:
    data = quandl.get("WIKI/" + symbol, start_date=start, end_date=end)
except Exception, e:
    print "Could not download QUANDL data: %s" % e

subset = data[['Open','High','Low','Close','Adj. Close','Volume']]

indexCounter = 0
for row in subset.values:
    dateIndex = subset.index.values[indexCounter]
    tup = (dateIndex, "%.4f" % row[0], "%.4f" % row[1], "%.4f" % row[2], "%.4f" % row[3], "%.4f" % row[4],row[5])
    prices.append(tup)
    indexCounter += 1

Thanks in advance for any help!

Community
  • 1
  • 1
Justin
  • 545
  • 3
  • 7
  • 17

1 Answers1

10

You can iterate over the result of to_records(index=True).

Say you start with this:

In [6]: df = pd.DataFrame({'a': range(3, 7), 'b': range(1, 5), 'c': range(2, 6)}).set_index('a')

In [7]: df
Out[7]: 
   b  c
a      
3  1  2
4  2  3
5  3  4
6  4  5

then this works, except that it does not include the index (a):

In [8]: [tuple(x) for x in df.to_records(index=False)]
Out[8]: [(1, 2), (2, 3), (3, 4), (4, 5)]

However, if you pass index=True, then it does what you want:

In [9]: [tuple(x) for x in df.to_records(index=True)]
Out[9]: [(3, 1, 2), (4, 2, 3), (5, 3, 4), (6, 4, 5)]
Ami Tavory
  • 74,578
  • 11
  • 141
  • 185
  • Thanks for your reply Ami. Your answer is very helpful! I am curious however if I can use reset_index() function because the index is not arbitrary. Each index is the Date for the Open, High, Low, Close, Volume price data for a specific stock. So I would like to somehow use previous indexs used in the 'subset' numpy array. Would this still achieve the functionality I am trying to create? – Justin Aug 10 '16 at 16:04
  • @user3547551 I've shortened the steps a bit so that it doesn't use `reset_index` in any case. – Ami Tavory Aug 10 '16 at 16:08