84

I have a list of tuples similar to the below:

[(date1, ticker1, value1),(date1, ticker1, value2),(date1, ticker1, value3)]

I want to convert this to a DataFrame with index=date1, columns=ticker1, and values = values. What is the best way to do this?

EDIT:

My end goal is to create a DataFrame with a datetimeindex equal to date1 with values in a column labeled 'ticker':

df = pd.DataFrame(tuples, index=date1)

Right now the tuple is generated with the following:

tuples=list(zip(*prc_path))

where prc_path is a numpy.ndarray with shape (1000,1)

molivizzy
  • 857
  • 1
  • 6
  • 8
  • 1
    It's conventional to give a small example of input and desired output that people can copy and paste (otherwise someone who wants to demonstrate that his method works needs to spend time inventing one of his own.) – DSM Jan 28 '15 at 19:19
  • Is the revised edit better? – molivizzy Jan 28 '15 at 19:28

1 Answers1

138

I think this is what you want:

>>> data = [('2013-01-16', 'AAPL', 1),
            ('2013-01-16', 'GOOG', 1.5),
            ('2013-01-17', 'GOOG', 2),
            ('2013-01-17', 'MSFT', 4),
            ('2013-01-18', 'GOOG', 3),
            ('2013-01-18', 'MSFT', 3)]

>>> df = pd.DataFrame(data, columns=['date', 'ticker', 'value'])
>>> df
         date ticker  value
0  2013-01-16   AAPL    1.0
1  2013-01-16   GOOG    1.5
2  2013-01-17   GOOG    2.0
3  2013-01-17   MSFT    4.0
4  2013-01-18   GOOG    3.0
5  2013-01-18   MSFT    3.0

>>> df.pivot('date', 'ticker', 'value')
ticker      AAPL  GOOG  MSFT
date                        
2013-01-16     1   1.5   NaN
2013-01-17   NaN   2.0     4
2013-01-18   NaN   3.0     3
elyase
  • 39,479
  • 12
  • 112
  • 119
  • Thanks! The data is set up exactly like your data where tuples = [(date, SPY, price), (date, SPY, price), etc.] But when I create the dataframe I get the following error: AssertionError: 3 columns passed, passed data had 1000 columns. Do I need to reshape and if so, how? – molivizzy Jan 29 '15 at 02:32
  • this is exactly solving my problem, but I don't understand why some columns end up being objects rather than strings. Any idea on how I can have a better understanding of strings might end up as objects? – bonobo Dec 11 '19 at 18:57
  • 2
    This did not work for my problem, however this did: pd.DataFrame.from_records(data, , columns=['date', 'ticker', 'value']) – mcagriardic Jun 22 '20 at 09:52