2

I have a dataframe with 3 columns (a date index, a price and a string symbol). It looks like that:

Date Price Symbol
2019-01-02 39.480000 AAPL
2019-01-02 101.120003 MSFT
2019-01-02 62.023998 TSLA
2019-01-03 35.547501 AAPL
2019-01-03 97.400002 MSFT
2019-01-03 60.071999 TSLA

I'm looking for some panda/pytorch/python syntactic sugar to turn that into a tensor/matrix that will be:

[ [ 39.480000, 101.120003, 62.023998], [35.547501, 97.400002, 60.071999]]

With the number length of the first dimension will be the number of unique dates, and the length of the second will be the number of unique symbols. I'm guaranteed to have exactly 3 symbols per date and I want that each row of my matrix follow the same order for its columns (e.g always AAPL, MSFT, TSLA).

Now, that is very easy with some for loops, but I'm looking for something more "pythonic"

user6232472
  • 21
  • 1
  • 4

1 Answers1

2

You can groupby the date column, convert the groups of Price to numpy arrays, and then convert this series to a tensor:

import torch
import pandas as pd

prices = df.groupby(['Date'])['Price'].apply(np.array)
my_tensor = torch.tensor(prices)
iacob
  • 20,084
  • 6
  • 92
  • 119