Flatten and join dissimilar (but consistent) Pandas DataFrames

Question

I'm struggling to consolidate several dataframes into a flattened format to use in a Keras machine learning algorithm:

My challenge surrounds the differing format of 2 ancillary tables that are ordered more like lists (they are associated to the main table and to one another via a shared key).

Since my desired output is a single line containing the shared data from the three data frames (and not, for instance, a right-joined table with matches rows)... I'm at a loss for how to do this. It's probable some magical combination of unstack or pivot...

Any thoughts or advice on how I might massage these together (or if I'm asking the wrong question and should simply use them as multiple inputs into the ML algorithm, please set me straight!)

Thanks!

edit: at the (very reasonable) request in the comments, here is some sample data to manipulate:

import pandas as pd

import numpy as np

priceDf = pd.DataFrame(np.arange(30).reshape((6,5))
                   .transpose(), columns=['Key', 'volume', 'high', 'low', 'price', 'price_in_5_minutes'])

tradeBookDf = pd.DataFrame(np.append(np.repeat(np.arange(4), 4),(np.random.randint(5, size=48)))
                .reshape((4,16)).transpose(), columns=['Key','size', 'price', 'bid_ask_bit'])

recentTradesDf = pd.DataFrame(np.append(np.repeat(np.arange(4), 4),(np.random.randint(5, size=48)))
                .reshape((4,16)).transpose(), columns=['Key','price', 'quantity', 'timestamp'])

The goal is to output a single row of every piece of data associated with the key. eg:

| Key | volume | high | low | price | price_in_5_minutes | trade1_price | trade1_quantity | trade1_timestamp | trade2.... | book1_size | book1_price | book1_bid_ask_bit | book2.... |

It would help us help you greatly, if you built some data to manipulate and the expected output so we could verify. See [MVCE](https://stackoverflow.com/help/mcve) Also see. https://stackoverflow.com/questions/20109391/how-to-make-good-reproducible-pandas-examples?answertab=oldest#tab-top — Scott Boston, Aug 26 '17 at 21:32
I'm having a hard time trying to understand what you mean by writing "list of values" in a row of a table. — Daniel Möller, Aug 27 '17 at 14:09

Flatten and join dissimilar (but consistent) Pandas DataFrames

0 Answers0