I have a function that parses json and then loads it into a dataframe. My general approach was to loop through each file and then concat it to the existing DF that was adding all the entries over time:
HoldingsDF = loadFundETFData(ticker1, fundsDict[ticker1])
holdingsFullDF = pd.concat([holdingsFullDF, HoldingsDF])
The problem is this loads 30k files and it taking over 8 hours. So I tried this:
HoldingsDF = loadFundETFData(ticker1, fundsDict[ticker1])
holdingsFullList.append(HoldingsDF.to_dict())
then once the loop is done I tried to combine it via:
holdingsFullDF = pd.DataFrame.from_records(holdingsFullList)
(I'm tried just loading the df and doing it from_dict as well). I'm getting this output:
Column1
{0: 'IBM', 1: 'google'..
I expect:
Column1
IBM
Google
..
My concat per loop worked(but took 8 hours), this new approach is done within a min but is not loading it correctly(same problem on all columns). What am I doing wrong when loading the data?