I have 121 JSON files that I need to perform some analysis on. So I need to append these files to a single dataframe and then perform the analysis. I can do in batches but the issue is that the data is not sorted in the files. What are the efficient way to combine these files to a single dataframe? I tried the below code (the inefficient one):
# combining the data
for file in files:
print("Appending: " + file)
currentDF = dd.read_json(myPath + "\\" + file, lines=True)
combineDf = combineDf.append(currentDF, ignore_index=True)