0

I am trying to append many data frames into one empty data frame but It is not working. For this, I am using this tutorial my code is like this:

I am generating a frame inside a loop for that my code is:

def loop_single_symbol(p1):
    i = 0
    delayedPrice = []
    symbol = [] 
    while i<5 :
        print(p1)
        h = get_symbol_data(p1)
        delayedPrice.append(h['delayedPrice']) 
        symbol.append(h['symbol'])
        i+=1
    df = pd.DataFrame([], columns = []) 
    df["delayedPrice"] = delayedPrice
    df["symbol"] = symbol
    df["time"] = get_nyc_time()
    return df 
    time.sleep(4) 

This code is generating a frame like this:

   delayedPrice symbol time
0          30.5    BAC  6:6
1          30.5    BAC  6:6
2          30.5    BAC  6:6
3          30.5    BAC  6:6
4          30.5    BAC  6:6

And I am running a loop like this:

length = len(symbol_list())
data = ["BAC","AAPL"]
df = pd.DataFrame([], columns = []) 
for j in range(length): 
    u = data[j]
    if h:
        df_of_single_symbol = loop_single_symbol(u)
        print(df_of_single_symbol)
        df.append(df_of_single_symbol, ignore_index = True)        
print(df)

I am trying to append two or more data frame into one empty data frame but using the above code I am getting:

Empty DataFrame
Columns: []
Index: []

And I want a result like this:

   delayedPrice symbol time
0          30.5    BAC  6:6
1          30.5    BAC  6:6
2          30.5    BAC  6:6
3          30.5    BAC  6:6
4          30.5    BAC  6:6
0        209.15   AAPL  6:6
1        209.15   AAPL  6:6
2        209.15   AAPL  6:6
3        209.15   AAPL  6:6
4        209.15   AAPL  6:6

How can I do this using panda and what is the best possible way to do this.

Note: Here this line

h = get_symbol_data(p1)

Is fetching some data from API

Nilay Singh
  • 2,201
  • 6
  • 31
  • 61
  • 2
    Just like `list.append`, `pd.DataFrame.append` is **not** an in-place operation. You need to assign the appended dataframe back to `df`. – Chris May 03 '19 at 10:34
  • Pandas dataframes do not work as a list, they are much more complex data structures and appending is not really considered the best approach. Why not considering a dictionary, a file or even better a database to store the api fetches and visualise / process by converting your data into pandas? – qmeeus May 03 '19 at 10:35
  • one approach is to store the api output in a database, then model/update a column before reporting. – MEdwin May 03 '19 at 10:37
  • I can do that but in this, I want to append in the data frame. How can I append an empty data frame with new a frame which I am creating? – Nilay Singh May 03 '19 at 10:39
  • see my answer. In short: `join` or `pd.concat` will do – qmeeus May 03 '19 at 10:41

2 Answers2

5

As I mentioned in my comment, appending to pandas dataframes is not considered a very good approach. Instead, I suggest that you use something more appropriate to store the data, such as a file or a database if you want scalability.

Then you can use pandas for what it's built, i.e. data analysis by just reading the contents of the database or the file into a dataframe.

Now, if you really want to stick with this approach, I suggest either join or concat to grow your dataframe as you get more data

[EDIT]

Example (from one of my scripts):

results = pd.DataFrame()
for result_file in result_files:
    df = parse_results(result_file)
    results = pd.concat([results, df], axis=0).reset_index(drop=True)

parse_results is a function that takes a filename and returns a dataframe formatted in the right way, up to you to make it fit your needs.

qmeeus
  • 2,341
  • 2
  • 12
  • 21
1

As the comments stated, your original error is that you didn't assign the df.append call to a variable - it returns the appended (new) DataFrame.

For anyone else looking to "extend" your DataFrame in-place (without an intermediate DB, List or Dictionary), here is a hint showing how to do this simply:

Pandas adding rows to df in loop

Basically, start with your empty DataFrame, already setup with the correct columns,

then use df.loc[ ] indexing to assign the new Row of data to the end of the dataframe, where len(df) will point just past the end of the DataFrame. It looks like this:

   df.loc[  len(df)  ] = ["my", "new", "data", "row"]

More detail in the linked hint.

Demis
  • 5,278
  • 4
  • 23
  • 34