8

Hello I am trying to insert 3 empty rows after each row of the current data using pandas then export the data. For example a sample current data could be:

name     profession
Bill      cashier
Sam        stock
Adam      security

Ideally what I want to achieve:

name     profession
Bill      cashier
Nan         Nan
Nan         Nan
Nan         Nan
Sam        stock
Nan         Nan
Nan         Nan
Nan         Nan
Adam      security
Nan         Nan
Nan         Nan
Nan         Nan

I have experimented with itertools however i am not sure how i can precisely get three empty rows using after each row using this method. Any help, guidance, sample would definitely be appreciative!

olive
  • 171
  • 1
  • 2
  • 12
  • 2
    Here are a few examples. Stackoverflow has many examples of similar request.[1](https://stackoverflow.com/questions/47148170/pandas-inserting-an-empty-row-after-every-2nd-row-in-a-data-frame), [2](https://stackoverflow.com/questions/39114382/pandas-insert-alternate-blank-rows), [3](https://stackoverflow.com/questions/10715965/create-pandas-dataframe-by-appending-one-row-at-a-time), [4](https://stackoverflow.com/questions/66380156/adding-a-row-of-special-character-after-every-nth-row-n-variable-but-pre-decide/66380501#66380501) – Joe Ferndz Mar 03 '21 at 23:33
  • Does this answer your question? [Create pandas Dataframe by appending one row at a time](https://stackoverflow.com/questions/10715965/create-pandas-dataframe-by-appending-one-row-at-a-time) – Joe Ferndz Mar 03 '21 at 23:34
  • Very neat solution here: [Pandas insert alternate blank rows](https://stackoverflow.com/a/39114978/1609514). Basically, two statements: `df.index = range(0, 4*len(df), 4); df2 = df.reindex(index=range(4*len(df)))` – Bill Mar 04 '21 at 03:26

6 Answers6

8

Using append on a dataframe is quite inefficient I believe (has to reallocate memory for the entire data frame each time).

DataFrames were meant for analyzing data and easily adding columns—but not rows.

So I think a good approach would be to create a new dataframe of the correct size and then transfer the data over to it. Easiest way to do that is using an index.

# Demonstration data
data = 'name profession Bill cashier Sam stock Adam security'
data = np.array(data.split()).reshape((4,2))
df = pd.DataFrame(data[1:],columns=data[0])

# Add n blank rows
n = 3
new_index = pd.RangeIndex(len(df)*(n+1))
new_df = pd.DataFrame(np.nan, index=new_index, columns=df.columns)
ids = np.arange(len(df))*(n+1)
new_df.loc[ids] = df.values
print(new_df)

Output:

    name profession
0   Bill    cashier
1    NaN        NaN
2    NaN        NaN
3    NaN        NaN
4    Sam      stock
5    NaN        NaN
6    NaN        NaN
7    NaN        NaN
8   Adam   security
9    NaN        NaN
10   NaN        NaN
11   NaN        NaN
Bill
  • 10,323
  • 10
  • 62
  • 85
  • Efficient solution, and it is also true that you should not use append to add rows, see [Append rows to a pandas DataFrame without making a new copy](https://stackoverflow.com/a/23394424/11154841). Use `df.loc[0] = ...` or even better, do many row additions in just one step as was done here. – questionto42 Dec 06 '21 at 18:23
5
 insert_rows = 3   # how many rows to insert

 df.index = range(0, insert_rows * len(df), insert_rows)

 # create new_df with added rows
 new_df = df.reindex(index = range(insert_rows * len(df))) 
PJ_
  • 473
  • 5
  • 9
  • Nice answer. Note that `insert_rows` is actually "row spacing" or "number of rows to insert + 1". – Bill Mar 25 '23 at 21:54
2

If you provided more information that would be helpful, but a thing that comes to mind is to use this command

df.append(pd.Series(), ignore_index=True)

This will add an empty row to your data frame, though as you can see you have to pass set ignore_index=True, otherwise the append won't work.

djvaroli
  • 1,223
  • 1
  • 11
  • 28
2

The code below includes a function to add empty rows between the existing rows of a dataframe.

Might not be the best approach for what you want to do, it might be better to add the blank rows when you are exporting the data.

import pandas as pd

def add_blank_rows(df, no_rows):
    df_new = pd.DataFrame(columns=df.columns)
    for idx in range(len(df)):
        df_new = df_new.append(df.iloc[idx])
        for _ in range(no_rows):
            df_new=df_new.append(pd.Series(), ignore_index=True)
    return df_new
    
df = pd.read_csv('test.csv')

df_with_blank_rows = add_blank_rows(df, 3)

print(df_with_blank_rows)
norie
  • 9,609
  • 2
  • 11
  • 18
2

this works

df_new = pd.DataFrame()
for i, row in df.iterrows():
    df_new = df_new.append(row)
    for _ in range(3):
        df_new = df_new.append(pd.Series(), ignore_index=True)

df of course is the original DataFrame

yakir0
  • 184
  • 6
1

Here is a function to do that with one loop:

def NAN_rows(df):
    row = df.shape[0]
    x = np.empty((3,2,)) # 3 empty row and 2 columns. You can increase according to your original df
    x[:] = np.nan
    df_x = pd.DataFrame( columns = ['name' ,'profession'])
    for i in range(row):
        temp = np.vstack([df.iloc[i].tolist(),x])
        df_x = pd.concat([df_x, pd.DataFrame(temp,columns = ['name' ,'profession'])], axis=0)
        
    return df_x
   
df = pd.DataFrame({
   'name' : ['Bill','Sam','Adam'],
    'profession' : ['cashier','stock','security']
})

print(NAN_rows(df))

#Output: 

   name profession
0  Bill    cashier
1   nan        nan
2   nan        nan
3   nan        nan
0   Sam      stock
1   nan        nan
2   nan        nan
3   nan        nan
0  Adam   security
1   nan        nan
2   nan        nan
3   nan        nan

Inputvector
  • 1,061
  • 10
  • 22