0

So what I want to accomplish is that every time in my dataframe i find a line which is missing the date and the rest of the first five columns, copy the values of the row above it. Example:

0| Date      | Name | Amount | Address
1| 12/04/2018| Pepe | $1.00  | Avenue 1
2| NaT       | NaN  |  NaN   |  NaN (In this line i need the values of the line above)
3| 1/04/2018 | Tito |  $3.00 |  Avenue 2

.

for file in files:
  fileName = os.path.splitext(file)[0]
  if fileName == 'xxxxxxx (copy)':
    df = pd.read_excel(file)
        for index, row in df.iterrows():
          if pd.isna(df['Date'] 'And the rest of the 5 columns'):
            #Copy the values of the line above it
Charlie Clark
  • 18,477
  • 4
  • 49
  • 55
  • `ffill` is what you want? but why do you want this? `ffill` to the dataframe cause severe damage to your df. It will simply remove all `NaN`, which may be you don't want to do. moreover, In your dataframe if you have all rows which is empty, why can't you remove that records from your df, instead of creating a duplicate? `df.dropna(how='all')` – Mohamed Thasin ah Dec 04 '18 at 11:09
  • I need this becuase, every row refers to a receipt that we have paid, so i receive an xlsx with this data, i read it with pandas and if there is an a row that has the first five columns empty (NaN) is another receipt that we paid but to the same customer or provider. As you can see in the example, hope this sounds clear. – Juan Burolleau Dec 04 '18 at 11:57

2 Answers2

1

Use ffill() to carry forward above row.

Example:

# Given
df = pd.DataFrame({'word':['Alpha', np.NaN, 'Charlie'],
                  'Percentage 1':[10, np.NaN, 0],
                  'Percentage 2': [5, np.NaN, 4]})
df
    word    Percentage 1    Percentage 2
0   Alpha   10.0             5.0
1   NaN     NaN              NaN
2   Charlie 0.0              4.0

df = df.ffill()

   word     Percentage 1    Percentage 2
0  Alpha    10.0            5.0
1  Alpha    10.0            5.0
2  Charlie  0.0             4.0
meW
  • 3,832
  • 7
  • 27
0

So i solve it, with using ffill(forward fill) method, here is the code:

for file in files:
  fileName = os.path.splitext(file)[0]
  if fileName == 'File1':
    df = pd.read_excel(file)
    new_df = df.fillna(method="ffill")
    writer = pd.ExcelWriter('File1.xlsx', engine='xlsxwriter')
    new_df.to_excel(writer, 'Sheet 1')
    writer.save()