0

I need to compare two rows of column 1 (title) to see if they have the same content in an Excel file. Eg: rows 1, 2, 5, 6. If the row is the same, we can go to compare the two rows of column 2 (pc) if they have the same content. So how should I implement my code to get it working?

     title  pc  rd/wr           Min Max     Avg Std_dev
   1 Test_1 PC0 Write Latency   88  1838    634 297
   2 Test_1 PC1 Write Latency   92  2363    661 369
   3 Test_2 PC0 Write Latency   90  1524    576 273
   4 Test_2 PC1 Write Latency   94  1526    568 267
   5 Test_1 PC0 Write Latency   90  1850    623 287
   6 Test_1 PC1 Write Latency   89  2403    678 356
   7 Test_2 PC0 Write Latency   88  1530    579 277
   8 Test_2 PC1 Write Latency   87  1500    587 275

My code:

import pandas as pd
df = pd.read_excel('Report_testing.xlsx')
col = []
row = []
for line in df:
    col = line.split()
    if row[i] == col[j]:

Expected Output:

Test_1  PC0 Write Latency   89    1844  628.5   292
Test_1  PC1 Write Latency   90.5  2383  669.5   362.5
Test_2  PC0 Write Latency   89    1527  577.5   275
Test_2  PC1 Write Latency   90.5  1513  577.5   271

The condition that set by me is incorrect. How should I implement my code to achieve my goal? Thanks in advance.

Findy
  • 5
  • 2
  • Does this answer your question? [Get statistics for each group (such as count, mean, etc) using pandas GroupBy?](https://stackoverflow.com/questions/19384532/get-statistics-for-each-group-such-as-count-mean-etc-using-pandas-groupby) – Naveed Sep 09 '22 at 00:51
  • @Naveed, sadly it does not answer my question. – Findy Sep 09 '22 at 01:09
  • are you able to create a dataframe as you have listed in the question? or the question is how to create that dataframe by reading excel? if you have a dataframe then its a simple groupby – Naveed Sep 09 '22 at 01:11

1 Answers1

0

This is my df

title   pc  rd/wr   Min Max Avg Std_dev
0   Test_1  PC0 Write   88  1838    634 297
1   Test_1  PC1 Write   92  2363    661 369
2   Test_2  PC0 Write   90  1524    576 273
3   Test_2  PC1 Write   94  1526    568 267
4   Test_1  PC0 Write   90  1850    623 287
5   Test_1  PC1 Write   89  2403    678 356
6   Test_2  PC0 Write   88  1530    579 277
7   Test_2  PC1 Write   87  1500    587 275

This is what i got using this df.groupby(['title', 'pc']).agg({'rd/wr':'first', 'Min': 'mean', 'Max':'mean', 'Avg':'mean', 'Std_dev':'mean'}).reset_index()

title   pc  rd/wr   Min      Max    Avg    Std_dev
Test_1  PC0 Write   89.0    1844.0  628.5   292.0
Test_1  PC1 Write   90.5    2383.0  669.5   362.5
Test_2  PC0 Write   89.0    1527.0  577.5   275.0
Test_2  PC1 Write   90.5    1513.0  577.5   271.0
XXavier
  • 1,206
  • 1
  • 10
  • 14