How to erase mismatched lines in a csv using python general logic

Question

I'm at the extent of my knowledge with Python here. I'm trying to see whether each line of this file (past line 0 which is a different length than the rest of the lines) matches lengthwise. If it doesn't, I want to to delete the whole row. What I want to do is compare row 4 to row 3, if row 4 is truncated by 3 columns like the picture I just want to delete the whole row.

import glob
import os
import csv
import io

#path to files -- parent folder is path and the folders below follow parent/*/* format.
path = 'C:/Users/*/Desktop/Current Project/J20102/Test data'

#machine ID minus date -- PS and TD header 
machineID = '14:14:08','21012','223','0','1098','0','031','810','12','01','092','048','0008','02'

#Trend Data and Date
TDID = 'TD','08/24/2021'
trendHeader = 'Date/Time','G120010','M129000','G110100','M119030','G112070','G112080','G111030','G127020','G127030','G120020','G120030','G121020','G111040','G112010','P102000','G112020','G112040','G112090','G110050','G110060','G110070','T111100'
TDmachineID = TDID + machineID

#sorting lists of all csv files in path
TD_files  = glob.glob(os.path.join(path,"*.csv"), recursive=True) # + glob.glob(os.path.join(subpath,"*SMP02_*.csv"), recursive=True) 
      
for f in TD_files:                                              #FOR ALL TREND FILES:
   with io.open(f,newline='',encoding='latin1') as g:           #open file as read
      r = csv.reader((line.replace('\0','') for line in g))     #declare read variable for list while dropping nulls from file
      data = [line for line in r]                               #set list to all data in file 
      for line in r:
        if data[y] != data[y-1]:
            line.strip
      data[0] = TDmachineID                                     #add machine ID line
      data[1] = trendHeader                                     #add trend header line
   with open(f,'w',newline='') as g:                            #open file as write
      w = csv.writer(g)                                         #declare write variable
      w.writerows(data)                                         #write rows

print('Complete!')

The row to delete

You can use `len()` to test whether the number of fields in a record is the same as that of another record. You can use slicing to start processing at a row after the first. Your code never sets or increments `y`. Also, you're trying to consume the file twice without seeking back to the beginning before the second time. You should process the file only once if possible. If not, add a `seek(0)`. You're also mixing iterating over the records with iterating over the index of the records. You should choose one. — Dennis Williamson, Sep 16 '22 at 15:48
@DennisWilliamson Thanks for the tips. I'm going to employ them now. How and where do you suggest I nest `len()`? This is what I'm currently working with now... `if len(data[y] != len(data[y-1])):` but I'm just getting this guy to the point where it works. — Kyle Lucas, Sep 16 '22 at 17:19
`line.strip` doesn't do what I think you want. Since there are no parentheses, you're just saying the name of the function with no effect. Note that `strip()` removes leading and trailing white space which you may want, but doesn't delete the line. Note that it's also not a method of lists so you can't use it with `data` either. You never use the value of `line` so that `strip()` would be a wasted operation. So now if you delete items from `data`, the index will get out of whack. See [this](https://thispointer.com/python-remove-elements-from-a-list-while-iterating/) for an explanation. — Dennis Williamson, Sep 16 '22 at 19:15

How to erase mismatched lines in a csv using python general logic

0 Answers0