The Problem
I have a CSV file that contains a large number of items.
The first column can contain either an IP address or random garbage. The only other column I care about is the fourth one.
I have written the below snippet of code in an attempt to check if the first column is an IP address and, if so, write that and the contents of the fourth column to another CSV file side by side.
with open('results.csv','r') as csvresults:
filecontent = csv.reader(csvresults)
output = open('formatted_results.csv','w')
processedcontent = csv.writer(output)
for row in filecontent:
first = str(row[0])
fourth = str(row[3])
if re.match('\d{1,3}\.\d{1,3}\.\d{1,3}\.\d{1,3}', first) != None:
processedcontent.writerow(["{},{}".format(first,fourth)])
else:
continue
output.close()
This works to an extent. However, when viewing in Excel, both items are placed in a single cell rather than two adjacent ones. If I open it in notepad I can see that each line is wrapped in quotation marks. If these are removed Excel will display the columns properly.
Example Input
1.2.3.4,rubbish1,rubbish2,reallyimportantdata
Desired Output
1.2.3.4 reallyimportantdata - two separate columns
Actual Output
"1.2.3.4,reallyimportantdata" - single column
The Question
Is there any way to fudge the format
part to not write out with quotations? Alternatively, what would be the best way to achieve what I'm trying to do?
I've tried writing out to another file and stripping the lines but, despite not throwing any errors, the result was the same...