How can you create an if else statement in python when you have a file with both text and numbers.
Let's say I want to replace the values from the third to last column in the file below. I want to create an if else statement to replace values <5 or if there's a dot "." with a zero, and if possible to use that value as integer for a sum.
A quick and dirty solution using awk would look like this, but I'm curious on how to handle this type of data with python:
awk -F"[ :]" '{if ( (!/^#/) && ($9<5 || $9==".") ) $9="0" ; print }'
So how do you solve this problem?
Thanks
Input file:
\##Comment1
\#Header
sample1 1 2 3 4 1:0:2:1:.:3
sample2 1 4 3 5 1:3:2:.:3:3
sample3 2 4 6 7 .:0:6:5:4:0
Desired output:
\##Comment1
\#Header
sample1 1 2 3 4 1:0:2:0:0:3
sample2 1 4 3 5 1:3:2:0:3:3
sample3 2 4 6 7 .:0:6:5:4:0
SUM = 5
Result so far
['sample1', '1', '2', '3', '4', '1', '0', '2', '0', '0', '3\n']
['sample2', '1', '4', '3', '5', '1', '3', '2', '0', '3', '3\n']
['sample3', '2', '4', '6', '7', '.', '0', '6', '5', '4', '0']
Here's what I have tried so far:
import re
data=open("inputfile.txt", 'r')
for line in data:
if not line.startswith("#"):
nodots = line.replace(":.",":0")
final_nodots=re.split('\t|:',nodots)
if (int(final_nodots[8]))<5:
final_nodots[8]="0"
print (final_nodots)
else:
print(final_nodots)