I have a fixed width file which I'm working with in Python but when trying to load the file I get an error due to a hex value \x9f not being read. This is fixed by forcing the load of the file as latin-1 but when I try and replace the \x9f value its not working unless I write out to another file which doesn't seem efficient.
Can anyone advise a better way of doing this please?
import pprint
import re, collections
import platform
#### INPUTS ####
layout = [
('ID', 0, 11),
('FIN-STATEMENT-IND', 83 , 84 ) ,
('RECENT-FIN-STAT-AGE', 84 , 86 ) ,
('FAILED-TO-FILE-IND', 86 , 87 ) ,
('FIN-STAT-OVDUE-IND', 87 , 88 ) ,
('NET-WORTH' , 88, 99 ) ,
]
headerdict = {}
#### OPEN ####
with open('uk_dcl_mrg.txt', 'r+', encoding='latin-1') as f:
for line in f:
f.write(line.replace('\x9f', '?'))
ct = 0
for line in f:
ct += 1
#### OUTOUT ####
for i in layout: ## Loop to create dictionary
headerdict[i[0]] = line[i[1]:i[2]]
print ('Sort by keys:')
for key in sorted(headerdict.keys()):
print ("%s: %s" % (key, headerdict[key]))
print(headerdict)
# print(platform.python_version())
if ct >= 1:
break
If I add the line below so I can write to a second file and then create the dictionary from this it works fine but I don't want to create a second file.
with open('uk_dcl_mrg_out.txt', 'r+', encoding='latin-1') as fo: