I'm using Microsoft SQL Server Management Studio to create an XML file. This file needs at the top to be uploaded properly. I understand that this is fairly normal and I need to figure out how to add that line myself.
To add the line, I'm calling each of my files and modifying them with the following function:
def append_prologue(file, orgID, schema):
timestamp = datetime.today().strftime('%Y%m%d')
new_name = f'{orgID}_000_2022TSDS_{timestamp}1500_' + schema
new_file = file.parent.parent / 'results/with_prologue' / new_name
if new_file.exists():
print(f'{new_file.name} already exists')
with open(file, 'r') as original:
data = original.read()
data = data[3:] #how the original writer dealt with the issue
with open(new_file, 'w+') as modified:
modified.write("<?xml version=\"1.0\" encoding=\"UTF-8\"?>" + data)
return
However, this creates a problem. It will write but it adds "\ufeff" which I understand to be a BOM and the XML file can't be read properly. I took over this project for a coworker who left my company and they wrote this code. They addressed the issue by removing the BOM but it doesn't seem to work for me. I also suspect there's probably a more systematic way of doing it.
What am I doing wrong? Is there a way to remove these characters when I write the file? Should I be approaching this differently?