0

My dataframe looks as such:

Dataframe I'm looking to insert each row into a pre-defined text file so that the values have a specific place in the document. This is what I came up with:

for i in range(len(df)):
with open("%s.xml" %index, "w") as f:
    f.write(
     """<?xml version="1.0"?>
<Invoice xmlns="urn:oasis:names:specification:ubl:schema:xsd:Invoice-2" xmlns:cac="urn:oasis:names:specification:ubl:schema:xsd:CommonAggregateComponents-2" xmlns:cbc="urn:oasis:names:specification:ubl:schema:xsd:CommonBasicComponents-2" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xmlns:ccts="urn:un:unece:uncefact:documentation:2" xsi:schemaLocation="urn:oasis:names:specification:ubl:schema:xsd:Invoice-2 http://docs.oasis-open.org/ubl/os-UBL-2.1/xsd/maindoc/UBL-Invoice-2.1.xsd">
  <cbc:UBLVersionID>2.1</cbc:UBLVersionID>
  <cbc:CustomizationID>urn:www.cenbii.eu:transaction:biitrns010:ver2.0:extended:urn:www.peppol.eu:bis:peppol4a:ver2.0:extended:urn:www.simplerinvoicing.org:si:si-ubl:ver1.1.x</cbc:CustomizationID>
  <cbc:ProfileID>urn:www.cenbii.eu:profile:bii04:ver2.0</cbc:ProfileID>
  <cbc:ID> """df[Factuurdatum[i]]" </cbc:ID>
  <cbc:IssueDate> Totaal </cbc:IssueDate>
  <cbc:DueDate> Factuurdatum[i] </cbc:DueDate>"
  <cbc:InvoiceTypeCode listID="UNCL1001" listAgencyID="6">380</cbc:InvoiceTypeCode>
  <cbc:DocumentCurrencyCode>EUR</cbc:DocumentCurrencyCode>
  <cac:AccountingSupplierParty>

My ideal output would then be for the first row:

<?xml version="1.0"?>
    <Invoice xmlns="urn:oasis:names:specification:ubl:schema:xsd:Invoice-2" xmlns:cac="urn:oasis:names:specification:ubl:schema:xsd:CommonAggregateComponents-2" xmlns:cbc="urn:oasis:names:specification:ubl:schema:xsd:CommonBasicComponents-2" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xmlns:ccts="urn:un:unece:uncefact:documentation:2" xsi:schemaLocation="urn:oasis:names:specification:ubl:schema:xsd:Invoice-2 http://docs.oasis-open.org/ubl/os-UBL-2.1/xsd/maindoc/UBL-Invoice-2.1.xsd">
      <cbc:UBLVersionID>2.1</cbc:UBLVersionID>         <cbc:CustomizationID>urn:www.cenbii.eu:transaction:biitrns010:ver2.0:extended:urn:www.peppol.eu:bis:peppol4a:ver2.0:extended:urn:www.simplerinvoicing.org:si:si-ubl:ver1.1.x</cbc:CustomizationID>
      <cbc:ProfileID>urn:www.cenbii.eu:profile:bii04:ver2.0</cbc:ProfileID>
      <cbc:ID> ""0606194584" </cbc:ID>
      <cbc:IssueDate> 12.93 </cbc:IssueDate>
      <cbc:DueDate> 2020-09-18 </cbc:DueDate>"
      <cbc:InvoiceTypeCode listID="UNCL1001" listAgencyID="6">380</cbc:InvoiceTypeCode>
      <cbc:DocumentCurrencyCode>EUR</cbc:DocumentCurrencyCode>
      <cac:AccountingSupplierParty>

My ideal output would then be for the second row:

<?xml version="1.0"?>
    <Invoice xmlns="urn:oasis:names:specification:ubl:schema:xsd:Invoice-2" xmlns:cac="urn:oasis:names:specification:ubl:schema:xsd:CommonAggregateComponents-2" xmlns:cbc="urn:oasis:names:specification:ubl:schema:xsd:CommonBasicComponents-2" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xmlns:ccts="urn:un:unece:uncefact:documentation:2" xsi:schemaLocation="urn:oasis:names:specification:ubl:schema:xsd:Invoice-2 http://docs.oasis-open.org/ubl/os-UBL-2.1/xsd/maindoc/UBL-Invoice-2.1.xsd">
      <cbc:UBLVersionID>2.1</cbc:UBLVersionID>         <cbc:CustomizationID>urn:www.cenbii.eu:transaction:biitrns010:ver2.0:extended:urn:www.peppol.eu:bis:peppol4a:ver2.0:extended:urn:www.simplerinvoicing.org:si:si-ubl:ver1.1.x</cbc:CustomizationID>
      <cbc:ProfileID>urn:www.cenbii.eu:profile:bii04:ver2.0</cbc:ProfileID>
      <cbc:ID> ""20200633369" </cbc:ID>
      <cbc:IssueDate> 30.25 </cbc:IssueDate>
      <cbc:DueDate> 2020-06-26 </cbc:DueDate>"
      <cbc:InvoiceTypeCode listID="UNCL1001" listAgencyID="6">380</cbc:InvoiceTypeCode>
      <cbc:DocumentCurrencyCode>EUR</cbc:DocumentCurrencyCode>
      <cac:AccountingSupplierParty>

etcetera for every row. What is a possible way to do this? Can someone help me?

  • Please don't post images of code, data, or Tracebacks. Copy and paste it as text then format it as code (select it and type `ctrl-k`) ... [Discourage screenshots of code and/or errors](https://meta.stackoverflow.com/questions/303812/discourage-screenshots-of-code-and-or-errors). – wwii Dec 08 '20 at 19:29
  • @WWII I thought some guidance or tips would be nice as I am really stuck – Max den Hoed Dec 08 '20 at 19:33
  • Please make sure you read and understand, how https://stackoverflow.com/tour this page works and how we can help you to our best capabilities. @wwii comment is on the contrary, helpful and trying to help you by showing you how to improve your question to get an answer. – Celius Stingher Dec 08 '20 at 19:40
  • 2
    Your question would be easier to work with if you posted a smaller example. Include code that initializes a dataframe with a few rows and columns and a small xml document to be modified. As stands we can't test a solution and your example xml is very messed up and its not easy to spot where the information should be added. I see what you want from the code, but make it easy for future readers to get the gist of the problem. – tdelaney Dec 08 '20 at 19:45
  • [How to make good reproducible pandas examples](https://stackoverflow.com/questions/20109391/how-to-make-good-reproducible-pandas-examples). – wwii Dec 08 '20 at 19:49

2 Answers2

1

You are almost there. You can use string formatting to insert your values in the string like so:

data = "some data i want to insert"

result = "This is what I want to say: {}".format(data)
# or
result = f"This is what I want to say: {data}"

References:

https://docs.python.org/3/library/stdtypes.html?highlight=format#str.format

https://docs.python.org/3/library/string.html#formatstrings

Luca Angioloni
  • 2,243
  • 2
  • 19
  • 28
1

If you iterate the rows, you get a tuple of (index, series) where series holds the column values of a single row. That series can be expanded into a str.format call that holds a template of the xml you want to generate. As a simple example

>>> df=pd.DataFrame([[1,2,3],[4,5,6]], columns=['A','B','C'])
>>> df
   A  B  C
0  1  2  3
1  4  5  6
>>> template = "<xml>\n  <a>{A}</a>\n  <b>{B}</b>\n  <c>{C}</c>\n</xml>"
>>> for row in df.iterrows():
...     print(template.format(**row[1]))
... 
<xml>
  <a>1</a>
  <b>2</b>
  <c>3</c>
</xml>
<xml>
  <a>4</a>
  <b>5</b>
  <c>6</c>
</xml>

Expanding this example I broke the wanted document into boiler plate for the canned surrounding xml document and a {fac_details} format variable for the unique information. I don't know a good name for this data so I called it "fac" - you'll want something more descriptive. I tried to make the xml more complete and didn't address all of the columns you are interested in.

NOTE: OP does not supply a complete running program so this is untested pseudocode.

# xml document to be expanding with per row details
fac_doc_template = """<?xml version="1.0"?>
<Invoice xmlns="urn:oasis:names:specification:ubl:schema:xsd:Invoice-2" xmlns:cac="urn:oasis:names:specification:ubl:schema:xsd:CommonAggregateComponents-2" xmlns:cbc="urn:oasis:names:specification:ubl:schema:xsd:CommonBasicComponents-2" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xmlns:ccts="urn:un:unece:uncefact:documentation:2" xsi:schemaLocation="urn:oasis:names:specification:ubl:schema:xsd:Invoice-2 http://docs.oasis-open.org/ubl/os-UBL-2.1/xsd/maindoc/UBL-Invoice-2.1.xsd">
  <cbc:UBLVersionID>2.1</cbc:UBLVersionID>
  <cbc:CustomizationID>urn:www.cenbii.eu:transaction:biitrns010:ver2.0:extended:urn:www.peppol.eu:bis:peppol4a:ver2.0:extended:urn:www.simplerinvoicing.org:si:si-ubl:ver1.1.x</cbc:CustomizationID>
  <cbc:ProfileID>urn:www.cenbii.eu:profile:bii04:ver2.0</cbc:ProfileID>
  {fac_details}
</cbc:CustomizationID>
</Invoice>"""

# per row details
# todo: expand for all of the column values you want
fac_details_xml_template = """
<cbc:ID>{Factuurnumer}</cbc:ID>
<cbc:IssueDate>{Factuurdatum}</cbc:IssueDate>
"""

def series_to_fac_details_xml(s):
    return fac_details_xml_template.format(**s)

for index, row in df.iterrows():
    details = series_to_fac_details_xml(row)
    with open(f"{index}.xml", "w") as f:
        f.write(fac_doc_template.format(fac_details=details))
tdelaney
  • 73,364
  • 6
  • 83
  • 116