1

I have an XML file that contains a list of individual <order/> elements. I would like to split this single XML file up into multiple files, each of which contain a single order.

Example input file:

<orders>
  <order order-id="123">
    <line-item product-id="ABC">ABC</line-item>
  </order>
  <order order-id="456">
    <line-item product-id="DEF">DEF</line-item>
  </order>
</orders>

Desired outputs:

Order-123.xml:

 <order order-id="123">
   <line-item product-id="ABC">ABC</line-item>
 </order>

Order-456.xml:

 <order order-id="456">
   <line-item product-id="DEF">DEF</line-item>
 </order>

At this stage, I am not concerned with unmarshalling each of the order details into a struct; I merely want an exact copy of each <order/> node in its own file.

I have tried a few variations of this using xml:",innerxml", like this:

type OrdersRaw struct {
  Orders []OrderRaw `xml:"order"`
}

type OrderRaw struct {
  Order string `xml:",innerxml"`
}

The problem in the above code is that each Order value contains the inner XML (starting with <line-item/>) but does not contain the wrapping <order/> tag.

Ideally, if there were an xml:",outerxml" tag, it would serve my purpose.

I am trying to avoid doing hacky things like manually concatenating the inner XML with handwritten XML (e.g. <order order-id="123">).

Chad Gilbert
  • 36,115
  • 4
  • 89
  • 97
  • Seems it can be done with just regexp. – Uvelichitel Sep 07 '16 at 16:12
  • 1
    Parsing XML with a regular expression? Just because you _can_ doesn't mean you _should_. I'm trying to avoid [hacky solutions like that](http://stackoverflow.com/questions/1732348/regex-match-open-tags-except-xhtml-self-contained-tags/1732454#1732454). – Chad Gilbert Sep 07 '16 at 16:15
  • In simple cases regexp can be faster and more readable than XML Encoder Decoder pipe – Uvelichitel Sep 07 '16 at 16:24
  • Did you find any generic solution to get a copy of the original outerxml when parsing into a struct? – tmm1 Dec 14 '19 at 02:04

1 Answers1

2

Use XMLName on your OrderRaw:

type OrderRaw struct {
    XMLName xml.Name `xml:"order"`
    Order   string   `xml:",innerxml"`
    OrderID string   `xml:"order-id,attr"`
}

Playground: https://play.golang.org/p/onK5FExqzD.


EDIT: If you want to save all attributes, you'll have to use MarshalerXML and UnmarshalerXML interfaces:

func (o *OrderRaw) UnmarshalXML(d *xml.Decoder, start xml.StartElement) error {
    o.attrs = map[xml.Name]string{}
    for _, a := range start.Attr {
        o.attrs[a.Name] = a.Value
    }
    type order OrderRaw
    return d.DecodeElement((*order)(o), &start)
}

func (o OrderRaw) MarshalXML(e *xml.Encoder, start xml.StartElement) error {
    for name, attr := range o.attrs {
        start.Attr = append(start.Attr, xml.Attr{Name: name, Value: attr})
    }
    start.Name = xml.Name{Local: "order"}
    type order OrderRaw
    return e.EncodeElement(order(o), start)
}

Playground: https://play.golang.org/p/XhkwqJFyMd.

Ainar-G
  • 34,563
  • 13
  • 93
  • 119
  • Thanks, that's close, but I'm hoping for a solution that captures the full outer xml of each order, in case an attribute is added (e.g. ``) in the future without my knowledge. – Chad Gilbert Sep 07 '16 at 16:18