0

I created a custom pipeline component that uses the OpenXML method here to extract the XML data from within an Excel workbook. The 'raw' XML from the workbook has this schema (generated from an instance of the extracted XML from the OpenXML code referenced above:

    <?xml version="1.0" encoding="utf-16"?>
<xs:schema xmlns:b="http://schemas.microsoft.com/BizTalk/2003" xmlns:x="http://schemas.openxmlformats.org/spreadsheetml/2006/main" attributeFormDefault="unqualified" elementFormDefault="qualified" targetNamespace="http://schemas.openxmlformats.org/spreadsheetml/2006/main" xmlns:xs="http://www.w3.org/2001/XMLSchema">
  <xs:import schemaLocation=".\Smart Recruiters_ExcelRawXML.xsd" namespace="http://www.w3.org/XML/1998/namespace" />
  <xs:annotation>
    <xs:appinfo>
      <references xmlns="http://schemas.microsoft.com/BizTalk/2003">
        <reference targetNamespace="http://www.w3.org/XML/1998/namespace" />
      </references>
    </xs:appinfo>
  </xs:annotation>
  <xs:element name="Root">
    <xs:complexType>
      <xs:sequence>
        <xs:element name="dimension">
          <xs:complexType>
            <xs:attribute name="ref" type="xs:string" use="required" />
          </xs:complexType>
        </xs:element>
        <xs:element name="sheetViews">
          <xs:complexType>
            <xs:sequence>
              <xs:element name="sheetView">
                <xs:complexType>
                  <xs:attribute name="tabSelected" type="xs:boolean" use="required" />
                  <xs:attribute name="workbookViewId" type="xs:unsignedByte" use="required" />
                </xs:complexType>
              </xs:element>
            </xs:sequence>
          </xs:complexType>
        </xs:element>
        <xs:element name="sheetFormatPr">
          <xs:complexType>
            <xs:attribute name="defaultRowHeight" type="xs:decimal" use="required" />
          </xs:complexType>
        </xs:element>
        <xs:element name="sheetData">
          <xs:complexType>
            <xs:sequence>
              <xs:element maxOccurs="unbounded" name="row">
                <xs:complexType>
                  <xs:sequence>
                    <xs:element maxOccurs="unbounded" name="c">
                      <xs:complexType>
                        <xs:sequence minOccurs="0">
                          <xs:element name="is">
                            <xs:complexType>
                              <xs:sequence>
                                <xs:element name="t">
                                  <xs:complexType>
                                    <xs:simpleContent>
                                      <xs:extension base="xs:string">
                                        <xs:attribute ref="xml:space" use="optional" />
                                      </xs:extension>
                                    </xs:simpleContent>
                                  </xs:complexType>
                                </xs:element>
                              </xs:sequence>
                            </xs:complexType>
                          </xs:element>
                        </xs:sequence>
                        <xs:attribute name="r" type="xs:string" use="required" />
                        <xs:attribute name="s" type="xs:unsignedByte" use="required" />
                        <xs:attribute name="t" type="xs:string" use="optional" />
                      </xs:complexType>
                    </xs:element>
                  </xs:sequence>
                  <xs:attribute name="r" type="xs:unsignedShort" use="required" />
                </xs:complexType>
              </xs:element>
            </xs:sequence>
          </xs:complexType>
        </xs:element>
        <xs:element name="pageMargins">
          <xs:complexType>
            <xs:attribute name="left" type="xs:decimal" use="required" />
            <xs:attribute name="right" type="xs:decimal" use="required" />
            <xs:attribute name="top" type="xs:decimal" use="required" />
            <xs:attribute name="bottom" type="xs:decimal" use="required" />
            <xs:attribute name="header" type="xs:decimal" use="required" />
            <xs:attribute name="footer" type="xs:decimal" use="required" />
          </xs:complexType>
        </xs:element>
      </xs:sequence>
    </xs:complexType>
  </xs:element>
</xs:schema>

An example of the first element in the first row is:

<x:row r="1"><x:c r="A1" s="1" t="inlineStr"><x:is><x:t>ID</x:t></x:is>

My question is, how can this be used in the mapper to iterate through the column records and apply each row/column value to the correct row/field in the target XML (below). I have searched online but I havent seen anything that can evaluate the row and apply to the correct target row/field. Assume the input will always have the correct number of columns, even though 'c' is stated to be unbounded.

<?xml version="1.0" encoding="utf-16"?>
<xs:schema xmlns="http://Schemas.SmartRecruiters_Joibs_Manual" xmlns:b="http://schemas.microsoft.com/BizTalk/2003" targetNamespace="http://Schemas.SmartRecruiters_Joibs_Manual" xmlns:xs="http://www.w3.org/2001/XMLSchema">
  <xs:annotation>
    <xs:appinfo>
      <schemaEditorExtension:schemaInfo namespaceAlias="b" extensionClass="Microsoft.BizTalk.FlatFileExtension.FlatFileExtension" standardName="Flat File" xmlns:schemaEditorExtension="http://schemas.microsoft.com/BizTalk/2003/SchemaEditorExtensions" />
      <b:schemaInfo standard="Flat File" codepage="65001" default_pad_char=" " pad_char_type="char" count_positions_by_byte="false" parser_optimization="speed" lookahead_depth="3" suppress_empty_nodes="false" generate_empty_nodes="true" allow_early_termination="false" early_terminate_optional_fields="false" allow_message_breakup_of_infix_root="false" compile_parse_tables="false" root_reference="Root" />
    </xs:appinfo>
  </xs:annotation>
  <xs:element name="Root">
    <xs:annotation>
      <xs:appinfo>
        <b:recordInfo structure="delimited" child_delimiter_type="hex" child_delimiter="0xD 0xA" child_order="postfix" sequence_number="1" preserve_delimiter_for_empty_data="true" suppress_trailing_delimiters="false" />
      </xs:appinfo>
    </xs:annotation>
    <xs:complexType>
      <xs:sequence>
        <xs:annotation>
          <xs:appinfo>
            <groupInfo sequence_number="0" xmlns="http://schemas.microsoft.com/BizTalk/2003" />
          </xs:appinfo>
        </xs:annotation>
        <xs:element maxOccurs="unbounded" name="Root_Child1">
          <xs:annotation>
            <xs:appinfo>
              <b:recordInfo structure="delimited" child_delimiter_type="char" child_delimiter="," child_order="infix" sequence_number="1" preserve_delimiter_for_empty_data="true" suppress_trailing_delimiters="false" />
            </xs:appinfo>
          </xs:annotation>
          <xs:complexType>
            <xs:sequence>
              <xs:annotation>
                <xs:appinfo>
                  <groupInfo sequence_number="0" xmlns="http://schemas.microsoft.com/BizTalk/2003" />
                </xs:appinfo>
              </xs:annotation>
              <xs:element name="Root_Child1_Child1" type="xs:string">
                <xs:annotation>
                  <xs:appinfo>
                    <b:fieldInfo justification="left" sequence_number="1" />
                  </xs:appinfo>
                </xs:annotation>
              </xs:element>
              <xs:element name="Root_Child1_Child2" type="xs:string">
                <xs:annotation>
                  <xs:appinfo>
                    <b:fieldInfo justification="left" sequence_number="2" />
                  </xs:appinfo>
                </xs:annotation>
              </xs:element>
              <xs:element name="Root_Child1_Child3" type="xs:string">
                <xs:annotation>
                  <xs:appinfo>
                    <b:fieldInfo justification="left" sequence_number="3" />
                  </xs:appinfo>
                </xs:annotation>
              </xs:element>
            </xs:sequence>
          </xs:complexType>
        </xs:element>
      </xs:sequence>
    </xs:complexType>
  </xs:element>
</xs:schema>
Bee
  • 109
  • 2
  • 11

2 Answers2

1

You need to use Value Mapping for each destination row.

enter image description here

The input as first parameter is the result of validate that the Row and Column of the origin are what are you expect (in this case, r = 1 and c.r = A1).

Second parameter is the value that you want to assign.

felixmondelo
  • 1,324
  • 1
  • 10
  • 19
  • Thanks for the reply. I dont think I explained the issue well enough. I can't evaluate c.r as a whole because the numbers(s) are irrelevant. I need to somehow extract the letters and evaluate those. Problem is, if the excel doc has thousands of records, the c.r ends up looking something like: – Bee Jan 18 '18 at 15:17
  • But on the equal functoid you can add whatever you want, only needs to now what is the column that you want to map and use the equal and the Value Mapping. – felixmondelo Jan 18 '18 at 16:11
  • Sorry, I don't think we're on the same page. So I think what you are suggesting is an equals functoid that evaluates c.r = 'BA2345'. What I need is something like c.r 'BA'.This feels like something I would normally do in .NET with regex, returning all letters left of the first number. – Bee Jan 18 '18 at 17:17
  • For that you can use String Find functoid or Scripting functoid with custom C# code or even invoke a method of a C# class. – felixmondelo Jan 18 '18 at 18:27
1

I think you're overthinking this a bit.

Link <row> to <Root_Child1> through a Looping Functoid. That will loop on the row, the value of doesn't matter...unless you really need it to.

Then, connect <t> through an Index Functoid to select the n-th <c> from where to get the value.

Johns-305
  • 10,908
  • 12
  • 21