0

I need a to create an array with all the subject values in this XML file. The ISIN list seems to work fine (the first property value), but subject values does not work.

I would like to end up with a array looking something like this:

$Companys = array ( [0]  => array ( "isin" => "DK0010247014","company" => "AAB"),
                    [1]  => array ( "isin" => "DK0015250344","company" => "ALM BRAND"),
                    [2]  => array ( "isin" => "DK0015998017","company" => "BAVARIAN NORDI"),
                    [3]  => array ( "isin" => "DK0010259027","company" => "DFDS"),
                    [4]  => array ( "isin" => "DK0010234467","company" => "FLSMIDTH & CO"),
                );

This is an example of one of the files i am trying to parse:

<doc>
    <id>123456</id>
    <version>4.0</version>
    <consnr>7861</consnr>
    <doctype>10</doctype>
    <dest>99</dest>
    <created>2013-05-15 14:18:16</created>
    <source>Direkt-DK</source>
    <language>DA</language>
    <texttype>This is a type</texttype>
    <premium>False</premium>
    <header>This is a header</header>
    <text>
        <para format="Text">This is a paragraph</para>
        <para format="Text">This is a paragraph</para>
        <para format="Text">This is a paragraph</para>
        <para format="Text">This is a paragraph</para>
        <para format="Text"/>
        <para format="Text">This is a paragraph</para>
        <para format="Byline"/>
        <para format="Byline">contents og the by line</para>
        <para format="Byline"/>
        <para format="Byline"/>
    </text>
    <subjects>
        <subject value="AAB" weight="Main">
            <property value="DK0010247014" type2="isin" type1="identificator"/>
            <property value="CSE:AAB" type2="ticker" type1="identificator"/>
            <property type1="sector" type2="GICS" type3="1" value="25"/>
            <property type1="sector" type2="GICS" type3="2" value="2530"/>
            <property type1="sector" type2="GICS" type3="3" value="253010"/>
            <property type1="sector" type2="GICS" type3="4" value="25301030"/>
        </subject>
        <subject value="ALM BRAND" weight="Main">
            <property value="DK0015250344" type2="isin" type1="identificator"/>
            <property value="CSE:ALMB" type2="ticker" type1="identificator"/>
            <property type1="sector" type2="GICS" type3="1" value="40"/>
            <property type1="sector" type2="GICS" type3="2" value="4030"/>
            <property type1="sector" type2="GICS" type3="3" value="403010"/>
            <property type1="sector" type2="GICS" type3="4" value="40301040"/>
        </subject>
        <subject value="BAVARIAN NORDI" weight="Main">
            <property value="DK0015998017" type2="isin" type1="identificator"/>
            <property value="CSE:BAVA" type2="ticker" type1="identificator"/>
            <property type1="sector" type2="GICS" type3="1" value="35"/>
            <property type1="sector" type2="GICS" type3="2" value="3520"/>
            <property type1="sector" type2="GICS" type3="3" value="352010"/>
            <property type1="sector" type2="GICS" type3="4" value="35201010"/>
        </subject>
        <subject value="DFDS" weight="Main">
            <property value="DK0010259027" type2="isin" type1="identificator"/>
            <property value="CSE:DFDS" type2="ticker" type1="identificator"/>
            <property type1="sector" type2="GICS" type3="1" value="20"/>
            <property type1="sector" type2="GICS" type3="2" value="2030"/>
            <property type1="sector" type2="GICS" type3="3" value="203030"/>
            <property type1="sector" type2="GICS" type3="4" value="20303010"/>
        </subject>
        <subject value="FLSMIDTH & CO" weight="Main">
            <property value="DK0010234467" type2="isin" type1="identificator"/>
            <property value="CSE:FLS" type2="ticker" type1="identificator"/>
            <property type1="sector" type2="GICS" type3="1" value="20"/>
            <property type1="sector" type2="GICS" type3="2" value="2010"/>
            <property type1="sector" type2="GICS" type3="3" value="201030"/>
            <property type1="sector" type2="GICS" type3="4" value="20103010"/>
        </subject>
    </subjects>
</doc>

Script:

<?
    foreach($xmlObj->subjects->subject as $b ){
        $isin = $b->property;
        $company = $b->attributes();
        #$company = $b->attributes()->value;
        If($isin && $isinlist == 'null') $isinlist = $isin['value'];
        ElseIf ($isin && $isinlist) $isinlist .= ','.$isin['value'];
        If($company && $companylist == 'null') $companylist = $company['value'];
        ElseIf ($company && $companylist) $companylist .= ','.$company['value'];
        var_dump($company->value[0]);
    }
?>
dhavald
  • 524
  • 4
  • 12
KongUnold
  • 21
  • 1
  • 10
  • 1
    Please reduce your problem to the *really* minimum needed to demonstrate the issue. To trigger that error you don't need all that XML nor all that PHP code. Keep it compact when you create a question here on site. That way you will probably already find the cause or a solution (but you can still ask about what you do not understand (hint)) and also you will get better answers. – hakre May 24 '13 at 17:02
  • I need the values in each subject like "FLSMIDTH & CO" - to be parsed as a string of values, comma seperated. I will edit my question, to make it more spefic. – KongUnold Jun 04 '13 at 14:00

1 Answers1

0

The main problem you've got is to find the child-element based on an attributes value. As there are multiple children with the same element name, you can not differ on the name alone.

In your concrete example the property child based on the attribute type2="isin".

This is either possible by making use of Xpath (this website already has a lot of Q&A material about that, for example SimpleXML: Selecting Elements Which Have A Certain Attribute Value) or by extending SimpleXMLElement with a function that just does it:

class MyElement extends SimpleXMLElement
{
    public function getChildByAttributeValue($name, $value) {
        foreach($this as $child)
        {
            if ($value === (string) $child[$name]) {
                return $child;
            }
        }
    }
}

You can then use the MyElement instead of the SimpleXMLElement:

$xml = simplexml_load_string($buffer, 'MyElement');
                                      ###########

and just map your values to an array:

$map = function(MyElement $subject) {
    return [
        (string) $subject['value'],
        (string) $subject->getChildByAttributeValue('type2', 'isin')['value'],
    ];
};

print_r(array_map($map, $xml->xpath('//subject')));

Given that $buffer is the XML you have provided in question (and the encoding error removed), this creates the following output:

Array
(
    [0] => Array
        (
            [0] => AAB
            [1] => DK0010247014
        )

    [1] => Array
        (
            [0] => ALM BRAND
            [1] => DK0015250344
        )

    [2] => Array
        (
            [0] => BAVARIAN NORDI
            [1] => DK0015998017
        )

    [3] => Array
        (
            [0] => DFDS
            [1] => DK0010259027
        )

    [4] => Array
        (
            [0] => FLSMIDTH & CO
            [1] => DK0010234467
        )

)

The full code example (Online Demo):

class MyElement extends SimpleXMLElement
{
    public function getChildByAttributeValue($name, $value) {
        foreach($this as $child)
        {
            if ($value === (string) $child[$name]) {
                return $child;
            }
        }
    }
}

$xml = simplexml_load_string($buffer, 'MyElement');

$map = function(MyElement $subject) {
    return [
        (string) $subject['value'],
        (string) $subject->getChildByAttributeValue('type2', 'isin')['value'],
    ];
};

print_r(array_map($map, $xml->xpath('//subject')));
Community
  • 1
  • 1
hakre
  • 193,403
  • 52
  • 435
  • 836