-4

Possible Duplicate:
Best XML Parser for PHP
How to parse XML file using php

i could not parse this xml... the real xml is here

    <?xml version="1.0" encoding="ISO-8859-1" ?>
<eqlist>
    <earhquake  name="2012.12.31 18:35:13"  lokasyon="CAMONU-AKHISAR (MANİSA)                           İlksel" lat="38.9572"   lng="27.8965"   mag="2.9" Depth="5.0" />
    <earhquake  name="2012.12.31 18:54:09"  lokasyon="VAN GÖLÜ                                          İlksel" lat="38.7273"   lng="43.1598"   mag="2.3" Depth="2.1" />
    <earhquake  name="2012.12.31 21:00:49"  lokasyon="KUCUKESENCE-ERENLER (SAKARYA)                     İlksel" lat="40.7347"   lng="30.4742"   mag="1.9" Depth="4.4" />
</eqlist>

how can i parse it? the problem is coming from first two chars of the xml file which is running nice remote site's google map application. Look at that array

[0] => ÿþ<?xml version="1.0" encoding="ISO-8859-1" ?>
Community
  • 1
  • 1
Burhan Çetin
  • 676
  • 7
  • 16
  • 4
    What have you tried? PHP has several extensions for XML parsing - SimpleXML, DOM, SAX... – Maxim Krizhanovsky Jan 01 '13 at 15:57
  • 2
    Didn't you dare to google for `php xml`? – moonwave99 Jan 01 '13 at 16:07
  • I think the answer may not be as simple as what you can find with a google for PHP XML. Go look at the original document. The author of this question just happened to choose lines that parse correctly. The entire document does not! – Ray Paseur Jan 01 '13 at 16:23
  • It's not just the XML tag - the entire document has invisible characters. You can detect this by using strlen() on any line. If you count what you see, it will be a lot less than the values passed from strlen(). Recommend you contact the publisher and ask for information about the character encoding. Also, read this: http://www.joelonsoftware.com/articles/Unicode.html – Ray Paseur Jan 01 '13 at 17:41

2 Answers2

3

You can use SimpleXML_Load_String.

<?php // RAY_temp_burhan.php
error_reporting(E_ALL);
echo '<pre>';

$xml = <<<ENDXML
<?xml version="1.0" encoding="ISO-8859-1" ?>
<eqlist>
    <earhquake  name="2012.12.31 18:35:13"  lokasyon="CAMONU-AKHISAR (MANISA)                           Ilksel" lat="38.9572"   lng="27.8965"   mag="2.9" Depth="5.0" />
    <earhquake  name="2012.12.31 18:54:09"  lokasyon="VAN GÖLÜ                                          Ilksel" lat="38.7273"   lng="43.1598"   mag="2.3" Depth="2.1" />
    <earhquake  name="2012.12.31 21:00:49"  lokasyon="KUCUKESENCE-ERENLER (SAKARYA)                     Ilksel" lat="40.7347"   lng="30.4742"   mag="1.9" Depth="4.4" />
</eqlist>
ENDXML;

// CONVERT TO AN OBJECT
$obj = SimpleXML_Load_String($xml);

// PARSE OUT SOME ATTRIBUTES
foreach ($obj as $quake)
{
    // ATTRIBUTE NAMES ARE CASE-SENSITIVE
    $loc = $quake->attributes()->lokasyon;
    $dep = $quake->attributes()->Depth;
    echo PHP_EOL . "$loc $dep";
}
Ray Paseur
  • 2,106
  • 2
  • 13
  • 18
  • Now, having looked at the original document, I see that there may be some invalid characters in the XML. So it may be a character-set encoding issue. When I try to use the original directly, it fails, however when I copy it into my text editor, some of the characters are converted (and it works). You might want to try a different character encoding, such as utf-8 – Ray Paseur Jan 01 '13 at 16:14
  • I think you need to go back to the origination of this document. It appears to have something (maybe a byte-order mark) in the top two characters. And Chrome thinks it's in Turkish and wants to translate it ;-) – Ray Paseur Jan 01 '13 at 16:22
  • thanks Ray, the orginal document has got xml bomb. i am trying to remove it. – Burhan Çetin Jan 01 '13 at 16:52
  • I copied the document into TextPad and saved it. That munged some of the characters. Where is the document originated? – Ray Paseur Jan 01 '13 at 16:55
  • its Turkish. the real problem is how to parse orginal xml file from remote site. – Burhan Çetin Jan 01 '13 at 17:04
  • the error is hapening from first two chars "ÿþ". look at the array [0] => ÿþ – Burhan Çetin Jan 01 '13 at 17:13
1

An object oriented way

$xml = <<<ENDXML
<?xml version="1.0" encoding="ISO-8859-1" ?>
<eqlist>
    <earhquake  name="2012.12.31 18:35:13" 
                lokasyon="CAMONU-AKHISAR (MANISA) Ilksel"
                lat="38.9572"
                lng="27.8965"   mag="2.9" Depth="5.0" />
    <!-- Etc... -->
</eqlist>
ENDXML;

$dom = new DOMDocument();
$dom->loadXML($xml, LIBXML_NOBLANKS);

Then you can use the various methods defined by DOMDocument. One of these methods that is useful for checking the validity with an XSD is schemaValidate

Ed Heal
  • 59,252
  • 17
  • 87
  • 127
  • 1
    Try it on the original document linked with the question. THe small sample works, but the original document does not. Warning: DOMDocument::loadXML() [domdocument.loadxml]: Start tag expected, '<' not found in Entity, line: 2 in /home/websitet/public_html/RAY_temp_burhan.php on line 65 – Ray Paseur Jan 01 '13 at 16:54
  • @RayPaseur - Works find for me. As you noted you need to sort out the encoding issues. – Ed Heal Jan 01 '13 at 17:31
  • Did you use the file at http://www.koeri.boun.edu.tr/sismo/zeqmap/xmlt/son24saat.xml ? How did you get it to work? – Ray Paseur Jan 01 '13 at 17:43
  • I couldn't fix it yet with using http://www.koeri.boun.edu.tr/sismo/zeqmap/xmlt/son24saat.xml – Burhan Çetin Jan 01 '13 at 19:04