I'm trying to insert an xml data file into R and get a data frame. I'm using package 'XML' and xmlToDataFrame("test.xml")
command. This is giving me the following error: xmlParseCharRef: invalid xmlChar value 26
.
Now, from my research online there's probably something going on in the xml file. I've tried replacing all escaping characters e.g. &
with &
I even replaced Ó
with O
(although it shouldn't make a difference but just to be sure). It didn't work. The xml data file has over 2million rows so it is impossible to go through it line by line.
Does anyone have any idea on what other character could be causing me the problem?
I should also mentioned that the encoding on the file was <?xml version="1.0" encoding="UTF-8"?>
but I've also tried <?xml version=''1.0'' encoding=''iso-8859-1''?>
and <?xml version="1.0" encoding="ascii"?>
. However, I have no idea what this means, but people were suggesting it online. Any help would be greatly appreciated!
Example of xml data:
<?xml version="1.0" encoding="UTF-8"?>
<data>
<new_buildings>
<new_building>
<new_building_shipyard_name value="189 (189 COMPANY)"/>
<new_building_bv_number value="29"/>
<new_building_ship_type value="boat"/>
<new_building_commercial_owner_name value="SHIPYARDS"/>
<new_building_registered_owner_code value="18"/>
<new_building_keel_laying_date value="2013-08-14"/>
<new_building_confidentiality_indicator value="N"/>
</new_building>
<new_building>