What I want to do
I have a folder. In that folder there are pdfs, pictures, etc. Additionally there is an xmlfile.
That xmlfile has metadata for each other file.
I want to extract that data from the xml and save it in an c# class so I can use it later
What I already did
I searched for a way to parse the file using linq. But I couldn't get it to work the way I want to.
I want it to work like this:
I have a list of files stored in my application. Then I want to do a loop over each file and get the data for that file from the xml.
What I have
The xmlfile looks like this:
<?xml version='1.0' encoding='ISO-8859-1' ?>
<FOLDERS Name="XXXXXXX" >
<FOLDER Date="12/15/2015 15:25:04" ByUser="" Name="some folders name" Type="" MemberOf="">
<![CDATA[FOLDERID111]]>
<VISUALFOLDER Date="02/16/2016 14:25:00" ByUser="" Name="some folders name" Type="" StartView="UNKNOWN" ScreenOffset="0"/>
<TABSHEET Date="02/16/2016 14:25:00" Name="Fields" Type="IdxFields">
<![CDATA[TABSHEETID521]]>
<VISUALTABSHEET Date="02/16/2016 14:25:00" Name="Fields" Type="IdxFields"/>
<INDEXFIELD Date="02/16/2016 14:25:00" Name="DocuName">
<![CDATA[Something thats not the documents name]]>
<VISUALINDEXFIELD Date="02/16/2016 14:25:00" Name="DocuName"/>
</INDEXFIELD>
<INDEXFIELD Date="02/16/2016 14:25:00" Name="DocuDate">
<![CDATA[09.12.2015]]>
<VISUALINDEXFIELD Date="02/16/2016 14:25:00" Name="DocuDate"/>
</INDEXFIELD>
<INDEXFIELD Date="02/16/2016 14:25:00" Name="Object">
<![CDATA[OBJECT1]]>
<VISUALINDEXFIELD Date="02/16/2016 14:25:00" Name="Object"/>
</INDEXFIELD>
<INDEXFIELD Date="02/16/2016 14:25:00" Name="Tag">
<![CDATA[LETTER]]>
<VISUALINDEXFIELD Date="02/16/2016 14:25:00" Name="Tag"/>
</INDEXFIELD>
<INDEXFIELD Date="02/16/2016 14:25:00" Name="User">
<![CDATA[USER1]]>
<VISUALINDEXFIELD Date="02/16/2016 14:25:00" Name="User"/>
</INDEXFIELD>
<INDEXFIELD Date="02/16/2016 14:25:00" Name="Note">
<VISUALINDEXFIELD Date="02/16/2016 14:25:00" Name="Note"/>
</INDEXFIELD>
<INDEXFIELD Date="02/16/2016 14:25:00" Name="Barcode">
<VISUALINDEXFIELD Date="02/16/2016 14:25:00" Name="Barcode"/>
</INDEXFIELD>
</TABSHEET>
<TABSHEET Date="02/16/2016 14:25:00" Name="Documents" Type="Documents" Data="" SeqNo="0" Title="" Password="">
<![CDATA[TABSHEETID522]]>
<VISUALTABSHEET Date="02/16/2016 14:25:00" Name="Documents" Type="Documents"/>
<DOCUMENT Date="02/16/2016 14:25:00" Name="Document" Type="" Data="" FileName="C:\ProgramData\Import\file1.pdf" FileOffset="5712054" FileSize="128509" BinaryType="PDF">
<VISUALDOCUMENT Date="02/16/2016 14:25:00" Name="Document" Type="" Height="148" Width="105"/>
</DOCUMENT>
<DOCUMENT Date="02/16/2016 14:25:00" Name="Document" Type="" Data="" FileName="C:\ProgramData\Import\file2.pdf" FileOffset="5840563" FileSize="129847" BinaryType="PDF">
<VISUALDOCUMENT Date="02/16/2016 14:25:00" Name="Document" Type="" Height="148" Width="105"/>
</DOCUMENT>
</TABSHEET>
</FOLDER>
<FOLDER Date="12/30/2015 15:25:04" ByUser="" Name="some other folders name" Type="" MemberOf="">
<![CDATA[FOLDERID111]]>
<VISUALFOLDER Date="02/16/2016 14:25:00" ByUser="" Name="some other folders name" Type="" StartView="UNKNOWN" ScreenOffset="0"/>
<TABSHEET Date="02/16/2016 14:25:00" Name="Fields" Type="IdxFields">
<![CDATA[TABSHEETID521]]>
<VISUALTABSHEET Date="02/16/2016 14:25:00" Name="Fields" Type="IdxFields"/>
<INDEXFIELD Date="02/16/2016 14:25:00" Name="DocuName">
<![CDATA[Something thats not the documents name]]>
<VISUALINDEXFIELD Date="02/16/2016 14:25:00" Name="DocuName"/>
</INDEXFIELD>
<INDEXFIELD Date="02/16/2016 14:25:00" Name="DocuDate">
<![CDATA[09.12.2015]]>
<VISUALINDEXFIELD Date="02/16/2016 14:25:00" Name="DocuDate"/>
</INDEXFIELD>
<INDEXFIELD Date="02/16/2016 14:25:00" Name="Object">
<![CDATA[OBJECT1]]>
<VISUALINDEXFIELD Date="02/16/2016 14:25:00" Name="Object"/>
</INDEXFIELD>
<INDEXFIELD Date="02/16/2016 14:25:00" Name="Tag">
<![CDATA[LETTER]]>
<VISUALINDEXFIELD Date="02/16/2016 14:25:00" Name="Tag"/>
</INDEXFIELD>
<INDEXFIELD Date="02/16/2016 14:25:00" Name="User">
<![CDATA[USER1]]>
<VISUALINDEXFIELD Date="02/16/2016 14:25:00" Name="User"/>
</INDEXFIELD>
<INDEXFIELD Date="02/16/2016 14:25:00" Name="Note">
<VISUALINDEXFIELD Date="02/16/2016 14:25:00" Name="Note"/>
</INDEXFIELD>
<INDEXFIELD Date="02/16/2016 14:25:00" Name="Barcode">
<VISUALINDEXFIELD Date="02/16/2016 14:25:00" Name="Barcode"/>
</INDEXFIELD>
</TABSHEET>
<TABSHEET Date="02/16/2016 14:25:00" Name="Documents" Type="Documents" Data="" SeqNo="0" Title="" Password="">
<![CDATA[TABSHEETID522]]>
<VISUALTABSHEET Date="02/16/2016 14:25:00" Name="Documents" Type="Documents"/>
<DOCUMENT Date="02/16/2016 14:25:00" Name="Document" Type="" Data="" FileName="C:\ProgramData\Import\file3.pdf" FileOffset="5712054" FileSize="128509" BinaryType="PDF">
<VISUALDOCUMENT Date="02/16/2016 14:25:00" Name="Document" Type="" Height="148" Width="105"/>
</DOCUMENT>
</TABSHEET>
</FOLDER>
</FOLDERS>
The xml is generated by another application.
Each "Folder" has two "TABSHEET"s. One includes the data (identifiable by the "Name" attribute) and another includes the Filenames.
The data is included in an CDATA-Block. Some fields have data, some not. Not every Document has a "Barcode".
My Question
How does a Linq query look like that does what I want to do?
Update 1
Ok, I fixed my query to almost do what I want
var test1 = xdoc
.Element("FOLDERS")
.Elements("FOLDER")
.Where(xml => xml
.Elements("TABSHEET")
.Elements("DOCUMENT")
.Select(x => x.Attribute("FileName").Value)
.ToList()
.Contains(file.FilePath)
)
.Select(xml => xml
.Elements("TABSHEET")
.Elements("INDEXFIELD")
.Where(x =>
x.Attribute("Name").Value == "DocuName" ||
x.Attribute("Name").Value == "Note" ||
x.Attribute("Name").Value == "User")
.Select(x => (string)x.Value)
);
The only Problem now is how to differenciate the result.
What I means is this: The query will return a IEnumerable> containing 3 values times the amount files.
But because its a IEnumerable I cant tell if the string is "DocuName" or "Note" or "User".
Is there a way to get a Dictionary with the right Keys out of this query?