0

I'm trying to edit an xml document to only contain a list of attributes I need. I've created an array of attributes I need but I'm not quite sure how to filter the xml document. Here's what I currently have:

var desiredIds = new[] { "fooo1","attribute2", "attribute3" };

var fullAttributeList = xml.Descendants("Value").Attributes("AttributeID");

//Exception list.... 
var rejectThis = rangeProducts.Descendants("Value").Where(y => desiredIds.Contains((string)y.Attribute("AttributeID")));

foreach (var item in rangeProducts.Descendants("Value").Except(rejectThis))
 {
   item.RemoveAttributes(); //nope.... 

   //What now????????
 }

Here's a sample xml with an example of what I'm trying to achieve.

<Product ID="Sample A" UserTypeID="TYPE_PRD_RANGE">
  <Values AttributeId = "AAAAAA">
    <Value AttributeId = "BBBBBB">Value1</Value>
    <Value AttributeId = "CCCCCC">Value2</Value>
    <Value AttributeId = "DDDDDD">Value3</Value>
    <Value AttributeId = "EEEEE">Value4</Value>
  </Values>
  <Product ID="Sample A_1" UserTypeID="SUB_RANGE">
    <Values AttributeId = "ZZZZZZ">
      <Value AttributeId = "YYYYYY">Value1</Value>
      <Value AttributeId = "CCCCCC">Value2</Value>
      <Value AttributeId = "DDDDDD">Value3</Value>
      <Value AttributeId = "BBBBBB">Value4</Value>
    </Values>
  </Product>
  <Product ID="Sample A_1_1" UserTypeID="ITEM">
    <Values AttributeId = "12345">
      <Value AttributeId = "N12345">Value1</Value>
      <Value AttributeId = "A12345">Value2</Value>
      <Value AttributeId = "C12345">Value3</Value>
      <Value AttributeId = "F12345">Value4</Value>
    </Values>
  </Product>    
</Product>

There's a nested xml file with nested nodes being named with the same name Product I need some attributes in the first, second and third nodes so a sample output would be:

<Product ID="Sample A" UserTypeID="TYPE_PRD_RANGE">
  <Value AttributeId = "DDDDDD">Value3</Value>
  <Value AttributeId = "EEEEE">Value4</Value>
  <Product ID="Sample A_1" UserTypeID="SUB_RANGE">
    <Value AttributeId = "BBBBBB">Value4</Value>    
  </Product>
  <Product ID="Sample A_1_1" UserTypeID="ITEM">
    <Value AttributeId = "F12345">Value4</Value>
  </Product>
</Product>
Sebastian Schumann
  • 3,204
  • 19
  • 37
  • Can you post some sample XML, i.e. example input and desired output? – Dr Schizo Jul 06 '15 at 08:44
  • Do you only want to change a string or file on disk? If so you can just use a regex to find and replace all matches with an empty string. If you want to filter the parser for further operations that will not help. – Sebastian Schumann Jul 06 '15 at 08:53
  • @DrSchizo Updated to reflect what I'm trying to achieve. –  Jul 06 '15 at 09:00
  • Your changed XML is not valid. There is a closing `` without a corresponding opening tag. You change the level of the ``-Tags. They're now direct children of ``. Correct? It looks still like using Regex if you only need to change a string or file. – Sebastian Schumann Jul 06 '15 at 09:05
  • @Verarind I made a mistake on the output. I've edited the output now. I don't see how regex is going to help me here as I have over 200 attributes that I need to remove. –  Jul 06 '15 at 09:13
  • Maybe someting like this: `var newXml = Regex.Replace(xmlAsString, string.Format(@"(?sm)^\s*.*?", string.Join("|", desiredIds)); newXml = Regex.Replace(xmlAsString, string.Format(@"(?sm)^\s*(.*?)", "$1"), string.Empty);` This is not tested. – Sebastian Schumann Jul 06 '15 at 09:30
  • @PetSerAl. This is intended. It's a nested node where all attributes for `Sample A` are applicabe to both `Sample A_1` and `Sample A_1_1` but the reverse isn't the case. In essence, I'm trying to create Tekla 3D models for the products in my company and need these attributes. `Sample A` refers to a range of products, `Sample A_1` refers to a sub-range of these products and `Sample A_1_1` is the product itself. –  Jul 06 '15 at 10:17
  • @PetSerAl, I have an array of `desiredIds`. What I've simply tried to do is create an exception List `var rejectThis = rangeProducts.Descendants("Value").Where(y => desiredIds.Contains((string)y.Attribute("AttributeID")));` which should then be preserved with the `Except` Method in the `foreach` loop. This is what I'm currently struggling with. –  Jul 06 '15 at 11:25
  • @Verarind Please [don't use](http://stackoverflow.com/a/1732454/5045688) regex to parse XML. I'm Russian hacker and I'm pwn your app. – Alexander Petrov Jul 06 '15 at 12:46
  • I don't want to **parse** XML using regex. I only want to search and replace using regex and that's what regexes are designed for. Yes you're right parsing of XML using regex will fail (maybe - I've never done it). – Sebastian Schumann Jul 06 '15 at 12:49

1 Answers1

0

Try this

using System;
using System.Collections.Generic;
using System.Linq;
using System.Text;
using System.Xml;
using System.Xml.Linq;

namespace ConsoleApplication34
{
    class Program
    {
        static void Main(string[] args)
        {
            string input =
              "<Product ID=\"Sample A\" UserTypeID=\"TYPE_PRD_RANGE\">" +
                "<Values AttributeId = \"AAAAAA\">" +
                  "<Value AttributeId = \"BBBBBB\">Value1\"</Value>" +
                  "<Value AttributeId = \"CCCCCC\">Value2\"</Value>" +
                  "<Value AttributeId = \"DDDDDD\">Value3\"</Value>" +
                  "<Value AttributeId = \"EEEEE\">Value4\"</Value>" +
                "</Values>" +
                "<Product ID=\"Sample A_1\" UserTypeID=\"SUB_RANGE\">" +
                  "<Values AttributeId = \"ZZZZZZ\">" +
                    "<Value AttributeId = \"YYYYYY\">Value1\"</Value>" +
                    "<Value AttributeId = \"CCCCCC\">Value2\"</Value>" +
                    "<Value AttributeId = \"DDDDDD\">Value3\"</Value>" +
                    "<Value AttributeId = \"BBBBBB\">Value4\"</Value>" +
                  "</Values>" +
                "</Product>" +
                "<Product ID=\"Sample A_1_1\" UserTypeID=\"ITEM\">" +
                  "<Values AttributeId = \"12345\">" +
                    "<Value AttributeId = \"N12345\">Value1\"</Value>" +
                    "<Value AttributeId = \"A12345\">Value2\"</Value>" +
                    "<Value AttributeId = \"C12345\">Value3\"</Value>" +
                    "<Value AttributeId = \"F12345\">Value4\"</Value>" +
                  "</Values>" +
                "</Product>" +
              "</Product>";

              XDocument doc = XDocument.Parse(input);

              var results =doc.Descendants("Product").Select(x => new {
                  id = x.Attribute("ID").Value,
                  valuesId = x.Element("Values").Attribute("AttributeId").Value,
                  values = x.Element("Values").Descendants("Value").Select(y => new {
                     valueId = y.Attribute("AttributeId").Value,
                     value = (string)y
                  }).ToList()
              }).ToList();
        }
    }
}
Konamiman
  • 49,681
  • 17
  • 108
  • 138
jdweng
  • 33,250
  • 2
  • 15
  • 20