0

I have a text data with nested format, like an XML format but is using brackets {} instead of <>. What would be the best way to parse this, maybe convert it to XML format? I will then manipulate it like display it on a page or save the info on data base.

This is how it looks like:

{myItem num="345"}
{subItem num="1"}
My Sub item Texts
{/subItem}
{subItem num="2"}
My Sub item Texts
{/subItem}
{/myItem}

As you can see, it looks like an XML format but it uses brackets.

adiga
  • 34,372
  • 9
  • 61
  • 83
Pektus
  • 37
  • 1
  • 5
  • You could replace `{` with `<` and `}` with `>`. (If the text in between can have `}`, it should be escaped with a backslash) – adiga Feb 16 '21 at 14:10

2 Answers2

0

read all the data into a string and use the replace function to replace all the '{' with a '<' and the same for '}' with a '>'

var str = "{myItem num="345"}{subItem num="1"}My Sub item Texts {/subItem}{subItem 
           num="2"}My Sub item Texts{/subItem}{/myItem}";

var res = str.replace("{", "<");
var res2use = res.replace("}", ">");
Michiel
  • 94
  • 5
0

Unless this "structured" language has a specification, it is not a formal language, and the parsing of it may tend more toward Natural Language Processing (NLP) than markup or programming language parsing due to the lack of a fixed grammar to follow.

Lexically replacing { and } with < and > and calling it XML presumes much, including:

  • Do { and } only appear as markup?
  • Are all other XML well-formedness rules followed?

As a very rough first cut, you might apply the above replacements and then follow the guidance provided by How to parse invalid (bad / not well-formed) XML?

kjhughes
  • 106,133
  • 27
  • 181
  • 240