0

If I have string like

var a = '<?xml version="1.0" encoding="Windows-1251"?> .. encoding="utf-8"';

how to extract only value of first occurance of encoding attribute only inside xml tag with regexp? So in result would be

Windows-1251

?

kaytrance
  • 2,657
  • 4
  • 30
  • 49

1 Answers1

2

If you really want to use a regex, as is your question, you may use

var val = a.match(/<\?xml[^>]+\s+encoding="([^"]*)"/)[1];

Note that usually, parsing the string, especially in a browser, is the simplest solution.

Because if you want to account for all cases, for example the possibility that you have encoding= at the end of an attribute value, then it starts to be nasty :

var val = a.match(/<\?xml(\s+[^>="]+="[^"]+")*\s+encoding="([^"]*)"/)[1];

Note that if you're not sure the attribute is present, you should check that the returned array isn't null before taking the element at index 1.

Denys Séguret
  • 372,613
  • 87
  • 782
  • 758
  • 1
    Usage of regex for xml is strongly not recommended [here](http://stackoverflow.com/questions/1732348/regex-match-open-tags-except-xhtml-self-contained-tags) – Oybek Mar 07 '14 at 12:45
  • downvote for stringtforward solution. It will not work if I have some other attributes with values before encoding - therefore I would have to specify other element number of array. – kaytrance Mar 07 '14 at 12:47
  • 3
    The downvote is not just because xml and regex are inherently incompatible. Your suggestion may match invalid values, e.g. `` – Oybek Mar 07 '14 at 12:47
  • 1
    @dystroy I'd rather stay away from it anyways – John Dvorak Mar 07 '14 at 12:48