-3

I am querying the wiktionary API and I need to match "===Noun===" including the text after that, up until "====Translations====" starts.The actual JSON object is way bigger, and I need to match it several times, including "===Verb===" etc. On Regexr.com I managed to get it matching but not in my javascript:

    var regex = /(===Verb===|===Noun===|===Adjective===|===Adverb===).*?====/g;
    console.log(jsondata.match(regex));
    console.log(regex.test(jsondata));

Any help is much appreciated!

==English== ===Pronunciation=== * , * , * * * * ===Noun=== # [[feces|Feces]]. ====Synonyms==== * [[BM]] * [[doo-doo]] * [[poo]] * [[poop]] ===Interjection=== # #* '''1995''', Phil Farrand, ''The Nitpicker's Guide for Next Generation Trekkers: Volume 2'' #*: (Ever feel like you've just entered... The Twilight Zone? '''Doo''', doo, doo, doo, doo, doo....) #* '''2006''', Steve Taylor, ''A to X of Alternative Music'' (page 272) #*: the bloke who sang about coloured girls going ''''doo''' de doo de doo doo d'de doo de doo de doo' had once had this thing with the guy who produced the debut albums by the Stooges and Patti Smith. ====Related terms==== * [[doo-wop]] ---- ==Gooniyandi== ===Noun=== # [[cave]] ---- ==Manx== ===Etymology=== From , from , from . ===Adjective=== # [[black]] # [[inky]] ====Synonyms==== * ====Derived terms==== * ===Noun=== # [[ink]] ====Derived terms==== ===Verb=== # to [[ink]] ===Mutation=== [[Category:gv:Colors]] ---- [[Category:Navajo terms with audio links]] ==Navajo== ===Pronunciation=== * ===Particle=== # Part of the [[negative]] correlative: #: '''''doo''' ... da'': #:: # With a nominalizer, forms a negative noun phrase: #: #: #: # Pairing '''doo''' with a verb + [[-góó]] forms a negative conditional: #: ====Derived terms==== ===Pronunciation=== * ===Verb=== # ''it will be'' (abbreviated form of [[dooleeł]]) # paired with [[ńtʼééʼ]], it forms a conditional: #: #: ====See also==== * * ---- ==Portuguese== ===Verb=== # # ---- ==Rohingya== ===Noun=== # [[knife]] ---- ==Scots== ===Etymology=== From (compare woman's [[given name]] ); akin to Old High German , Icelandic , ''[[Dúfa]]'' "[[Dove]]" (woman's [[first name]]), Swedish , Danish and Norwegian . ===Pronunciation=== * ===Noun=== # [[dove]] (bird of the pigeon family, [[Columbidae]]) ====Derived terms==== * [[King of the Doos]] [[es:doo]] [[fr:doo]] [[lt:doo]] [[mg:doo]] [[pl:doo]] [[ru:doo]] [[fi:doo]] [[tr:doo]]

Lukars
  • 141
  • 5
  • exactly how is this not working? – Marc B May 28 '15 at 15:13
  • you could at least try to format question some what so people dont get dizzy looking at it. – Craicerjack May 28 '15 at 15:16
  • Better use some XML parser, see http://stackoverflow.com/q/7949752/1682509 – Reeno May 28 '15 at 15:17
  • 2
    1) Why are you saying "JSON"? The posted content seems to be XML. 2) http://stackoverflow.com/a/1732454/1336841 – Forketyfork May 28 '15 at 15:17
  • @MarcB it is not matching anything, if I change my regex to var regex = /(===Verb===|===Noun===|===Adjective===|===Adverb===)(.*?)/g; it will match "===Verb===" but not everything after that. The API returns the above wrapped in a JSON object... – Lukars May 28 '15 at 15:26
  • when I parse the xml it will still be wikimedia markdown so it will still be "===Noun===" and the problem is the same.... – Lukars May 28 '15 at 15:30

3 Answers3

0

You were not capturing the correct part of the string

   /(?:===Verb===|===Noun===|===Adjective===|===Adverb===)(.*?)====Translations====/g

https://regex101.com/r/bO3iY4/2

Simone
  • 81
  • 1
  • 4
0

It seems to work in this fiddle for your regex

var regex = /(===Verb===|===Noun===|===Adjective===|===Adverb===).*?====/g;

https://jsfiddle.net/2n3756aw/1/

500KD
  • 46
  • 5
0

Make sure your json is correctly escaped.

In my example, I have escaped the words ōʹvə-flō', ō'və-flōʹ, ōʹvər-flō' and ō'vər-flōʹ to ōʹvə-flō\', ō\'və-flōʹ, ōʹvər-flō\' and ō\'vər-flōʹ, respectively, to work:

var json = '<root><h level="2" i="1">==English==</h> <template lineStart="1"><title>wikipedia</title><part><name>dab</name>=<value>overflow</value></part><part><name>lang</name>=<value>en</value></part></template> <h level="3" i="2">===Etymology===</h> From <template><title>prefix</title><part><name index="1"/><value>over</value></part><part><name index="2"/><value>flow</value></part><part><name>lang</name>=<value>en</value></part></template>. Literally corresponds to <template><title>term</title><part><name index="1"/><value>superfluous</value></part><part><name>lang</name>=<value>en</value></part></template>, which is from Latin, rather than Germanic. <h level="3" i="3">===Pronunciation===</h> * <template><title>a</title><part><name index="1"/><value>RP</value></part></template> ** <template><title>sense</title><part><name index="1"/><value>noun</value></part></template> <template><title>enPR</title><part><name index="1"/><value>ōʹvə-flō\'</value></part></template>, <template><title>IPA</title><part><name index="1"/><value>/ˈəʊvəˌfləʊ/</value></part><part><name>lang</name>=<value>en</value></part></template> ** <template><title>sense</title><part><name index="1"/><value>verb</value></part></template> <template><title>enPR</title><part><name index="1"/><value>ō\'və-flōʹ</value></part></template>, <template><title>IPA</title><part><name index="1"/><value>/ˌəʊvəˈfləʊ/</value></part><part><name>lang</name>=<value>en</value></part></template> * <template><title>a</title><part><name index="1"/><value>GenAm</value></part></template> ** <template><title>sense</title><part><name index="1"/><value>noun</value></part></template> <template><title>enPR</title><part><name index="1"/><value>ōʹvər-flō\'</value></part></template>, <template><title>IPA</title><part><name index="1"/><value>/ˈoʊvɚˌfloʊ/</value></part><part><name>lang</name>=<value>en</value></part></template> ** <template><title>sense</title><part><name index="1"/><value>verb</value></part></template> <template><title>enPR</title><part><name index="1"/><value>ō\'vər-flōʹ</value></part></template>, <template><title>IPA</title><part><name index="1"/><value>/ˌoʊvɚˈfloʊ/</value></part><part><name>lang</name>=<value>en</value></part></template> * <template><title>rhymes</title><part><name index="1"/><value>əʊ</value></part><part><name>lang</name>=<value>en</value></part></template> <h level="3" i="4">===Noun===</h> <template lineStart="1"><title>en-noun</title></template> # The [[spillage]] resultant from overflow; [[excess]]. # [[outlet|Outlet]] for escape of excess material. # <template><title>context</title><part><name index="1"/><value>computing</value></part><part><name>lang</name>=<value>en</value></part></template> The situation where a value exceeds the available [[numeric]] [[range]]. <h level="4" i="5">====Translations====</h> <template lineStart="1"><title>trans-top</title><part><name index="1"/><value>spillage</value>';

var regex = new RegExp(/(===Verb===|===Noun===|===Adjective===|===Adverb===).*?====/g);

document.write(json.match(regex));
falsarella
  • 12,217
  • 9
  • 69
  • 115