4

I have a large HTML file, containing a lot of content. I want to get a JavaScript variable, named 'a' for example, from the whole file.

Example: (deleted lots of the actual content)

<html>
    <head>
        <script>
            var a = [{'a': 1, 'b': 2}];
        </script>
    </head>
    <body>
        ....
    </body>
</html>

What should come from the above is:

[{'a': 1, 'b': 2}]
Peter Mortensen
  • 30,738
  • 21
  • 105
  • 131
Novak
  • 2,760
  • 9
  • 42
  • 63

1 Answers1

12
preg_match('#var a = (.*?);\s*$#m', $html, $matches);
echo $matches[1];

Explanation:

  • Regex will try to match any line containing var a =
  • It will then match everything up until a ;, any amount of spaces \s*, then the end of the line $
  • The m modifier will try to match each line independently, without it, the $ would just match then end of the string which would be a bit useless

The any amount of spaces is only there in case you have some spaces after the definition, no other reason (e.g. human error). If you're sure that won't happen, you can remove \s*.

Note that this doesn't replace a full-blown parser. You will need to make modifications if a is defined over more than one line, if a is defined more than once (think about scope, you can have var a on a global scope, then var a within a function), etc.

Jay
  • 3,285
  • 1
  • 20
  • 19
  • I found a problem. If for example, one of the values contains '&', the match stops there. – Novak Jul 08 '12 at 06:23
  • `var a = "may also break with a semi-colon ; in the var's value";` - unlikely, but just adding as a note. – Fabrício Matté Jul 08 '12 at 09:59
  • @FabrícioMatté That works fine (notice the condition on the *end of the line*) - I just tested it. – Jay Jul 08 '12 at 10:01
  • 1
    Absolutely true, I stand corrected. +1 (it would also match the apostrophes in this case but as OP is looking for an array of objects instead of strings that shouldn't matter). – Fabrício Matté Jul 08 '12 at 10:06