6

I need to fetch some result on a webpage, which use some JavaScript code to generate the part I am interesting in like following

eval(function(p,a,c,k,e,d){e=function(c){return c};if(!''.replace(/^/,String)){while(c--)d[c]=k[c]||c;k=[function(e){return d[e]}];e=function(){return'\\w+'};c=1;};while(c--)if(k[c])p=p.replace(new RegExp('\\b'+e(c)+'\\b','g'),k[c]);return p;}('5 11=17;5 12=["/3/2/1/0/13.4","/3/2/1/0/15.4","/3/2/1/0/14.4","/3/2/1/0/7.4","/3/2/1/0/6.4","/3/2/1/0/8.4","/3/2/1/0/10.4","/3/2/1/0/9.4","/3/2/1/0/23.4","/3/2/1/0/22.4","/3/2/1/0/24.4","/3/2/1/0/26.4","/3/2/1/0/25.4","/3/2/1/0/18.4","/3/2/1/0/16.4","/3/2/1/0/19.4","/3/2/1/0/21.4"];5 20=0;',10,27,'40769|54|Images|Files|png|var|imanhua_005_140430179|imanhua_004_140430179|imanhua_006_140430226|imanhua_008_140430242|imanhua_007_140430226|len|pic|imanhua_001_140429664|imanhua_003_140430117|imanhua_002_140430070|imanhua_015_140430414||imanhua_014_140430382|imanhua_016_140430414|sid|imanhua_017_140430429|imanhua_010_140430289|imanhua_009_140430242|imanhua_011_140430367|imanhua_013_140430382|imanhua_012_140430367'.split('|'),0,{}))

The result of eval() is valuable to me, I am writing a Python script, is there any library I can use to virtually run this piece of JavaScript code and get the output?

Thanks

axel22
  • 32,045
  • 9
  • 125
  • 137
overboming
  • 1,502
  • 1
  • 18
  • 36

6 Answers6

9

pyv8 is a set of bindings for the V8 JavaScript Engine (Google Chrome)

Henrik Hansen
  • 2,180
  • 1
  • 14
  • 19
  • love to see so many choices, between spidermonkey and v8, it's a matter of preferring firefox or chrome, Thanks! – overboming May 02 '10 at 15:18
7

Use a spidermonkey binding

from spidermonkey import Runtime
rt = Runtime()
cx = rt.new_context()
result = cx.eval_script(whatyoupostedabove)
duncan
  • 6,113
  • 3
  • 29
  • 24
  • 1
    This project seems to be dead for a long time, and won't work on snow leopard – overboming May 04 '10 at 04:44
  • I use the ruby version on Linux, and it is also kind of dead. But it works. The biggest problem on ruby is not the bindings themselves but to get the right spidermonkey (did not work with 1.9 for example). – duncan May 04 '10 at 07:31
  • looks like there is a continuation at https://github.com/davisp/python-spidermonkey but it has also not been updated in two years at this point. – Jim Garrison Jun 17 '13 at 19:56
  • This library is deprecated now. – Shmalex Aug 07 '19 at 03:35
4

You can use PyQt with the WebKit module :) It has JS engine and can evaluate JS within context of a (X)HTML document.

Viet
  • 17,944
  • 33
  • 103
  • 135
4

I suppose you solved the problem by now, but I wanted to share another (in my opinion a much more viable) option. When you are interested in evaluating just one --known-- javascript function, it may be easier to implement this function in Python rather than pull in a huge tool that is built to parse and run all imaginable javascript in the world.

So I would suggest to write a python version of the javascript unpacker function and most is solved. I did in fact do that and here is an example. The int2base function is Alex Martelli's implementation which can be found here.

def unpack(p, a, c, k, e=None, d=None):
    ''' unpack
    Unpacker for the popular Javascript compression algorithm.
    
    @param  p  template code
    @param  a  radix for variables in p
    @param  c  number of variables in p
    @param  k  list of c variable substitutions
    @param  e  not used
    @param  d  not used
    @return p  decompressed string
    '''
    # Paul Koppen, 2011
    for i in xrange(c-1,-1,-1):
        if k[i]: p = re.sub('\\b'+int2base(i,a)+'\\b', k[i], p)
    return p

Finally you need to do a tiny bit of parsing to extract the four function arguments. Just for the sake of a simple illustration though, I use eval here to let Python do that for me.

s  = '''eval(function(p,a,c,k,e,d){e=function(c){return c};if(!''.replace(/^/,String)){while(c--)d[c]=k[c]||c;k=[function(e){return d[e]}];e=function(){return'\\w+'};c=1;};while(c--)if(k[c])p=p.replace(new RegExp('\\b'+e(c)+'\\b','g'),k[c]);return p;}('5 11=17;5 12=["/3/2/1/0/13.4","/3/2/1/0/15.4","/3/2/1/0/14.4","/3/2/1/0/7.4","/3/2/1/0/6.4","/3/2/1/0/8.4","/3/2/1/0/10.4","/3/2/1/0/9.4","/3/2/1/0/23.4","/3/2/1/0/22.4","/3/2/1/0/24.4","/3/2/1/0/26.4","/3/2/1/0/25.4","/3/2/1/0/18.4","/3/2/1/0/16.4","/3/2/1/0/19.4","/3/2/1/0/21.4"];5 20=0;',10,27,'40769|54|Images|Files|png|var|imanhua_005_140430179|imanhua_004_140430179|imanhua_006_140430226|imanhua_008_140430242|imanhua_007_140430226|len|pic|imanhua_001_140429664|imanhua_003_140430117|imanhua_002_140430070|imanhua_015_140430414||imanhua_014_140430382|imanhua_016_140430414|sid|imanhua_017_140430429|imanhua_010_140430289|imanhua_009_140430242|imanhua_011_140430367|imanhua_013_140430382|imanhua_012_140430367'.split('|'),0,{}))'''
js = eval('unpack' + s[s.find('}(')+1:-1])

Result:

'var len=17;var pic=["/Files/Images/54/40769/imanhua_001_140429664.png","/Files/Images/54/40769/imanhua_002_140430070.png","/Files/Images/54/40769/imanhua_003_140430117.png","/Files/Images/54/40769/imanhua_004_140430179.png","/Files/Images/54/40769/imanhua_005_140430179.png","/Files/Images/54/40769/imanhua_006_140430226.png","/Files/Images/54/40769/imanhua_007_140430226.png","/Files/Images/54/40769/imanhua_008_140430242.png","/Files/Images/54/40769/imanhua_009_140430242.png","/Files/Images/54/40769/imanhua_010_140430289.png","/Files/Images/54/40769/imanhua_011_140430367.png","/Files/Images/54/40769/imanhua_012_140430367.png","/Files/Images/54/40769/imanhua_013_140430382.png","/Files/Images/54/40769/imanhua_014_140430382.png","/Files/Images/54/40769/imanhua_015_140430414.png","/Files/Images/54/40769/imanhua_016_140430414.png","/Files/Images/54/40769/imanhua_017_140430429.png"];var sid=40769;'

Additional note: it was brought to my attention that if the radix > 36 then Alex' int2base function breaks. The solution is to modify it by adding uppercase characters like so: digs = string.digits + string.lowercase + string.uppercase

evandrix
  • 6,041
  • 4
  • 27
  • 38
Paul
  • 273
  • 1
  • 3
  • 11
1

This seems to be suitable for my need: http://code.google.com/p/python-spidermonkey/

overboming
  • 1,502
  • 1
  • 18
  • 36
  • I've also found a very simple command line tool built in on Mac OS called jrunscript, this works out of the box. – overboming May 18 '10 at 07:35
0

when importing javacript module is not option, I use this

import re

def baseN(num,b,numerals="0123456789abcdefghijklmnopqrstuvwxyz"):
    return ((num == 0) and numerals[0]) or (baseN(num // b, b, numerals).lstrip(numerals[0]) + numerals[num % b])

def unpack(p, a, c, k, e=None, d=None):
    while (c):
        c-=1
        if (k[c]):
            p = re.sub("\\b" + baseN(c, a) + "\\b",  k[c], p)
    return p

encrypted = r'''eval(function(p,a,c,k,e,d){e=function(c){return c};if(!''.replace(/^/,String)){while(c--)d[c]=k[c]||c;k=[function(e){return d[e]}];e=function(){return'\\w+'};c=1;};while(c--)if(k[c])p=p.replace(new RegExp('\\b'+e(c)+'\\b','g'),k[c]);return p;}('5 11=17;5 12=["/3/2/1/0/13.4","/3/2/1/0/15.4","/3/2/1/0/14.4","/3/2/1/0/7.4","/3/2/1/0/6.4","/3/2/1/0/8.4","/3/2/1/0/10.4","/3/2/1/0/9.4","/3/2/1/0/23.4","/3/2/1/0/22.4","/3/2/1/0/24.4","/3/2/1/0/26.4","/3/2/1/0/25.4","/3/2/1/0/18.4","/3/2/1/0/16.4","/3/2/1/0/19.4","/3/2/1/0/21.4"];5 20=0;',10,27,'40769|54|Images|Files|png|var|imanhua_005_140430179|imanhua_004_140430179|imanhua_006_140430226|imanhua_008_140430242|imanhua_007_140430226|len|pic|imanhua_001_140429664|imanhua_003_140430117|imanhua_002_140430070|imanhua_015_140430414||imanhua_014_140430382|imanhua_016_140430414|sid|imanhua_017_140430429|imanhua_010_140430289|imanhua_009_140430242|imanhua_011_140430367|imanhua_013_140430382|imanhua_012_140430367'.split('|'),0,{}))'''

encrypted = encrypted.split('}(')[1][:-1]

print eval('unpack(' + encrypted)

output:

var len=17;var pic=["/Files/Images/54/40769/imanhua_001_140429664.png","/Files/Images/54/40769/imanhua_002_140430070.png","/Files/Images/54/40769/imanhua_003_140430117.png","/Files/Images/54/40769/imanhua_004_140430179.png","/Files/Images/54/40769/imanhua_005_140430179.png","/Files/Images/54/40769/imanhua_006_140430226.png","/Files/Images/54/40769/imanhua_007_140430226.png","/Files/Images/54/40769/imanhua_008_140430242.png","/Files/Images/54/40769/imanhua_009_140430242.png","/Files/Images/54/40769/imanhua_010_140430289.png","/Files/Images/54/40769/imanhua_011_140430367.png","/Files/Images/54/40769/imanhua_012_140430367.png","/Files/Images/54/40769/imanhua_013_140430382.png","/Files/Images/54/40769/imanhua_014_140430382.png","/Files/Images/54/40769/imanhua_015_140430414.png","/Files/Images/54/40769/imanhua_016_140430414.png","/Files/Images/54/40769/imanhua_017_140430429.png"];var sid=40769;
uingtea
  • 6,002
  • 2
  • 26
  • 40