2

I fetched some data using beautiful soup and python requests from a secondary url in the main html source of a website (i think that is what is called dynamic referencing) which was in the form of .js file link. Using beautiful soup i obtained the data (a list of lists), but it is all in string format, having a length of some 16000+. It is counting every entry, comma etc as a single entry. Though later i was able to get the data i required using selenium, but still, is there a way to convert the string data that i have into lists.

There is like a sample secondary url that is being referenced by the main url/website. lets say this one,

http://www.tennisabstract.com/cgi-bin/player.cgi?p=KeiNishikori

When i go to it's html code it references the data from this file below.

<script type="text/javascript" 
src="http://www.minorleaguesplits.com/tennisabstract/cgi-
bin/jsmatches/KeiNishikori.js"></script> 

but when i extracted my data from here (it's a var named matchmx that i need), i got something like this,

[["20170102", "Brisbane", "Hard", "A", "L", "5", "3", "", "F", "6-2 2-6 6-3", "3", "Grigor Dimitrov", "17", "7", "", "R", "25.6344969199", "188", "BUL", "0", "108", "4", "0", "69", "49", "36", "9", "12", "2", "5", "7", "2", "77", "52", "41", "12", "13", "5", "7", "1", "20170107-M-Brisbane-F-Grigor_Dimitrov-Kei_Nishikori.html", "", "", "2017-M020-300", "", "", ""],

["20170102", "Brisbane", "Hard", "A", "W", "5", "3", "", "QF", "6-1 6-1", "3", "Jordan Thompson", "79", "", "WC", "R", "22.7049965777", "", "AUS", "0", "61", "3", "0", "34", "19", "18", "10", "7", "0", "0", "1", "2", "47", "28", "15", "5", "7", "3", "8", "2", "", "", "", "2017-M020-295", "", "3", "2"],..... and so on but all as individual string giving me something like a length in 1000's. What to do to convert it into at best a list of list or how to use it to so that ultimately i could load it to a dataframe

Mr. Confused
  • 245
  • 2
  • 11

1 Answers1

1

Hi try the following code

import ast
p='[["abcd","abcd"],["abcd","abcd"]]'
print ast.literal_eval(p) #[["abcd","abcd"],["abcd","abcd"]]
print type(ast.literal_eval(p)) #list

In reference to post

Pavan Kumar T S
  • 1,539
  • 2
  • 17
  • 26
  • Yes Pawan kumar thanks, i did the same way using eval after spending too much time on net and through many resources. But i would like to know more about the eval function. What it does actually, I read about it but couldn't make out much from it. Though using it is not recommended from one of the resources where i read. Could you please tell me what is eval() doing above. I know it is doing what we wanted but how i meant? thanks for your help – Mr. Confused Feb 08 '18 at 10:53
  • yes its correct that use of eval() is dangerous but literal_eval() is not you may see this post for further understanding https://stackoverflow.com/a/15197698/7887883 – Pavan Kumar T S Feb 08 '18 at 11:34
  • use ast.literal_eval() where ever you used eval() – Pavan Kumar T S Feb 08 '18 at 11:35