0

This is probably a basic question, but I'm new to MATLAB and have been struggling with it for a while. I'm trying to import data from a website using urlread. I can import the data into a cell but it is in html form and I lose all the table formatting. I thought I'd be able to retrieve it using textscan, but I end up with empty cells. The code I'm using is below. I've also tried saving the page and importing it using uiimport, with no luck.

data = urlread('http://www.extraskater.com/team/montreal-canadiens/2013/gamelog?sort=game');
readData = textscan(data, '%f %s %f %f %f %f %f %f %f %f %f %f %f %f %f %f', 'delimiter');
darthbith
  • 18,484
  • 9
  • 60
  • 76
user3498384
  • 55
  • 1
  • 7
  • What are you trying to import from that page? – darthbith Jun 07 '14 at 16:11
  • I'm trying to get the numerical data in columns: TOI, GF, GA, CF, etc. I'd like to get them either as a matrix or individual vectors - either would work. – user3498384 Jun 07 '14 at 16:13
  • I think your best bet is to use a full [html parser](http://stackoverflow.com/a/20552447/2449192). Unfortunately, there isn't one built into MATLAB. Do you have to use MATLAB? Alternately, you could use something like Python to parse the HTML and print out a text file if you need to do some math in MATLAB (but Python can probably do that math as well)... – darthbith Jun 07 '14 at 16:18
  • Thanks, I'll look into that. I don't know much about Python, so I'll probably just copy and paste everything into Excel and import from there; I was just hoping Matlab had an easier way. Thanks for the help. – user3498384 Jun 07 '14 at 16:27
  • No problem, sorry it didn't work out! Good luck! – darthbith Jun 07 '14 at 16:29
  • Doing `urlread` will get the actual HTML source of the webpage. Unfortunately, you won't be able to use `textscan` to get your data like that because the HTML page is not simply formatted like that. There is a whole bunch of HTML code that is required to get it to look that way. You could perhaps look into **regular expressions** to look for columns that have those column tags, but a HTML parser (like what @darthbith) recommends is the better approach. Good luck! – rayryeng Jun 08 '14 at 02:18
  • That's a very helpful explanation, thank you! – user3498384 Jun 09 '14 at 01:03

0 Answers0