1

I have a number of different HTML files that contain formatted tables which I would like to combine in a knitr report in R. Unfortunately, I have some issues in loading the HTML files into R and including the tables in my knitr report.

The HTML files were created using the "save as htm" function in MS Excel and the stargazer library. They display flawlessly in any browser. My code is:

```{r, echo=FALSE, return='asis'}
library(XML)
overview.html <- htmlParse("overview.htm")
print(overview.html)
```

When printing "overview html" in the console I get the correct html code. However, when kniting the report the output document does not contain my code and I get the following error:

Warning message:
XML content does not seem to be XML: 'overview.htm' 

I have tried several variations of the above (using htmlTreeParse, using the print type = "html" option etc.) to no avail. It would be great if someone could suggest a way how this might work.

hrbrmstr
  • 77,368
  • 11
  • 139
  • 205
Phil
  • 954
  • 1
  • 8
  • 22
  • Does `overview.htm` contain a full HTML document (i.e. has complete `'……` sections) or just the markup for the tables (i.e. ` – hrbrmstr Jan 28 '15 at 10:33
  • It has complete section with , , and tags, also a – Phil Jan 28 '15 at 10:34

2 Answers2

5

If you want to preserve the formatting (and also not bother with XML/HTML churning), you can use an <iframe> to embed your full HTML document in the knitr doc like this:

```{r echo=FALSE, results='asis'}
tmp <- URLencode(paste(readLines("/path/to/table.htm"), collapse="\n"))

cat('<iframe src="data:text/html;charset=utf-8,', tmp ,
    '" style="border: none; seamless:seamless; width: 800px; height: 200px"></iframe>')
```

It won't show up in the RStudio viewer but it will show up in a real browser:

enter image description here

You'll need to tweak width and height (I could/should have made height a bit less for this example), but you'll have your fully formatted/styled tables in your knitted document this way.

NOTE: this only works if knitting to HTML.

hrbrmstr
  • 77,368
  • 11
  • 139
  • 205
  • That looks god. Still when knitting I run into an error (Error in file(con,"r"): cannot open the connection Calls: ... URLencode -> strsplit -> paste -> readLines -> file Execution halted. Is there possibly something wrong with my knitr that keeps these errors appearing? – Phil Jan 28 '15 at 11:11
  • Ah okay. Including the FULL file path (not the relative one from my working directory) fixed that. This might also solve some of the previous errors in ran into. – Phil Jan 28 '15 at 11:14
  • Brilliant. Now it works including the preserved format. Thanks! – Phil Jan 28 '15 at 11:17
0

Here is my solution. Read the html into R and using xtable to output as html tables

```{r, echo=FALSE, return='asis'}
library(xtable)
u = "http://en.wikipedia.org/wiki/List_of_airlines_of_Malaysia"
tables = as.data.frame(readHTMLTable(u)[1])
print(xtable(tables),type='html',comment=FALSE)
```
chee.work.stuff
  • 326
  • 2
  • 14
  • It's very strange. When I apply your solution to my data document it works in the console (as do those I tried before) but when I try it in knitr I get an error (Error in characters | factors : operations are possible only for numeric, logical or complex types Calls: ... eval -> eval -> print -> xtable -> xtable.data.frame – Phil Jan 28 '15 at 10:49
  • This solution won't preserve the table CSS formatting. – hrbrmstr Jan 28 '15 at 10:58