I got this "incomplete final line found by readTableHeader" error message when using read.delim() to read in a tab-delimited text file. There are Traditional Chinese characters in the header and content, so I am already using alternative encoding, like this:
kg = read.delim("KG_EDB_20150505.csv",fileEncoding="UTF-16LE")
Warning message:
In read.table(file = file, header = header, sep = sep, quote = quote, :
incomplete final line found by readTableHeader on 'KG_EDB_20150505.csv'
I have read other posts with similar issues, e.g.:
'Incomplete final line' warning when trying to read a .csv file into R In read.table(): incomplete final line found by readTableHeader
But unfortunately the suggested solutions in these posts cannot solve the problem.
A summary of what were tried etc:
- Pressing ENTER at the last line of the text file: same error
- Trimming the text file into header + 1 single of data, then make sure there is a new line (ENTER) between the line for header and the content: same error
- Trimming the text file until only the header is left, then copy&paste the header onto the next line and use it to pretend as a line of data. Add a new line (ENTER) after the fake line of data: WORKS! Chinese is all garbage, but then I do not need those anyway.
- Remove the trailing new line (ENTER) in #3: same error, but can read 1 line of fake data into the data.frame.
- Open in Excel directly: works, but not the workflow I want.
So what gives?
Is there a way I can read in such file?
or
Is there a way to massage the file (preferably in R) and then read it in?
The file is here:
https://dl.dropboxusercontent.com/u/5860015/KG_EDB_20150505.csv
It was from a government webpage here:
http://www1.map.gov.hk/gih3/view/index.jsp
(Map Tools > Data Download > Kindergarten-cum-child Care Centres)
Many thanks in advance!
Update:
By a stroke of luck, I isolated an offending character inside the text file, namely this Chinese character "稚". It may not be the only one, but if I add it to the file in #3, same error again. I do not know what is special about this character and I do no need any info in the text file in Chinese anyway.
So now there are more questions:
- Is there a way to skip reading this offending character?
or
- Is there a way in R to replace this offending character in the file, before reading in the text file?