0

i load data

read.table("path.txt", sep = "\t", header=TRUE, fileEncoding="UCS-2")

it contains three rows

x         x2
MAKFA   МАКФА
makar   макароны
макар.  макароны

but i get the warning

incomplete final line found by readTableHeader on 

and as output my dataset (real incomplete)

      x x2
1 MAKFA МА

how to fix this problem (i need work only with .txt)

structure(list(x = structure(1L, .Label = "MAKFA", class = "factor"), 
               x2 = structure(1L, .Label = "МА", class = "factor")), .Names = c("x", 
                                                                                "x2"), class = "data.frame", row.names = c(NA, -1L))

solutuion here 'Incomplete final line' warning when trying to read a .csv file into R doesn't work

here the link to download txt file

https://dropmefiles.com/FfcC6

Community
  • 1
  • 1
d-max
  • 167
  • 13
  • 1
    to help deal with encoding issues it may be necessary to provide a link that enables downloading of the original file (browsers tend to coalesce everything to UTF-8 in the majority of locales) – hrbrmstr Oct 12 '18 at 13:06
  • @hrbrmstr, i updated post – d-max Oct 12 '18 at 13:35
  • (in the middle of something else but a quick look with the Unix `file` command shows: `23.txt: Little-endian UTF-16 Unicode text, with CRLF line terminators`) – hrbrmstr Oct 12 '18 at 13:46

1 Answers1

1

This may not be optimal, but it works. stringi's reliance on the ICU libraries makes it a great Swiss Army knife for overcoming encoding issues. When I saw that vim could read the file appropriately, I decided to give it a go with stringi:

library(stringi)
library(docxtractr)

stri_read_lines("23.txt") %>% 
  stri_split_fixed("\t", simplify = TRUE) %>% 
  as.data.frame(stringsAsFactors=FALSE) %>% 
  docxtractr::assign_colnames(1)
##      old      new
## 1  MAKFA    МАКФА
## 2  makar макароны
## 3 макар. макароны
hrbrmstr
  • 77,368
  • 11
  • 139
  • 205