0

I am reading an excel file using the readxl package and more specifically the read_excel() function of that package.

The strings in one of my columns contain the "$" symbol at the end of them. When I check the length of the strings in that column the number reported by nchar() is one more than the length I visually see (8 instead of 7).

This is not the case for the rest of the columns in my excel file that do not contain the "$" special character at the end.

  • I tried to explicitly format that column as "Text" in excel, but it did not help.
  • I also tried to use the trim_ws = TRUE parameter of the function, again without success.

Here are the strings that I read along with the results of nchar:

enter image description here

Any help would be much appreciated.

stratar
  • 119
  • 7
  • You'll need to share a representation of one of those strings that we can examine ourselves. We can't do much just looking at screen shots. `dput()` output would be useful. – joran Jan 29 '19 at 16:50
  • Not sure what dput does to be honest. I cannot share the file as it is forbidden to share it (I am operating under corporate rules here, not at home). Basically the string it parses is **M1USE1$** instead of just **M1USE1$** – stratar Jan 29 '19 at 16:56
  • So your file has hidden zero length characters in it, which R is faithfully reporting to you. Simply Googling "remove u200b in r" led me to [this](https://stackoverflow.com/q/39993715/324364). – joran Jan 29 '19 at 16:59
  • Proprietary data is not in and of itself prohibitive of [Creating a great reproducible example in R](https://stackoverflow.com/questions/5963269/how-to-make-a-great-r-reproducible-example). Please see the mentioned link, and update your question. – Mako212 Jan 29 '19 at 17:45
  • And `dput` exports an R object in a form where it can be easily reconstructed by someone trying to help you in R, while preserving any idiosyncrasies of the data in question. – Mako212 Jan 29 '19 at 17:46

0 Answers0