0

Let me explain in short and clear.

Basically, I have an excel file named PT.xlsx.

One of the columns with a header named No_Days_In_Location are storing the numbers as a text. (I have checked it all again, confirmed that are only numbers but stored as a text, no alphabets or letters.)

I converted the data in No_Days_In_Location to numeric by using the command

df_PT$No_Days_In_Location <- as.numeric(df_PT$No_Days_In_Location)

Subsequently, not only did I receive the Warning message: NAs introduced by coercion, but the most concerning aspect was the disappearance of certain data.

Can you help give me help / advice me in handling the issue like this...

Phil
  • 7,287
  • 3
  • 36
  • 66
Gambit
  • 77
  • 1
  • 11
  • This is difficult to help with without seeing your data -- can you share enough of that column so tht the problem is reproduced?, But there must be other characters / whitespace / misplaced decimals / commas / other hidden characters etc in the string somewhere . Simple examples `x = c("1 ", " 2", " 3 ", "4 . ", "5, ") ; y = as.numeric(x); y`. But you can have a look at what values are being made `NA` using `x[is.na(y)]` which should be helpfu. – user20650 May 29 '23 at 11:46
  • We need to see (some of) your data in order to give definitive advice. One possibility that springs to mind is a mismatch of locales. What separators are being used in your data file? Do the correspond to your settings in R? Is there any pattern to the values that “disappear”? – Limey May 29 '23 at 11:46
  • Hi @user20650 , your comment is very helpful! I started to notice the data disappeared when on the number such as "1,075.54 , 1,010.46, ... " which have the comma. Can I send you the data by sending you through email? Or can you contact me through d2ydx2@hotmail.com? – Gambit May 29 '23 at 12:01
  • @Limey, your comment also very helpful to me. I just noticed that the pattern that values disappear" is when there is a comma in it, which have the thousand separator... can I send you the data through your email? – Gambit May 29 '23 at 12:03
  • No, you may not contact me via email. But you now have your answer: the problem is caused by a mmismatch in locales. It is easily solved and there are many questions and answers on the topic on SO. The correct solution for you depends on how you are reading your data, and you have given us no information about that. You maximise your chance of getting a useful answer if you provide a minimal reproducible example. [This post](https://stackoverflow.com/help/minimal-reproducible-example) may help. – Limey May 29 '23 at 12:09
  • @Gambit ; see https://stackoverflow.com/questions/1523126/how-to-read-data-when-some-numbers-contain-commas-as-thousand-separator for removing the commas. – user20650 May 29 '23 at 12:25

0 Answers0