0

Results are different in Version 3.6 & 4.1.

My R(3.6) code in the ubuntu server(18) is running well but the same code in ubuntu 20 R(4.1) is working very badly. look at this capture Issue with R Version

The purpose of this code is to normalize the column by dividing the sum.

Thank you all in advance.

krishna
  • 1
  • 3
  • 4
    In R 3.6 character vectors read into a data frame are interpreted as factors by default. In R 4.1 they are kept as character vectors. You can see the difference in the quotation marks around the elements in the character vector. If you want to keep the column as a factor variable, include `stringsAsFactor = TRUE` in the call that creates the data frame in R 4.1 . – Allan Cameron Nov 01 '22 at 08:05
  • 2
    Your code is broken and probably gives wrong results under R 3.6. When moving to R 4.0, R actually improved this buggy behaviour so it now gives you NAs instead of misleading, correct-looking but wrong results. – Konrad Rudolph Nov 01 '22 at 08:17
  • Images are not a good way for posting data or code. See [this Meta](https://meta.stackoverflow.com/a/285557/8245406) and a [relevant xkcd](https://xkcd.com/2116/). Post the data and code properly and I'll upvote what is otherwise a good, important question. – Rui Barradas Nov 01 '22 at 08:18

1 Answers1

3

Please don't post code as an image. It is also advised to post a reproducible example.

In any case, in your example on R 3.6, all_bins is a factor. However, in your R 4.1 example, all_bins is a character vector.

This is because of the change in R 4.0.0.:

R now uses a ‘⁠stringsAsFactors = FALSE⁠’ default, and hence by default no longer converts strings to factors in calls to data.frame() and read.table().

In order to reproduce the server behaviour on your local machine, when you read in bins in your local version of R, you need to add the argument stringsAsFactors = TRUE, e.g.:

bins <- read.csv("path/to/file", stringsAsFactors = TRUE)

This should solve this particular issue. However, you may run into other differences between R 3.6 and R 4.1 on different machines. I would recommend running the same version of R and packages on both machines, perhaps using renv, if you want to ensure the output is the same.

SamR
  • 8,826
  • 3
  • 11
  • 33
  • 2
    Note that ‘renv’ won’t actually solve issues around the use of different R versions. It will warn about the different version, but that’s all. – Konrad Rudolph Nov 01 '22 at 08:13
  • 1
    Also, note that merely including `stringsAsFactors = TRUE`, while restoring the R 3.6 behaviour, won’t actually fix the **bug in the code**, which was caught by R 4.0. Namely, indexing by factors rarely does the expected thing. It’s highly likely that the code yielded completely wrong results under R 3.6. If the results are correct then that’s purely by chance. – Konrad Rudolph Nov 01 '22 at 08:17
  • @KonradRudolph yes I thought the indexing looked very odd and there was a good chance the code was doing something unintended but from the posted image it's hard to tell what's happening, or desired. Thanks for the clarification about renv - I haven't actually used it to run different versions before. – SamR Nov 01 '22 at 08:23
  • thank you very much @SamR I solved the problem by converting it to factor as factor(). Your recommendation seems to be the best.i will apply this while reading table. Thank you very much. Thank you again – krishna Nov 01 '22 at 09:09
  • glad it seemed to work but I would take into account the point made by @KonradRudolph that reproducing your results may simply be reproducing a bug. Are you sure you are getting the results you expect? – SamR Nov 01 '22 at 09:14