1

Below is a screenshot of the problem I'm facing when dealing with a conversion of a numeric object to character object in R. The dataframe is correct otherwise, but the trailing zeros are lacking after the characters "46" and "104".

enter image description here

Consider the following MWE:

library(dplyr)

# I first created random data and then created quintiles from the data.
# The random data are listed below:
testdata1 <- structure(list(X = c(62.5229689071269, 145.825042620083, 124.871684774549, 
                                86.2501301893607, 101.433010648648, 144.618979893455, 110.778688415318, 
                                45.9851314727384, 106.411772801465, 56.7832887263229, 162.318035050403, 
                                72.8574239442922, 133.416450070424, 137.670510111283, 107.965525693767, 
                                114.545917853894, 103.963829924899, 123.393869519699, 70.6355172309528, 
                                67.4792934191092), quintiles = structure(c(1L, 5L, 4L, 2L, 2L, 
                                5L, 3L, 1L, 3L, 1L, 5L, 2L, 4L, 5L, 3L, 4L, 3L, 4L, 2L, 1L),
                                .Label = c("[46,70]", "(70,103]", "(103,112]", "(112,134]", "(134,162]"),
                                class = "factor")), row.names = c(NA, 20L), class = "data.frame")

# A new dataframe "testdata2" will show in 4 columns:
# 1) quintiles,
# 2) min. value of X in each quintile,
# 3) max. value of X in each quintile, and
# 4) the range between mins and maxs within the quintiles:

testdata2 <- as.data.frame(levels(testdata1$quintiles))
names(testdata2)[1] <- 'Quintiles'

testdata2$Min <- testdata1 %>% group_by(quintiles) %>% summarise(X = min(X)) %>%
  select(X)  %>% mutate(across(where(is.numeric), round, 1)) %>% as.matrix %>% as.character

testdata2$Max <- testdata1 %>% group_by(quintiles) %>% summarise(X = max(X)) %>%
  select(X)  %>% mutate(across(where(is.numeric), round, 1)) %>% as.matrix %>% as.character

testdata2$Range <- format(paste(testdata2$Min, testdata2$Max, sep="-"))

View(testdata2)

As a side note, I had great difficulty in avoiding whole vectors (all min values and all max values) from being projected to each individual cell in the dataframe. If you erase the two as.matrix functions from the code, you will see what I mean. Is there a more elegant way of achieving the result than using as.matrix?

Any help is greatly appreciated.

jaggedjava
  • 440
  • 6
  • 14
  • 1
    Could you create an example that doesn’t require unnecessary external packages? I don’t understand your problem, but I can’t execute your example without first installing the ‘gtools’ package (which, if I understand correctly, is not actually related to your issue). (And even then I don’t think your example actually demonstrates “trailing zeros”, which may be due to the fact that you’re using random numbers without fixing a seed). Please just post an example dataset using [`dput`](https://stackoverflow.com/a/5963610/1968). – Konrad Rudolph Oct 12 '21 at 09:36
  • I apologize, you are absolutely right. As per your advice, I have now rewritten the code using dput. I hope it makes better sense now. – jaggedjava Oct 12 '21 at 11:12

1 Answers1

0

Instead of pasting the strings together have a go with sprintf, you have quite a lot of control with the formatting.

> sprintf("%d.0-%d.0", 71, 100)
[1] "71.0-100.0"

EDIT

The full work would look akin to this.

testdata1 %>% 
  group_by(quintiles) %>% 
  mutate(
    Min = min(X),
    Max = max(X),
    Range = sprintf("%.1f-%.1f", Min, Max)
    ) 

sprintf will sort out the formatting and leading 0s. Thanks for the correction Roland, not sure what I was doing above

Quixotic22
  • 2,894
  • 1
  • 6
  • 14
  • 2
    Yes, `sprintf` can be used here. However, your format string is wrong. It should be something like `sprintf("%.1f-%.1f", 71.1, 100)`. – Roland Oct 12 '21 at 11:18
  • @Quixotic22, can you please be more specific? I rewrote the code in such a way that my original random data may now be used by the commenters. (Before posting the question, I have tried sprintf, but to no avail.) – jaggedjava Oct 12 '21 at 11:28
  • 1
    Additional notes added – Quixotic22 Oct 12 '21 at 13:06
  • I confirm that your code solves perfectly not only 1) the issue with disappearing trailing zeros/decimals but also 2) the "as a side note" problem mentioned in my question. Besides, the code got much tidyer (pun unintended). Thanks to both of you and even the user suggesting the usage of dput. – jaggedjava Oct 12 '21 at 14:45
  • @Quixotic22, I forgot one final comment: in order to achieve the result (and only that result without extra rows) depicted in the screenshot in the beginning of my comment, I believe the following snippet has to be added to your code: `%>% filter(X == min(X) | X == max(X)) %>% arrange(X) %>% select(-X) %>% distinct` – jaggedjava Oct 13 '21 at 08:07
  • 1
    @jaggedjava, best method is to actually changed from the `mutate` verb above to `summarise`. – Quixotic22 Oct 13 '21 at 08:12
  • @Quixotic22, you're absolutely right: the most efficient solution is to use the full work that you posted, but in such a way that `mutate` is substituted with `summarise`. With that simple modification, there is no need for any additional code snippets. – jaggedjava Oct 16 '21 at 08:06