-1

I've grouped some Data about the titanic the following way:

priceMutate <- mutate(titanic, PriceGroup = ifelse (Fare < 51,
                                                       '0 - 50',
                                                       ifelse(Fare >=51  & Fare < 101,
                                                              '51-100',
                                                              ifelse(Fare >= 101 & Fare < 151,
                                                                     '101-150',
                                                                     ifelse(Fare >= 151 & Fare < 201,
                                                                            '151-200',
                                                                            ifelse(Fare >= 201 & Fare < 251,
                                                                                   '201-250',
                                                                                   ifelse(Fare >= 251 & Fare < 301,
                                                                                          '251-300',
                                                                                          ifelse(Fare >= 301 & Fare < 351,
                                                                                                 '301-350',
                                                                                                    ifelse(Fare >= 351 & Fare < 401,
                                                                                                            '351-400',
                                                                                                                  ifelse(Fare >= 401 & Fare < 451,
                                                                                                                        '401-450',
                                                                                                                            ifelse(Fare > 450,
                                                                                                                                '451+','?')))))))))))

"Fare" is the price payed for a ticket for the titanic. I've chosen steps of 50$.

Now here is my problem:

I've made a plot that shows the chance of survival regarding the price of the tickets:

  output$ex15 <- renderPlot({
    ggplot(priceMutate, 
           aes(x = PriceGroup,
               fill = Status)) + 
      geom_bar(position = "fill")+
      ggtitle("Überlebenschancen nach Preis des Tickets (gruppiert)")+
      theme(plot.title = element_text(size = 16))+
      scale_fill_manual(values = c("grey24", "snow"))+
      labs(y= "Anzahl")
  })

However this plot mixes up the groups I made and does not show the "?" for the not-available data!

enter image description here

Can anyone see a problem/mistake that I've made?

Here is a link to my dataset: https://drive.google.com/file/d/1xsIfkv1464etX23O0J9y35CviK0mKYQl/view?usp=sharing

Thank you a lot :)

  • 1
    Please check your mutate statement and your data. There is no possible value for your data to take "?". – YBS Jul 03 '22 at 20:22

1 Answers1

1

As already mentioned by @YBS in the comments at least for your example data there is no observation which will be assigned a "?" as all values are in the range 0 to 512 and there are no missings.

Concerning your second issue, as you recoded the Fare column as a character your PriceGroups will be ordered alphabetically by default. And alphabetically a string starting with a 4 like 451+ comes before a string starting with a 5 like 51-100. If you want the categories to be ordered you have to convert to a factor with the levels set according to your desired order. This for example could be achieved via the cut function which makes it easy to recode a numeric to intervals and which will automatically convert to a factor. If you do that often I also would suggest to have a look at the santoku package which makes it even easier set nice labels.

Finally, instead of using your data I created a minimal reproducible example by using some fake random example data to mimic your real data:

library(shiny)
library(tidyverse)

# Create fake example data
set.seed(123)

titanic <- data.frame(
  PassengerId = 1:100,
  Survived = sample(0:1, 100, replace = TRUE),
  Fare = runif(100, 0, 512)
)

# Set breaks and labels
breaks <- c(0, seq(51, 451, 50), Inf)
labels <- paste(breaks[-length(breaks)], breaks[-1], sep = "-")
labels[length(labels)] <- "451+"

priceMutate <- titanic %>%
  mutate(PriceGroup = cut(Fare, breaks = breaks, labels = labels, right = FALSE),
         Status = recode(Survived, "0" = "Dead", "1" = "Survived"))

ui <- fluidPage(
  plotOutput("ex15")
)

server <- function(input, output, session) {
  output$ex15 <- renderPlot({
    ggplot(priceMutate, 
           aes(x = PriceGroup,
               fill = Status)) + 
      geom_bar(position = "fill")+
      ggtitle("Überlebenschancen nach Preis des Tickets (gruppiert)")+
      theme(plot.title = element_text(size = 16))+
      scale_fill_manual(values = c("grey24", "snow"))+
      labs(y= "Anzahl")
  })
}

shinyApp(ui, server)
#> 
#> Listening on http://127.0.0.1:8734

stefan
  • 90,330
  • 6
  • 25
  • 51