1

I have one data set.Which contain data about employees in company.You can see data below:

#Data
 output_test<-data.frame(
                         Employees=c(1,2,3,10,15,122,143,150,250,300,500,1000)
                         )

So next steep should be classification. I need to classify Employees by size of company.Rule is that every number of Employees determine size of company.For example if number is below 10 that meaning that is "micro" company, if number is greater then 10 but below or equal to 50 company is "small" company.For "medium" company number of Employees is greater then 50 but equal or small to 250 and last is "large" company which have Employees greater then 250. In order to do this i wrote this line of code whit IF else statment

# Code
library(dplyr)

    output_test_final<-output_test%>%
                       mutate( 
                      Size= if(Employees>=10){
                  "Micro"      
                } else {
                  if(Employees>=50){
                    "Small"
                  } else {
                if(Employees>=250){
                     "Medium"
                } else {
                  "Large"
                   }
                    }
                   }
              )

So results from this code are not good.So can anybody help me how to fix this code and get table like table below ?

enter image description here

silent_hunter
  • 2,224
  • 1
  • 12
  • 30

3 Answers3

2

if is used for scalar inputs, you can use ifelse which can be used for vectors or better case_when here. Also note that your conditions need to be reversed.

library(dplyr)

 output_test %>%
  mutate(Size = case_when(Employees <= 10 ~ "Micro", 
                          Employees <= 50 ~ "Small", 
                          Employees <= 250 ~ "Medium",
                          TRUE ~ "Large"))
#   Employees   Size
#1          1  Micro
#2          2  Micro
#3          3  Micro
#4         10  Micro
#5         15  Small
#6        122 Medium
#7        143 Medium
#8        150 Medium
#9        250 Medium
#10       300  Large
#11       500  Large
#12      1000  Large

Another option is to use cut specifying breaks and labels.

cut(output_test$Employees, breaks = c(-Inf, 10, 50, 250, Inf), 
          labels = c('Micro', 'Small', 'Medium', 'Large'))
Ronak Shah
  • 377,200
  • 20
  • 156
  • 213
1

try this one:

library(dplyr)
        output_test_final<-output_test%>%
                           mutate( 
                          Size= if(Employees<=10){
                      "Micro"      
                    } else {
                      if(Employees>=11 && Employees<=50){
                        "Small"
                      } else {
                    if(Employees>=51 && Employees<=250){
                         "Medium"
                    } else {
                      "Large"
                       }
                        }
                       }
                  )
0

We can use ifelse

library(dplyr)
output_test %>%
      mutate(Size = ifelse(Employees <= 10, "Micro", 
                      ifelse(Employees <= 50, "Small", 
                         ifelse(Employees <= 250, "Medium",
                          "Large"))))

Or in base R with findInterval

c('Micro', 'Small', 'Medium', 'Large')[findInterval(output_test$Employees, c(10, 50, 250)) + 1]
#[1] "Micro"  "Micro"  "Micro"  "Small"  "Small"  "Medium" "Medium" "Medium" "Large"  "Large"  "Large"  "Large" 
akrun
  • 874,273
  • 37
  • 540
  • 662