factorize all column by their levels with how many times they occur in the attribute of my data set

Question

this is my data set on which i want to complete factorize my data set with each count levels of the every attribute of file This my code:

    library(dplyr)
    #read File
    h_Data<-read.csv(file.choose())
    #store university attribute
    h_Data<-h_Data$University

    #Count each levels factor of data of 
    h_DataDF <- data.frame(h_Data)
    h_dataLevels<-h_DataDF %>% 
    group_by(h_Data) %>%
    summarise(no_rows = length(h_Data))
    h_dataLevels  

    #missing of data
    h_DataMissing<-sum(is.na(h_Data))
    h_DataMissing

    #percentage of each level of factor
    h_DataPer<-prop.table(table(h_Data))*100

    #table format
    h_DataTable <-data.frame(levels_data=h_dataLevels,levels_perc=h_DataPer,missing_data=h_DataMissing)
    h_DataTable

I want to summrize as: levels_University no.of_timesLevels Percentage_of_Level MissingAttributes IBA 4 57.14 0 KU 1 14.28 0 UIT 2 28.57 0

Please make this question *reproducible*. This includes sample code (including listing non-base R packages), sample data (e.g., `dput(head(x))`), and expected output. Refs: https://stackoverflow.com/questions/5963269, https://stackoverflow.com/help/mcve, and https://stackoverflow.com/tags/r/info. Since you've mentioned a "file", perhaps including the top "n" lines from the file, where "n" is defined based on balancing relative importance, sufficiency, and compactness. — r2evans, Oct 23 '18 at 19:13
Title should be a very brief summary of the question, not the question itself, to begin with... — FatihAkici, Oct 23 '18 at 21:02

score 0 · Answer 1 · answered Oct 23 '18 at 20:59

it's hard to know exactly what you want without some sample data and desired output but here is some code that takes a dataframe and for each column that is a factor returns a dataframe listing the number of observations for each factor level.

## dummy data
df <- data.frame(Sex = c("m", "f", "m","f"), department = c("bs", "el", "bs", "se"), numbers = c(1,2,3,4))

## function that takes a column of data
## and returns factor counts if the column is a factor
countFactors <- function(col){
     if(is.factor(col)){
          fct_count(col)
     }else{
          NULL
     }
}

## use purrr::map to iterate through the columns of the
## dataframe and apply the function
df %>% 
     map(~ countFactors(.)) %>% 
     compact()

factorize all column by their levels with how many times they occur in the attribute of my data set

1 Answers1