0

*by cell I am talking about a coordinate like df[x, y]

I'd like to make a function which takes in a data frame and returns another data frame with the same columns, but just 1 row which contains the amount of cells for each column that were not empty.

My dataframe has 173 columns so I cannot group them by manually. Some of the cells are empty/NA (I prefer to represent empty cells with "" instead of NA)

So, for example, if I have DF

Name  | Lname  |     City       | Email
John    Marbor    Kalamazoo        
Mary    Watson    Grand Rapids    maryW@something.com
Leon    Ortiz     Lansing         Leonie@something.com
Trevor  Brooks     

I want the output to be

Name | Lname | City | Email
  4       4      3      2

Can this be accomplished with R?

  • A data.frame is just a list of columns, so you could do `sapply(DF, function(x) sum(x!=""))` is you insist on using a zero length string rather than a proper NA value. – MrFlick Sep 15 '20 at 03:45
  • 2
    Or use `colSums` : `colSums(df != '')` – Ronak Shah Sep 15 '20 at 03:46
  • A couple of things: I changed the empty strings to NA, and then I tried both methods and they're both returning several columns as NA rather than the actual counts. I already confirmed the columns indeed have values in them. –  Sep 15 '20 at 04:00

0 Answers0