0

I'm writing my master thesis and need to use R for some statistic calculations. Since statistic in general and R are not something familiar to me, I watched 'how-to' youtube video's and practised on some Excel data examples from the internet. After mastering this, I ventured into my own data with no success. So, I need to calculate the median of answers given on a likert scale for each question. When I try to calculate this in R, I get the 'ERROR' that ' need numeric data'. I really don't get what goes wrong. I know that this is 'basic', but I'm new to statistics/R...

  Sub Groep Behandeling X1.1 X1.2 X1.3 X1.4 X1.5 X1.6
1  21    BE           1    3    5    5    3    5    5
2  31    BE           1    4    5    5    2    5    5
3  33  BEBD           1    4    5    5    4    5    2
4  44    BE           1    3    5    5    3    5    5
5  47    BE           1    5    4    5    1    4    3
6  48    BE           1    5    2    4    1    4    5



library(xlsx)
library(dplyr)    
#Inlezen Exel sheet: 'testTAQ'----    
TestTAQ <- read.xlsx("Questionnaires_Results.xlsx" , sheetIndex = 6)
#rij 1 wil ik weg     
TAQ <- TestTAQ[2:31, ]
TAQ

#groep maken met alleen BE en BEBD     
TAQ <- TAQ %>% filter(Groep == "BE" | Groep == "BEBD")    
TAQ

#Subgroep maken met behandeling vs geen behandeling ----
#voor behandeling vs niet behandeling alleen kolommen voor vragen die bevraagd waren in die subgroep toegevoegd    
##behandeling 
TAQBehandeling <- TAQ %>% select(1:9) %>% filter(Behandeling ==1)
##geen behandeling 
TAQGeenbehandeling <- TAQ %>% select(1, 2, 10:32) %>% filter(Niet_Behandeling ==1)
TAQGeenbehandeling

#Mediaan berekenen ----
##TAQBehandeling     
summarise(TAQBehandeling, median_X1.1 = median(X1.1))

summarise(TAQBehandeling, median_X1.1 = median(X1.1))
Error: Evaluation error: need numeric data.

 str(Testtaq)             
    'data.frame':   41 obs. of  32 variables:
     $ Sub             : Factor w/ 41 levels "1","21","22",..: 41 2 3 4 5 6 7 8 1 9 ...
     $ Groep           : Factor w/ 3 levels "BD","BE","BEBD": NA 2 1 2 2 3 2 2 2 2 ...
     $ Behandeling     : num  0.154 1 0 0 0 ...
     $ X1.1            : Factor w/ 4 levels "3","4","5","NA": 2 1 4 4 4 4 4 4 4 2 ...
     $ X1.2            : Factor w/ 4 levels "2","4","5","NA": 3 3 4 4 4 4 4 4 4 3 ...
     $ X1.3            : Factor w/ 3 levels "4","5","NA": 2 2 3 3 3 3 3 3 3 2 ...
     $ X1.4            : Factor w/ 6 levels "1","2","2.5",..: 3 4 6 6 6 6 6 6 6 2 ...
     $ X1.5            : Factor w/ 3 levels "4","5","NA": 2 2 3 3 3 3 3 3 3 2 ...
     $ X1.6            : Factor w/ 4 levels "2","3","5","NA": 3 3 4 4 4 4 4 4 4 3 ...
     $ Niet_Behandeling: num  0.846 0 1 1 1 ...
     $ X2.1            : Factor w/ 6 levels "1","2","3","4",..: 3 6 5 2 1 2 2 2 2 6 ...
     $ X2.2            : Factor w/ 6 levels "1","2","3","4",..: 2 6 1 4 4 2 1 3 1 6 ...
     $ X2.3            : Factor w/ 6 levels "1","2","3","4",..: 4 6 5 2 2 4 4 4 4 6 ...
     $ X2.4            : Factor w/ 6 levels "1","2","3","4",..: 2 6 1 1 1 2 4 1 2 6 ...
     $ X2.5            : Factor w/ 5 levels "1","2","3","4",..: 1 5 1 1 1 2 2 3 1 5 ...
     $ X2.6            : Factor w/ 5 levels "1","2","3","4",..: 3 5 1 1 1 2 4 4 3 5 ...
     $ X2.7            : Factor w/ 6 levels "1","2","3","4",..: 2 6 1 2 1 4 4 1 3 6 ...
     $ X2.8            : Factor w/ 6 levels "1","2","3","4",..: 3 6 1 4 1 4 4 4 4 6 ...
     $ X2.9            : Factor w/ 5 levels "1","2","3","4",..: 1 5 1 1 1 3 1 2 1 5 ...
     $ X2.10           : Factor w/ 6 levels "1","2","3","4",..: 1 6 1 1 1 3 1 4 1 6 ...
     $ X2.11           : Factor w/ 5 levels "1","2","3","5",..: 1 5 1 1 1 3 4 1 1 5 ...
     $ X2.12           : Factor w/ 4 levels "1","2","3","NA": 1 4 1 1 1 3 1 1 1 4 ...
     $ X2.13           : Factor w/ 5 levels "1","2","3","5",..: 1 5 1 4 1 3 1 1 1 5 ...
     $ X2.14           : Factor w/ 4 levels "1","2","3","NA": 1 4 1 1 1 3 1 1 1 4 ...
     $ X2.15           : Factor w/ 4 levels "1","2","3","NA": 1 4 1 1 1 3 1 1 1 4 ...
     $ X2.16           : Factor w/ 5 levels "1","2","3","4",..: 1 5 1 1 1 3 1 1 3 5 ...
     $ X2.17           : Factor w/ 5 levels "1","2","3","4",..: 1 5 1 1 1 3 1 1 1 5 ...
     $ X2.18           : Factor w/ 6 levels "1","2","3","4",..: 1 6 1 4 4 4 1 1 3 6 ...
     $ X2.19           : Factor w/ 6 levels "1","2","3","4",..: 1 6 1 1 3 4 4 1 4 6 ...
     $ X2.20           : Factor w/ 5 levels "1","2","3","4",..: 1 5 1 1 2 3 3 1 3 5 ...
     $ X2.21           : Factor w/ 6 levels "1","2","3","4",..: 4 6 5 2 1 4 2 2 5 6 ...
     $ X2.22           : Factor w/ 4 levels "1","3","4","NA": 1 4 1 3 1 2 3 1 2 4 ...

and

str(TAQGeenbehandeling)
'data.frame':   14 obs. of  25 variables:
 $ Sub             : Factor w/ 41 levels "1","21","22",..: 4 5 6 7 8 1 10 12 13 14 ...
 $ Groep           : Factor w/ 3 levels "BD","BE","BEBD": 2 2 3 2 2 2 3 2 2 2 ...
 $ Niet_Behandeling: num  1 1 1 1 1 1 1 1 1 1 ...
 $ X2.1            : Factor w/ 6 levels "1","2","3","4",..: 2 1 2 2 2 2 1 2 3 4 ...
 $ X2.2            : Factor w/ 6 levels "1","2","3","4",..: 4 4 2 1 3 1 4 5 3 5 ...
 $ X2.3            : Factor w/ 6 levels "1","2","3","4",..: 2 2 4 4 4 4 2 4 4 1 ...
 $ X2.4            : Factor w/ 6 levels "1","2","3","4",..: 1 1 2 4 1 2 4 4 2 3 ...
 $ X2.5            : Factor w/ 5 levels "1","2","3","4",..: 1 1 2 2 3 1 1 4 1 2 ...
 $ X2.6            : Factor w/ 5 levels "1","2","3","4",..: 1 1 2 4 4 3 4 3 3 4 ...
 $ X2.7            : Factor w/ 6 levels "1","2","3","4",..: 2 1 4 4 1 3 1 4 2 4 ...
 $ X2.8            : Factor w/ 6 levels "1","2","3","4",..: 4 1 4 4 4 4 4 4 2 4 ...
 $ X2.9            : Factor w/ 5 levels "1","2","3","4",..: 1 1 3 1 2 1 1 4 3 1 ...
 $ X2.10           : Factor w/ 6 levels "1","2","3","4",..: 1 1 3 1 4 1 5 3 1 1 ...
 $ X2.11           : Factor w/ 5 levels "1","2","3","5",..: 1 1 3 4 1 1 1 3 1 3 ...
 $ X2.12           : Factor w/ 4 levels "1","2","3","NA": 1 1 3 1 1 1 3 3 1 2 ...
 $ X2.13           : Factor w/ 5 levels "1","2","3","5",..: 4 1 3 1 1 1 1 3 3 1 ...
 $ X2.14           : Factor w/ 4 levels "1","2","3","NA": 1 1 3 1 1 1 1 2 1 1 ...
 $ X2.15           : Factor w/ 4 levels "1","2","3","NA": 1 1 3 1 1 1 1 1 1 1 ...
 $ X2.16           : Factor w/ 5 levels "1","2","3","4",..: 1 1 3 1 1 3 1 3 1 4 ...
 $ X2.17           : Factor w/ 5 levels "1","2","3","4",..: 1 1 3 1 1 1 1 3 1 4 ...
 $ X2.18           : Factor w/ 6 levels "1","2","3","4",..: 4 4 4 1 1 3 1 3 1 5 ...
 $ X2.19           : Factor w/ 6 levels "1","2","3","4",..: 1 3 4 4 1 4 1 3 1 4 ...
 $ X2.20           : Factor w/ 5 levels "1","2","3","4",..: 1 2 3 3 1 3 4 3 1 4 ...
 $ X2.21           : Factor w/ 6 levels "1","2","3","4",..: 2 1 4 2 2 5 2 4 3 1 ...
 $ X2.22           : Factor w/ 4 levels "1","3","4","NA": 3 1 2 3 1 2 1 2 2 1 ...
Marco Sandri
  • 23,289
  • 7
  • 54
  • 58
Hanne
  • 1
  • 1
  • what does `str(TestTAQ)` and `str(TAQGeenbehandeling)` return? – user20650 May 17 '20 at 17:44
  • Thanks for the edit. You can see that your data is not numeric as it is getting read in as character / factor.I would spend some time trying to find out why this is so ... it could be a stray character or comma in the columns in your raw data etc. More generally, use `str(yourdata)` as soon as you read it in to check that it is read in correctly and the variables are the expected class. – user20650 May 17 '20 at 17:59
  • I've pasted the return from those 2 at the bottom of my question :) – Hanne May 17 '20 at 17:59
  • Hi! Thanks for the help! There is no way to 'fix' that so R reads it as numeric? – Hanne May 17 '20 at 19:00
  • Yes, you can set the [`colClasses`](https://stackoverflow.com/questions/18279268/read-xlsx-and-colclasses) parameters or you can [`convert to numeric`](https://stackoverflow.com/questions/3418128/how-to-convert-a-factor-to-integer-numeric-without-loss-of-information) after reading in. I prefer to try to diagnose the import problems though as either of the above methods *may* lead to loss of information. – user20650 May 17 '20 at 19:17
  • So, when I read it in like below, I could create loss of info? : Questionnaires_Results <- read_excel("Questionnaires_Results.xlsx", + sheet = "TAQ", range = "A1:I42", col_types = c("text", + "text", "numeric", "numeric", "numeric", + "numeric", "numeric", "numeric", + "numeric"), na = "NA") – Hanne May 17 '20 at 20:01
  • it could, for example, if for some reason the number `1` had been stored as `1,0` (i.e. using a comma instead of a point for the decimal indicator). In this case the variables would be set to NA whereas it is likely to be 1 (although this is unlikely in your case as it seems as if integers should be returned); these are decisions you sometimes have to make when preparing data for analysis. Again I'd try to identify why the variables are be read in as character. – user20650 May 17 '20 at 20:20

0 Answers0