-1

I need to turn a series of character columns into factors. I then need factors across columns to map to the corresponding enumerated values when they are converted to type numeric.

as.numeric(as.factor(characterColumnDataFrame))

This currently returns each column factored independently so the resultant numbers don't match the corresponding character string across columns.

Want to try an avoid converting one column and subsequently looking up and mapping the enums from the first column.

theGreatKatzul
  • 437
  • 1
  • 5
  • 16

3 Answers3

2

Use levels= when creating the factors. DF has character columns whereas DF2 has factor columns all having the same levels, levs.

# test data frame
DF <- as.data.frame(matrix(letters,, 2), stringsAsFactors = FALSE) 

DF2 <- DF
levs <- sort(unique(unlist(DF)))
DF2[] <- lapply(DF2, factor, levels = levs)

This could be written as a one-liner like this:

DF2 <- replace(DF, TRUE, lapply(DF, factor, levels = sort(unique(unlist(DF)))))
G. Grothendieck
  • 254,981
  • 17
  • 203
  • 341
  • I appreciate your reply and you showed me functionality I hadn't thought about within `levels`. I have opted for a slightly different solution below. – theGreatKatzul Aug 23 '17 at 12:46
1

The fct_unify() function from Hadley Wickham's forcats package unifies the levels in a list of factors.

# using G. Grothendieck's test data frame
DF <- as.data.frame(matrix(letters,, 2), stringsAsFactors = FALSE)
str(DF)
'data.frame': 13 obs. of  2 variables:
 $ V1: chr  "a" "b" "c" "d" ...
 $ V2: chr  "n" "o" "p" "q" ...
DF[] <- lapply(DF, factor)
str(DF)
'data.frame': 13 obs. of  2 variables:
 $ V1: Factor w/ 13 levels "a","b","c","d",..: 1 2 3 4 5 6 7 8 9 10 ...
 $ V2: Factor w/ 13 levels "n","o","p","q",..: 1 2 3 4 5 6 7 8 9 10 ...
DF[] <- forcats::fct_unify(DF)
str(DF)
'data.frame': 13 obs. of  2 variables:
 $ V1: Factor w/ 26 levels "a","b","c","d",..: 1 2 3 4 5 6 7 8 9 10 ...
 $ V2: Factor w/ 26 levels "a","b","c","d",..: 14 15 16 17 18 19 20 21 22 23 ...

or as a one-liner to produce the numbers of the unified factor levels:

DF[] <- lapply(forcats::fct_unify(lapply(DF, factor)), as.numeric)
DF
   V1 V2
1   1 14
2   2 15
3   3 16
4   4 17
5   5 18
6   6 19
7   7 20
8   8 21
9   9 22
10 10 23
11 11 24
12 12 25
13 13 26
Uwe
  • 41,420
  • 11
  • 90
  • 134
  • I like one liners, but I am not familiar with the package. Who is this Hadley? – theGreatKatzul Aug 23 '17 at 18:48
  • 1
    Hadley is the author and/or mastermind of a number of well known and widely used packages like `ggplot2`, `dplyr`, `stringr`, `lubridate`, and others which are marketed as `tidyverse`. He also authored a number of books on R which are available on-line (http://hadley.nz). `forcats` is quite new and has not receveived much attention yet, so I thought mentioning Hadley's name would ring a bell. – Uwe Aug 23 '17 at 20:29
0
library(zoo)
test = xtsCharacterObjectWithManyColumns
xts::coredata(test) = as.numeric(factor(test, levels = unique(test), ordered = T))
base::storage.mode(test) = "numeric"
M--
  • 25,431
  • 8
  • 61
  • 93
theGreatKatzul
  • 437
  • 1
  • 5
  • 16