0

I have a dataset with two columns containing the following: an indicator number and a hashcode

The only problem is that the columns have the same name, but the value can switch columns. Now I want to merge the columns and keep the number (I don't care about the hashcode)

I saw this question: Merge two columns into one in r and I tried the coalesce() function, but that is only for having NA values. Which I don't have. I looked at the unite function, but according to the cheat sheet documentation documentation here that doesn't what I'm looking for

My next try was the filter_at and other filter functions from the dplyr package Documentation here

But that only leaves 150 data points while at the start I have 61k data points. Code of filter_at I tried:

data <- filter_at(data,vars("hk","hk_1"),all_vars(.>0))

I assumed that a #-string shall not be greater than 0, which seems to be true, but it removes more than intented.

I would like to keep hk or hk_1 value which is a number. The other one (the hash) can be removed. Then I want a new column which only contains those numbers.


Sample data

My data looks like this:

HK|HK1
190|#SP0839
190|#SP0340
178|#SP2949
#SP8390|177
#SP2240|212

What I would like to see:

HK
190
190
178
177
212

I hope this provides an insight into the data. There are more columns like description, etc which makes that 190 at the start are not doubles.

1 Answers1

0

We can replace all the values that start with "#" to NA and then use coalesce to select non-NA value between HK and HK1.

library(dplyr)

df %>%
  mutate_all(~as.character(replace(., grepl("^#", .), NA))) %>%
  mutate(HK = coalesce(HK, HK1)) %>%
  select(HK)

#     HK
#1   190
#2   190
#3   178
#4   177
#5   212

data

df <- structure(list(HK = structure(c(4L, 4L, 3L, 2L, 1L), .Label = c("#SP2240", 
"#SP8390", "178", "190"), class = "factor"), HK1 = structure(c(2L, 
1L, 3L, 4L, 5L), .Label = c("#SP0340", "#SP0839", "#SP2949", 
"177", "212"), class = "factor")), class = "data.frame", row.names = c(NA, -5L))
Ronak Shah
  • 377,200
  • 20
  • 156
  • 213