replacing values in a column with another column R

Question

I have two tables in different dimensions, now I want to replace value datA$swl1 with values in datB$swl2 according to userids.

datA

datB

id   swl2
 1   0.8
 3   0.6
 5   0.7

output

datA (here swl1 is replaced by the new values in swl2, but not all the ids have a new values, for those that haven't, the original values are retained)

how to do this?

what’s the second column in datB called? – tumultous_rooster Aug 27 '15 at 02:39 — tumultous_rooster, Aug 27 '15 at 02:39

Matthew Lundberg · Accepted Answer · 2015-08-27T03:22:14.707

You can use merge to match by id, then replace in column swl1 those items from datB which exist:

datC <- merge(datA, datB, all.x=TRUE)
datC
##   id swl1 swl2
## 1  1  0.8  0.8
## 2  2  0.7   NA
## 3  3  0.4  0.6
## 4  4  0.7   NA
## 5  5  0.0  0.7

This matches up the rows. Now to replace those values in column swl1 with the non-NA values from column swl2:

datC$swl1 <- ifelse(is.na(datC$swl2), datC$swl1, datC$swl2)
datC$swl2 <- NULL
datC
##   id swl1
## 1  1  0.8
## 2  2  0.7
## 3  3  0.6
## 4  4  0.7
## 5  5  0.7

RHertel · Answer 2 · 2015-08-28T21:24:03.323

6

You can obtain this result with one line of code:

datA$swl1[datA$id %in% datB$id] <- datB$swl2
#> datA
#  id swl1
#1  1  0.8
#2  2  0.7
#3  3  0.6
#4  4  0.7
#5  5  0.7

With the %in% operator we select the entries of the column datA$swl1 that belong to rows with the same id as those listed in datB. These values in the column of datA$swl1 are then replaced with the entries of the swl2 column of datB.

edited Aug 28 '15 at 21:24

answered Aug 28 '15 at 15:54

RHertel

23,412
5
38
64

1

%in% is quite useful! – Lucia Aug 29 '15 at 13:40

Arun · Answer 3 · 2015-08-28T16:55:00.730

5

IIUC, using data.table v1.9.5:

require(data.table)
setDT(datA)[datB, swl1 := swl2, on = "id"]

datA is updated by reference.

edited Aug 28 '15 at 16:55

answered Aug 28 '15 at 16:14

Arun

116,683
26
284
387

This worked for me. One feature of this approach is that any NA values in swl2 will be copied over to swl1. This was a positive for me as I wanted to retain the NA values. – Bong112 Jan 20 '23 at 11:14

tumultous_rooster · Answer 4 · 2015-08-27T07:21:50.073

1

If you'd like to select the largest value, regardless of which column it is in, you could try

library(dplyr)
datA <- data.frame(id=c(1,2,3,4,5), swl1=c(0.8, 0.7, 0.4, 0.7, 0.0))
datB <- data.frame(id=c(1,3,5), somename=c(0.8, 0.6, 0.7))

datC <- full_join(datA, datB)
datA <- data.frame(id=c(1:5))    
datA$swli1 <- apply(datC[, c('swl1', 'somename')], 1, function(x) max(na.omit(x)))

> datA
  id swli1
1  1   0.8
2  2   0.7
3  3   0.6
4  4   0.7
5  5   0.7

edited Aug 27 '15 at 07:21

answered Aug 27 '15 at 03:05

tumultous_rooster

12,150
32
92
149

2

Helpful code for this sort of thing: `datA <- read.table(header=TRUE, file='clipboard')` – Matthew Lundberg Aug 27 '15 at 03:23
But then the answer is not a self-contained solution – tumultous_rooster Aug 27 '15 at 07:32
You can make it self-contained by providing the data via `dput` -- or even editing the question and putting the definitions there. Just trying to save you some time. – Matthew Lundberg Aug 27 '15 at 14:53

replacing values in a column with another column R

4 Answers4

Linked

Related