-1

I am trying to add an extra column onto a database df2 based on information from another database df1. I have one column called sequence_annotation in df2 if this column has a numerical value then I want it to copy the information in a column called PoleX and AnX for characters. The column/identifier that is constant for both databases is called CTsite so this is how I am matching up which values from PoleX or AnX should be added.

df2 looks like this:

CTsite  sequence_annotation
OCCAJ01  Lt
OCCAJ01  20
OCCAJ04  Mt
OCCAJ04  40

df1 looks like this:

CTsite   AnX  Pole X    
OCCAJ01  720  2592
OCCAJ04  640  3264

The disired output is:

CTsite sequence_annotation  xres
OCCAJ01  Lt                 720  
OCCAJ01  20                 2592
OCCAJ04  Mt                 640
OCCAJ04  40                 3264

I have tried to create a code base on previous questions and comments but it doesn't work. Would be very grateful if anyone can help. This is the code:

if (mode(df2$sequence_annotation)=="numeric") {
df2$xres <- df1$PoleX[match(df1$CTsite,df2$CTsite)]
} else {
df2$xres <- df1$AnX[match(df1$CTsite, df2$CTsite)]
} 
jay.sf
  • 60,139
  • 8
  • 53
  • 110
Kate
  • 3
  • 3
  • 1
    Welcome to StackOverflow! Please read the info about [how to ask a good question](http://stackoverflow.com/help/how-to-ask) and how to give a [reproducible example](http://stackoverflow.com/questions/5963269). This will make it much easier for others to help you. – Cath Jul 09 '18 at 11:38
  • Hi Cath, thanks for the links. I am new to coding and dyslexic so I find it hard to explain my problem. I am not sure how to explain it better perhaps adding an example of my dataframes? – Kate Jul 09 '18 at 12:48
  • Data examples (input and desired output) would make it easier to see what you have and what you want – Cath Jul 09 '18 at 12:50

2 Answers2

0

A different solution with tidyverse. I suspect that your test of being numeric being inadequate : all the values of the column must share the same mode, probably character as in the following :

library(reader)
library(stringr)

df2 <- read_csv("
CTsite  ,sequence_annotation
OCCAJ01 ,Lt
OCCAJ01 ,20
OCCAJ04 ,Mt
OCCAJ04 ,40",skip=1)
df1 <- read_csv("
CTsite   ,AnX  ,PoleX    
OCCAJ01  ,720  ,2592
OCCAJ04  ,640  ,3264",skip=1)

df2 %>% inner_join(df1,by="CTsite") %>%
   mutate(xres=ifelse(str_detect(sequence_annotation,"\\D"),AnX,PoleX))

## A tibble: 4 x 5
#  CTsite  sequence_annotation   AnX PoleX     a
#  <chr>   <chr>               <int> <int> <int>
#1 OCCAJ01 Lt                    720  2592   720
#2 OCCAJ01 20                    720  2592  2592
#3 OCCAJ04 Mt                    640  3264   640
#4 OCCAJ04 40                    640  3264  3264
Nicolas2
  • 2,170
  • 1
  • 6
  • 15
  • Thanks, yes I tried this way before, it does add the column `xres` but all the values are all NA. – Kate Jul 09 '18 at 12:38
  • Perhaps because the test mode(...)=="numeric" returns NA? It is difficult to say without any view on data. – Nicolas2 Jul 09 '18 at 12:54
  • I just ran it on its own and this was the output `(mode(df2$sequence_annotation)=="numeric")` `[1] TRUE` I have edited the question with an example of what the data looks like. – Kate Jul 09 '18 at 13:24
0

The dplyr-package could help. But as I cannot reproduce that, I cannot much help. Here is a cheat sheet

https://www.rstudio.com/wp-content/uploads/2015/02/data-wrangling-cheatsheet.pdf

smurfit89
  • 327
  • 5
  • 17