0

I am inputting numeric data with parentheses around it in a text file. Record one looks like jelly (34)

I am told that "(" needs to be escaped as \(".) I presume that means that ")" needs to be escaped as \)".) I really don't know what this means. How do I use the escape and do I need a specific read in function to do this?

I am expecting the output to look like jelly 34 where jelly is a character string and 34 is numeric.

Before I deal with the parentheses I need to deal with input of records of unequal length. The code to input name (text) and age (numeric) is given below.

R Code:

dirdata<-"c:\data"
d=read.table(paste0(dirdata,"top.txt"),
header = FALSE,  sep=" ",
strip.white = TRUE, 
stringsAsFactors = FALSE
#colClasses= ("character",numeric")
#col.names= (V1 V2 V3 V4 V5 V6 V7 V8 V9 V10 V11 V12 V13 V14)
)

# data  top.txt
#Jack 1 Ben 25 Hunter 49 Di 73 Miguel 97 Mike 2 Zach 26
#Tammy 50
#Jules 74 Jake 98
# ... unequal record lengths

#Error in scan(file = file, what = what, sep = sep, quote = quote, 
 dec = dec,   
#line 2 did not have 14 elements

Can you help? Thank you. MM

Mary A. Marion
  • 780
  • 1
  • 8
  • 27
  • 1
    Not clear from the question. Can you show expected output and input data – akrun Nov 16 '19 at 22:33
  • [This](https://softwareengineering.stackexchange.com/questions/112731/what-does-backslash-escape-character-really-escape) is what it means to escape a character. But as @akrun said, please provide more context. For instance, who or what is telling you to escape the parentheses? – Dunois Nov 16 '19 at 23:01
  • Relevant : [How do I deal with special characters like \^$.?*|+()[{ in my regex?](https://stackoverflow.com/questions/27721008/how-do-i-deal-with-special-characters-like-in-my-regex) – Ronak Shah Nov 17 '19 at 01:29

2 Answers2

1

If you are simply looking to get rid of the parentheses, you can use fixed = TRUE option in gsub as well.

library(dplyr)
word <- "jelly (34)"
word %>% gsub("(", "", ., fixed = TRUE) %>% gsub(")", "", ., fixed = TRUE) 
Hong
  • 574
  • 3
  • 10
  • Hong, I will be reading in the data as name (number) . For example Jake (58) . Right now to simplify I need to learn how to read in records of variable length. I temporarily removed the parentheses. – Mary A. Marion Nov 17 '19 at 00:40
0

If I understand right, you are trying to convert jelly (34) in two object, one character jelly and one numeric 34.

( and ) are special character, so to select them, you have to use \\. You can also use the library rebus to look for OPEN_PAREN and CLOSE_PAREN.

Here, I wrote a small code to get what you ask. It is a solution among others, I have no doubt that there is alternative ways to achieve this result.

So, using your example and assuming that all your numeric vector are written with parenthesis, it should be:

words = c("Jack (1)","Ben (25)","Hunter (49)", "Di (73)", "Miguel (97)", "Mike (2)", "Zach (26)")

word = data.frame(t(sapply(words,function(x){rbind(unlist(strsplit(x," ")))})))

for(i in 1:length(word))
{
  if(grepl("\\(",word[,i])==TRUE)
  {
    word[,i] = gsub("\\(","",word[,i])
    word[,i] = gsub("\\)","",word[,i])
    word[,i] = as.numeric(word[,i])
  }
  else{word[,i] = as.character(word[,i])}
}

So, at the end, you get the following dataframe containing a character vector and a numeric vector.

> str(word)
'data.frame':   7 obs. of  2 variables:
 $ X1: chr  "Jack" "Ben" "Hunter" "Di" ...
 $ X2: num  1 25 49 73 97 2 26

Is it what you are looking for ?

dc37
  • 15,840
  • 4
  • 15
  • 32
  • Yes. Please note the changes I've made in this post particularly the greater detail and input of the data. I need to apply your solution to a vector of words. I'm new at all these functions so it's going to take a bit of study. Please see post. Thank you. – Mary A. Marion Nov 17 '19 at 00:31
  • Hi Mary, I updated my answer to include the manipulation of a vector of words (with parenthesis). Let me know if it is what you are looking for. – dc37 Nov 17 '19 at 00:52