1

I have a 10 x ~15,000 data frame with salaries in column 9 and I'm trying to remove the $ from the start of each entry in that column.

This is the best version of what I have. I am new to R and far more familiar with other languages. Preferably if there is a way to run an operation on each element of a data frame (like cellfun in Matlab, or a list comprehension in python) that would make this far easier. Based on my debugging attempts it seems like gsub just isn't doing anything, even outside a loop. Any suggestions from a more experienced user would be appreciated. Thanks.

bbdat <- read.csv("C:/Users/musta/Downloads/BBs1.csv", header=TRUE, sep=",", dec=".", stringsAsFactors=FALSE)
i <- 0
for (val in bbdat[,9])
{
  i = i+1
  bbdat[i,9]<- gsub("$","",val)
}
Kyle Hunt
  • 13
  • 3
  • 1
    `gsub` is vectorized. you need `gsub("$", "", bbdat[,9], fixed = TRUE)` – akrun Mar 02 '20 at 19:26
  • 1
    Example [here](https://stackoverflow.com/q/34156898/5325862) or [here](https://stackoverflow.com/q/37471921/5325862) – camille Mar 02 '20 at 19:33

1 Answers1

1

The $ is a metacharacter and it implies the end of the string. If we want to evaluate it literally, either use the fixed = TRUE (by default it is FALSE) or keep it inside square bracket ("[$]") or escape (\\$). As gsub/sub are vectorized, looping is not required

bbdat[,9] <- gsub("$", "", bbdat[,9], fixed = TRUE)

If there is only a single instance of $ in each element, use sub (gsub - global substitution) instead ofgsub`

akrun
  • 874,273
  • 37
  • 540
  • 662
  • 1
    Ah, thank you I had a feeling there was an easier way to do this since it's so simple in every other language. Thank you for your help – Kyle Hunt Mar 02 '20 at 19:31