1

I would like to remove a specific number of characters from each value in a column of a data frame, both from the beginning and the end of the values in the data frame, and return a new data frame with only the specific information in all cell values. In other words: there is a specific part of the information in each cell value which I would like to keep. My cell values consists of upper- and lowercase letters, numbers, special characters dashes (-), colons (:), semicolons (;), and quotation marks ("). For example:

1A2b-3c4d5e6:f7g-8h;9i10j"11k12"l13m;IWouldLikeToKeepThis;14n15o16P17q18r19s-20t21U2;2v23w24"x25y-26z-27

should become

IWouldLikeToKeepThis

The number of characters in front of what I would like to keep is fixed (37 characters). The number of characters of the part I would like to keep is also fixed (20 characters). However, the number of characters of the part after what I would like to keep is not fixed (multiple characters).

NelsonGon
  • 13,015
  • 7
  • 27
  • 57
Silhouettes
  • 145
  • 1
  • 10

1 Answers1

2

You could use substr/substring since the number of characters is fixed.

string <- '1A2b-3c4d5e6:f7g-8h;9i10j"11k12"l13m;IWouldLikeToKeepThis;14n15o16P17q18r19s-20t21U2;2v23w24"x25y-26z-27'
substr(string,38,57)
#[1] "IWouldLikeToKeepThis"

Or with substring.

substring(string,38,57)
Ronak Shah
  • 377,200
  • 20
  • 156
  • 213