1

So I have a some data coming in where the spacing is not consistent for one particular column as a result I am not able to group them by this particular column.

testdata <- tibble::tribble(
  ~config,       ~construct, ~var,
  1, "This is line 1",   12,
  2, " This is line 2",   15,
  3, "This is line   1 ",   21,
  4, "This  is line 2",   12,
  5, "This is line 3",   12,
  6, "This  is line 4",   11,
  7, " This   is line 3 ",   21,
  8, "This is   line 4",   12
)

As you can see df above I am trying to group them by construct but then since the spacing is not consistent I am not sure how do I trim this in order to group them properly.

I have looked into trim but it appears to remove only head and trail space and not take care of the additional spaces in between. How can I remove such spaces in this case.

SNT
  • 1,283
  • 3
  • 32
  • 78
  • 3
    Does this answer your question? [Merge Multiple spaces to single space; remove trailing/leading spaces](https://stackoverflow.com/questions/25707647/merge-multiple-spaces-to-single-space-remove-trailing-leading-spaces) –  Jun 30 '20 at 21:40
  • 1
    `tm::stripWhitespace(trimws(testdata$construct))` – d.b Jun 30 '20 at 21:51

1 Answers1

1

trimws (to remove leading/lagging spaces) with gsub (to match 2 or more spaces and replace with single space) seems to work

gsub("\\s{1,}", " ", trimws(testdata$construct))
#[1] "This is line 1" "This is line 2" "This is line 1" "This is line 2" "This is line 3" "This is line 4" "This is line 3" "This is line 4"
akrun
  • 874,273
  • 37
  • 540
  • 662