-2

I have some strings which the "spaces" are removed mistakenly. But I was lucky that I had the original strings as well, I wish to replace the changed strings with the original strings. Let's my data is:

#dts1

Id     changed   Original

1        cfd      a b c
2        abc      cf d

All the spaces of original data are removed. I wish my program go to the changed column and for instance find cfd and understand that it was cf d and replace it.

So, my ideal output is

#output

  Id  changed 

   1    cf d
   2    a b c

Any tip is really appreciated.

  dts1 <- structure(list(original = c("a b c", "cf d"), changed = c("cfd", 
  "abc")), .Names = c("original", "changed"), row.names = c(NA, 
  -2L), class = "data.frame")
MFR
  • 2,049
  • 3
  • 29
  • 53
  • 1
    Not really sure of what you are trying to get done. Can you elaborate and make the question a bit more detailed? – Pj_ Aug 24 '16 at 23:47
  • 2
    Just strip the white space out of `dts1$original` and match it up? `dts1$original[match(dts1$changed, gsub("\\s+","",dts1$original))]` Your question is not clear and the relationship between `dts1$original` and `dts1$changed` is not obvious. – thelatemail Aug 24 '16 at 23:50
  • Sorry for not being clear. Let's say in one of my files mistakenly I removed all the spaces from my strings. for instance "a b" mistakenly became "ab", I want to undo this mistake, I want to change "ab" to "a b" again. please let me know if it is not still clear – MFR Aug 24 '16 at 23:52
  • Can anyone please explain to me the reason of these downvotes, please? – MFR Aug 24 '16 at 23:57
  • 1
    @Eddie You should provide a reproducible example so others can help you http://stackoverflow.com/questions/5963269/how-to-make-a-great-r-reproducible-example – shiny Aug 25 '16 at 00:11
  • I think that your question is fine but it sounds like it was answered by @thelatemail , no? BTW, while I think the current version of your question is good, one important reason your other question did better is that you included `dput()` to make it reproducible. That's very important. – Hack-R Aug 25 '16 at 00:20
  • Unless you are able to provide the some knowledge to the processor it has no way to know if "cfd" should be "cf d", "c f d", or "c fd", etc. Do you have the correct spacing somewhere where you can restore from? If so can you simply restore that good copy? Otherwise the only feasible option is to manually correct the errors. – Joshua Briefman Aug 25 '16 at 00:23
  • 1
    You need to demonstrate some effort. Your question reads as, "Here's my data, here's what I want, do my work for me." – nrussell Aug 25 '16 at 00:24
  • Joshua becasue we have only cf d in the original data not for instance c f d – MFR Aug 25 '16 at 00:25
  • So in this case, you have specific words like "cf" which previously had a space, and now don't. You could regex and replace instances of "cf" with "cf " and perform the same for other similar words. You could use SED, Python , Perl, or other tools to perform this update. The only thing I would be concerned with is that words will need to be carefully processed so as not to split up words which contain other known words. – Joshua Briefman Aug 25 '16 at 00:28
  • Thanks Joshua, It is very unlikely that we have this issue in this case because the strings are pretty long. – MFR Aug 25 '16 at 00:32

1 Answers1

4

Just strip the white space out of dts1$original and match it up?

dts1$original[match(dts1$changed, gsub("\\s+","",dts1$original))]
#[1] "cf d"  "a b c"

Where:

dts1 <- structure(list(original = c("a b c", "cf d"), changed = c("cfd", 
"abc")), .Names = c("original", "changed"), row.names = c(NA, 
-2L), class = "data.frame")
thelatemail
  • 91,185
  • 12
  • 128
  • 188