3

First off, I'm still learning about regular expression, I have googled about this but still doesn't work.

How do I remove all characters except letters and numbers in a variable with sed? For example I have this text file:

MytextOnly !@#!text@@32423#@$text#%$#text%#t23432ext$32342%^-_+-=-_++_;:"'][}}{|\/

How do I show only letters and numbers?

Lin
  • 1,771
  • 6
  • 17
  • 21
  • Why specify that an answer must "use sed"? Why not ask for an answer that uses either bash or POSIX-standardized tools, and let folks give you the best tool for the job? – Charles Duffy Feb 19 '15 at 22:41
  • 1
    Because I didn't know that, please remember that I'm still learning – Lin Feb 19 '15 at 22:59
  • 1
    That's my point -- since you're still learning, it's best to ask questions in a general enough way that you leave them open to answers that might be outside the realm you initially expect. For instance, if you have `SomeShellVar='abc123def456'`, you can `echo "${SomeShellVar//[^[:alpha:]]/}"` (or `LettersOnly=${SomeShellVar//[^[:alpha:]]/}` if you don't want to `echo` the output) to remove anything that isn't a letter, completely internal to bash. Same thing with `[^[:alnum:]]` to leave only letters and numbers -- far faster than any external tool when you're working with shell variables. – Charles Duffy Feb 19 '15 at 23:07
  • ...granted, that approach is focused on variables, as opposed to files; for working on files, the answers you have now are good (though if you want to do file edits in-place in a way that works on all POSIX platforms, `ex` is another good tool to know). – Charles Duffy Feb 19 '15 at 23:08

2 Answers2

4

You can use:

sed 's/[^[:alnum:]]\+//g' file
MytextOnlytext32423texttextt23432ext32342

[^[:alnum:]] property will find all non-alphanumerical characters.


EDIT: Based on comments below:

sed 's~[^[:alnum:]/]\+~~g' file
MytextOnlytext32423texttextt23432ext32342/
anubhava
  • 761,203
  • 64
  • 569
  • 643
  • Thanks that works, can you explain what `\+` does ? and also what if I want to add forward slash as well, so it shows letters, numbers and forward slashes? – Lin Feb 19 '15 at 21:53
  • Because of `/g` global `\+` is not even necessary, I think. – Tiago Lopo Feb 19 '15 at 21:55
  • Check my updated answer for allowing forward slash. @Tiago: Quantifier `+` is used for efficiency so that less # of replacements happen. – anubhava Feb 19 '15 at 21:59
2

Using grep

grep -o '[[:alnum:]]' file

agree, no the perfect output, but everything is there

Using tr

$ tr -d -c '[:alnum:]' < file
MytextOnlytext32423texttextt23432ext32342

If you also want to keep forward slashes:

$ tr -d -c '[:alnum:]/' < file
MytextOnlytext32423texttextt23432ext32342/

For a python solution, see https://stackoverflow.com/a/5843560/297323

Community
  • 1
  • 1
Fredrik Pihl
  • 44,604
  • 7
  • 83
  • 130
  • what if I want to add forward slash as well, so it shows letters, numbers and forward slashes – Lin Feb 19 '15 at 21:57
  • then just add `\/` to the character class, the `-c` flag negates everything specified, saying delete everything not matched by the character class – Fredrik Pihl Feb 19 '15 at 21:59
  • For that python link, How do I keep spaces and forward slash? so it shows letters, numbers, spaces and forward slashes ? – Lin Feb 20 '15 at 06:43