1

I've been trying to figure out a regex to output only three letters and also to delete the word "not"

What I've tried so far is:

Here's what I need regexed:

bash: line 1: drs: command not found
bash: line 2: tep: command not found
bash: line 3: ldo: command not found
bash: line 4: tep: command not found
bash: line 5: txw: command not found
bash: line 6: tep: command not found
bash: line 7: jfp: command not found
bash: line 8: mys: command not found
bash: line 9: jhf: command not found
bash: line 10: mjw: command not found
bash: line 11: czw: command not found
bash: line 12: txh: command not found
bash: line 13: krn: command not found
bash: line 14: sct: command not found
bash: line 15: jad: command not found

I want it to only output:

drs
tep
ldo
tep
txw
tep
jfp
mys
jhf
mjw
czw
txh
krn
sct
jad

Is there a way I can do this? Please keep in mind I have multiple other three letter combinations, with all letters of the alphabet.

Community
  • 1
  • 1
Bam
  • 153
  • 13
  • are the letters always in that place? I mean, is always "bash: line xxxx: ABC: ...."? – zon7 Dec 28 '15 at 22:32
  • @zon7 Yes they are always in the same place, I'll edit the post hang on.. – Bam Dec 28 '15 at 22:33
  • Please read "[ask]" and "[mcve]". Is there working code? Is there sample input and your expected output? – the Tin Man Dec 28 '15 at 22:33
  • *WHY* do you want to delete the word "not"? That doesn't make any sense. – the Tin Man Dec 28 '15 at 22:48
  • 2
    **Warning: Dot not use `[A-z]` in regexes.** It matches uppercase and lowercase ASCII letters as you expect, but it also matches several punctuation characters whose code points lie between `z` and `A`. Use `[A-Za-z]` instead, or make use of the case-insensitive flag (e.g., `/[a-z]/i`). – Alan Moore Dec 28 '15 at 22:57
  • @AlanMoore I had no idea it matched punctuation characters, thanks for the information! – Bam Dec 28 '15 at 23:00
  • 1
    You could improve your question in several ways: 1. State what you want to do without reference to the approach that you think should be taken (e.g., use of a regex). 2. When you give an example, ensure each input value is a valid Ruby object. Here it's not clear if your text is a string or an array of strings. You should write `"bash: line 1:...."` or `["bash: line 1:...]`. 3. Assign a variable to each object that is an input for your example (e.g., `str = "bash: line 1:...."`) so that readers can refer to the variable in comments and answers without having to define it. (cont.) – Cary Swoveland Dec 29 '15 at 03:06
  • ...4. Ensure the example illustrates each element of the question. Here you say that you want to "delete the word 'not'", but that word does not appear in your example. 5. Make the example as brief as possible to make the point. Here 3-5 lines would have been enough. – Cary Swoveland Dec 29 '15 at 03:09

4 Answers4

3

Why regex? You are overcomplicating your life:

def three_letters_excluding_not(text)
    text
      .split(/\W+/)
      .select{|w| w.length == 3}
      .reject{|w| w=="not"}
end

Short, easy, readable, enjoy the power of Ruby.

Caridorc
  • 6,222
  • 2
  • 31
  • 46
2

This doesn't seem like a good use of regex since you're dealing with fields:

str = "bash: line 14: krn: command not found"
str.split(': ')[2] # => "krn"

Here's a more thorough test:

[
  'bash: line 1: drs: command not found',
  'bash: line 2: tep: command not found',
  'bash: line 3: ldo: command not found',
  'bash: line 4: tep: command not found',
  'bash: line 5: txw: command not found',
  'bash: line 6: tep: command not found',
  'bash: line 7: jfp: command not found',
  'bash: line 8: mys: command not found',
  'bash: line 9: jhf: command not found',
  'bash: line 10: mjw: command not found',
  'bash: line 11: czw: command not found',
  'bash: line 12: txh: command not found',
  'bash: line 13: krn: command not found',
  'bash: line 14: sct: command not found',
  'bash: line 15: jad: command not found',
].each do |str|
  puts str.split(': ')[2]
end
# >> drs
# >> tep
# >> ldo
# >> tep
# >> txw
# >> tep
# >> jfp
# >> mys
# >> jhf
# >> mjw
# >> czw
# >> txh
# >> krn
# >> sct
# >> jad

If you don't know how many spaces will surround : delimiters, use strip to remove leading and trailing whitespace from the word captured:

str.split(':')[2].strip
the Tin Man
  • 158,662
  • 42
  • 215
  • 303
1
str =<<_
bash: line 1: drs: command not found
bash: line 2: tep: command not found
bash: line 3: not: command not found
bash: line 4: tep: command not found
bash: line 5: txw: command not found
_

r = /
    \d:\s+ # match a digit, colon and one or more spaces
    \K     # forget everything matched so far
    .{3}   # match any three characters
    /x     # extended/free-spacing regex definition mode

str.scan r
  #=> ["drs", "tep", "not", "tep", "txw"]

If you don't want "not":

str.scan(r) - ["not"]
  #=> ["drs", "tep", "tep", "txw"] 

If this is not a one-off calculation, consider whether the text format may change in future. If it might, implement a method that you think is least likely to require modification after the change.

Cary Swoveland
  • 106,649
  • 6
  • 63
  • 100
-1

This should do:

"bash: line.?: (.?):"

This will get everything from bash till the ": " after line and return in a group the three or more letters before the ":"

You can test it here http://rubular.com/

zon7
  • 529
  • 3
  • 12