1

I am trying to write a function to remove suffix from a string. The suffix is listed below:

agent_pkg
agent
pkg
driver
abs_if
abs_if_pkg
if_pkg
if

Test string:

test_blah_agent_pkg
test_blah_agent
test_blah_pkg
test_blah_driver
test_blah_abs_if
test_blah_abs_if_pkg
test_blah_if_pkg
test_blah_if

From the above test string, I expect to get test_blah from it.

I wrote a function like this:

(defun get-base-name (name)
  "Get the base name from string."
  (setq s (substring-no-properties name))
  (string-match "\\(.*\\)_\\(agent_pkg\\|agent\\|driver\\|abs_if\\|if\\|pkg\\)" s)  
  (match-string 1 s))

but it always just match the short candicates. I got test_blah_abs from (get-base-name "test_blah_abs")

Enze Chi
  • 1,733
  • 17
  • 28

1 Answers1

4

.* is greedy¹, meaning it would try to cover as much as possible, as long as the string matches the regex. You want to make it non-greedy, to stop as soon as the match is found. Adding ? just after * or + makes it non-greedy. Compare:

(let ((s "abcabcabc"))
  (string-match ".*c" s)
  (match-string 0 s)) ; => "abcabcabc"
(let ((s "abcabcabc"))
  (string-match ".*?c" s)
  (match-string 0 s)) ; => "abc"

.*? is a non-greedy version of .*, so just adding ? makes it work:

(let ((s "test_blah_agent_pkg
test_blah_agent
test_blah_pkg
test_blah_driver
test_blah_abs_if
test_blah_abs_if_pkg
test_blah_if_pkg
test_blah_if"))
  (string-match "\\(.*?\\)_\\(agent_pkg\\|agent\\|driver\\|abs_if\\|if\\|pkg\\)" s)
  (match-string 1 s)) ; => "test_blah"

FYI, third-party string manipulation library s has plenty of string functions that you mind useful instead of relying on regular expressions all the time. E.g. s-shared-start can find a common prefix for 2 strings:

(s-shared-start "test_blah_agent" "test_blah_pkg") ; "test_blah_"

Combined with s-lines, which breaks a string into a list of strings by newline character, and -reduce function from the amazing third-party list manipulation library dash, you can find a prefix that is common for every string:

(let ((s "test_blah_agent_pkg
test_blah_agent
test_blah_pkg
test_blah_driver
test_blah_abs_if
test_blah_abs_if_pkg
test_blah_if_pkg
test_blah_if"))
  (-reduce 's-shared-start (s-lines s))) ; => "test_blah_"

¹ Read under section Greediness to understand this concept.

Community
  • 1
  • 1
Mirzhan Irkegulov
  • 17,660
  • 12
  • 105
  • 166
  • Hi @sindikat. thanks for your answer. It solved my problem. The other options with `s` may not work for this case because every time I only have single string try to match with the suffixes. But there's a lot of knowledge in your reply. I will read the packages you listed carefully. Thanks again. – Enze Chi Feb 26 '15 at 10:07