0

I'm building two hybrid expressions in SQL Alchemy where I want to select i) all the digits before and ii) all the digits after a hyphen in a string.

Given the string 6-12(1) I'd like the outputs 6 and 12 respectively.

Following this answer I figured I'd need r"(\d+)-" for i) so:

@digits_before.expression
def digits_before(cls):
    return func.regexp_substr(cls.string_field, r"(\d+)-")

However, this returns 6- for the record in question.

Looking at the SQL that's generated the regex string is reduced to '(d+)-' - could this be why?

What are the correct two regex strings I need to use?

Thanks in advance.

Jossy
  • 589
  • 2
  • 12
  • 36
  • If the backslash is removed, you need to double it in the code. Better, just use `[0-9]` instead of `\d`. – Wiktor Stribiżew Aug 12 '21 at 15:03
  • Thanks. So changed the Python regex to `r"([0-9]+)-"` which produced `"([0-9]+)-"` in SQL however I'm still getting `"6-"` :-( – Jossy Aug 12 '21 at 15:37
  • Correct, `regexp_substr` only returns the first match and returns the whole match, not just the captured part unless you specify the group ID as the third argument. – Wiktor Stribiżew Aug 12 '21 at 15:38
  • Ah ok. When I read through the documentation the third argument is `pos` which is the position in the string to start the search - pretty sure this isn't what I want? The fourth and final argument is `occurence` - is this the group ID you mean? I'm not sure what to enter here? – Jossy Aug 12 '21 at 15:46
  • Ok, if there is no group number to pass, it does not even allow that (it does in some other implementations). – Wiktor Stribiżew Aug 12 '21 at 15:51

1 Answers1

0

Seems like this regexp would capture the two numbers:

(\d+)-(\d+)

This gets just the first number, but verifies that there is a second:

(\d+)-\d+

Vice versa:

\d+-(\d+)
Rick James
  • 135,179
  • 13
  • 127
  • 222