1

I am trying to extract rating from a tweet using regular expression. For example for below tweet, I want to get the user rating(9.75) and maximum rating(10).

This is Logan, the Chow who lived. He solemnly swears he's up to lots of good. 9.75/10

I used below regex, but the capture groups 1 and 2 has results 75 and 10. I am not sure why the user rating is captured only after decimal group.

.*(\d+\.?\d+)\/(\d*\.?\d*)
Wiktor Stribiżew
  • 607,720
  • 39
  • 448
  • 563
Pratibha T
  • 11
  • 2
  • 1
    Well, there is a "greedy" quantifier issue here, remove `.*`. Then, just use `/(\d+(?:\.\d+)?)\/(\d+(?:\.\d+)?)/` – Wiktor Stribiżew Dec 05 '19 at 08:22
  • @PratibhaT [As the asker, you have a special privilege: you may accept the answer that you believe is the best solution to your problem](https://stackoverflow.com/help/someone-answers) – AndreasPizsa Dec 05 '19 at 11:11

2 Answers2

2

If you want both numbers to have optional decimal you should place the match one or ore + and the match zero or more * on the correct places, where they match the mandatory leading digit and then the optional decimals

(\d+\.?\d+)\/(\d*\.?\d*)

with

(\d+\.?\d*)\/(\d+\.?\d*)

This will match at least one digit followed by maybe a dot and then again maybe some more digits.

Live link: https://regex101.com/r/qc5Zwz/1

Simson
  • 3,373
  • 2
  • 24
  • 38
2
\b(\d+(?:\.\d+)?)\/(\d+)\b
  • \b - expect a word boundary (eg, space, non-letter character)
  • ( - start capturing the 'rating'
  • \d+ - integer part
  • (?:\.\d+)? - wrap the decimal part, don’t capture it as a group; make it optional
  • ) - end of 'rating' capturing group
  • \/- expect a forward slash
  • (\d+) - capture the 'maximum'
  • \b - expect a word boundary again

const text = 'This is Logan, the Chow who lived. He solemnly swears he\'s up to lots of good. 9.75/10'
const pattern = /\b(\d+(?:\.\d+)?)\/(\d+)\b/
console.log(text.match(pattern))

https://regex101.com/r/foO1DF/2

AndreasPizsa
  • 1,736
  • 19
  • 26