I'm fairly new to coding, and I've looked around and haven't found an answer to this (maybe I'm wording it wrong?). I'm working on a code that finds numbers in comments, converts them to floats, and adds them to the end of a list (float_numbers).
Currently, if someone says, "2 Thousand," then it'll place it at the end of the float_numbers list as [2, 1000], because it reads that as being two separate numbers.
I'm currently using the module "word2number" to convert the number words to number numbers.
I'll also need to convert numbers formatted as "3k" "4b" etc...
Here's the process in my code that sorts numbers out of comments.
def extract_numbers_from_text(text):
# Use regular expressions to search for numeric patterns
numbers = re.findall(r'\b\d+(?:[.,]\d+)?\b|\b[a-z]+\b', text)
float_numbers = []
# Filter out some numbers
for number in numbers:
if number in ["infinite", "infinity"]:
continue
if 'http' in comment.body:
return
try:
#if number is already a float, add it to the list
float_numbers.append(float(number))
except ValueError:
try:
#if number is string, convert it to a float and add it to the list
float_numbers.append(w2n.word_to_num(number))
except ValueError:
#if word cannot be converted to number, ignore it
pass
return float_numbers