0

I am using Portia to extract info from a page. However, one of the values extracted is not in a format that I can use.

More specifically, I want to extract a numeric value which uses a dot instead of a comma to denote thousands e.g. "1.000" instead of "1,000".

Is it possible to extract and then transform with Portia? I can set a regex to extract numbers but is it possible to replace them too?

What I'm doing now is that I export the data to csv and then use sed to replace the numbers in question.

Thanks

George Eracleous
  • 4,278
  • 6
  • 41
  • 50

1 Answers1

0

Check: How do I use Python to convert a string to a number if it has commas in it as thousands separators?

import locale
locale.setlocale( locale.LC_ALL, 'de_DE.UTF-8' )
locale.atoi('1.000')
# 1000

Basically it's string to number with the correct format mask

Thomas Strub
  • 1,275
  • 7
  • 20
  • Hey Thomas. Thanks for your answer. However, Portia is a UI which allows me from avoid to write code to crawl a page. So I was wondering if there was a way via the UI itself to do that. There is a way to declare a regex which is used to extract a value from a string but there isn't a way to replace/transform that value. – George Eracleous Aug 30 '18 at 12:01
  • Probably in the UI there is a convert method as well. Should be a standard function – Thomas Strub Aug 30 '18 at 13:13