1
 PriceStr=group[2]
 price=Decimal(sub(r'[^\d.]', '', PriceStr))

In this part of code, a variable string for money is being converted into Decimal. What does the second line of this code actually mean? Why is the 'sub', the 'r', the apostrophes, the '^' etc. needed?

enigma
  • 3,476
  • 2
  • 17
  • 30
khushi
  • 57
  • 1
  • 6

2 Answers2

2

They delete all symbols from the string except numbers and dots,

for example '$1,346.9 total' is converted to '1346.9'

Antony Hatchkins
  • 31,947
  • 10
  • 111
  • 111
0

In Python r'string' means raw string, i.e. string where escape sequences are not valid. Compare for example:

print(r'foo bar\n')
print('foo bar\n')

In the second case \n interpreted as a notation for new line, while at the raw string it is just a slash and a letter n. Find out more about raw strings for example here.

The method sub has been imported from module re. At the top of your code you likely will find this line:

from re import sub # (or `from re import *`)

In my opinion it is better to import re and after access sub as re.sub, this way it is unambiguous.

The first argument of sub is a regular expression. Regular expressions (regex) are a big topic, you can find excellent resources to understand them here or and here. What this particular regex does:

  • defines a character class (between square brackets [ and ])
  • tells that this character is not a digit (\d means digit, same as 0-9), and ^ means negation (not)
  • then it tells that this character can be anything (dot . in regex means any character

The so called regex engine will look for matches of this one character pattern, for example in abc123 it will have 3 matches: a, b and c. The second argument of sub tells what to replace these matches with. Here you tell replace with nothing (empty string, i.e. nothing between 2 quote marks: ''). The third argument of sub provides the string you want to do this operation on. Then the result is passed to a method or class called Decimal, you can look this up in your code and find out what it does.

Not part of the answer, but general advices how to handle better similar cases:

If you want to understand a basic code like this, you can import its methods and try them one-by-one, or temporarily add print() statements, reload the module and call the methods. For example let's say your module is called pricecalculator, and is in a directory with the same name or in a file pricecalculator.py, then you go to that directory, open a Python shell, and type:

import imp
import pricecalculator
from pricecalculator import *

Then you can call any method from this module. Let's say you are wondering what is the variable group, and what is its third element? Then add the line print('group: ', group) or print('type of group: ', type(group)) above the lines in your post, and reload the module:

imp.reload(pricecalculator)
from pricecalculator import *

Then you call the method where the code takes place, and you will see the content of the variable printed.

Also if you see a method and you are wondering what it does, try to find out where does it come from, which module, and look up the documentation of the method. For example:

import re
# see the documentation:
help(re.sub) # press `q` to return to shell

from re import *
# find the module for a method:
sub.__module__

And try to experiment directly with the methods directly in the shell, read the docs, try things and understand the error messages. Here is a guide what each types of errors mean.

Finally, before asking a question here, always think: is there a chance that the answers will help someone else? Try to formulate the title and the question accordingly.

deeenes
  • 4,148
  • 5
  • 43
  • 59