2

In python I can do the following:

who = "tim"
what = "cake"
print "{0} likes {1}".format(who, what)

to yield "tim likes cake".

However the inverse operation is not as straightforward since I need to use regular expressions. I mean, to parse a string of known structure and extract the portions I know it contains, and store them into my variables. This extraction I perform by:

import re

expression = "([a-z]*) likes ([a-z]*)"
input_line = "tim likes cake"

who, what = re.search(expression, inputline).groups()

which is neat enough for small amount of parameters, but it has two main drawbacks for me compared to my idea of "ideal inverse" to format():

  • Parameters extracted are always strings, they need to be converted to float with extra lines. Format handles internally the conversion needed, from any value to string.
  • I need to define different templates for input and output, because the input template in regular expression form "([a-z]*) likes ([a-z]*)" cannot be reused for the "exporting" of the data, in the format function.

So, my question is, does a function like this exists, which would parse automatically the string and get the values the same way as we print them to the string, following almost the same syntax like
"{0} likes {1}".extract(who,what,input_line="tim likes cake")

I am aware I can create my custom "extract" function which behaves as desired, but I don't want to create it if there is already one available.

rmhleo
  • 191
  • 1
  • 7
  • So... Natural language processing? – OneCricketeer Jul 22 '16 at 14:06
  • I think this is much more simpler, because the template sentence is given and the info to extract is specified. One option would be splitting in spaces, and extract the portions which are signaled in the template with escape characters. But again, I am looking for the existing option, before than making my own. – rmhleo Jul 22 '16 at 14:11

3 Answers3

1
who = "tim"
what = "cake"
print "{0} likes {1}".format(who, what)

This works because you know exactly where who and what are in the string. If that's the case, you don't need regex. Strings are lists of characters :)

def extract_who_what_from_string(string):
    words = string.split(" ")
    who = words[0]
    what = words[-1] 
    return who, what

Anything more complicated than this is, in fact, natural language processing and would be very much out of my scope.

joaquinlpereyra
  • 956
  • 7
  • 17
1

Here's an idea.

import re 

template ="{0} likes {1}"
str_re = r"\w+"
re.search(template.format(str_re, str_re), ...) 

Though, seems messy

OneCricketeer
  • 179,855
  • 19
  • 132
  • 245
1

There doesn't seem to be a built-in solution beyond splitting the string and casting the components or using re.

Which is a little weird, because format can be used to specify types on input: "{0:03d}_{1:f}".format(12, 1) gives '012_3.000000', so I'm not sure why there's no "012_3.000000".extract("{0:03d}_{1:f}", [a, b]), but .. maybe only people coming from C want such a thing.

In any case, you may find the parse module useful, as suggested in this answer.

Community
  • 1
  • 1
Corey
  • 1,845
  • 1
  • 12
  • 23