1

I'm looking for a pythonic and efficient replacement for the following self-explanatory code:

term = "< << >" # turn this into "( < )"
term.replace("<<", "#~").replace(">>", "~#").replace(">", ")").replace("<", "(").replace("#~", "<").replace("~#", ">")

Any ideas?

Daniel F
  • 13,684
  • 11
  • 87
  • 116
  • Is every token separated by a space? – Joel Cornett Apr 27 '12 at 19:12
  • 3
    @MarkRansom To be fair, he is asking about a better way to do this, so he probably realises that. – Gareth Latty Apr 27 '12 at 19:23
  • @JoelCornett no, well it depends. You made me realize that I've got a problem, since "(<" should be a possible result string, which would require an input of <<< where the first one must get transformed to a ( and the following two to a <. Awww, I think I need to rethink and reformulate my requirements... – Daniel F Apr 27 '12 at 20:54
  • @MarkRansom, I did realize that, but I know that that sequence will most probably not occur anytime soon in the input, so that it can be used to test the body of code which actually uses the snippet I'm asking about here. Now I want to focus on this snippet. – Daniel F Apr 27 '12 at 20:56
  • 1
    @DanielF., I put that out there not only for your benefit but for those who would provide a better solution. And unlikely input always has a habit of turning up: http://caterina.net/archive/001011.html – Mark Ransom Apr 27 '12 at 21:03
  • Actually, the original problem is this one: [solved] In Python, how can I query a list of words to match a certain query criteria? http://stackoverflow.com/questions/10281863/in-python-how-can-i-query-a-list-of-words-to-match-a-certain-query-criteria Apparently I may need to be able to get a resulting string of (< as well as <) or >) and )>, which would only be possible to archive with >>> and <<<, respectively, so that's no solution. I think I need to use < for ( and something like \< for < ... – Daniel F Apr 27 '12 at 21:04
  • I see your problem, `'<<<'` is ambiguous, should it be `'<('` or `'(<'`? Using `'\'` for quoting has a long history behind it and would make a good alternative. – Mark Ransom Apr 27 '12 at 21:39

4 Answers4

3

Use regular expressions:

import re
d = {'<': '(', '>': ')'}
replaceFunc = lambda m: m.group('a') or d[m.group('b')]
pattern = r"((?P<a><|>)(?P=a)|(?P<b><|>))"
term = "< << >"

replaced = re.sub(pattern, replaceFunc, term) #returns "( < )"

EDIT per the recommendations of Niklas B.

The above regular expression is the equivalent of matching:

("<<" OR ">>") OR ("<" OR ">")

(?P<a><|>) #tells the re to match either "<" or ">", place the result in group 'a'
(?P=a) #tells the re to match one more of whatever was matched in the group 'a'
(?P<b><|>) #tells the re to match a sing "<" or ">" and place it in group 'b'

Indeed, the lambda function replaceFunc simply repeats this match sequence, but returns the relevant replacement character.

This re matches "largest group first", so "<<< >" will be converted to "<( )".

Joel Cornett
  • 24,192
  • 9
  • 66
  • 88
  • he wants the output to be '( < )'. – Ashwini Chaudhary Apr 27 '12 at 19:37
  • 1
    Nice. Can't really say I immediately understood how it works, though. Maybe you should add an explanation? BTW, that lambda could be shortified a bit to `replaceFunc = lambda m: m.group('a') or d[m.group('b')]` – Niklas B. Apr 27 '12 at 21:24
  • It works solid and is compact. Too bad my requirements have changed. But until I evaluate the new requirements, I'll be using this code. (I may be needing an ouput of (< as well as <( which is not possible with the duplicates.) – Daniel F Apr 27 '12 at 21:27
  • @NiklasB.: Edited. Feel free to add/change anything if you want. – Joel Cornett Apr 28 '12 at 00:58
2

Here's a shorter method than my first answer. It splits the input on the doubled-up character sequence to remove them, then joins those segments back up again with the replacement single character. As before it uses a dictionary to specify the replacements that should be made.

def convert(s, replacements):
    for before, after in replacements.items():
        s = before.join([segment.replace(before, after) for segment in s.split(before + before)])
    return s

>>> convert('< << >', {'<': '(', '>': ')'})
'( < )'
Mark Ransom
  • 299,747
  • 42
  • 398
  • 622
1

I'd put all my replace terms in a list, then iterate over that and replace:

CHARS = [
  ('<<', '#~'),
  ('>>', '~#'),
  ...
]

for replace in CHARS:
   term = term.replace(*replace)

Not sure if it is the most pythonic, but seems pretty clear. You could even factor the forloop that receives the list of chars.

Jj.
  • 3,160
  • 25
  • 31
0

I'm not sure how Pythonic this is, but it works and is extremely flexible.

def convert(s, replacements):
    pending_char = None
    result = []
    for c in s:
        if c in replacements:
            if c == pending_char:
                result.append(c)
                pending_char = None
            else:
                pending_char = c
        else:
            if pending_char:
                result.append(replacements[pending_char])
                pending_char = None
            result.append(c)
    if pending_char:
        result.append(replacements[pending_char])
    return ''.join(result)

>>> convert('< << >', {'<': '(', '>': ')'})
'( < )'
Mark Ransom
  • 299,747
  • 42
  • 398
  • 622
  • @AshwiniChaudhary, I did. Based on the question I assumed `'>>'` should be converted to `'>'` although the example didn't include it. – Mark Ransom Apr 27 '12 at 20:07
  • This errs on convert('<>> >', {'<': '(', '>': ')'}) the result is '> )' instead of (> ) But don't worry, as the requirements of this question seem to have changed. Thanks for the effort. – Daniel F Apr 27 '12 at 21:24