2

I need to get a list of all placeholders in a string:

Thus, "There're %(num_items)d items in the %(container)s" should yield (('num_items', 'd'), ('container', 's')).

What I tried:

1) I tried looking into the source code and found that the

PyObject *
PyString_Format(PyObject *format, PyObject *args)

function does % interpolation on C level.

2) I also tried searching pypi and found a parse lib that does the same thing as string.Formatter.parse which is parsing {}-style string, which is not what I need.

Warning: a quick regexp is unlikely to cover all syntax of % substitution, which is what I need.

Similar question: How can I find all placeholders for str.format in a python string using a regex?

Update

It seems to be solvable pretty well with a reasonably complex regexp, so it will make a nice homework task.

I'll accept this as an answer in two days and I don't anticipate any new answers to the question.

Update2

Is the question so localized that will never be useful to anyone else (except maybe those taking the same class)? If so, vote to close.

(from Please clarify the policy on homework questions)

Community
  • 1
  • 1
Antony Hatchkins
  • 31,947
  • 10
  • 111
  • 111
  • Yes, that's a very nice behavior to vote to close without leaving a comment – Antony Hatchkins Aug 31 '15 at 18:55
  • It appears you're asking for a library, that should explain the close vote – Tim Aug 31 '15 at 18:58
  • @Tim Castelijns Yes, probably. I've carefully reworded the question to avoid such allegations. – Antony Hatchkins Aug 31 '15 at 19:00
  • I would phrase it like this *A regex is unlikely to cover all syntax of %-substitution, so I'm looking for another solution*, removing anything that might look like you're asking for a library – Tim Aug 31 '15 at 19:03
  • @Tim Castelijns Thanks, fixed – Antony Hatchkins Aug 31 '15 at 19:10
  • I'm voting to close this question as off-topic because it looks like a homework. – Antony Hatchkins Aug 31 '15 at 19:49
  • @Tim Castelijns How did he manage to vote to close without leaving a message when the message is inserted automatically! – Antony Hatchkins Aug 31 '15 at 19:51
  • That is only when someone enters a custom reason for closing, or when voting to close as duplicate. Why do you vote to close your own question? – Tim Aug 31 '15 at 19:56
  • @Tim Castelijns Because I've found a solution and don't need new answers since they would potentially spoil a nice homework task for my students. Yet I don't delete it for maybe someone would suggest a non-regex solution in the comments. Would you please vote for closing too? – Antony Hatchkins Aug 31 '15 at 20:09
  • No, that is not how it works. SO is for everybody, not just for your students. If any answers should arise that are better than the regex solution, they are welcome – Tim Aug 31 '15 at 20:11
  • @Tim Castelijns No, that's not how it works. There's a rule that a question that looks like a homework should not be answered. I respect your opinion but I don't share it. – Antony Hatchkins Aug 31 '15 at 20:24
  • Why would you ask a question, knowing it's a bad one, with the purpose of having it closed once **one** half decent solution is given? Also, *There's a rule that a question that looks like a homework should not be answered* could you redirect me to the page that states that rule? That is new information for me – Tim Aug 31 '15 at 20:30
  • @Tim Castelijns Who told you it's a bad question? It's a good one! At the moment I asked it I didn't know that the syntax of % interpolation is _that_ simple for it seemed to me that 300 lines of c code cannot be represented by regex one-liner. Well maybe it is not a rule but rather a policy. The homework policy is an age-old flaw of SO. Earlier there was a dedicated tag, now it is [deprecated](http://meta.stackexchange.com/questions/147100/the-homework-tag-is-now-officially-deprecated). In update2 I quoted the point that is violated in my opinion. – Antony Hatchkins Aug 31 '15 at 21:00
  • I don't think it's a bad question. You implied *you do* when you voted to close it. Anyway, homework related questions can be good questions, too. They just need to be clear and show some effort where possible (which I think this one does). Also, *Is the question so localized that will never be useful to anyone else* I believe this question can be of value to people who are not taking the class. – Tim Aug 31 '15 at 21:05
  • As far as I understand a closed question is not deleted, it is just closed for new answers, which is what I now want :) The bad thing about answering homework questions is that it not only diminishes the effect from doing a homework of the one who asks, but also of the subsequent readers who got the same homework. Ok, I got your point, don't vote if you don't wish to. – Antony Hatchkins Aug 31 '15 at 21:15

2 Answers2

0
import re

s = "There're %(num_items)d items in the %(container)s"
print re.findall(r'%\((.*?)\)', s)
Cody Bouche
  • 945
  • 5
  • 10
0

I ended up with this regexp:

re.findall(r'%\(([^)]+)\)[0-9]*(?:\.[0-9]*)?([diouxXeEfFgGcrs%])', a)

as a sensible approximation to the problem (matching 5 tokens out of 7).

Antony Hatchkins
  • 31,947
  • 10
  • 111
  • 111
  • What's the extra stuff beyond `%\(([^)]+)\)` ? –  Aug 31 '15 at 20:52
  • @sln That's to match `%(x)12.3f`. But I don't want to match `%(x)12.3f` only. I want to match any kind of stuff capable of being interpolated in a string. Plus I've updated the question a little bit: I found out that type information is also useful for me. – Antony Hatchkins Aug 31 '15 at 21:06
  • Oh, sort of like printf –  Aug 31 '15 at 21:18