0

Not sure if this is possible... but I need to find (and replace) all commas within strings, which I'm going to run on a PHP code file. i.e., something like "[^"]+,[^"]+" except that'll search on the wrong side of the strings too (the first quote is where a string ends, and the last one where it begins). I can run it multiple times to get all the commas, if necessary. I'm trying to use the Find-and-Replace feature in Komodo. This is a one-off job.

Well, here's my script so far, but it isn't working right. Worked on small test file, but on the full file its replacing commas outside of strings. Bah.

import sys, re

pattern = ','
replace = '~'

in_str = ''
out_str = ''
quote = None
in_file = open('infile.php', 'r')
out_file = open('outfile.php', 'w')
is_escaped = False # ...

while 1:
    ch = in_file.read(1)
    if not ch: break

    if ch in ('"',"'"):
        if quote is None:
            quote = ch
        elif quote == ch:
            quote = None

            out_file.write(out_str)
            out_file.write(re.sub(pattern,replace,in_str))
            in_str = ''
            out_str = ''

    if ch != quote and quote is not None:
        in_str += ch
    else:
        out_str += ch


out_file.write(out_str)
out_file.write(in_str)

in_file.close()
out_file.close()
mpen
  • 272,448
  • 266
  • 850
  • 1,236
  • A similar [SO question](http://stackoverflow.com/questions/249791/regexp-for-quoted-string-with-escaping-quotes) - this is not an easy task with regex. – nevets1219 Apr 07 '11 at 00:31
  • Sounds like parsing a comma to me. should a spit() be enough. No, I didn't think so. Oh well, you need a real expert in regulare expressions because your requirements will not stop whit this, will it? –  Apr 07 '11 at 01:20
  • @sln: No, split most certainly would not be enough, because that won't search inside string literals only. Do you even understand what I'm asking? I'm actually pretty proficient with regexes, but they're not very well-suited for this kind of problem,... – mpen Apr 07 '11 at 06:28

1 Answers1

3

I take it your trying to find string literals in the PHP code (i.e. places in the code where someone has specified a string between quote marks: $somevar = "somevalue"; )

In this case, it may be easier to write a short piece of parsing code than a regex (since it will be complicated in the regex to distinguish the quote marks that begin a string literal from the quote marks that end it).

Some pseudocode:

inquote = false
while (!eof)
    c = get_next_character()
    if (c == QUOTE_MARK)
        inquote = !inquote
    if (c == COMMA)
        if (inquote)
            delete_current_character()
Nick W.
  • 1,050
  • 2
  • 9
  • 21
  • Yes, string literals are what I want... bah. Then I have to do file read/write garbage.... maybe I'll write a Python script... – mpen Apr 06 '11 at 23:59
  • @Mark - Time intensive. When are you going ti WRITE regexe's instead of asking for them dude? –  Apr 07 '11 at 01:23
  • @sln: What are you talking about? – mpen Apr 07 '11 at 06:23