3

Need to convert every occurrence of the string "gotcha" in a text file to gotcha[1], gotcha[2], gotcha[3] etc. (in order).

I can do this easily with a simple C++ program but wondered if there's an easier way. Regex-replace in my text editor doesn't appear to be capable. After some surfing, it looks like Perl, sed, or awk might be the right tool, but I'm not familiar with any of those.

anouar.bagari
  • 2,084
  • 19
  • 30
mathematrucker
  • 166
  • 1
  • 9
  • @user1775603: …and that replace feature utilizes regex for matching (though not necessarily needed here). And he asks *which* language he should use best to do this task :-) – Bergi Apr 27 '13 at 18:51
  • Concur; you can use regex for matching but replacement is not a feature of regular expressions proper, much less generating dynamic replacement strings. – tripleee Apr 27 '13 at 19:05
  • Possible duplicate of http://stackoverflow.com/questions/10892276/regex-replace-with-the-count-of-the-match – Benjamin Toueg Apr 27 '13 at 19:14
  • @btoueg thank you for pointing out that thread. It does ask for the same thing I'm asking for. The responses indicate I'm best off just writing my little C++ program. My main tools besides C++ are Lugaru's Epsilon (a very powerful editor), Bare Bones's BBEdit (strong regex capability), and Apple's FileMaker Pro database app (surprisingly useful for text manipulation). – mathematrucker Apr 27 '13 at 19:36
  • Just figured out a quick way to get what I need with BBEdit: Since the numbers only need to assign a unique key to each match, I first isolate each "gotcha" onto its own line by replacing it with "\rgotcha\r" (\r being carriage return), then apply a BBEdit text factory that inserts line numbers into the file. Then regex-replace to get "gotcha[]" format. Then regex-replace back to the original state of the file. Lastly I need to retrieve the numbers that got used, but that's easily accomplished with a simple find. – mathematrucker Apr 27 '13 at 20:17
  • @Bergi glad I was able to supply you with a chuckle. In my mind the thing that "didn't appear to be capable" wasn't "regex-replace" in general it was "regex-replace in my text editor" which for all I knew might be different from "regex-replace in [Perl, sed, awk]." Regex is so useful and powerful, it still ended up playing an important role in the patchwork solution I ultimately went with, described in my comment above. – mathematrucker Jan 12 '16 at 14:44

4 Answers4

1

In ruby,

count = 0
"gotcha gotcha gotcha".gsub(/(gotcha)/) {|s| count+=1; s + "[" + count.to_s  +  "] ";}

Output:

 => "gotcha[1]  gotcha[2]  gotcha[3] "

But this is very ruby specific way.

Knowing the language that you want to use will help to get language specific solution.

vaichidrewar
  • 9,251
  • 18
  • 72
  • 86
1

I don't know if other languages support this, but in PHP you have the e modifier, which is ofcourse bad to use and is deprecated in recent PHP versions. So this is a POC in PHP:

$string = 'gotcha wut gotcha wut gotcha wut gotcha PHP gotcha rocks gotcha !!!'; // a string o_o
$i = 0; // declaring a variable i which is 0

echo preg_replace('/gotcha/e', '"$0[".$i++."]"', $string);


/*
   + echo --> output the data
         + preg_replace() --> function to replace with a regex
                + /gotcha/e
                    ^     ^--- The e modifier (eval)
                    --- match "gotcha"

                + "$0[".$i++."]"
                  $0 => is the capturing group 0 which is "gotcha" in this case"
                  $i++ => increment i by one
                  Ofcourse, since this is PHP we have to enclose string
                 between quotes (like any language :p)
                 and concatenate with a point:  "$0["   .   $i++   .   "]"

                + $string should I explain ?
*/

Online demo.


And ofcourse, since I know there are some haters on SO I'll show you the right way to do this in PHP without the e modifier, let's preg_replace_callback !

$string = 'gotcha wut gotcha wut gotcha wut gotcha PHP gotcha rocks gotcha !!!';
$i = 0;
// This requires PHP 5.3+
echo preg_replace_callback('/gotcha/', function($m) use(&$i){
    return $m[0].'['.$i++.']';
}, $string);

Online demo.

HamZa
  • 14,671
  • 11
  • 54
  • 75
1

In python it could be:

import re

a = "gotcha x gotcha y gotcha z"

g = re.finditer("gotcha", a)

for i, m in reversed(list(enumerate(g))):
    k = m.end()
    a = '{}[{}]{}'.format(a[:k], i, a[k:])

print a

Of course, you can cram it all into a single line (for some higher purpose of saving vertical space)

Jakub M.
  • 32,471
  • 48
  • 110
  • 179
1

In Perl:

$a = "gotcha x gotcha y gotcha z";

$i = -1; $a =~ s/(gotcha)/$i+=1;"gotcha[$i]"/ge;

print "$a\n";
Jakub M.
  • 32,471
  • 48
  • 110
  • 179