1

I have the following text:

SendNoticeMsg (api.post = "/test/SendNoticeMsg")
GenerateMsg (api.post = "/test/GenerateMsg")
GetUserLastAction (api.post = "/test/GetUserLastAction")

And I want to change the text to be:

SendNoticeMsg (api.post = "/test/send_notice_msg")
GenerateMsg (api.post = "/test/generate_msg")
GetUserLastAction (api.post = "/test/get_user_last_action")

Description: I just want to change the URL path to a valid underscore style, so the solution shouldn't change any other irrelevant characters.

I tried use the sed script:

sed -E 's/(\/test\/.*)([A-Z]).*\"/\1\2_\L/'

But it not works.

zdim
  • 64,580
  • 5
  • 52
  • 81
  • Could you please whatever all conditions you have to change your capital letters to small letters? Its not clear as of now, kindly do add more details in your question for better understanding. – RavinderSingh13 Dec 21 '20 at 07:15
  • Thank you for your suggestions. I just want to change the capital letters in the URL. – user2463906 Dec 21 '20 at 08:00
  • The question quoted to close this one as a dupe has indeed been asked (5 hours) before this one -- but that one has 3 (three) answers with 1+1 votes (nothing accepted) while this one has 8 (eight) answers with 2+2+1+3 votes. (This question also implements one request on the previous one, to include more test cases. Should've not been a new question, I agree, but now once we have them both I don't see that _this one_ should be closed on the account of the other one. Voting to reopen. – zdim Dec 21 '20 at 22:40
  • This question is essentially identical to this one except for the joinery character. https://stackoverflow.com/questions/65298102/is-it-possible-to-rename-pascalcase1-wav-to-kebab-case-1-wav-with-a-single-perl/65298186 – lordadmira Dec 22 '20 at 00:58
  • Here is the link to that [other question](https://stackoverflow.com/q/65386733/4653379), the same as this one, with some good answers as well – zdim Dec 22 '20 at 01:37

9 Answers9

4
perl -wnE'
    @p = m{(.*/)(.*)"};                       # break up into parts
    @w = $p[-1] =~ /([A-Z][a-z0-9]*)/g;       # extract (PascalCase-ed) words
    $p[-1] = join("_", map { lc } @w).q{")};  # low-case them, join with _ 
    say @p
' input.txt

Or overwrite input "in-place" by changing switches to perl -i.bak -wnE'...'

This assumes that the words, after the initial, can only have [a-z0-9]; adjust if needed.

zdim
  • 64,580
  • 5
  • 52
  • 81
3

Another perl solution:

$ perl -pe '$i=0; s/(?=\/test\/)(\S+)/$s=$1;$s=~s!([A-Z])!$i++?"_".lc($1):lc($1)!ge;$s/ge ' underscore2.txt
SendNoticeMsg (api.post = "/test/send_notice_msg")
GenerateMsg (api.post = "/test/generate_msg")
GetUserLastAction (api.post = "/test/get_user_last_action")
$

Enhancing further.

$ perl -pe '$i=0; s/(?=\/[^\/]+\/)(\S+)/$s=$1;$s=~s!([A-Z])!$i++?"_".lc($1):lc($1)!ge;$s/ge ' underscore2.txt
SendNoticeMsg (api.post = "/test/send_notice_msg")
GenerateMsg (api.post = "/test/generate_msg")
GetUserLastAction (api.post = "/test/get_user_last_action")
$
stack0114106
  • 8,534
  • 3
  • 13
  • 38
  • 3
    Please don't ping others to request a review of something you posted. – tripleee Dec 21 '20 at 07:25
  • @tripleee.. ok.. noted.. is it wrong to get your answers reviewed by experts.. – stack0114106 Dec 21 '20 at 07:31
  • By posting an answer you are already putting yourself up for a review. – tripleee Dec 21 '20 at 07:32
  • @tripleee Good point in principle, but the comment (and my response) is a bit personal between us; we've traded commentary of far removed questions and/or answers of each other a number of times in the past. It's a "hello" of sorts between us – zdim Dec 21 '20 at 07:34
  • 2
    @tripleee "_By posting an answer you are already putting yourself up for a review._" --- Most of the time people actually don't bother to review, and certainly not comment in detail (I suppose out of courtesy really, but still, the comments don't come) – zdim Dec 21 '20 at 07:39
  • @stack0114106 (1) I'd use a generic word (like `[^/]+` or such) instead of literal `test` (2) why do you need the `m` modifier in the end? It's for multi-line strings etc ... can't come up here? (The given example works without it, tested) – zdim Dec 21 '20 at 07:43
  • @zdim.. yes.. I'll change that.. Im also looking at ikegami's answer.. he made it shorter as well.. – stack0114106 Dec 21 '20 at 07:48
  • @tripleee.. I see that you are idealistic.. not realistic.. not all my answers are reviewed by the co-answerers. being a senior in SO you should encourage such knowledge conversations.. – stack0114106 Dec 21 '20 at 07:50
  • 1
    If you are looking for code reviews, there is [a separate site for them.](/codereview.stackexchange.com) You don't get _immediate_ feedback but if something is seriously wrong someone will comment, and if nothing is seriously wrong you will slowly receive upvotes. – tripleee Dec 21 '20 at 07:51
  • `$i=0;` isn't needed, even with warnings! – ikegami Dec 21 '20 at 08:16
  • `$s = $1; $s =~ s!...!...!ge; $s` => `$1 =~ s!...!...!ger` – ikegami Dec 21 '20 at 08:17
  • `$1 =~ s!([A-Z])!$i++ ? "_".lc($1) : lc($1)!ger` => `lc( $1 =~ s!(?=[A-Z])! $i++ ? "_" : "" !ger )` – ikegami Dec 21 '20 at 08:19
  • @ikegami.. ````lc( $1=~s!([A-Z])!$i++?"_":""!ger )```` strips the first char.. – stack0114106 Dec 21 '20 at 08:20
3
perl -pe's{/test/\K[^"]*}{ lc( $& =~ s/\w\K(?=\p{Lu})/_/gr ) }e'

or even

perl -pe's{/test/\K[^"]*}{ lc( $& =~ s/\B(?=\p{Lu})/_/gr ) }e'

We extract the path, and replace it with the string constructed by the replacement expression. We construct that string by inserting underscores in the appropriate places (between word characters and uppercase letters), then lowercasing the result.

See Specifying file to process to Perl one-liner.

ikegami
  • 367,544
  • 15
  • 269
  • 518
  • $& is a read-only variable.. right?.. how you are able to assign it in the inner substitution? – stack0114106 Dec 21 '20 at 07:37
  • @stack0114106, `s///r` returns the resulting string instead of modifying the var. So `perl -E'say "abc" =~ s/b/B/r'` can be used instead of `perl -E'my $tmp = "abc"; $tmp =~ s/b/B/; say $tmp;'` – ikegami Dec 21 '20 at 07:44
  • the "r" was vague to me..able to grasp now.. so ````r```` can be used only for the first substitution?.. – stack0114106 Dec 21 '20 at 08:01
  • (piece-a-magic:) .. that lookahead is a great solution for where to not-have `_` – zdim Dec 21 '20 at 08:05
  • you posted a lengthy solution in the beginning and I was able to comprehend it.. now it is sort of difficult. – stack0114106 Dec 21 '20 at 08:06
  • @stack0114106, The tricky part is `w\K(?=\p{Lu})`, which matches a word character followed by an uppercase letter. We `K`eep the word character, resulting in 0-width match. We then replace that 0-width string between the word char and the uc letter with `_`. – ikegami Dec 21 '20 at 08:12
  • @ikegami (did you add parens for readability? it worked for me without that and I thought `=~` binds strong enough in any case) – zdim Dec 21 '20 at 08:17
  • @zidm, Re "*Did you add parens for readability?*", Yes, precisely. It felt very icky to leave them out from the start. – ikegami Dec 21 '20 at 08:23
  • @stack0114106, Re "*so `r` can be used only for the first substitution?*", Sorry, just noticed this comment. I'm not sure what you're asking. Maybe you should play with it and post a question if you have more problems? – ikegami Dec 21 '20 at 08:30
  • @ikegami.. ````lc( $1=~s!(?=\w\k[A-Z])!_!gr )```` is throwing error.. – stack0114106 Dec 21 '20 at 08:31
  • @stack0114106 That's not the code I used. It should be an uppercase `K`. Also, you put the `(?=...)` at the wrong place. You want to insert the `_` after the \w, not before. This will also fix the nonsense use of `\K` inside of `(?=...)` – ikegami Dec 21 '20 at 08:32
  • @ikegami.. can you try answering https://stackoverflow.com/questions/65345478/moving-sql-logic-to-backend-bash – stack0114106 Dec 21 '20 at 11:34
  • I don't know awk. If the data is sorted, it's just a question of remembering the last value seen. If the data isn't sorted, then sort it. – ikegami Dec 21 '20 at 11:38
1

I'll add one more sed to the mix, a slight variation on the answer by @tripleee using # as an alternate substitution delimiter, e.g.

sed -E ':a;s#^(.*/)([^A-Z]*)([A-Z])#\1\L\2_\3#;ta;s#/_#/#'

The ^(.*/) allowing a greedy match of all characters from the beginning through the last '/' in each line.

Example Use/Output

$ sed -E ':a;s#^(.*/)([^A-Z]*)([A-Z])#\1\L\2_\3#;ta;s#/_#/#' file
SendNoticeMsg (api.post = "/test/send_notice_msg")
GenerateMsg (api.post = "/test/generate_msg")
GetUserLastAction (api.post = "/test/get_user_last_action")

(note: if not using GNU sed you will need to replace each ';' with -e to mark the separate expressions)

If for some reason your sed doesn't support ERE, then the "picket fence" like BRE would be:

sed ':a;s#^\(.*/\)\([^A-Z]*\)\([A-Z]\)#\1\L\2_\3#;ta;s#/_#/#'
David C. Rankin
  • 81,885
  • 6
  • 58
  • 85
0

Here's a Perl oneliner:

perl -pe 'sub to_snake { my $v = $_[0]; $v =~ s/([a-z])([A-Z])/$1_$2/g; return lc $v; } s@(/test/)(\w+)@$1.to_snake $2@e'

Note that you can change \w here to suit your needs, but should be sufficient for most purposes.

Breaking it down:

Perl's -p argument reads the input & writes to standard output; -e allows the program to be specified as a string on the command-line.

This part does the substitution on each line; it uses the e modifier to execute code:

s@(/test/)(\w+)@$1.to_snake $2@e'

This part defines the conversion function; it matches a lowercase letter preceding an uppercase letter, adds an underscore in between them and subsequently maps everything to lowercase using Perl's lc function:

sub to_snake { my $v = $_[0]; $v =~ s/([a-z])([A-Z])/$1_$2/g; return lc $v; }
ikegami
  • 367,544
  • 15
  • 269
  • 518
costaparas
  • 5,047
  • 11
  • 16
  • 26
  • `my $v = $_[0]; $v =~ s/([a-z])([A-Z])/$1_$2/g; return lc $v;` => `lc( $_[0] =~ s/([a-z])([A-Z])/$1_$2/rg )` => `lc( $_[0] =~ s/[a-z]\K[A-Z]/_$&/rg )` – ikegami Dec 21 '20 at 07:47
0

Your simple sed script performs a single replacement. You want to add a loop to make it match and replace as many times as necessary.

sed -E -e ':a' -e 's/(\/test\/.*)([A-Z])(.*\")/\1_\L\2\3/' -e 'ta' -e 's%/_%/%'

The :a creates a label a and ta branches back to that label if the latest replacement was successful. (I'm blindly assuming the other parts of your script are correct for you; the availability of the -E option and whether it provides an escape \L for lowercasing is by no means universal or portable, though I had to refactor it a bit, and add a fix for /test/_whatever/.)

tripleee
  • 175,061
  • 34
  • 275
  • 318
0

It seems all the answers use perl so here's one using awk

awk -F '"' '{
    gsub(/([^\/][A-Z])/," & ", $2);
    n=split($2,c," ");
    s=""
    for (i=1; i<=n; i++) {
        s = (i%2==1) ? (s c[i]) : (s substr(c[i],1,1) "_" substr(c[i], 2, 1));
    }
    $2=tolower(s);
    print $1 "\"" $2 "\"" $3;
}' input.txt

it's assuming no capturing groups or gensub available so it's separating the groups with spaces to later insert the _.

Diego Torres Milano
  • 65,697
  • 9
  • 111
  • 134
0

Assuming that you have 0 or 2 " per line and are interesting only in altering what is between them you might use AWK following way. Let file.txt content be

SendNoticeMsg (api.post = "/test/SendNoticeMsg")
GenerateMsg (api.post = "/test/GenerateMsg")
GetUserLastAction (api.post = "/test/GetUserLastAction")

then

BEGIN{FS=OFS="\""}{while(match($2,/[a-z][A-Z]/)){$2 = substr($2, 1, RSTART) "_" substr($2, RSTART+1)};$2=tolower($2);print}

output

SendNoticeMsg (api.post = "/test/send_notice_msg")
GenerateMsg (api.post = "/test/generate_msg")
GetUserLastAction (api.post = "/test/get_user_last_action")

Explanation I set both field seperator and output field seperator to ", then I alter $2 (what is inside ") as follows: as long as there is lowercase letter followed by upper case letter insert _ between them. Finally I change whole $2 to lowercase. (tested in gawk 4.2.1)

Daweo
  • 31,313
  • 3
  • 12
  • 25
-1

/in before lock

perl -pe '$_ = lc join "_", split /(?<=[^A-Z])(?=[A-Z])/;'

my $s = "Foo1BarXBazX";
print $s =~ s/^([A-Z])|(?<=[^A-Z])([A-Z]+)/defined $1 ? "\L$1" : "_\L$2"; /egr;
"foo1_bar_xbaz_x"


lordadmira
  • 1,807
  • 5
  • 14