Python: Replace lower case to upper, and vice versa, simultaneously via ReGex lib

Question

I open this question although a duplicate of it exists in C# lang. (without the relevant resolution needed).

Im trying to replace all the lower case characters in a given string, with upper case chars, and vice versa. This should be done simultaneously with the least time complexity (because of use in high volume of verbal translations).

The IO:

input: str_1 = "Www.GooGle.com"

output: "wWW.gOOgLE.COM"

The code:

import re                   # import RegEx lib

str_1 = "Www.GooGle.com"    # variable tested

def swapLowerUpper(source):
    # takes source string
    # returns group 0 regex to lower and group 1 regex to upper, by source
    return re.sub(r'([A-Z]+)|([a-z]+)', lambda x: x.group(0).lower(), source)

# check the output
print(swapLowerUpper(str_1)

The Question:

I have a hard time in triggering the second group (which index is 1) and apply the attribute ".upper()" on it. My attempt was to open it as {x: x.group(0).lower(), x: x.group(1).upper()} which failed.

Andreas Violaris · Accepted Answer · 2023-04-20T21:54:31.327

If your goal is to accomplish the task with the least amount of time complexity, perhaps it would be more efficient to utilize a Python built-in method rather than relying on regular expressions:

str_1 = "Www.GooGle.com"
print(str_1.swapcase())

Efficiency comparison using a randomly generated mixalpha string with a length of 1000 characters over 5000 iterations:

import timeit
import re
import random
import string


def swapLowerUpper_regex(source):
    return re.sub(r'([A-Z]+)|([a-z]+)', lambda x: x.group(1).lower() if x.group(1) else x.group(2).upper(), source)


def swapLowerUpper_swapcase(source):
    return source.swapcase()


# Generate a random mixalpha string with length 1000
word = ''.join(random.choices(string.ascii_letters, k=1000))

# Define the number of iterations to run
num_iterations = 5000

# Time the execution of the swapLowerUpper() function using regular expressions
elapsed_times_regex = []
for i in range(num_iterations):
    start_time = timeit.default_timer()
    swapLowerUpper_regex(word)
    elapsed_time_regex = timeit.default_timer() - start_time
    elapsed_times_regex.append(elapsed_time_regex)

# Time the execution of the swapcase() method
elapsed_times_swapcase = []
for i in range(num_iterations):
    start_time = timeit.default_timer()
    swapLowerUpper_swapcase(word)
    elapsed_time_swapcase = timeit.default_timer() - start_time
    elapsed_times_swapcase.append(elapsed_time_swapcase)

# Compute the average elapsed time of each method
avg_elapsed_time_regex = sum(elapsed_times_regex) / len(elapsed_times_regex)
avg_elapsed_time_swapcase = sum(elapsed_times_swapcase) / len(elapsed_times_swapcase)

# Print the average elapsed times
print(f"Average elapsed time using swapLowerUpper_regex(): {avg_elapsed_time_regex:.10f} seconds")
print(f"Average elapsed time using swapLowerUpper_swapcase(): {avg_elapsed_time_swapcase:.10f} seconds")

Output:

Average elapsed time using swapLowerUpper_regex(): 0.0002279637 seconds
Average elapsed time using swapLowerUpper_swapcase(): 0.0000108802 seconds

score 2 · Answer 2 · answered Apr 20 '23 at 20:48

2

Check whether x.group(1) or x.group(2) was matched, and return the appropriate replacement.

def swapLowerUpper(source):
    return re.sub(r'([A-Z]+)|([a-z]+)', lambda x: x.group(1).lower() if x.group(1) else x.group(2).upper(), source)

answered Apr 20 '23 at 20:48

Barmar

741,623
53
500
612

Im very thankful to you sir. Your answer resolved my function's issue and also allowed me to better understand how to properly incorporate logic in lambda, under regular expression. – Mabadai Apr 20 '23 at 20:58
1

Unless this was an exercise to learn regular expressions, you should accept the answer that uses `swapcase()`, it's the best way to do it. – Barmar Apr 20 '23 at 20:59
It works great for short strings, hence I can accept the swapcase() method answer, although my concern is for huge amount of input of data. – Mabadai Apr 20 '23 at 21:08
1

That's the point. `swapcase()` will be most efficient for large amounts. – Barmar Apr 20 '23 at 21:10
Im sorry sir, I cannot understand. From what I already know the swapcase() is a built in method that iterates over each char in a string and applies the right logic on it. On the other side, regex cascades to finite state of each logic (if lower or upper) simultaneously, with the changes applied. Maybe you can throw me some references that I could read more about the math function of both of the approaches. Thanks in advance. – Mabadai Apr 20 '23 at 21:19
3

`swapcase()` is written in simple C code. Regular expressions are more expensive, and the callback function in `re.sub()` is lots of overhead. See the performance differences in CritiFati's answer. – Barmar Apr 20 '23 at 21:24
1

CristiFati has already provided the relevant metrics, but since you expressed concerns about handling large amounts of input data, I have updated my answer to demonstrate how each method performs under such conditions. – Andreas Violaris Apr 20 '23 at 21:58

score 1 · Answer 3 · answered Apr 20 '23 at 20:47

1

Try:

import re

str_1 = "Www.GooGle.com"

str_1 = re.sub(
    r"([a-z]+)|([A-Z]+)",
    lambda g: g.group(1).upper() if g.group(1) else g.group(2).lower(),
    str_1,
)
print(str_1)

Prints:

wWW.gOOgLE.COM

answered Apr 20 '23 at 20:47

Andrej Kesely

168,389
15
48
91

Python: Replace lower case to upper, and vice versa, simultaneously via ReGex lib

3 Answers3

Linked