0

I am trying to solve a problem using Python. The problem: Given a string like 'a b c d', replace the space with '_' for all combinations. So here, the expected output would be: 'a_b c d', 'a_b_c d', 'a_b_c_d', 'a b_c d', 'a b_c_d', 'a b c_d'

I am using a sliding window approach for this, here's my code:

ip = 'a b c d'
org = ip
res = []

for i in range(len(ip)):
    if ip[i] == ' ': i += 1
    for j in range(i + 1, len(ip)):
        if ip[j] == ' ':
            ip = ip[:j] + '_' + ip[j+1:]
            res.append(ip)
        j += 1
    i += 1
    ip = org

The problem is, the 2nd for loop is looping twice and appending duplicate results.

Result: ['a_b c d', 'a_b_c d', 'a_b_c_d', 'a b_c d', 'a b_c_d', 'a b_c d', 'a b_c_d', 'a b c_d', 'a b c_d']

I am not able to figure why this is happening, would appreciate any help.

Thanks!

FenderBender
  • 303
  • 3
  • 16
  • 3
    `ip = org` is not creating a copy. You may find it helpful to read through [this](https://nedbatchelder.com/text/names.html). – Axe319 Jan 10 '23 at 13:51
  • You are modifying `ip` in the 2nd loop while looping over it in both loops. This can lead to unexpected behaviour. Ideally use a different variable name here to avoid confusion. – match Jan 10 '23 at 13:51
  • 1
    Unless I'm missing something, isn't `a b c d`, `a_b c d`, `a b_c d`, `a_b_c d`, `a b c_d`, `a_b c_d`, `a b_c_d`, `a_b_c_d` the full list of combinations? – AKX Jan 10 '23 at 13:59
  • @AKX there are duplicates – mozway Jan 10 '23 at 14:01
  • 1
    Also, this `if ip[i] == ' ': i += 1` doesn't advance your `for` loop. What you want is something like `if ip[i] == ' ': continue`. – Axe319 Jan 10 '23 at 14:04
  • 1
    How is your expected output just 6 elements? – Talha Tayyab Jan 10 '23 at 14:05
  • @Axe319, thanks so much! `if ip[i] == ' ': i += 1` was the culprit. – FenderBender Jan 10 '23 at 14:06
  • @mozway What duplicates? Your solution gives the same list as what I posted there. – AKX Jan 10 '23 at 14:08
  • @AKX I meant in the output from OP's code – mozway Jan 10 '23 at 14:09

4 Answers4

1

Beyond itertools.product, there's a neat binary trick you can do here since you're essentially flipping "bits" at given indices in the string:

ip = 'a b c d'

# Figure out where the spaces are.
space_indices = [i for i, x in enumerate(ip) if x == ' ']
# Loop over all numbers from 0 (no bits set) to 2^N-1 (all bits set)
for x in range(1 << len(space_indices)):
    new_ip = list(ip)
    for i, pos in enumerate(space_indices):  # loop over every "bit"
        if x & (1 << i):  # if the i-th bit is set...
            new_ip[pos] = '_'  # ... replace with an underscore
    print("".join(new_ip))

This prints out

a b c d
a_b c d
a b_c d
a_b_c d
a b c_d
a_b c_d
a b_c_d
a_b_c_d

For a reverse order, you can switch to if not.

The loop body could also be

    # Figure out which positions to replace with underscores
    # by looking at which bits are set in `x`.
    to_replace = {pos for i, pos in enumerate(space_indices) if x & (1 << i)}
    # Print out a string with some of the characters replaced.
    print("".join("_" if pos in to_replace else c for pos, c in enumerate(ip)))

if you like a more functional approach.

AKX
  • 152,115
  • 15
  • 115
  • 172
0

I would use itertools.product here:

from itertools import product, chain

ip = 'a b c d'
words = ip.split()

res = [''.join(chain(*zip(words, p), words[-1]))
       for p in product([' ', '_'], repeat=len(words)-1)]

Output:

['a b c d',
 'a b c_d',
 'a b_c d',
 'a b_c_d',
 'a_b c d',
 'a_b c_d',
 'a_b_c d',
 'a_b_c_d']
mozway
  • 194,879
  • 13
  • 39
  • 75
  • the order differs from the expected output – RomanPerekhrest Jan 10 '23 at 14:03
  • @RomanPerekhrest there are several ways to change the order if this is important ([example](https://stackoverflow.com/questions/9903257/python-itertools-product-reorder-the-generation)). I'll try to update with a simple approach – mozway Jan 10 '23 at 14:05
0

Thanks so much for your help everyone, especially @Axe319. The issue was with the second line in the for loop, here's the fixed solution:

ip = 'a b c d'
org = ip
res = []

for i in range(len(ip)):
    if ip[i] == ' ': continue
    for j in range(i + 1, len(ip)):
        if ip[j] == ' ':
            ip = ip[:j] + '_' + ip[j+1:]
            res.append(ip)
        j += 1
    i += 1
    ip = org
FenderBender
  • 303
  • 3
  • 16
0

With str.find and re.sub functions (with specifying the maximum number of pattern occurrences to be replaced):

import re

ip = 'a b c d'
sep_cnt = ip.count(' ')  # count number of spaces
res = []
sep_pos = ip.find(' ')  # get separator position
for i in range(sep_cnt):
    for j in range(sep_cnt - i):  # countdown replacements
        res.append(ip[:sep_pos] + re.sub(r' ', '_', ip[sep_pos:], j + 1))
    sep_pos = ip.find(' ', sep_pos + 1)

print(res)

['a_b c d', 'a_b_c d', 'a_b_c_d', 'a b_c d', 'a b_c_d', 'a b c_d']
RomanPerekhrest
  • 88,541
  • 4
  • 65
  • 105