0

I want to do multiple string replacements in Python.

I have a dictionary:

my_dict = {'Can I have some roti and aloo gobhi ?': 
              {'roti': ['pulka', 'butter kp', 'wheat parota', 'chapati',
                        'gobi parota', 'onion parota', 'paneer parota',
                        'kerala parota', 'aloo parota', 'plain naan',
                        'butter naan', 'garlic naan', 'plain kulcha',
                        'butter kulcha', 'lacha parota', 'tandoori roti',
                        'tandoori butter roti', 'roti'],
               'aloo gobhi': ['paneer butter masala', 'palak paneer', 
                              'kadai paneer', 'hydrabadi paneer', 
                              'kadai gobi', 'aloo gobi', 'aloo mattar', 
                              'mix veg curry', 'baby corn masala', 
                              'dal fry', 'palak dal', 'dal tadka', 
                              'mushroom masala', 'gobi masala', 
                              'paneer tikka masala', 
                              'mushroom tikka masala', 'aloo gobhi']
              }
          }

It basically has a sentence as a key and the value (which is again a dictionary). This dictionary which I have key as item to replace in the sentence by the corresponding value (which is a list). Now I want to construct a sentence using the key of a main dictionary by replacing roti with any of the ones in the corresponding list and 'aloo gobhi' with any item of the corresponding list.

For example:

input_string = "Can I have some roti and aloo gobhi ?"
output_string = "Can I have some pulka and panner butter masala ?"

UPDATE: I have an excel file (say food_items.xlsx) where I have list of food items which are separated as dessert, starters,main course, etc. I have another excel file (say food_queries.xlsx) where I have user queries requesting for order of food items which are present in food_items.xlsx. I'm trying to write a script which will cover all the food items in food_items.xlsx with minimum number of user queries so that machine learning can be done with minimum queries.

import xlrd
import xlsxwriter
import string
import random
import re
import time
import itertools


list_of_items = []
dict_of_names = {}

def createList(filename):
    try:
        book = xlrd.open_workbook(filename)
        sheet = book.sheet_by_name(book.sheet_names()[2])
        for i in xrange(sheet.ncols):
            list_1 = []
            for j in xrange(sheet.nrows):
                cell_value = sheet.cell(j,i).value
                if str(cell_value) in (None,""):
                    j+=1
                    break
                else:
                    list_1.append(str(cell_value).lower())
            dict_of_names[str(list_1[0]).upper()] = list_1[1:]

    except Exception, e:
        print e

def getFile(readFile):
    try:
        list_of_sentences = []
        row = 0
        col = 0
        query_book = xlrd.open_workbook(readFile)
        first_sheet = query_book.sheet_by_index(0)
        for i in xrange(first_sheet.ncols):
            for j in xrange(first_sheet.nrows):
                cell_value = str(first_sheet.cell(j,i).value)
                if cell_value in (None,""," "):
                    j += 1
                    # dict_of_names[keys].remove(value)
                else:
                    list_of_sentences.append(cell_value)
        replaceStrings(list_of_sentences)
    except Exception as e:
        print e



def replaceStrings(list_of_sentences):
    # all_dict = {}
    # for sentence in list_of_sentences:
    #   dict_values = {}
    #   for keys,values in dict_of_names.items():
    #       for val in values:
    #           temp_dict = {}
    #           if val in sentence:
    #               temp_dict[val] = dict_of_names[keys]
    #               dict_values.update(temp_dict)
    #   all_dict[sentence] = dict_values
    # print all_dict

    # for keys,values in all_dict.items() :

    # for b,c in itertools.izip(dict_values,food_item_1[0],food_item_1[1]):
        # print sentence.replace(a,b).replace(a,c)

    for sentence in list_of_sentences:
        dict_values = {}
        for keys,values in dict_of_names.items():
            for val in values:
                temp_dict = {}
                if val in sentence:
                    temp_dict[val] = dict_of_names[keys]
                    dict_values.update(temp_dict)


        keys = dict_values.keys()
        n = len(keys)
        for i in range(n):
            thisKey = keys[i]
            nextKey = keys[(i + 1) % n]
            # print thisKey,nextKey
            for c,a,b in itertools.izip(list_of_sentences, dict_values[thisKey],dict_values[nextKey]):
                new_cell = c.replace(thisKey,a).replace(nextKey,b)
                # del dict_values[a]
                print new_cell


            # for k in existing_names:
                # if k in cell.value:
                #   lines = str(cell.value).replace(k,str(random.choice(new_names_one)))\
                #       .replace(k,str(random.choice(new_names_two)))
                #   worksheet.write(row,col,lines)
                #   row  = row + 1
                # else:
                #   break


if __name__ == "__main__":
    print "starting execution.."
    # workbook = xlsxwriter.Workbook('Query_set_1.xlsx')
    # worksheet = workbook.add_worksheet()
    createList("total food queries.xlsx")
    getFile("total food queries.xlsx")

    # workbook.close()

UPDATE 2:

The basic algorithm I want to implement is :

  1. I need to cover all the food items (each food item can occur only once).

  2. Once all food items are covered I just stop. (though there are few query sample form the user still left)

My main goal is cover all the food items, not the queries from the user.

Michael Currie
  • 13,721
  • 9
  • 42
  • 58
d-coder
  • 83
  • 5

3 Answers3

1

I would keep the main sentence as its own string, and then replace the words and save a new string.

import random

sentence = 'Can I have some roti and aloo gobhi?'
new_sentence = sentence

replacements = {
'roti': ['pulka', 'butter kp', 'wheat parota', 'chapati', 'gobi parota', 'onion parota', 'paneer parota', 'kerala parota', 'aloo parota', 'plain naan', 'butter naan', 'garlic naan', 'plain kulcha', 'butter kulcha', 'lacha parota', 'tandoori roti', 'tandoori butter roti', 'roti'],
'aloo gobhi': ['paneer butter masala', 'palak paneer', 'kadai paneer', 'hydrabadi paneer', 'kadai gobi', 'aloo gobi', 'aloo mattar', 'mix veg curry', 'baby corn masala', 'dal fry', 'palak dal', 'dal tadka', 'mushroom masala', 'gobi masala', 'paneer tikka masala', 'mushroom tikka masala', 'aloo gobhi']
}

for key in replacements:
    new_sentence = new_sentence.replace(key, random.choice(replacements[key]))

Result:

>>> new_sentence
'Can I have some onion parota and aloo mattar?'

If you just want a random item for each dish rather than only replacing those particular dishes, you should use string formatting:

import random

sentence = 'Can I have some {} and {}?'

replacements = [
['pulka', 'butter kp', 'wheat parota', 'chapati', 'gobi parota', 'onion parota', 'paneer parota', 'kerala parota', 'aloo parota', 'plain naan', 'butter naan', 'garlic naan', 'plain kulcha', 'butter kulcha', 'lacha parota', 'tandoori roti', 'tandoori butter roti', 'roti'],
['paneer butter masala', 'palak paneer', 'kadai paneer', 'hydrabadi paneer', 'kadai gobi', 'aloo gobi', 'aloo mattar', 'mix veg curry', 'baby corn masala', 'dal fry', 'palak dal', 'dal tadka', 'mushroom masala', 'gobi masala', 'paneer tikka masala', 'mushroom tikka masala', 'aloo gobhi']
]

Result:

>>> new_sentence = sentence.format(*(random.choice(l) for l in replacements))
>>> new_sentence
'Can I have some tandoori roti and mix veg curry?'
>>> new_sentence = sentence.format(*(random.choice(l) for l in replacements))
>>> new_sentence
'Can I have some pulka and paneer butter masala?'
>>> new_sentence = sentence.format(*(random.choice(l) for l in replacements))
>>> new_sentence
'Can I have some lacha parota and palak paneer?'

Based on your updated question and its comments, you're not looking for a random replacement at all; you're looking for the Cartesian product of those two lists. We'll use the product() function in the itertools module, along with string formatting.

import itertools

replacements = [
['pulka', 'butter kp', 'wheat parota', 'chapati', 'gobi parota', 'onion parota', 'paneer parota', 'kerala parota', 'aloo parota', 'plain naan', 'butter naan', 'garlic naan', 'plain kulcha', 'butter kulcha', 'lacha parota', 'tandoori roti', 'tandoori butter roti', 'roti'],
['paneer butter masala', 'palak paneer', 'kadai paneer', 'hydrabadi paneer', 'kadai gobi', 'aloo gobi', 'aloo mattar', 'mix veg curry', 'baby corn masala', 'dal fry', 'palak dal', 'dal tadka', 'mushroom masala', 'gobi masala', 'paneer tikka masala', 'mushroom tikka masala', 'aloo gobhi']
]

all_combos = itertools.product(*replacements)

all_sentences = ['Can I have some {} and {}?'.format(*combo) for combo in all_combos]

Result (only every 30th sentence, not the whole thing):

>>> for sentence in all_sentences[::30]:
...     print(sentence)
...
Can I have some pulka and paneer butter masala?
Can I have some butter kp and gobi masala?
Can I have some chapati and dal fry?
Can I have some onion parota and aloo gobi?
Can I have some kerala parota and palak paneer?
Can I have some aloo parota and paneer tikka masala?
Can I have some butter naan and palak dal?
Can I have some plain kulcha and aloo mattar?
Can I have some lacha parota and kadai paneer?
Can I have some tandoori roti and mushroom tikka masala?
Can I have some roti and dal tadka?
TigerhawkT3
  • 48,464
  • 6
  • 60
  • 97
  • Close enough! I don't want to sentences to be repeated either! Next combinations should go to next user query in the excel sheet. – d-coder Aug 12 '15 at 20:26
  • Each of the two lists contains only unique dishes, so each sentence will appear exactly once. There is no overlap between the two lists, so any given pair of dishes will appear exactly once. – TigerhawkT3 Aug 12 '15 at 20:32
0

You could try something like:

if input_string in my_dict:
  output_string = input_string
  for k in my_dict[input_string].keys():
    new_word = random.choice(my_dict[input_string][k])
    output_string.replace(k,new_word)
dingensundso
  • 176
  • 4
0

I short solution based on this answer:

import random

replacements = [(key, random.choice(my_dict[input_string][key])) for key in my_dict[input_string].iterkeys()]
output_string = reduce(lambda a, kv: a.replace(*kv), replacements, input_string)

You basically construct a list of tuples, each containing a word and its replacement. Then you make use of Python's reduce function to execute each replacement.

reduce(function, iterable[, initializer]) Apply function of two arguments cumulatively to the items of iterable, from left to right, so as to reduce the iterable to a single value. [...]

Example output:

Can I have some paneer parota and baby corn masala ?
Can I have some paneer parota and gobi masala ?
Can I have some butter naan and gobi masala ?
Can I have some tandoori butter roti and gobi masala ?
Can I have some onion parota and hydrabadi paneer ?
...
Community
  • 1
  • 1
Falko
  • 17,076
  • 13
  • 60
  • 105