0

I'm working on a python 2.6 script to generate poems from small bits of media transcripts.

Atm I'm puzzling over picking a random combination of valid syllable counts to compose semirandom lines from pools of known-(syllable)length strings. The current function is inspired by the powerset() function proposed here: http://docs.python.org/release/3.1.3/library/itertools.html

def valid(self, iterable, target):
    s = list(iterable)
    range_floor = target / max(s)
    range_ceil = target / min(s)
    unfiltered = chain.from_iterable(self.combinations_with_replacement(s, r) for r in range(range_floor, range_ceil))
    return random.choice(list(ifilter(lambda x: sum(x) is target, unfiltered)))

For a quick example: if I had two sets of data chunked into 5 and 10 syllable units from which to assemble a 20-syllable line, valid combinations would include: (5, 5, 5, 5), (10, 10), (5, 10, 5) and (10, 5, 5).

With large numbers of possible combinations (longer lines, more than two lengths of constituent unit) processing time gets significant--that's my main concern for now. I may ultimately just limit my scope and provide hard caps on some of the variables to avoid more intense options.

As I write this it occurs to me that it might be the sort of situation where it'd make sense to just stick with the existing algorithm and cache/pickle past results to save myself doing the work more than once--but I don't really have a background in CS or math, so I suspect I might be overlooking something obvious to those who do.

abathur
  • 1,047
  • 7
  • 19

0 Answers0