I want to write some code that takes a list of items and concatenates them (separated by commas) to long strings, where each string is not longer than a predefined length. For example, for this list:
colors = ['blue','pink','yellow']
and a max len of 10 chars, the output of the code will be:
Long String 0: blue,pink
Long String 1: yellow
I created the following code (below), but its pitfall is cases where the total length of the concatenated items is shorter of the max len allowed, or where it creates one or more long strings and the total len of the concatenation of the residual items in the list is shorter than the max len.
What I'm trying to ask is this: in the following code, how would you "stop" the loop when items run out and yet the concatenation is so short that the "else" clause isn't reached?
Many thanks :)
import pyperclip
# Theoretical bug: when a single item is longer than max_length. Will never happen for the intended use of this code.
raw_list = pyperclip.paste()
split_list = raw_list.split()
unique_items_list = list(set(split_list)) # notice that set are unordered collections, and the original order is not maintained. Not crucial for the purpose of this code the way it is now, but good remembering. See more: http://stackoverflow.com/a/7961390/2594546
print "There are %d items in the list." % len(split_list)
print "There are %d unique items in the list." % len(unique_items_list)
max_length = 10 # salesforce's filters allow up to 1000 chars, but didn't want to hard code it in the rest of the code, just in case.
list_of_long_strs = []
short_list = [] # will hold the items that the max_length chars long str.
total_len = 0
items_processed = [] # will be used for sanity checking
for i in unique_items_list:
if total_len + len(i) + 1 <= max_length: # +1 is for the length of the comma
short_list.append(i)
total_len += len(i) + 1
items_processed.append(i)
elif total_len + len(i) <= max_length: # if there's no place for another item+comma, it means we're nearing the end of the max_length chars mark. Maybe we can fit just the item without the unneeded comma.
short_list.append(i)
total_len += len(i) # should I end the loop here somehow?
items_processed.append(i)
else:
long_str = ",".join(short_list)
if long_str[-1] == ",": # appending the long_str to the list of long strings, while making sure the item can't end with a "," which can affect Salesforce filters.
list_of_long_strs.append(long_str[:-1])
else:
list_of_long_strs.append(long_str)
del short_list[:] # in order to empty the list.
total_len = 0
unique_items_proccessed = list(set(items_processed))
print "Number of items concatenated:", len(unique_items_proccessed)
def sanity_check():
if len(unique_items_list) == len(unique_items_proccessed):
print "All items concatenated"
else: # the only other option is that len(unique_items_list) > len(unique_items_proccessed)
print "The following items weren't concatenated:"
print ",".join(list(set(unique_items_list)-set(unique_items_proccessed)))
sanity_check()
print ",".join(short_list) # for when the loop doesn't end the way it should since < max_length. NEED TO FIND A BETTER WAY TO HANDLE THAT
for item in list_of_long_strs:
print "Long String %d:" % list_of_long_strs.index(item)
print item
print