41

I am reading in a line from a text file using:

   file = urllib2.urlopen("http://192.168.100.17/test.txt").read().splitlines()

and outputting it to an LCD display, which is 16 characters wide, in a telnetlib.write command. In the event that the line read is longer than 16 characters I want to break it down into sections of 16 character long strings and push each section out after a certain delay (e.g. 10 seconds), once complete the code should move onto the next line of the input file and continue.

I've tried searching various solutions and reading up on itertools etc. but my understanding of Python just isn't sufficient to get anything to work without doing it in a very long winded way using a tangled mess of if then else statements that's probably going to tie me in knots!

What's the best way for me to do what I want?

LostRob
  • 433
  • 1
  • 4
  • 12
  • 1
    Try `import time` and `time.sleep` for delays. – rlms Sep 17 '13 at 16:05
  • 6
    To split into chunks, the `chunks` function [here](http://stackoverflow.com/questions/312443/how-do-you-split-a-list-into-evenly-sized-chunks-in-python) should work. – mgilson Sep 17 '13 at 16:09
  • 1
    @mgilson I vote to close as a duplicate, because the answer is the same – Marcin Sep 17 '13 at 16:39
  • @mgilson @Marcin - this question is slightly different if you consider that when the input is a string, you can use the `re` module to chunk it with `re.findall('.{%d}' % length, string)`. – carl.anderson Aug 21 '14 at 16:02
  • 1
    @carl.anderson -- sure you _could_. I'm not convinced that it would be faster (although maybe ...) and it's definitely not easier to read to my un-re-trained eye. – mgilson Aug 21 '14 at 16:35

7 Answers7

79

One solution would be to use this function:

def chunkstring(string, length):
    return (string[0+i:length+i] for i in range(0, len(string), length))

This function returns a generator, using a generator comprehension. The generator returns the string sliced, from 0 + a multiple of the length of the chunks, to the length of the chunks + a multiple of the length of the chunks.

You can iterate over the generator like a list, tuple or string - for i in chunkstring(s,n): , or convert it into a list (for instance) with list(generator). Generators are more memory efficient than lists because they generator their elements as they are needed, not all at once, however they lack certain features like indexing.

This generator also contains any smaller chunk at the end:

>>> list(chunkstring("abcdefghijklmnopqrstuvwxyz", 5))
['abcde', 'fghij', 'klmno', 'pqrst', 'uvwxy', 'z']

Example usage:

text = """This is the first line.
           This is the second line.
           The line below is true.
           The line above is false.
           A short line.
           A very very very very very very very very very long line.
           A self-referential line.
           The last line.
        """

lines = (i.strip() for i in text.splitlines())

for line in lines:
    for chunk in chunkstring(line, 16):
        print(chunk)
rlms
  • 10,650
  • 8
  • 44
  • 61
  • I just about understand what the function is doing but I'm still missing some bits, such as how best to use the generated chunks. For example I have "for line in file" followed by the code to update the display followed by a wait but how should I step through each chunk before moving onto the next line (i.e. how do I know how many chunks I have and refer to them e.g. if I use "for i in chunkstring(s,n):" how do I "print" chunk 1 or chunk 3?) – LostRob Sep 18 '13 at 12:58
  • Never mind, I didn't understand your answer very well. I read [this](http://stackoverflow.com/questions/231767/the-python-yield-keyword-explained) explanation of iterables and generators which helped me realise my mistake. – LostRob Sep 18 '13 at 14:46
  • I hadn't seen your edit. Thank you, that also clarifies it for me! – LostRob Sep 18 '13 at 14:49
  • @LostRob That's unsuprising, I edited it in response to your comment! – rlms Sep 18 '13 at 14:52
  • 3
    Great snippet. And despite the name, there's nothing limiting it to strings. – Cerin Jul 09 '14 at 02:45
10

My favorite way to solve this problem is with the re module.

import re

def chunkstring(string, length):
  return re.findall('.{%d}' % length, string)

One caveat here is that re.findall will not return a chunk that is less than the length value, so any remainder is skipped.

However, if you're parsing fixed-width data, this is a great way to do it.

For example, if I want to parse a block of text that I know is made up of 32 byte characters (like a header section) I find this very readable and see no need to generalize it into a separate function (as in chunkstring):

for header in re.findall('.{32}', header_data):
  ProcessHeader(header)
carl.anderson
  • 1,040
  • 11
  • 16
8

the standard library offers textwrap.wrap:

from textwrap import wrap

s = "some random text that should be splitted into chunks"

print(wrap(s, width=3))

['som', 'e r', 'and', 'om ', 'tex', 't t', 'hat', 'sho', 'uld', 'be ', 'spl', 
 'itt', 'ed ', 'int', 'o c', 'hun', 'ks']
FObersteiner
  • 22,500
  • 8
  • 42
  • 72
3

I know it's an oldie, but like to add how to chop up a string with variable length columns:

def chunkstring(string, lengths):
    return (string[pos:pos+length].strip()
            for idx,length in enumerate(lengths)
            for pos in [sum(map(int, lengths[:idx]))])

column_lengths = [10,19,13,11,7,7,15]
fields = list(chunkstring(line, column_lengths))
2

I think this way is easier to read:

string = "when an unknown printer took a galley of type and scrambled it to make a type specimen book."
length = 20
list_of_strings = []
for i in range(0, len(string), length):
    list_of_strings.append(string[i:length+i])
print(list_of_strings)
Jodmoreira
  • 85
  • 1
  • 7
1

Doing it with list-comprehension:

n = "aaabbbcccddd"
k = 3
[n[i:i+k] for i in range(0,len(n),k)]
=> ['aaa', 'bbb', 'ccc', 'ddd']
Shuizid
  • 11
  • 1
0

Doing it with ever more simplicity:

str_to_split="KIMJEONG" # Your string to split here
n=4 # Your chunk length here
buf=""
ourchunks=[]
x=0

for i in str_to_split:
   x += 1
   buf += i
   if (x % 4) == 0:
     ourchunks.append(buf)
     buf=""