1

how can I separate a string: "Blahblahblahblah" into "Blah" "blah" "blah" "blah" on python. I've tried the following:

str = "Blahblahblahblah"
for letter[0:3] on str

How can I do it?

2 Answers2

3

Try:

>>> SUBSTR_LEN = 4
>>> string = "bla1bla2bla3bla4"
>>> [string[n:n + SUBSTR_LEN] for n in range(0, len(string), SUBSTR_LEN)]
['bla1', 'bla2', 'bla3', 'bla4']
Erik Kaplun
  • 37,128
  • 15
  • 99
  • 111
3

If you do not mind to use re library. In this example the regex .{4} means any character except \n of length 4.

import re

str = "Blahblahblahblah"
print re.findall(".{4}", str)

output:

['Blah', 'blah', 'blah', 'blah']

Note: str is not a very good name for a variable name. Because there is a function named str() in python that converts the given variable into a string.

Sabuj Hassan
  • 38,281
  • 14
  • 75
  • 85
  • 1
    You don't need the parens in the pattern string. – ooga Apr 05 '14 at 13:20
  • @ooga yes. you are right as its only one group here. I am updating the code. – Sabuj Hassan Apr 05 '14 at 13:22
  • 2
    I know you're keeping the same name as the OP, but `str` is *not* a good choice :) and `.` means *any* not newline character (unless told otherwise) - not *non-whitespace* char – Jon Clements Apr 05 '14 at 13:23
  • @JonClements yes, you are right. I didn't notice that OP used `str` while doing copy/paste. I added a `note` for this. – Sabuj Hassan Apr 05 '14 at 13:26
  • I'm actually trying to separate RNA code into codons to analize the proteins of DNA. – user3501183 Apr 05 '14 at 13:43
  • @SAbuj You are wrong with the use of the word **whitespace**. In the doc of ``re``, whitespaces are defined as being the characters ``blank``, ``\f``, ``\n``, ``\r``, ``\t``, ``\v`` – eyquem Apr 05 '14 at 13:48
  • @user3501183 you may want to look at http://biopython.org then if you're dealing with that sort of data... – Jon Clements Apr 05 '14 at 13:48
  • @eyquem yup... I've already mentioned that (in another way) :) – Jon Clements Apr 05 '14 at 13:49
  • @Jon Excuse me, Jon. I instantly reacted when reading the message before reading the comments. My bad. By the way, I never understood if the commonly accepted meaning of the word **newline** is to designate only `\n`` or if it designates every kind of end of line as ``\n``, ``\r``, and ``\r\n`` – eyquem Apr 05 '14 at 13:51
  • @eyquem all good... just means the OP might decide to change it now :) (update: and they have... fantastic) – Jon Clements Apr 05 '14 at 13:53
  • @eyquem I updated my answer. Basically my previous quote was for `\s` and it was wrong. I updated it for `dot` now. I missed the part of one earlier comment that pointed this issue. – Sabuj Hassan Apr 05 '14 at 13:55
  • @Jon I don't understand what you mean by **(update: and they have... fantastic)** – eyquem Apr 05 '14 at 13:55
  • @Sabuj I have another critic : what do you mean by **variable name** ? : _diverse name_ ? _variable's name_ ? The first means nothing. The second is horrid because there are no variables in Python, only names and objects. – eyquem Apr 05 '14 at 13:58
  • @eyquem lol. I didn't know that there is no variable in Python. I thought its just like other languages. Can you please take a hand to update my answer? – Sabuj Hassan Apr 05 '14 at 14:01
  • In languages such as C++ and PHP, there are **variables** in the sense of **chunks of memory whose bits content can change**. In Python, the value of a non-container object can't change. The word **variable** in Python can have only the same sense as **name** and **identifier**. Since we already have at our disposal the synonyms **name** and **identifier** and the word **variable** is ambiguous and leads to miscomprehension of the _data model_ of Python, I don't see any interest to continue to use it in Python. Objects in Python are not variables they are complex C-structures. – eyquem Apr 05 '14 at 14:26