126

Can seem to find a substring function in python.

Say I want to output the first 100 characters in a string, how can I do this?

I want to do it safely also, meaning if the string is 50 characters it shouldn't fail.

micstr
  • 5,080
  • 8
  • 48
  • 76
Blankman
  • 259,732
  • 324
  • 769
  • 1,199
  • 2
    The followup to this question is: [Good primer for Python slice notation](http://stackoverflow.com/questions/509211/good-primer-for-python-slice-notation) – Greg Hewgill Aug 15 '10 at 06:05
  • 1
    What do you mean with "characters"? Code points, grapheme clusters or code units? Slicing will count code units, which might not give the desired result. – Philipp Aug 15 '10 at 08:09

8 Answers8

198
print my_string[0:100]
icktoofay
  • 126,289
  • 21
  • 250
  • 231
  • 8
    it also works for strings shorter than 100, for example `print 'foo'[:100]` (note that `len('foo')` is 3, so even when `foo[100]` doesn't work, it does) – Rodrigo Laguna Mar 28 '18 at 19:40
73

From python tutorial:

Degenerate slice indices are handled gracefully: an index that is too large is replaced by the string size, an upper bound smaller than the lower bound returns an empty string.

So it is safe to use x[:100].

czchen
  • 5,850
  • 2
  • 26
  • 17
30

Easy:

print mystring[:100]
Arkady
  • 14,305
  • 8
  • 42
  • 46
7

To answer Philipp's concern ( in the comments ), slicing works ok for unicode strings too

>>> greek=u"αβγδεζηθικλμνξοπρςστυφχψω"
>>> print len(greek)
25
>>> print greek[:10]
αβγδεζηθικ

If you want to run the above code as a script, put this line at the top

# -*- coding: utf-8 -*-

If your editor doesn't save in utf-8, substitute the correct encoding

John La Rooy
  • 295,403
  • 53
  • 369
  • 502
  • 2
    Not disparaging your answer, but there one only 24 letters in Greek, `ς` and `σ` are the same letter :-) – paxdiablo Aug 15 '10 at 14:07
  • 5
    @paxdiablo, doh! I copied them off the wikipedia page. Lucky I didn't name the variable `greek_alphabet` then :) – John La Rooy Aug 16 '10 at 23:11
4

Slicing of arrays is done with [first:last+1].

One trick I tend to use a lot of is to indicate extra information with ellipses. So, if your field is one hundred characters, I would use:

if len(s) <= 100:
    print s
else:
    print "%s..."%(s[:97])

And yes, I know () is superfluous in this case for the % formatting operator, it's just my style.

paxdiablo
  • 854,327
  • 234
  • 1,573
  • 1,953
  • I guess this was meant as food for thought, but in the OP's case I would probably not suggest to do that. The result would be a string that you would have to check for content to further trim or something like that. In this case I would imagine one would either want that number to be variable, and the result to always be correct, or the number to be fixed and the handling either producing something meaningful, or error or return gracefully in case of failure. I can't think of many cases, other than delivering human readable info, where I'd want to add text arbitrarily to a string. –  Aug 15 '10 at 07:56
4

String formatting using % is a great way to handle this. Here are some examples.

The formatting code '%s' converts '12345' to a string, but it's already a string.

>>> '%s' % '12345'

'12345'

'%.3s' specifies to use only the first three characters.

>>> '%.3s' % '12345'

'123'

'%.7s' says to use the first seven characters, but there are only five. No problem.

>>> '%.7s' % '12345'

'12345'

'%7s' uses up to seven characters, filling missing characters with spaces on the left.

>>> '%7s' % '12345'

'  12345'

'%-7s' is the same thing, except filling missing characters on the right.

>>> '%-7s' % '12345'

'12345  '

'%5.3' says use the first three characters, but fill it with spaces on the left to total five characters.

>>> '%5.3s' % '12345'

'  123'

Same thing except filling on the right.

>>> '%-5.3s' % '12345'

'123  '

Can handle multiple arguments too!

>>> 'do u no %-4.3sda%3.2s wae' % ('12345', 6789)

'do u no 123 da 67 wae'

If you require even more flexibility, str.format() is available too. Here is documentation for both.

JoseOrtiz3
  • 1,785
  • 17
  • 28
3

Most of previous examples will raise an exception in case your string is not long enough.

Another approach is to use 'yourstring'.ljust(100)[:100].strip().

This will give you first 100 chars. You might get a shorter string in case your string last chars are spaces.

Julien Kieffer
  • 1,116
  • 6
  • 16
0
[start:stop:step]

So If you want to take only 100 first character, use your_string[0:100] or your_string[:100] If you want to take only the character at even position, use your_string[::2] The "default values" for start is 0, for stop - len of string, and for step - 1. So when you don't provide one of its and put ':', it'll use it default value.

Szymek G
  • 31
  • 6