How to replace custom tabs with spaces in a string, depend on the size of the tab?

Question

I'm trying to write a python function not using any modules that will take a string that has tabs and replace the tabs with spaces appropriate for an inputted tabstop size. It can't just replace all size-n tabs by n spaces though, since a tab could be 1 to n spaces. I'm really confused, so if anyone could just point me in the right direction I'd greatly appreciate it.

For instance, if tabstop is size 4 originally:

123\t123 = 123 123 #one space in between

but changed to tabstop 5:

123\t123 = 123  123 #two spaces in between

I think I need to pad the end of the string with spaces until string%n==0 and then chunk it, but I'm pretty lost at the moment..

It would be a good idea to add a bunch of testcases to your question — John La Rooy, Apr 17 '13 at 06:40
What happen if blocksize is 5 and string is more longer, e.g. 123456\t ? Result is: 1234_56___ ? 1234_6____ ? 123456_? — emigue, Apr 17 '13 at 06:45
I may be missing something, but a tabstop is not size "n". A tabstop is `\t` which is one character and is always size 1. Do you want to replace spaces with tabs, maybe? Or spaces with fewer spaces? — Joel Cornett, Apr 17 '13 at 07:17
Ohhhh. Okay. I see what you mean. You should probably rephrase your question as it's a bit confusing at first. — Joel Cornett, Apr 17 '13 at 07:20

score 5 · Answer 1 · answered Apr 17 '13 at 07:24

5

For a tab length of 5:

>>> s = "123\t123"
>>> print ''.join('%-5s' % item for item in s.split('\t'))
123  123  
>>>

answered Apr 17 '13 at 07:24

Joel Cornett

24,192
9
66
88

2

Or: `(5*' ').join(s.split('\t'))` – Basel Shishani Jun 08 '15 at 00:00

Rémi · Accepted Answer · 2013-04-17T06:47:00.553

4

Since you wan't a python function that doesn't use any external module, I think you should design first the algorithm of your function...

I would propose to iterate on every char of the string ; if char i is a tab, you need to compute how many spaces to insert : the next "aligned" index is ((i / tabstop) + 1) * tabstop. So you need to insert ((i / tabstop) + 1) * tabstop - (i % tabstop). But an easier way is to insert tabs until you are aligned (i.e. i % tabstop == 0)

def replace_tab(s, tabstop = 4):
  result = str()
  for c in s:
    if c == '\t':
      while (len(result) % tabstop != 0):
        result += ' ';
    else:
      result += c    
  return result

edited Apr 17 '13 at 06:47

answered Apr 17 '13 at 06:38

Rémi

527
5
10

Thanks everybody for the help. This is exactly what I was looking for I was just having a mental block trying to wrap my mind around the algorithm, so thanks again! – Austin Apr 17 '13 at 13:37
Anybody know how to change this to work with multiple tabs in a row? seems that it only picks up the first one – Austin Apr 17 '13 at 16:16
In the test I ran multiple tab were ok: replace_tab('123\t12\t1\t123456\t1234\t12345678\n') returns '123.12..1...123456..123412345678' (with dots replacing spaces for readability) – Rémi Apr 18 '13 at 07:16
BTW, I think that other answers with list comprehension, split and join are far more elegant... – Rémi Apr 18 '13 at 07:28
For multiple tabs like "uint8_t\t\tvalue;" I inserted "if len(result) % tabstop == 0: result += ' '" before the while loop. – Andris Aug 05 '14 at 09:52
6

This shouldn't be upvoted as it is incorrect - I doubt it was ever tested. A leading tab simply gets thrown away, replaced by nothing. (And only the first tab is replaced, though that negative feature is documented.) – Tom Swirly Aug 03 '15 at 15:08

andrea.m.piovesana · Answer 3 · 2015-11-17T10:02:23.930

4

I use .replace function that is very simple:

line = line.replace('\t', ' ')

edited Nov 17 '15 at 10:02

answered Nov 17 '15 at 09:54

andrea.m.piovesana

41
4

This approach simply does not provide what the question asks for ... – Claudio Aug 05 '23 at 22:18

ibi0tux · Answer 4 · 2013-04-17T06:29:36.060

2

Sorry, i misread the question the first time.

This is a recursive version that should work for any number of tabs in the input :

def tabstop ( s , tabnum = 4):
    if not '\t' in s:
        return s
    l = s.find('\t')
    return s[0:l]+' '*(tabnum-l)+tabstop(s[l+1:],tabnum)

edited Apr 17 '13 at 06:29

answered Apr 17 '13 at 06:16

ibi0tux

2,481
4
28
49

Nice idea, but this approach does not work when the length of the substrings between the tabs is larger than `tabnum` ( `' '*negativeNumber` gives an empty string). – Claudio Aug 05 '23 at 22:27

score 2 · Answer 5 · answered Nov 12 '15 at 00:04

I think Remi's answer is the simplest but it has a bug, it doesn't account for the case when you are already on a "tab stop" column. Tom Swirly pointed this out in the comments. Here's a tested fix to his suggestion:

def replace_tab(s, tabstop = 4):
    result = str()

    for c in s:
        if c == '\t':
            result += ' '
            while ((len(result) % tabstop) != 0):
                result += ' '
        else:
            result += c    

    return result

score 2 · Answer 6 · answered Sep 05 '19 at 17:41

2

Here is the easiest way

def replaceTab(text,tabs)
    return text.replace('\t', ' ' * tabs)

answered Sep 05 '19 at 17:41

Vignesh A

289
7
19

This approach simply does not provide what the question asks for ... – Claudio Aug 05 '23 at 22:14

score 1 · Answer 7 · answered Apr 17 '13 at 07:12

1

This code can help you:

initial_string = "My \tstring \ttest\t"
block_size = "5"
"".join([("{block_value:"+str(block_size)+"s}").format(block_value=block) 
    for block in initial_string.split("\t")])

You will need to study: format, split and join function and list comprehension concept.

answered Apr 17 '13 at 07:12

emigue

492
3
13

This approach does not work for strings in which the distance between the Tabs is larger than `block_size`. – Claudio Aug 05 '23 at 22:46

Jorge Antonio Galaz · Answer 8 · 2014-06-30T18:02:54.307

This programm replaces all the tabs for spaces in a file:

def tab_to_space (line, tab_lenght = 8):
    """this function change all the tabs ('\\t') for spaces in a string, 
        the lenght of the tabs is 8 by default"""

    while '\t' in line:
        first_tab_init_pos = line.find('\t')
        first_tab_end_pos = (((first_tab_init_pos // tab_lenght)+1) * tab_lenght)
        diff = first_tab_end_pos - first_tab_init_pos
        if diff == 0:
            spaces_string = ' ' * tab_lenght
        else:
            spaces_string = ' ' * diff
        line = line.replace('\t', spaces_string, 1)
    return line


inputfile = open('inputfile.txt', 'r')
outputfile = open('outputfile.txt', 'w')
for line in inputfile:
    line = tab_to_space(line)
    outputfile.write(line)
inputfile.close()
outputfile.close()

score 1 · Answer 9 · answered Apr 11 '18 at 20:35

if you have the requirement where you want to add n spaces instead of custom tab you can simply write below code. I have shown the implementation using two functions, each having different way to solve it.You can use any of the function!

for eg. let the string be in the variable 'code' and 'x' be the size of tab

code = "def add(x, y)\f\treturn x + y"
x=4

def convertTabs(code, x):
    temp=""
    for i in range(0,x):
        temp+=" "
    return code.replace("\t",temp) 

def converTabs1(code,x):
    return code.replace("\t",x*" ")

both the functions above will give the same value, but the second one is super awesome !

This approach simply does not provide what the question asks for ... — Claudio, Aug 05 '23 at 22:16
Can you please provide better explanation for your comment ? @Claudio — Ravi Bhanushali, Aug 28 '23 at 20:12

kzar · Answer 10 · 2015-04-29T13:36:11.387

0

I needed something similar, here's what I came up with:

import re

def translate_tabs(tabstop = 8):
  offset = [0]
  def replace(match, offset=offset):
    offset[0] += match.start(0)
    return " " * (tabstop - offset[0] % tabstop)
  return replace

re.sub(r'\t', translate_tabs(4), "123\t123") 
# => '123 123'

re.sub(r'\t', translate_tabs(5), "123\t123")
# => '123  123'

edited Apr 29 '15 at 13:36

answered Aug 20 '14 at 16:39

kzar

2,981
2
17
13

This approach does not correctly work for multiple Tabs in the string, because it does not consider the change of the Tab offset depending on the previous replacements not being a single space. – Claudio Aug 05 '23 at 23:27
See my answer for how can you fix it keeping the main idea of this interesting approach. – Claudio Aug 06 '23 at 00:04

score 0 · Answer 11 · answered Sep 07 '16 at 09:20

0

Use the re.sub is enough.

def untabify(s, tabstop = 4):
    return re.sub(re.compile(r'\t'), ' '*tabstop, s)

answered Sep 07 '16 at 09:20

Felix

495
4
9

This approach simply does not provide what the question asks for ... – Claudio Aug 05 '23 at 22:17

score 0 · Answer 12 · answered Aug 07 '20 at 07:15

0

Fix for @rémi answer This implementation honors the leading tab and any consecutive tabs

def replace_tab(s, tabstop=4):
    result = str()
    for c in s:
        if c == '\t':
            if (len(result) % tabstop == 0):
                result += ' ' * tabstop
            else:
                while (len(result) % tabstop != 0):
                    result += ' '
        else:
            result += c
    return result

answered Aug 07 '20 at 07:15

Juande Manjon

11
1

What about ``` if c == '\t': result += ' ' while (len(result) % tabstop != 0): result += ' ' ``` ? – Claudio Aug 05 '23 at 22:12

score 0 · Answer 13 · answered Apr 07 '22 at 17:56

0

def expand_tabs(text: str, width: int = 8) -> str:
    """
    Expand each tab to one or more spaces
    """
    assert width > 0
    while (i := text.find('\t')) >= 0:
        text = text[:i] + ' ' * (width - i % width) + text[i+1:]
    return text

answered Apr 07 '22 at 17:56

Waxrat

510
2
11

Extremely inefficient approach. On each Tab occurrence the string must be resized (expensive) and each Tab occurrence the `find` starts to search unnecessary from the very begin of the string for the next Tab. – Claudio Aug 05 '23 at 21:55

Claudio · Answer 14 · 2023-08-06T00:22:46.353

Just only because this below does not fit into a comment to the answer of kzar who came up with a quite interesting approach (which does not answer the question because it uses a module), but didn't it correct :

import re
offsetAddon = 0
def spaces(tabSize=8):
  def replace(match):
    global offsetAddon
    spaceMultipl = (tabSize - (match.start(0) + offsetAddon) % tabSize)
    offsetAddon += (spaceMultipl - 1)
    return " " * spaceMultipl    
  return replace
tab=r'\t'
s="\t1\t12\t123\t1234\t12345\t123456\t1234567\t12345678\t12\t"
print(f'''"{re.sub(tab, spaces(4), s)}"''') # gives:  
# "    1   12  123 1234    12345   123456  1234567 12345678    12  "

How to replace custom tabs with spaces in a string, depend on the size of the tab?

14 Answers14

Linked

Related