-1

I have file abc.txt which has spaces on some lines. Some lines have 4 spaces while some have more than 4 spaces. I want to convert 1st 4 spaces into a tab leaving reast of the spaces as it. I tried

unexpand -t 4 --first-only abc.txt > efg.txt and also some sed equivalents. They converted all my spaces to tab but not only the first occurrence of 4 space sequence. How can this be achieved in shell and ruby?

Stefan
  • 109,145
  • 14
  • 143
  • 218
codec
  • 7,978
  • 26
  • 71
  • 127

2 Answers2

3

You can easily do this with sed:

sed "s/^    /$(printf '\t')/g" abc.txt > efg.txt

For more on why the $(printf '\t') is necessary, check out this answer. As stated in another answer and discussed at length in comments, this pattern could also be expressed as:

sed "s/^ \{4\}/$(printf '\t')/g" abc.txt > efg.txt

Depending on your preference you might find one or the other more or less explicit and easy to reckon with.

Matthew Story
  • 3,573
  • 15
  • 26
  • This worked but it created an 8 space tab instead of 4. – codec Jul 03 '18 at 05:21
  • A tab is a tab. How it is displayed has to do with the configuration of the program you are viewing it with. Most programs default to a tabstop of `8`. Which program are you using to view the file? – Matthew Story Jul 03 '18 at 05:23
  • For context, here's how you set a tabstop of 4 in bash: https://stackoverflow.com/questions/10782699/how-to-set-4-space-tab-in-bash or in vi: https://stackoverflow.com/questions/1878974/redefine-tab-as-4-spaces – Matthew Story Jul 03 '18 at 05:25
  • If the file is not huge and one wants to do the changes inplace, `sponge` (`moreutils`) might be used here. – Aleksei Matiushkin Jul 03 '18 at 05:32
  • sponge is neat! you can do changes inline with sed using the `-i` option as well. I assumed this wasn't required/desirable as the OP listed a different in/out file. In general the `-i` sed option is looked down on in favor of somthing like: `sed s'/find/replace' in.txt > in.txt.tmp && mv in.txt.tmp in.txt` ... which is a very standard shell idiom for idempotently replacing files. – Matthew Story Jul 03 '18 at 05:35
  • {4} is not BRE, you must use \{4\} or sed -E – ctac_ Jul 03 '18 at 08:00
  • Good call @ctac_, fixed. – Matthew Story Jul 03 '18 at 15:40
1

In ruby:

'/path/to/file'.tap do |file|
  File.write(file, File.read(file).gsub(/^ {4}/, "\t"))
end

To avoid loading the whole file into memory, use File#readline instead.

Aleksei Matiushkin
  • 119,336
  • 10
  • 100
  • 160
  • This looks like it reads the whole file into memory, subs, and then writes the whole file back to disk rather than operating line-by-line? Also I find the ` {4}` pattern funny as both four literal spaces and the pattern ` {4}` are exactly four spaces wide. – Matthew Story Jul 03 '18 at 05:28
  • @MatthewStory yes, that’s not the most efficient version and in the real life, I would go with `sed` _if_ the size of the file is critically huge. – Aleksei Matiushkin Jul 03 '18 at 05:29
  • I'd also rather be told "this is 4 spaces (ie, ` {4}`) even if it is the same character length than have to count and figure out how many spaces that is on my own...and makes it clear "yes, I intended 4 spaces", not "my cat sat on my space key, and we ended up with 4"....also, easier to change, if they decide they want 6 spaces or what not. – Simple Lime Jul 03 '18 at 05:35
  • @SimpleLime yeah ... I could go either way on that. I considered going with ` {4}` in my solution as well and just thought it was kind of funny that they were the same length. Definitely not criticizing using that style over four literal spaces, just sharing an observation. – Matthew Story Jul 03 '18 at 05:38
  • 1
    Ah, yeah on my initial reading I read your comment on it as more of a passive aggressive attack on doing it that way, and I kicked into argumentative mode, rereading it not even 5 minutes later, it definitely isn't as passive-aggressive as I thought. – Simple Lime Jul 03 '18 at 05:43
  • @SimpleLime c’mon :) – Aleksei Matiushkin Jul 03 '18 at 05:44