I need to remove white spacing for lines that are separated by white spacing, in a .txt file, from Ubuntu

Question

Hello!!

As it says in the title, I need to remove the white spacing that separate only by a white spacing to two lines, as you see in the image, the white spacing that have a green line, are the ones I need to remove, but the multiple white spacing that are with a red line, I do not want to remove them, the ones with the green line, are separated by only a white spacing, I do not know if with AWK or SED or CUT will work, the problem is that I do not know how to do it, thank you for your help.

I tried to do it with SED and with AWK as follows, but it did not produce any effect

awk -F, '{gsub("\n","",$1); print}' archivo.txt

sed 's/ //g' input.txt > no-spaces.txt

Can you add actual example input text and the expected result to the question (instead of a screenshot)? — Benjamin W., Aug 01 '23 at 00:21
Please post code, data, and results as text, not screenshots ([how to format code in posts](https://stackoverflow.com/help/formatting)). [Why should I not upload images of code/data/errors?](https://meta.stackoverflow.com/questions/285551/why-should-i-not-upload-images-of-code-data-errors) http://idownvotedbecau.se/imageofcode — Barmar, Aug 01 '23 at 00:22

score 2 · Accepted Answer · answered Aug 01 '23 at 02:27

Assumptions:

The input file has "\n" (not "\r\n") line endings.
Non-empty line contains at least two charactes.
We don't have to care about an empty line at the beginning or the ending of the file.

If GNU sed which supports -z (slurp) option and \n notation is available. would you please try:

sed -Ez "s/([^\n]\n)\n([^\n])/\1\2/g" input.txt > no-spaces.txt

Example of input.txt:

line1
line2 # following blank line should be removed

line3 # following blank lines should be kept



line4

Output:

line1
line2 # following blank line should be removed
line3 # following blank lines should be kept



line4

Sed normally processes the input line by line. That is why we cannot process the input across multiple lines. The -z option changes the behavior by setting the input line separator to the NUL character.

([^\n]\n) matches the last character of non-blank line. \1 is set as a bac kreference.
\n is the blank line in between (to be removed).
([^\n]) matches the first character of the following non-blank line. \2 is set as a backreference.

Btw following will work with any POSIX-compliant sed with a help of bash:

#!/bin/bash

# define newline character for replacement
NL=$'\\\n'

sed -E '
:l
N
$!b l
# first slurp all lines in the pattern space
# and perform the replacements over the lines
s/([^'"$NL"']'"$NL"')'"$NL"'([^'"$NL"'])/\1\2/g
' input.txt > no-spaces.txt

score 0 · Answer 2 · edited Aug 01 '23 at 18:40

With any POSIX awk:

awk '/^[[:space:]]*$/{n++; p=$0; next} {if(n==1) print p; print; n=0}' file

If the current line contains only spaces (/^[[:space:]]*$/) increment variable n, store current line in variable p, and move to next line (next). Else (the current line contains non-space characters), if n==1 print the previous empty line stored in variable p, then print the current line and reset n.

Note: if the last line contains only spaces and is preceded by a line containing non-space characters, it is not printed. If it must be printed try:

awk '/^[[:space:]]*$/{n++; p=$0; next} {if(n==1) print p; print; n=0}
  END {if(n==1) print p}' file

Note: if you want to remove only empty lines, replace ^[[:space:]]*$/ with /^$/.

score 0 · Answer 3 · answered Aug 01 '23 at 05:33

This might work for you (GNU sed):

sed -E '1N;:a;N;/(\S.*\n)\n(.*\S)/{s//\1\2/;N;ba};P;D' file

Open a 3 line window.

If a two non-empty lines sandwich an empty line, remove the empty line and maintain the 3 line window.

Otherwise, print/delete the first line and repeat.

N.B. 1N ensures the 3 line window is created, the N following the substitution ensures this likewise.

I need to remove white spacing for lines that are separated by white spacing, in a .txt file, from Ubuntu

3 Answers3