Bash + sed/awk/cut to delete nth character

Question

I trying to delete 6,7 and 8th character for each line.

Below is the file containing text format.

Actual output..

#cat test 
18:40:12,172.16.70.217,UP
18:42:15,172.16.70.218,DOWN

Expecting below, after formatting.

#cat test
18:40,172.16.70.217,UP
18:42,172.16.70.218,DOWN

Even I tried with below , no luck

#awk -F ":" '{print $1":"$2","$3}' test
18:40,12,172.16.70.217,UP

#sed 's/^\(.\{7\}\).\(.*\)/\1\2/' test  { Here I can remove only one character }
18:40:1,172.16.70.217,UP

Even with cut also failed

#cut -d ":" -f1,2,3 test
18:40:12,172.16.70.217,UP

Need to delete character in each line like 6th , 7th , 8th

Suggestion please

The sed command is very near - to remove multiple characters, you'll need to have up to three characters between the capture groups: `s/^$.\{5\}$.\{1,3\}$.*$/\1\2/` or more simply just `s/^$.....$..?.?/\1/` — Toby Speight, Mar 14 '18 at 16:54

Tom Fenech · Answer 1 · 2018-03-14T14:41:05.910

With GNU cut you can use the --complement switch to remove characters 6 to 8:

cut --complement -c6-8 file

Otherwise, you can just select the rest of the characters yourself:

cut -c1-5,9- file

i.e. characters 1 to 5, then 9 to the end of each line.

With awk you could use substrings:

awk '{ print substr($0, 1, 5) substr($0, 9) }' file

Or you could write a regular expression, but the result will be more complex.

For example, to remove the last three characters from the first comma-separated field:

awk -F, -v OFS=, '{ sub(/...$/, "", $1) } 1' file

Or, using sed with a capture group:

sed -E 's/(.{5}).{3}/\1/' file

Capture the first 5 characters and use them in the replacement, dropping the next 3.

Why would you want to, if `cut` does the job for you? – Tom Fenech Mar 14 '18 at 14:32 — Tom Fenech, Mar 14 '18 at 14:32

kvantour · Answer 2 · 2018-03-14T16:33:06.823

The solutions below are generic and assume no knowledge of any format. They just delete character 6,7 and 8 of any line.

sed:

sed 's/.//8;s/.//7;s/.//6' <file>   # from high to low
sed 's/.//6;s/.//6;s/.//6' <file>   # from low to high (subtract 1)
sed 's/\(.....\).../\1/'   <file>
sed 's/\(.{5}\).../\1/'    <file>

s/BRE/replacement/n :: substitute nth occurrence of BRE with replacement

awk:

awk 'BEGIN{OFS=FS=""}{$6=$7=$8="";print $0}' <file>
awk -F "" '{OFS=$6=$7=$8="";print}'          <file>
awk -F "" '{OFS=$6=$7=$8=""}1'               <file>

This is 3 times the same, removing the field separator FS let awk assume a field to be a character. We empty field 6,7 and 8, and reprint the line with an output field separator OFS which is empty.

cut:

cut -c -5,9- <file>
cut --complement -c 6-8 <file>

score 3 · Answer 3 · answered Mar 14 '18 at 16:01

3

it's a structured text, why count the chars if you can describe them?

$ awk '{sub(":..,",",")}1' file

18:40,172.16.70.217,UP
18:42,172.16.70.218,DOWN

remove the seconds.

answered Mar 14 '18 at 16:01

karakfa

66,216
7
41
56

score 2 · Answer 4 · answered Mar 14 '18 at 16:19

2

Just for fun, perl, where you can assign to a substring

perl -pe 'substr($_,5,3)=""' file

answered Mar 14 '18 at 16:19

glenn jackman

238,783
38
220
352

score 0 · Answer 5 · answered Mar 14 '18 at 14:43

0

With awk :

echo "18:40:12,172.16.70.217,UP" | awk '{ $0 = ( substr($0,1,5) substr($0,9) ) ; print $0}'

Regards!

answered Mar 14 '18 at 14:43

Matias Barrios

4,674
3
22
49

score 0 · Answer 6 · answered Mar 14 '18 at 15:50

0

If you are running on bash, you can use the string manipulation functionality of it instead of having to call awk, sed, cut or whatever binary:

while read STRING
do
  echo ${STRING:0:5}${STRING:9}
done < myfile.txt

${STRING:0:5} represents the first five characters of your string, ${STRING:9} represents the 9th character and all remaining characters until the end of the line. This way you cut out characters 6,7 and 8 ...

answered Mar 14 '18 at 15:50

Heiko Gerstung

57
1

You're thinking about this backwards. awk, sed, cut or whatever binaries are what you're **supposed** to call to manipulate text when you're using a shell. They're much faster, more robust, and more portable than trying to use shell loops, etc. since shell wasn't designed to manipulate text, it was designed to call those tools to manipulate text when necessary. The code you wrote, for example, contains multiple bugs and is slow. See https://unix.stackexchange.com/questions/169716/why-is-using-a-shell-loop-to-process-text-considered-bad-practice for a discussion of some of the related issues. – Ed Morton Mar 14 '18 at 18:07
I resp ectfully disagree. I know that my proposed solution is only working for bash (it is not a POSIX compliant solution that is), but this is what the original poster said she/he is working with. And using the built-in shell functions is much faster than starting two or three processes when using sed and awk. – Heiko Gerstung Mar 15 '18 at 07:13
I would appreciate a hint why the code I posted has several bugs in it. And I accept that on desktop and server systems, starting multiple processes to achieve the same outcome is as fast or sometimes a little bit faster than my suggestion. And thanks for the link, interesting discussion. – Heiko Gerstung Mar 15 '18 at 07:24
1

Some potential improvements to your solution would be: `while read -r string` (use `-r` to prevent the shell from trying to be smart with backslashes and avoid uppercase variable names), `echo "${string:0:5}${string:9}"` (quotes to prevent possible glob expansion and white space collapsing). Bear in mind that this would still do something weird if the line started with `-e` or `-n` because `echo` would think that it was a switch. – Tom Fenech Mar 15 '18 at 09:39
... and that "something weird" would depend on the version of echo you're using as would what happens to characters preceded by backslashes when they get to the echo, even if they got past the read. You should also set `IFS=` on the loop by default - it might not matter in this case given the OPs input but that's always what you do by default, like quoting vars. @HeikoGerstung - again, see https://unix.stackexchange.com/questions/169716/why-is-using-a-shell-loop-to-process-text-considered-bad-practice for why your shell loop would be an order of magnitude slower than, say, just calling awk. – Ed Morton Mar 15 '18 at 16:32
Just to demonstrate, I created a file that's 100,000 lines long and ran your shell loop script on it plus one of Toms awk scripts (`'{ print substr($0, 1, 5) substr($0, 9) }'`) on it. Here's the 3rd-run timing results: awk = `real 0m0.187s, user 0m0.156s, sys 0m0.015s` and shell = `real 0m25.383s, user 0m4.851s, sys 0m20.389s`. So about **0.2 seconds** for awk vs **25 seconds** (yes, 25, not 2.5 nor 0.25) for the shell loop. – Ed Morton Mar 15 '18 at 16:47
I forgot to say earlier - you should be using `printf '%s\n' "${STRING:0:5}${STRING:9}"` instead of `echo ${STRING:0:5}${STRING:9}` to deal with some of the weirdness, errors that Tom mentioned, non-portability, etc. It'll still be extremely slow though. – Ed Morton Mar 15 '18 at 17:14
I couldn't put output of cut... variants into variables but this thing worked. good for less than 10,000 iterations I suppose. – Harshiv Jul 01 '19 at 05:30

Bash + sed/awk/cut to delete nth character

6 Answers6

Linked