0

How can I cut the leading zeros in the third field so it will only be 6 characters?

 xxx,aaa,00000000cc
 rrr,ttt,0000000yhh

desired output

  xxx,aaa,0000cc
  rrr,ttt,000yhh
starpark
  • 185
  • 1
  • 2
  • 11

3 Answers3

3

or here's a solution using awk

 echo " xxx,aaa,00000000cc
 rrr,ttt,0000000yhh"|awk -F, -v OFS=, '{sub(/^0000/, "", $3)}1'

output

 xxx,aaa,0000cc
 rrr,ttt,000yhh

awk uses -F (or FS for FieldSeparator) and you must use OFS for OutputFieldSeparator) .

sub(/srchtarget/, "replacmentstring", stringToFix) is uses a regular expression to look for 4 0s at the front of (^) the third field ($3).

The 1 is a shorthand for the print statement. A longhand version of the script would be

echo " xxx,aaa,00000000cc
 rrr,ttt,0000000yhh"|awk -F, -v OFS=, '{sub(/^0000/, "", $3);print}'
 # ---------------------------------------------------------^^^^^^

Its all related to awk's /pattern/{action} idiom.

IHTH

shellter
  • 36,525
  • 7
  • 83
  • 90
  • Much better answer than mine. Would you mind explaining what that `1` at the end does? – Andrew Magee Mar 04 '15 at 03:34
  • @AndrewMagee : Better, well, shorter, anyway. Thanks. I've added a short explanation. If you read `awk` postings for a week or so, you'll find fuller definitions of how the `/pattern/{action}` works. Good luck to all. – shellter Mar 04 '15 at 03:48
  • 2
    @AndrewMagee: `1` is shorthand for: print the (potentially modified) line at hand unconditionally. Technically, `1` serves as a _pattern_ that always evaluates to true (non-negative numbers in awk are considered true in a Boolean context). awk programs come in pattern-action pairs: if the pattern matches, the associated action (`{...}`) is executed. Patterns that do not have an associated default to printing the line at hand. In `awk`, much is about what's left _unsaid_; the cleverly designed default behavior allows for very terse programs. – mklement0 Mar 04 '15 at 03:55
  • 1
    @mklement0 : Well put! Thanks for that. Good luck to all. Wow! Love your answer to http://stackoverflow.com/questions/12882611/how-to-get-bc-to-handle-numbers-in-scientific-aka-exponential-notation/28846040#28846040 . – shellter Mar 04 '15 at 04:09
  • @shellter: thanks - almost: of course, I meant to say: _nonzero_ numbers in awk are considered true in a Boolean context -- and thanks for the compliment on the linked answer. – mklement0 Mar 04 '15 at 04:12
1

If you can assume there are always three fields and you want to strip off the first four zeros in the third field you could use a monstrosity like this:

$ cat data
xxx,0000aaa,00000000cc
rrr,0000ttt,0000000yhh

$ cat data |sed 's/\([^,]\+\),\([^,]\+\),0000\([^,]\+\)/\1,\2,\3/
xxx,0000aaa,0000cc
rrr,0000ttt,000yhh

Another more flexible solution if you don't mind piping into Python:

cat data | python -c '
import sys
for line in sys.stdin():
  print(",".join([f[4:] if i == 2 else f for i, f in enumerate(line.strip().split(","))]))
'

This says "remove the first four characters of the third field but leave all other fields unchanged".

Andrew Magee
  • 6,506
  • 4
  • 35
  • 58
  • how if I also have 4 leading zeros at another column that i don't need to cut? I just need the 3rd column.. thanks – starpark Mar 04 '15 at 01:28
  • I may consider this one. But how about if I have more than 15 columns and I will only remove particular leading zeros in column 3, isn't it tedious? – starpark Mar 04 '15 at 02:40
  • It is a little bit tedious. Though it doesn't matter how many columns you have *after* the one you want to modify as you can just match a `.*` at the end. – Andrew Magee Mar 04 '15 at 02:41
  • Actually it will already work regardless of how many columns you have after the one you want to modify; they'll just remain unchanged. If you wanted to modify the 15th column, though, this solution would be a bit ridiculous. – Andrew Magee Mar 04 '15 at 02:43
0

Using awks substr should also work:

awk -F, -v OFS=, '{$3=substr($3,5,6)}1' file
xxx,aaa,0000cc
rrr,ttt,000yhh

It just take 6 characters from 5 position in field 3 and set it back to field 3

Jotne
  • 40,548
  • 12
  • 51
  • 55