Why does Awk's print acts crazy in formating two simple piped fields?

Question

Good Evening.

I encountered a strange phenomenon in dealing with awk's last field that I want to share it with you.
I have a log file for social networks which contains some fields separated by |. The fields are not important imho but they appear in this formating.
id|name|lastname|...|Social_Media_Used(nothing)
There are 9 separate fields.

Every row contains a user. e.g. ^random_numbers|Aris|something|...|Facebook$

The goal is to find a way of finding a total for every social media used.I have done this using the above code.

grep -v '^#' $3 | awk -F\| '{print $9}' | sort | uniq -c | awk '{print $1$2}'

First command removes # from my file that are considered comments.

Second command finds and prints the field 9 which corresponds at the field Social_Media_Used.This is the last field so I guess it will have \n at the end.

After that I sort and count the field and last awk prints the output like this:

884Blogger  
1105Facebook  
1326Flickr  
1104Google+  
1105Instagram  
1105LinkedIn  
1325Twitter  
1546Youtube

If I try in the last this command:
awk '{print $2$1}' then something strange happens.
If I store it it in a file I can see it like this:

Blogger  
 884  
Facebook  
 1105  
Flickr  
 1326  
Google+  
 1104  
Instagram  
 1105
LinkedIn  
 1105  
Twitter  
 1325  
Youtube  
 1546

If howerer I try to see the output form from terminal I see this:

884gger  
1105book  
1326kr  
1104le+  
1105agram  
1105edIn  
1325ter  
1546ube

DESIRED OUTPUT IS:
Blogger 884
Facebook 1105
Flickr 1326
Google+ 1104
Instagram 1105
LinkedIn 1105
Twitter 1325
Youtube 1546

I searched everything about sed or awk's RS,ORS or FRS and I also tried with printf or print but I couldn't find anything that matched or even came close to have word-space-number in the same line.No matter how I print or printf these lines.Howewer, when I try to print a dummy file I copy-pasted from main with 20 lines everything goes smoothly.Also, everything goes smoothly if I try to printf or print the field 8 or 7.

Where lies the solution to this problem?In the long file of 9500 files?Or in the fact that exists newline after the word?What do you think?

Please add sample input (multiple rows) and your desired output for that sample input to your question. — Cyrus, Dec 19 '18 at 19:03

score 1 · Answer 1 · answered Dec 19 '18 at 19:16

1

your data most likely include \r\n line endings. First run dos2unix file

you can eliminate most of the pipes with this as well

$ awk -F\| '!/^#/{a[$9]++} END{for(k in a) print k,a[k]}' file | sort

answered Dec 19 '18 at 19:16

karakfa

66,216
7
41
56

'!/^#/ does remove one pipe it's true.Thank you! – Aris Barlos Dec 19 '18 at 19:43

Cyrus · Answer 2 · 2018-12-19T19:25:37.797

1

Replace with GNU awk:

awk '{print $2$1}'

with

awk -v RS='\r*\n' '{print $2$1}'

to handle Unix and DOS/Windows line endings.

edited Dec 19 '18 at 19:25

answered Dec 19 '18 at 19:20

Cyrus

84,225
14
89
153

Thank you very much!My code (after a comma between) works perfectly. – Aris Barlos Dec 19 '18 at 19:39
A small detail improvement: replace `*` with `{0,1}` – Cyrus Dec 19 '18 at 19:45

Why does Awk's print acts crazy in formating two simple piped fields?

2 Answers2