I am trying to remove a newline characters from with in quotes in file
I am able to achieve that using the below code
awk -F"\"" '!length($NF){print;next}{printf("%s ", $0)}' filename.txt>filenamenew.txt
Note I am creating a new file filenamenew.txt is this avoidable can i do the command in place the reason I ask is because files are huge.
my file is pipe delimited
sample input file
"id"|"name"
"1"|"john
doe"
"2"|"second
name
in the list"
using the above code I get the following output
"id"|"name"
"1"|"john doe"
"2"|"second name in the list"
but I have a huge files and i see in some of the lines have ^M character in between quotes example
second sample input file
"id"|"name"
"1"|"john
doe"
"^M2"|"second^M^M
name
in the list"
o/p using above code
"id"|"name"
"1"|"john doe"
name in the list"
so basically if there is a ^M in the line that string is not being printed but i read online ^M is equal to \r so i used
tr -d'\r'< filename.txt
I also tried
awk-F"|"{sub(/^M/,"")}1
but it did not remove those characters (^M)
A little background on why i am doing this I am extracting data from a relational table and loading into flat file and checking if the counts between table and file matched but since there is \n in columns count(*) vs wc-l in file is not matching.
final resolution:
i don't want to delete these unprintable characters in the long run but want to replace it with some character or value(so that counts between table and file matches) and then when i am loading it back to a table i want to again replace the value that i have added effectively as a place holder with \n or ^M what was originally present so that there is no tampering of data from my side.
Any suggestions is appreciated.
thanks.