I am trying to parse some text files with the command line. part of this involves reattaching broken words in some badly-formatted emails. An example:
9,650 330,765.0 16.38% NYSE (000) 1,707,915 272,099.0 18.95% Commodit=
ies Close Change % Change Crude Oil (Feb) 19.62 0.32 1.66% Heating Oil (Ja=
I want to grab 'Commodities.' I'm using this workaround to sed to get the job done.
I'm using Mac OS X 10.7 and GNU sed version 4.2.1. If at the command line I enter
sed ':a;N;$!ba;s/=\r\n//g' ./filename
sed works correctly. However if I run this bash script:
#!/bin/bash
sed ':a;N;$!ba;s/=\r\n//g' filename
sed doesn't work. However, the same script works under Ubuntu's command line:
9,650 330,765.0 16.38% NYSE (000) 1,707,915 272,099.0 18.95% Commodities Close Change % Change Crude Oil (Feb) 19.62 0.32 1.66% Heating Oil (Jan)
On my Mac, the simpler script
#!/bin/bash
sed 's/=//g' filename
successfully removes all the equal signs. I'm trying different combinations of characters to backslash out but without much success. Any hints to what the Mac terminal isn't liking?