26

I have this awk script that I use to filter genes that are differentially expressed. I have a csv file that was created in R.

 #Command to get DE genes
awk -F '\t' '$14 < 0.05 && $10 < -1 && $7 > 1 { print > "Genes-Down.csv" }
             $14 < 0.05 && $10 > +1 && $8 > 1 { print > "Genes-Up.csv" }' Results-RPKMs.csv

I started doing all my analyses on Mac OS now and the same command does not work. It also does not give any error message. It runs and nothing happens. I also had same problems with other sed commands, but it was easy to make new ones using awk.


Update: The MacOS X awk is version 20070501. However, the Ubuntu machine has mawk 1.3.3. The command awk --version wouldn't work. Had to use awk -W --version. So I think that is why it works on Ubuntu but was not working in MacOSX. So I downloaded mawk and installed it using fink and now the command works in MacOSX. Thanks for your help.

Update2: Actually the problem was not awk. Usually I create the csv files in R. Then I just run the script to do the filtering. Turns out that if I open the csv files in Excel or save an Excel file in csv format then the script does not work (tried several times with different delimiters). Apparently if you save a spreadsheet as .csv in MacOX (Excel 2011) and try to open it back in Excel it says that it is a SYLK file. There is a description of this on Microsoft website. If I use OpenOffice, it works just fine.

codeforester
  • 39,467
  • 16
  • 112
  • 140
degopwn
  • 507
  • 1
  • 5
  • 14
  • try grouping the `&&` as `($14 < 0.05 && $10 < -1) && $7 > 1` – Avinash Raj Jun 20 '14 at 17:50
  • 3
    `cat -vet "Results-RPKMs.csv | head -10` . Do you see `^M$` at the end of each line? If so, then `dos2unix Results-RPKMs.csv`. Else edit your question to include results of `awk --verion` from both machines. Good luck. – shellter Jun 20 '14 at 18:12
  • 1
    I just tried your script with `BSD awk` version 20070501 on MacOS X and it worked without a hitch as did `mawk` and `gawk` – Scrutinizer Jun 21 '14 at 10:28
  • 1
    FYI mawk is a minimally-feature awk, stripped down to help it run a bit faster than some other awks. You'd be much better off installing the feature-rich, POSIX-superset gawk. – Ed Morton Jun 21 '14 at 11:19

2 Answers2

39

I also had the same problem. Installing gawk on OSX 10.11.2 through brew solved my issue.

~$ brew install gawk
~$ gawk --version | head -n 1
GNU Awk 4.1.4, API: 1.1 (GNU MPFR 3.1.4-p1, GNU MP 6.1.1)
~$
vikas027
  • 5,282
  • 4
  • 39
  • 51
29

A same command name doesn't mean it is the same command. Most basic commands have a different implementation, AWK is an example, but almost all the GNU core utils has their equivalent in BSD license. You should be careful with GNU sed and BSD sed, it's a pitfall too.

In reality Linux generally uses gawk or mawk:

$ man awk
mawk - pattern scanning and text processing language

Mac OS uses generally nawk:

$ man awk
awk - pattern-directed scanning and processing language

See this page for more informations about AWK implementations.

Zulu
  • 8,765
  • 9
  • 49
  • 56