I have a tab limited data that reads
1 0 0 1 1 Black Swan
0 0 1 0 0 Golden Duck
1 0 0 1 0 Brown Eagle
0 0 1 0 1 Golden Duck
1 0 0 1 0 Black Swan
1 0 1 0 0 Golden Duck
1 0 0 1 1 Sparrow
The last column is a combination of one or more words separated by space. I want to count the number of unique values in the last column and replace that with a number which is unique to that group. I know I can count the and list the numbers using
awk -F '\t' '{print $NF}' infile | sort | uniq | wc -l
But how do I replace with numbers? For example, replace all Black Swan by 1, replace all Golden Duck by 2 etc. I want the result to be :
1 0 0 1 1 1
0 0 1 0 0 2
1 0 0 1 0 3
0 0 1 0 1 2
1 0 0 1 0 1
1 0 1 0 0 2
1 0 0 1 1 4
and I also want to generate the list of numbers given to specific values like
Black Swan 1
Golden Duck 2
Brown Eagle 3
Sparrow 4