2

we have sometimes huge file of /root/.ssh/authorized_keys on our linux machines , and that because a lot of duplicate lines in the file as the following

ssh-rsa AAAAB3NzaC1yc2EAAAADAQABAAABAQC6yaJuzX2QldXj9jI/IYbJQuYDTUf232IbkefUDG4sZxkkScbiqC4skJs9bC58iovYxMVLB7YijIHDri7ONfKzixooSfpf8x18JdmSTkEl7WVTPm3TI/fPVP7DDOoBbqpTeZzS6cFVRMceve3ecFp/Z
D02RfLy6FHu3Y9o55g4Hlm+IgRq+QflsSoY3khZhaxofyzYIchg9NI1RzEZJQEBIMlQZMd+bRiBoAtzqI2BtKd5YmnBmxGHhnZLswSeo7hz+2cAPe+Ng37V91cSuygQJyKf20f1DmhSKHvHEDU3EXDPbjO8H0LNz6OEhsjwUj+G5dcJA04wY0Y1+qCfRz
kR root@server1.com
ssh-rsa AAAAB3NzaC1yc2EAAAADAQABAAABAQC6yaJuzX2QldXj9jI/IYbJQuYDTUf232IbkefUDG4sZxkkScbiqC4skJs9bC58iovYxMVLB7YijIHDri7ONfKzixooSfpf8x18JdmSTkEl7WVTPm3TI/fPVP7DDOoBbqpTeZzS6cFVRMceve3ecFp/Z
D02RfLy6FHu3Y9o55g4Hlm+IgRq+QflsSoY3khZhaxofyzYIchg9NI1RzEZJQEBIMlQZMd+bRiBoAtzqI2BtKd5YmnBmxGHhnZLswSeo7hz+2cAPe+Ng37V91cSuygQJyKf20f1DmhSKHvHEDU3EXDPbjO8H0LNz6OEhsjwUj+G5dcJA04wY0Y1+qCfRz
kR root@server1.com
ssh-rsa AAAAB3NzaC1yc2EAAAADAQABAAABAQC6yaJuzX2QldXj9jI/IYbJQuYDTUf232IbkefUDG4sZxkkScbiqC4skJs9bC58iovYxMVLB7YijIHDri7ONfKzixooSfpf8x18JdmSTkEl7WVTPm3TI/fPVP7DDOoBbqpTeZzS6cFVRMceve3ecFp/Z
D02RfLy6FHu3Y9o55g4Hlm+IgRq+QflsSoY3khZhaxofyzYIchg9NI1RzEZJQEBIMlQZMd+bRiBoAtzqI2BtKd5YmnBmxGHhnZLswSeo7hz+2cAPe+Ng37V91cSuygQJyKf20f1DmhSKHvHEDU3EXDPbjO8H0LNz6OEhsjwUj+G5dcJA04wY0Y1+qCfRz
kR root@server1.com

what is the best approach to remove this duplicate lines so only unique lines will be appears?

we need to remove the duplicate lines VIA ssh on remote machines

brian d foy
  • 129,424
  • 31
  • 207
  • 592
jessica
  • 2,426
  • 24
  • 66
  • 4
    why not something 'simple' like: `cd ~/.ssh; mv authorized_keys authorized_keys.bkup; sort -u authorized_keys.bkup > authorized_keys; chmod 600 authorized_keys`; perhaps add a date/time stamp to the backup file if you want to keep a few copies ... just in case; I'd also want to track down what process is generating all of the duplicate entries and fix it (so that it doesn't generate duplicate entries) – markp-fuso Nov 03 '20 at 20:17
  • I'd read each line of that file as a key of a hash. Then, rewrite the file with the keys of that hash. – Miguel Prz Nov 03 '20 at 20:20
  • [This](https://unix.stackexchange.com/q/30173/74329) and [this](https://stackoverflow.com/q/16529716/3776858) might help. – Cyrus Nov 03 '20 at 20:23
  • How "_huge_" is that file? Is it feasible to read it all into memory? – zdim Nov 03 '20 at 22:32

2 Answers2

4

Commands can be run on the remote server with the ssh client

ssh hostname '
    cd /root/.ssh; 
    cp -a authorized_keys authorized_keys.orig;
    sort -u authorized_keys -o authorized_keys
'

This is written as multi-line code for readability. The whole thing can also be on one line (in which case there is no need to enclose it in '', in this case).

If a backup (.orig) is absolutely certainly not needed just remove the cp ... line.

Doing it this way clearly changes the order of lines. If that is a problem you can run a script, or a one-liner, that can preserve the order.


If there is more to be done, one way to run a one-liner on the remote server is

ssh hostname << 'CMD'
cd /root/.ssh/
perl -i.orig -wne'$uniq{$_} = 1; }{ print for keys %uniq' authorized_keys
CMD

The }{ syntax starts an END block, which runs after all lines have been processed. If a backup is certainly not needed remove .orig, so leave only -i switch (to change the file in-place).

The above still merely removes duplicates but one can substitute or amend the code under '' with other Perl code that may be needed. For example, to preserve order among unique lines can use this Perl command-line program ("one-liner") instead of the one above

perl -MList::Util=uniq -i.orig -wne'
    push @lines, $_; END { print for uniq @lines }' authorized_keys

Here we use uniq from the core module List::Util, which returns unique elements from its input list. It keeps the first of (possibly) repeated elements and maintains the order. I assume that the authorized_keys file cannot be too large to read all into memory.

See "Command switches" in perlrun for more about one-liners.

Note that the keyword in the "heredoc" syntax used above is under quotes ('CMD'), to suppress variable expansion by the shell.

zdim
  • 64,580
  • 5
  • 52
  • 81
2

I would opt for something simple like:

ssh ... "cd ~/.ssh; mv authorized_keys authorized_keys.bkup; sort -u authorized_keys.bkup > authorized_keys; chmod 600 authorized_keys"

Assuming it would be a major PITA to reconstruct authorized_keys from scratch I've opted to keep a copy of the current file ... just in case a typo happens to screw up the contents of authorized_keys, in which case you have a backup copy you can recover from (once you can successfully log into the remote host).

I'd also want to track down what process is generating all of the duplicate entries and fix it (so that it doesn't generate duplicate entries).

markp-fuso
  • 28,790
  • 4
  • 16
  • 36
  • 1
    can you post another solution without the backup file ? we not need it – jessica Nov 03 '20 at 21:09
  • keep the same code and add `;rm authorized_keys.bkup` on the end; alternatively, wait until you've verified the changes are good (eg, you can successfully initiate a new ssh session) and then issue a separate `ssh ... "cd ~/.ssh ; rm authorized_keys.bkup"` – markp-fuso Nov 03 '20 at 22:05