4

As per title, I have a replicaSet with 1 primary, 1 secondary and 1 arbiter, I restored a big DB in the primary and it is a much faster instance than the secondary. Now the secondary is lagging a lot (hours) and it's in recovery status since hours. Can I do something? Can I know the recovery progress?

michelem
  • 14,430
  • 5
  • 50
  • 66
  • http://stackoverflow.com/questions/19675117/mongodb-how-to-check-secondary-is-synced-now-or-not/19862521#19862521 – zero323 Apr 25 '15 at 10:49
  • 1
    As a side note: Having replica set members which have a vast difference in their performance is a Very Bad Idea™. There will be a point of usage at which the replication to the slower member will start to fall behind and won't have the chance to catch up, assuming the usage stays the same. Always have data bearing replica set members which roughly have the same performance. The best is to have exact duplicates. – Markus W Mahlberg Apr 25 '15 at 11:05

1 Answers1

1

I'll answer your second question first. "Can I know the recovery progress?" Yes, you may connect to the replica set primary and execute the rs.status() command to see the status of each member in the RS. Refere to the stateStr field of the output which will denote the friendly name of the status code. This is indicative of the progress of the recovery.

In your title however you asked if you can know when it will end. That's a lot harder. There is no way to know "exactly" when a secondary will finish syncing with another member.

Regarding "Can I do something?"; yes but nothing will give you an exact answer to your desire to know when the replication will end. Refer to the rs.status() output and specifically check the optime field and compare it for the secondary member and the member to which it is synchronizing which in your case is the primary. This will only provide "some" insight into how far apart the two servers are apart. However, this alone is not very exact and other factors can affect the actual time it will take to catch up. It will not tell you it will be done at a particular time.

Also; I would also head Markus Mahlberg's advice if in your use case your servers are not of equal quality. This and many other factors can contribute to replication lag including disk io issues, network latency including cross data center factors. There is no clear cut answer here.

Ikar Pohorský
  • 4,617
  • 6
  • 39
  • 56
SDillon
  • 213
  • 1
  • 11
  • yep, I saw, thanks for answer, I ended with a secondary shutdown, remove, delete, re-add and everything went well after a full re-sync. I think the big gap on resources between the instances made the differences (and the problems) – michelem Apr 28 '15 at 08:21