3

We discovered that our Hg repository got corrupted several weeks ago. This corruption seems to have propagated: all clones (the central repo and the user repos) are corrupted, pretty badly, in the same way. I think we could have prevented it if we did verification at that time.

Is there some Hg setting that would cause verification on each push, and prevent push in case of verification failure? I know I could implement it as a hook in Python, but is there maybe a simpler solution?

Is it also possible to do the opposite: make sure the remote repository is verified before pulling?

FWIW, I am on Windows 10 and we are using TortoiseHg.

Update: I've tried creating hooks as suggested by Jordi. Hg now hangs waiting for locks. Here is what I see:

c:\Users\username\test-hook>hg init
c:\Users\username\test-hook>cd ..
c:\Users\username>hg clone test-hook test-hook-clone
updating to branch default
0 files updated, 0 files merged, 0 files removed, 0 files unresolved

# At this point I edited clone repository settings to include
# [hooks]
# preoutgoing = hg verify
#
# Then I created a test.txt file and "added" it via TortoiseHg context menu.

c:\Users\username\test-hook-clone>hg commit
c:\Users\username\test-hook-clone>hg status

c:\Users\username\test-hook-clone>hg outgoing
comparing with c:\Users\username\test-hook
searching for changes
changeset:   0:a61d33af6cdb
tag:         tip
user:        username
date:        Mon May 06 20:32:54 2019 +0200
summary:     test file added

c:\Users\username\test-hook-clone>hg push -verbose
pushing to c:\Users\username\test-hook
searching for changes
running hook preoutgoing: hg verify
waiting for lock on repository c:\Users\username\test-hook-clone held by process '16840' on host 'LT407233'
texnic
  • 3,959
  • 4
  • 42
  • 75
  • I don't know the direct answer, but you could also schedule a verify call on any remote / central repos. Could run hourly, even. This could be better than nothing. – StayOnTarget Apr 16 '19 at 11:16
  • @DaveInCaz, sounds like a good idea. I am trying to make Task Scheduler work, not very successfully so far. – texnic Apr 26 '19 at 06:58
  • What process does process '16840' correspond to? You could do, for example `ps aux | grep 16840` or whatever the number is. – Faheem Mitha May 06 '19 at 18:48
  • @FaheemMitha, I am on Windows, but the process is `hg.exe`. I added -verbose to hg push call. It's "running hook preoutgoing: hg verify". – texnic May 06 '19 at 18:57
  • I'm not a Windows user. I'm also not a Python expert, but I suspect that one way to proceed with that is to attach a Python debugger to that process, assuming it is a Python process. You could post a separate question about this. Or you could ask on the Mercurial user mailing list or IRC. I'll post this on #mercurial and see if anyone has comments. – Faheem Mitha May 06 '19 at 21:39

2 Answers2

4

To answer your question, the hook does not have to be written in Python. In the appropriate server's hgrc (either at the repository level or the system level), simply set

[hooks]
preoutgoing = hg verify
preincoming = hg verify

This may significantly slow down all pull and push operations, but perhaps you are willing to sacrifice speed for correctness.

This will result in output like this when a client tries to pull from a corrupt repo:

$ hg clone http://localhost:9000 sample-repo
requesting all changes
remote: abort: preoutgoing hook exited with status 1

and in your server logs, you should see output similar to

127.0.0.1 - - [18/Apr/2019 12:41:09] "GET /?cmd=batch HTTP/1.1" 200 - x-hgarg-1:cmds=lheads+%3Bknown+nodes%3D x-hgproto-1:0.1 0.2 comp=zstd,zlib,none,bzip2 partial-pull
checking changesets
checking manifests
crosschecking files in changesets and manifests
checking files
 a@0: broken revlog! (index data/a.i is corrupted)
warning: orphan data file 'data/a.i'
checked 2 changesets with 1 changes to 2 files
1 warnings encountered!
1 integrity errors encountered!
(first damaged changeset appears to be 0)
0

You can enable a server side option to perform more validation all incoming content, just set the server.validate=yes option on your server.

The simplest way is to enable it in your server global hgrc or in the repository .hg/hgrc file by adding the following two lines:

[server]
validate = yes

This is a server option but you can also use it on the client. It should validate pull too.

(By the way, what kind of corruption are you seeing?)