In my case there was corruption in the symbols
section of the xxx,v
file. The expected format is tag_name:tag_rev
, but there were instances of:
- Missing
:tag_rev
e.g. tag_name
Fixed by deleting the line.
- Multiple
tag_name
e.g. tag_name1:tag_name2:tag_rev
Fixed by removing the second tag name (which one you remove probably depends on what they are).
- Invalid name/revision delimiter. In my case the invalid character was always
z
(there is only 1-bit difference between ASCII :
and z
).
e.g. tag_nameztag_rev
Fixed by replacing the z
with :
.
To help during my investigation I added a print
line to cvs2svn_rcsparse\common.py
. If parsing the symbols fails, the last tag printed is the cause.
def _parse_admin_symbols(self, token):
while 1:
tag_name = self.ts.get()
# WileCau print the token and tag_name
print 'token=|%s| tag_name=|%s|' % (token, tag_name)
if tag_name == ';':
break
self.ts.match(':')
tag_rev = self.ts.get()
self.sink.define_tag(tag_name, tag_rev)
The additional print adds quite a lot of noise to the output so it might be nicer to only print if an exception happens, but this was good enough for my needs.
I also found this link which turned out to not be my problem but may help someone else. Credit to Christian Haarmann for documenting it.
http://tigris-scm.10930.n7.nabble.com/suggestions-for-cvs2svn-fix-for-error-quot-myfile-txt-v-is-not-a-valid-v-file-quot-td54240.html
In case the link becomes invalid, the summary is that someone had edited the xxx,v
file and their editor had replaced 0x0A (LF) with 0x0D/0x0A (CR/LF), and the additional character caused the parser to think the file was corrupt.