I have been tasked to find a way to convert a large amount of .docx files to docbook 5. Currently, we open the file in openoffice and save to docbook. This is a time consuming task, but I am confident there is a better way. These files will then be processed further to our custom relax NG schema. Therefore this conversion does not need to be flawless. I have looked around, and will continue to investigate some leads, but have not found anything usefull.
looking at Convert doc/docx to semantic HTML they have suggested upCast, but this does not seem appropriate to my needs.
I am looking for something freely available that I can use from the command line. I ultimately I would like to batch process our files. I have included the linux, python, and java tags for these are the environments I am most comfortable, but would be willing to bend for the right solution. I am trying to do some research before I go out and reinvent the wheel.