6

I am working on Asp.Net project which needs to fill in a word document. My client provides a word template with last name, firstname, birth date,etc... . I have all those information in the sql database, and the client want the users of the application be able to download the word document with filled in information from the database.

What's the best way to archive this? Basically, I need identify those "fillable spot" in word document, fill those information in when the application user clicks on the download button.

J.W.
  • 17,991
  • 7
  • 43
  • 76

8 Answers8

8

If you can use Office 2007 the way to go is to use the Open XML API to format the documents: http://support.microsoft.com/kb/257757. The reason you have to go that route is that you can't really use Word Automation in a server environment. (you CAN, but it's a huge pain to get working properly, and can EASILY break).

If you can't go the 2007 route, I've actually had pretty good success with just opening up a word template as a stream and finding and replacing the tokens and serving that to the user. This has actually worked surprisingly well in my experience and it's REALLY simple to implement.

aquinas
  • 23,318
  • 5
  • 58
  • 81
6

I'm not sure about some of the ASP.Net aspects, but I am working on something similar and you might want to look into using an RTF instead. You can use pattern replacement in the RTF. For example you can add a tag like {USER_FIRST_NAME} in the RTF document. When the user clicks the download button, your application can take the information from the database and replace every instance of {USER_FIRST_NAME} with the data from the database. I am currently doing this with PHP and it works great. Word will open the RTF without a problem so that is another reason I chose this method.

Mark
  • 1,368
  • 5
  • 13
  • 26
2

I have used Aspose.Words for .NET. It's a little on the pricey side, but it works extremely well and the API is fairly intuitive for something that is potentially very complex.

If you want to pre-design your documents (or allow others to do that for you), anyone can put fields into the document. Aspose can open the document, find and fill the fields, and save a new filled-out copy for download.

Rex M
  • 142,167
  • 33
  • 283
  • 313
2

Aspose works okay, but again: it's pricey.

Definitely avoid Office Automation in web apps as much as possible. It just doesn't scale well.

My preferred solution for this kind of problem is xml: specifically here I recommend WordProcessingML. You create an Xml document according to the schema, put a .doc extension on it, and MS Word will open it as if it were native in any version as far back as Office XP. This supports most Word features, and this way you can safely reduce the problem to replacing tokens in a text stream.

Be careful googling for more information on this: there's a lot of confusion between this and new Xml-based format for Office 2007. They're not the same thing.

Joel Coehoorn
  • 399,467
  • 113
  • 570
  • 794
1

This code works for WordMl text boxes and checkboxes. It's index based, so just pass in an array of strings for all textboxes and an array of bool's for all checkboxes.

public void FillInFields(
    Stream sourceStream,
    Stream destinationStream,
    bool[] pageCheckboxFields,
    string[] pageTextFields
    ) {

    StreamUtil.Copy(sourceStream, destinationStream);
    sourceStream.Close();

    destinationStream.Seek(0, SeekOrigin.Begin);

    Package package = Package.Open(destinationStream, FileMode.Open, FileAccess.ReadWrite);
    Uri uri = new Uri("/word/document.xml", UriKind.Relative);

    PackagePart packagePart = package.GetPart(uri);
    Stream documentPart = packagePart.GetStream(FileMode.Open, FileAccess.ReadWrite);
    XmlReader xmlReader = XmlReader.Create(documentPart);

    XDocument xdocument = XDocument.Load(xmlReader);

    List<XElement> textBookmarksList = xdocument
        .Descendants(w + "fldChar")
        .Where(e => (e.AttributeOrDefault(w + "fldCharType") ?? "") == "separate")
        .ToList();

    var textBookmarks = textBookmarksList.Select(e => new WordMlTextField(w, e, textBookmarksList.IndexOf(e)));

    List<XElement> checkboxBookmarksList = xdocument
        .Descendants(w + "checkBox")
        .ToList();

    IEnumerable<WordMlCheckboxField> checkboxBookmarks = checkboxBookmarksList
        .Select(e => new WordMlCheckboxField(w, e, checkboxBookmarksList.IndexOf(e)));

    for (int i = 0; i < pageTextFields.Length; i++) {
        string value = pageTextFields[i];
        if (!String.IsNullOrEmpty(value))
            SetWordMlElement(textBookmarks, i, value);
    }

    for (int i = 0; i < pageCheckboxFields.Length; i++) {
        bool value = pageCheckboxFields[i];
        SetWordMlElement(checkboxBookmarks, i, value);
    }

    PackagePart newPart = packagePart;
    StreamWriter streamWriter = new StreamWriter(newPart.GetStream(FileMode.Create, FileAccess.Write));
    XmlWriter xmlWriter = XmlWriter.Create(streamWriter);
    if (xmlWriter == null) throw new Exception("Could not open an XmlWriter to 4311Blank-1.docx.");
    xdocument.Save(xmlWriter);

    xmlWriter.Close();
    streamWriter.Close();
    package.Flush();

    destinationStream.Seek(0, SeekOrigin.Begin);
}

private class WordMlTextField {
    public int? Index { get; set; }
    public XElement TextElement { get; set; }

    public WordMlTextField(XNamespace ns, XObject element, int index) {
        Index = index;

        XElement parent = element.Parent;
        if (parent == null) throw new NicException("fldChar must have a parent.");
        if (parent.Name != ns + "r") {
            log.Warn("Expected parent of fldChar to be a run for fldChar at position '" + Index + "'");
            return;
        }
        var nextSibling = parent.ElementsAfterSelf().First();

        if (nextSibling.Name != ns + "r") {
            log.Warn("Expected a 'r' element after the parent of fldChar at position = " + Index);
            return;
        }

        var text = nextSibling.Element(ns + "t");
        if (text == null) {
            log.Warn("Expected a 't' element inside the 'r' element after the parent of fldChar at position = " + Index);
        }

        TextElement = text;
    }
}

private class WordMlCheckboxField {
    public int? Index { get; set; }
    public XElement CheckedElement { get; set; }
    public readonly XNamespace _ns;

    public WordMlCheckboxField(XNamespace ns, XContainer checkBoxElement, int index) {
        _ns = ns;
        Index = index;

        XElement checkedElement = checkBoxElement.Elements(ns + "checked").FirstOrDefault();
        if (checkedElement == null) {
            checkedElement = new XElement(ns + "checked", new XAttribute(ns + "val", "0"));
            checkBoxElement.Add(checkedElement);
        }

        CheckedElement = checkedElement;
    }

    public static void Copy(Stream readStream, Stream writeStream) {
        const int Length = 256;
        Byte[] buffer = new Byte[Length];
        int bytesRead = readStream.Read(buffer, 0, Length);
        // write the required bytes
        while (bytesRead > 0) {
            writeStream.Write(buffer, 0, bytesRead);
            bytesRead = readStream.Read(buffer, 0, Length);
        }
        readStream.Flush();
        writeStream.Flush();
    }
Lee Richardson
  • 8,331
  • 6
  • 42
  • 65
  • Perhaps you could add some context to the sample? – Joel Coehoorn Apr 13 '09 at 14:40
  • What kind of context did you have in mind? I basically just finished implementing the exact functionality gisresearch was asking for (assuming WordML) and thought it might help. Basically just call FillInfields(...). – Lee Richardson Apr 15 '09 at 21:16
0

In general you are going to want to avoid doing Office automation on a sever, and Microsoft has even stated that it is a bad idea as well. However, the technique that I generally use is the Office Open XML that was noted by aquinas. It does take a bit of time to learn your way around the format, but it is well worth it once you do as you don't have to worry about some of the issues involved with Office automation (e.g. processes hanging).

Awhile back I answered a similar question to this that you might find useful, you can find it here.

Community
  • 1
  • 1
rjzii
  • 14,236
  • 12
  • 79
  • 119
  • there is a bit more nuance than "MS says its bad". MS specifically states that the older 2003 APIs are not recommended but the newer Open XML stuff is perfect for this task. – John Farrell Jul 22 '10 at 15:06
  • @jfar - There is indeed more nuance than that, but it is a quick way of expalining things. Also, you will note that I said I'm using the Office Open XML format. – rjzii Jul 22 '10 at 15:29
0

If you need to do this in DOC files (as opposed to DOCX), then the OpenXML SDK won't help you.

Also, just want to add another +1 about the danger of automating the Office apps on servers. You will run into problems with scale - I guarantee it.

To add another reference to a third-party tool that can be used to solve your problem:

http://www.officewriter.com

OfficeWriter lets you control docs with a full API, or a template-based approach (like what your requirement is) that basically lets you open, bind, and save DOC and DOCX in scenarios like this with little code.

Eisbaer
  • 189
  • 1
  • 6
-3

Could you not use Microsofts own InterOp Framework to utilise Word Functionality

See Here

Dean
  • 5,896
  • 12
  • 58
  • 95
  • 1
    It's a royal pain to get this working and keep it working in a server environment. Not recommended. – Knobloch Apr 13 '09 at 14:39