4

I have the following situation:

I have set up a series of Cron jobs on an internal company server to run various PHP scripts designed to check data integrity. Each PHP script queries a company database, formats the returned query data into an HTML file containing one or more <tables>, and then mails the HTML file to several client emails as an attachment. From my experience, most of the PHP scripts generate HTML files with only a few tables, however there are a few PHP scripts the create HTML files with around 30 tables. HTML files have been chosen as the distribution format of these scans because HTML makes it easy to view many tables at once in a browser window.

I would like to add the functionality for the clients to download a table in the HTML file as a CSV file. I anticipate clients using this feature when they suspect a data integrity issue based on the table data. It would be ideal for them to be able to take the table in question, export the data out to a CSV file, and then study it further.

Because need for exporting the data to CSV format is at the discretion of the client, unpredictable as to what table will be under scrutiny, and intermittently used I do not want to create CSV files for every table.

Normally creating a CSV file wouldn't be too difficult, using JavaScript/jQuery to preform DOM traversal and generate the CSV file data into a string utilizing a server call or flash library to facilitate the download process; but I have one limiting constraint: The HTML file needs to be "portable."

I would like the clients to be able to take their HTML file and preform analysis of the data outside the company intranet. Also it is likely these HTML files will be archived, so making the export functionality "self contained" in the HTML files is a highly desirable feature for the two previous reasons.

The "portable" constraint of CSV file generation from a HTML file means:

  1. I cannot make a server call. This means ALL the file generation must be done client-side.

  2. I want the single HTML file attached to the email to contain all the resources to generate the CSV file. This means I cannot use jQuery or flash libraries to generate the file.

I understand, for obvious security reasons, that writing out files to disk using JavaScript isn't supported by any browser. I don't want to create a file without the user knowledge; I would like to generate the file using JavaScript in memory and then prompt the user the "download" the file from memory.

I have looked into generating the CSV file as a URI however, according to my research and testing, this approach has a few problems:

  • URIs for files are not supported by IE (See Here)

  • URIs in FireFox saves the file with a random file name and as a .part file

As much as it pains me, I can accept the fact the IE<=v9 won't create a URI for files. I would like to create a semi-cross-browser solution in which Chrome, Firefox, and Safari create a URI to download the CSV file after JavaScript DOM traversal compiles the data.

My Example Table:

<table>
    <thead class="resulttitle">
        <tr>
            <th style="text-align:center;" colspan="3">
            NameOfTheTable</th>
        </tr>
    </thead>
    <tbody>
        <tr class="resultheader">
            <td>VEN_PK</td>
            <td>VEN_CompanyName</td>
            <td>VEN_Order</td>
        </tr>
        <tr>
            <td class='resultfield'>1</td>
            <td class='resultfield'>Brander Ranch</td>
            <td class='resultfield'>Beef</td>
        </tr>
        <tr>
            <td class='resultfield'>2</td>
            <td class='resultfield'>Super Tree Produce</td>
            <td class='resultfield'>Apples</td>
        </tr>
        <tr>
            <td class='resultfield'>3</td>
            <td class='resultfield'>John's Distilery</td>
            <td class='resultfield'>Beer</td>
        </tr>
    </tbody>
    <tfoot>
        <tr>
          <td colspan="3" style="text-align:right;">
          <button onclick="doSomething(this);">Export to CSV File</button></td>
        </tr>
    </tfoot>
</table>

My Example JavaScript:

<script type="text/javascript">
  function doSomething(inButton) {

    /* locate elements */
    var table = inButton.parentNode.parentNode.parentNode.parentNode;
    var name  = table.rows[0].cells[0].textContent;
    var tbody = table.tBodies[0];

    /* create CSV String through DOM traversal */
    var rows  = tbody.rows;
    var csvStr = "";
    for (var i=0; i < rows.length; i++) {
      for (var j=0; j < rows[i].cells.length; j++) {
        csvStr += rows[i].cells[j].textContent +",";
      }
      csvStr += "\n";
    }

    /* temporary proof DOM traversal was successful */
    alert("Table Name:\t" + name + "\nCSV String:\n" + csvStr);

    /* Create URI Here!
     * (code I am missing)
     */

    /* Approach #1 : Auto-Download From Browser
     * Attempts to redirect the browser to the URI
     * and have the browser download the data as a file
     *
     * Approach does downloads CSV data but:
     * In FireFox downloads as randomCharacers.part instead of name.csv
     * In Chrome downloads without prompting the user and without correct file name
     * In Safari opens the files in browser (textfile)
     */
    /* Approach #1 Code:
       var hrefData = "data:text/csv;charset=US-ASCII," + encodeURIComponent(csvStr);
       document.location.href = hrefData;
     */

    /* Approach #2 : Right-Click Save As Link...
     * Attempts to remove "Download Button"
     * and replace the it with an anchor tag <a>
     * that the user can use to download the data
     *
     * Approach sort of works better:
     * When clicking on the link:
     * In FireFox downloads as randomCharacers.part instead of name.csv
     * In Chrome downloads without prompting the user (uses correct name) 
     * In Safari opens the files in browser (textfile)
     * 
     * When right-click "Save As" on the link:
     * In FireFox prompts the user with filename: randomCharacers.part 
     * In Chrome  prompts the user with filename: "download"
     * In Safari  prompts the user with filename: "Unknown"
     */
    /* Approach #2 Code:
       var hrefData = "data:text/csv;charset=US-ASCII," + encodeURIComponent(csvStr);
       var fileLink = document.createElement("a");
       fileLink.href = hrefData;
       fileLink.type  = "text/csv";
       fileLink.innerHTML = "CSV Download Link";
       fileLink.setAttribute("download",name+".csv"); // for Chrome
       parentTD = inButton.parentNode;
       parentTD.appendChild(fileLink);
       parentTD.removeChild(inButton);
     */
  }
</script>

I am looking for an example solution in which the above example table can be downloaded as a CSV file:

  • using a URI
  • using a <button> or a <a>
  • code works as described in modern versions of FireFox, Safari, & Chrome
  • AND ( 1. OR 2. ):
      • the user is prompted to save the file
      • user does not need to "Right Click Save As"
      • automatically saves the file to default browser save directory
      • the default filename is the name of the table with the .csv file extension

I have added a <script> tag with the DOM traversal function doSomething(). The real help I need is with formatting the URI to what I want within the doSomething() function. Approach #2 (see my code) seem most promising.

I am willing to accept that my desired functionality is impossible to achieve using only HTML & JavaScript if the statement is presented with proof; such as browser documentation, DOM standards, JavaScript/ECMAscript documentation, or a logical counter example demonstrating the impossibility.

That being said, I think a solution is possible from someone with a deeper background/more experience with URIs, even if the solution is a little bit of a hack.

recursion.ninja
  • 5,377
  • 7
  • 46
  • 78
  • 1
    why not just attach both the html file and a csv file to the email and send it to them? – dqhendricks Jul 30 '12 at 22:21
  • @dqhendricks A valid suggestion and something I considered. These reports are run quite frequently and can be quite large. The HTML file most often contains several tables and each table can have over a hundred entries. Attaching the HTML file along with 1-30 CSV files (one for each table) is excessive when all the data is already contained in the HTML file. I choose the HTML file format to house all the tables because it is relatively light-weight and much more appealing to look at and organize the several tables. I want to give the client the option to export a single table to a CSV file. – recursion.ninja Jul 30 '12 at 23:22
  • I see. Unfortunately as far as I know, you cannot initialize a download without it being a download. You could however use javascript to initialize a download, and on error explain that they need to be connected to the internet to use that particular feature. Other option might be to zip all the files and make that your attachment. – dqhendricks Jul 30 '12 at 23:35
  • you also cannot write to local files. as you mention, it is a security violation. – dqhendricks Jul 30 '12 at 23:37
  • CSV and HTML should compress really well, why not create a zip file attachment with the HTML file and the CSV inside. This would allow you to have only one attachment still and reduce the size of it at the same time whilst providing html and csv. Expanding on this you could create lots of csv files named appropriately (one for each table) and add them into the zip file so you end up with a CSV for each of your tables and your html document. If the user uncompresses the zip to a folder I think you could even use relative links in the HTML to point to the appropriate csv file. Just a thought. – Luke Baughan Jul 31 '12 at 10:47
  • The zip option is functional but not the intuitive solution I am seeking. I shouldn't _NEED_ to create separate HTML and CSV files. The HTML contains all the data necessary data to generate the CSV files. Also my users won't always want the CSV files for every table. They would only want to export a table in which the see/suspect a data integrity issue. Ideally, that would be almost never and when it does happen it is like to be only 1 or 2 tables (random in the sense that I can predetermine which will be needed). I'm going to edit my OP soon to include more info and probably sample code. – recursion.ninja Aug 06 '12 at 20:58
  • Is it okay for them to have to set the filename correctly? I have an idea I'm about to test, but it probably won't allow you to specify the filename... – Stobor Sep 14 '12 at 03:02
  • It just occurred to me that maybe there is an option to send the zip of a folder which is basically a .mht (or however it is called, the web archive) with the index page containing the HTMl and links to CSV's in a subfolder. Name it properly and you even won't have to confuse all users :) – Zlatko Sep 14 '12 at 06:41

3 Answers3

3

As long as you don't mind only supporting IE10+ when it comes to IE, use FileSaver.js. You can generate the CSV and append it to a Blob, and then saveAs() the blob. You will only be able to preserve the filename of the saved file in browsers that support <a download>, like Chrome.

Eli Grey
  • 35,104
  • 14
  • 75
  • 93
  • I am unfamiliar with how to use Blobs in JavaScript. Some _sample code_ using my example table and `doSomething()` function including the `Blob` object and `FileSaver.js` library demonstrating the functionality would be **greatly** appreciated. – recursion.ninja Sep 14 '12 at 20:02
  • Your answer was the most on point and least pessimistic. Enjoy the bounty! – recursion.ninja Sep 14 '12 at 20:08
  • 1
    I got it working in FireFox & Chrome with using [this guide](http://hackworthy.blogspot.com/2012/05/savedownload-data-generated-in.html). Your answer about using Blobs was the point in the right direction I needed! – recursion.ninja Sep 14 '12 at 21:48
1

You state in the comments that you "shouldn't NEED to create separate HTML and CSV files," but in this case your fundamental need is to provide data in two formats - originating two copies of the data is not an unreasonable constraint. To repeat, what you're asking for is not feasible.

The closest solution I can imagine to your problem would be to package the CSV data in the HTML file (via a display: none div or similar) and populating it to HTML tables dynamically using JavaScript at loadtime. This is going to be slow if you have large recordsets, but it'd be simple to add a button to show and select the CSV text for copy-paste. Inelegant? Yeah, sorry. There's a reason why HTML isn't an interchange format.

That being said, you could package your tabular data in a PDF and include the CSVs inside it as attachments.

Winfield Trail
  • 5,535
  • 2
  • 27
  • 43
  • You admit that it is possible to recreate the CSV data string from the HTML table (which I have done) but you don't think it is possible to somehow get that string from memory to disk using only HTML and JavaScript and I instead need to create separate CSV files for each table. I have, using Approach #2 in my code above, gotten the CSV data string from memory to disk using only HTML and JavaScript. I am simply looking for a slightly more elegant and structured way of doing this from someone with more experience with URIs that would meet my desired functionality. – recursion.ninja Sep 02 '12 at 19:05
  • Given the caveats - Firefox loses the filename, and it won't work at all in IE - I stand by my statement that this isn't feasible. Whether or not there's a JavaScript library that does this - see Eli's answer - this is browser behavior that does not appear to be documented or encouraged for general use. I'd be very surprised if this method didn't go away eventually - with or without a different, approved API materializing in a future spec. – Winfield Trail Sep 14 '12 at 04:24
  • And that aside, I also believe that my other points - namely that needing to ship duplicate data isn't unreasonable and that HTML isn't a good fit for this goal - are correct. I also question the wisdom of making a CSV downloader delivery method for clients who can't be trusted with a ZIP file, when the delivery method you're trying to develop doesn't work on the dullard browser of choice. ;) – Winfield Trail Sep 14 '12 at 04:28
-1

I want the single HTML file attached to the email to contain all the resources to generate the CSV file. This means I cannot use jQuery

This is a false statement. If you want to use the jQuery functionality in a single HTML file with no server call, simply copy/paste the minified jQuery library into the HTML itself. It will make each HTML file 32K bigger, which is not terrible, unless you're emailing hundreds at a time.

So if you're familiar with jQuery and are comfortable using it to refactor the data on the page, feel free to use it. I'd suggest your auto-created HTML have the data in a javascript array, and upon page load it creates the table(s). Then if the user clicks a CSV option, it can clear and refactor it.

var data = {
  title: 'NameOfTable',
  headers: ['VEN_PK', 'VEN_CompanyName', 'VEN_Order'],
  data: [
    ['1', 'Brander Ranch' 'Beef'],
    ['2', 'Super Tree Produce' 'Apples'],
    ['3', 'John\'s Distilery' 'Beer']
  ] 
}
$(document).ready(function() {
  var html = '<table>';
  html += '<thead>';
  html += '<tr><th style="text-align:center;" colspan="' + data.headers.length +'">' + data.title + '</th></tr>';
  html += '</thead><tbody>';
  html += '<tr class="resultheader">';
  for (i in data.headers) {
    html += '<td>'+data.headers[i]+'</td>';
  }
  html += '</tr>';
  for (i in data.data) {
    var row = data.data[i];
    html += '<tr>';
    for (j in row) {
      html += '<td class="resultfield">'+row[j]+'</td>';
    }
    html += '</tr>';
  }
  html += '</table>';
  $('#container').html(html);
});

function showCSV() {
   var csv = '';
   for (i in data.headers) {
     csv += data.headers[i] + ',';
   }
   csv += "\n";
   for (i in data.data) {
     var row = data.data[i];
     for (j in row) {
       csv += row[j] + ',';
     }
     csv += "\n";
   }
   $('#container').html(csv);
}
MidnightLightning
  • 6,715
  • 5
  • 44
  • 68
  • How the data is encoded on the page is irreverent to the OP. Data stored as `` and converted to CSV strings via JavaScript for download is equivalent to being stored as CSV strings in JavaScript and generated into `` on page load. I chose the former because it less computationally intensive client side. – recursion.ninja Sep 14 '12 at 20:06