6

Is it possible to output html strings to csv.

Trying to export data from a cms to csv and to Excel. Each piece of html could include commas and anything really.

EG. <p class="myclass">This is an example, of the string</p>

The import is broken in Excel, the wrong data appears in the wrong columns altough the first few rows are correct.

I want to achieve this sort of format

col1,col2,col3
"1","<p class="myclass">This is an example, of the string</p>","and more html here"

I have tried this sort of thing - I am iterating a content item in the cms and outputting each property as seperate csv data value enclosed in quotes and separated by commas.

foreach (var prop in offer.Properties) //.Where(x=>x.Alias != "Id"))
{

    var @propValue = prop.Value.ToString().Replace("\"", "'");

    // Append comma except last
    sb.Append(prop != offer.Properties.Last()
        ? "\"" + propValue + "\","
        : "\"" + propValue + "\"");
}
sb.Append(Environment.NewLine);

UPDATE: In truth this task proved fraught with difficulty. The original goal was to quickly export a set of nodes and their properties from the Umbraco CMS to an Excel file. I learned that csv is probably not the right format for this type of data which is all based on data stored in xml and including encoded html snippets.

In our case the best way to achieve what we wanted was to output the exported data as an html table which Excel understands and which maintains an editor friendly format rather than encoded html snippets.

wingyip
  • 3,465
  • 2
  • 34
  • 52
  • What do You want achieve? You want parse `

    This is an example, of the string

    ` to what exacly? Give Us some example od output!
    – blogprogramisty.net Jun 21 '16 at 12:20
  • Just added more explanation of format required – wingyip Jun 21 '16 at 12:28
  • 1
    Why don't you create a real Excel file with EPPlus? Generating a sheet could be as easy as `ws.LoadFromDataTable(someTable);` or `ws.LoadFromCollection(someList);`. Apart from that, trying to put HTML in a CSV is simply asking for trouble. You can't simply replace or encode all quotes, as the HTML snippet may *already* contain encoded strings. You could try using some really unexpected characters as column and line separators, eg ¤ and ¶ – Panagiotis Kanavos Jun 21 '16 at 12:35
  • 1
    @wingyip what you ask is probably impractical without severe restrictions on the data, eg no newlines, no escaped quotes, only single quotes for attributes etc. What is the *real* problem you are trying to solve? Why do you think that exporting HTML in a CSV is the solution? – Panagiotis Kanavos Jun 21 '16 at 12:37
  • Thanks for making me think of alternatives @PanagiotisKanavos – wingyip Jun 21 '16 at 22:15

2 Answers2

0

When encoding/decoding csv i'd rather go with a plugin / library, there are some nasty use-case i've bitten myself when I tried to do it myself (How to note decimals depending on the locale, uneven data, escaping caracters and such. ) I use a tweaked version of CsVHelper but you can find plenty different ones online.

Regarding your update. What i'd suggest is instead of filling your csv directly with html, just fill it with the actual value. Put your View Logic away from your Model Logic . Let's do a quick example.

<table>
  <tr>
    <th>A</th>
    <th>B</th> 
    <th>C</th>
  </tr>
  <tr>
    <td>1</td>
    <td>2</td> 
    <td>3</td>
  </tr>
</table>

If i gave you the data in this format :

A B C
1 2 3

Or even in this format :

A,B,C
1,2,3

You could, pretty easily, recreate the html table from this data; or create a diagram; or a word document ; or any kind of way you want to present the model to a user.

Having the other way around, with a set of data in the form

 <th>A</th>,    <th>B</th> ,    <th>C</th>
 <td>1</td>,    <td>2</td> ,    <td>3</td>

Will force you to parse everytime you're going to use the data in a different context other than an html one. Having the view and the model in different places will make your job easier.

Mekap
  • 2,065
  • 14
  • 26
-1

You could HtmlEncode the strings which will get rid of your quotes ".

string data = "<p class=\"myclass\">This is an example, of the string</p>";
Server.HtmlEncode(data);

https://msdn.microsoft.com/en-us/library/w3te6wfz(v=vs.110).aspx

EDIT:

"<a href=&quote;http://www.example.com&quote;>link</a>","<b>more html</b>"

Danny Cullen
  • 1,782
  • 5
  • 30
  • 47