5

I'm getting a string from a list of items, The string is currently displayed as "item.ItemDescription" (the 9th row below)

I want to strip out all html from this string. And set a character limit of 250 after the html is stripped. Is there a simple way of doing this? I saw there was a posts saying to install HTML Agility Pack but I was looking for something simpler.

EDIT:
It does not always contain html, If the client wanted to add a Bold or italic tag to an items name in the description it would show up as <"strong">Item Name<"/strong"> for instance, I want to strip out all html no matter what is entered.

<tbody>
    @foreach (var itemin Model.itemList)
    {
        <tr id="@("__filterItem_" + item.EntityId + "_" + item.EntityTypeId)">
            <td>
                @Html.ActionLink(item.ItemName, "Details", "Item", new { id = item.EntityId }, null)
            </td>
            <td>
                item.ItemDescription
            </td>
            <td>
                @if (Model.IsOwner)
                {
                    <a class="btnDelete" title="Delete" itemid="@(item.EntityId)" entitytype="@item.EntityTypeId" filterid="@Model.Id">Delete</a>
                }
            </td>

        </tr>
    }
</tbody>
Veda99817
  • 169
  • 2
  • 15
  • you are saying that `item.Description` contains a value like ``? – T McKeown Dec 30 '15 at 22:07
  • Uh It would appear as "blah blah blah" essentially, but it would contain the value yes. – Veda99817 Dec 30 '15 at 22:08
  • @Veda99817 you could get the string out of `item.ItemDescription` and apply `maxlength` property to element or set this property on backend where you get this string generated. – pratikpawar Dec 30 '15 at 22:11
  • you already have a `` in your code... sorry, but why are doing it this way? It seems like an awful way to render. – T McKeown Dec 30 '15 at 22:12
  • My current apps are using angular, but unfortunately this is an older project for a client and so i have to make due with what I've got. – Veda99817 Dec 30 '15 at 22:14
  • The max length would work, or even apply a class into the attribute and limit the character length by css. but the main problem would be how would i strip out the html? – Veda99817 Dec 30 '15 at 22:16
  • dont strip out the HTML, if it's always a `` then go with it and keep it simple. Remove your `` from your razor code and just use the html you've already got. – T McKeown Dec 30 '15 at 22:17
  • @Veda99817 you need to keep in mind if string is limited to 250; if such values are updated in DB you might overwrite values with new values. Is it desired? e.g DB description value is 300 a's as per code you will display only 250 a's. Now if updated in db new value will be 250 a's. – pratikpawar Dec 30 '15 at 22:21
  • Well that's the thing, the item description where i'm getting the strings from can have HTML added to it. For Example:

    ItemName

    $15.00

    Example item description would go here

    – Veda99817 Dec 30 '15 at 22:22
  • @Veda99817 Please update the question with more information. Does `item.desc` always contain `html` or not? Are you trying to display only values out of those HTML? Or you need to display whole HTML but with text limit on values inside those elements? Also if update issue mentioned above is acceptable or not. – pratikpawar Dec 30 '15 at 22:26
  • It does not always contain html, If the client wanted to add a Bold or italic tag to an items name in the description it would show up as Item Name for instance, I want to strip out **all** html no matter what. – Veda99817 Dec 30 '15 at 22:28
  • @Veda99817 then you could simply use RegEx to replace all HTML code from below answer. Now to put limit of 250 you could either refer to `substring` part from Tims answer or combiine with css on `td`s as McKeown pointed out. – pratikpawar Dec 30 '15 at 22:31
  • or <"strong"> you can try regex as shown in my answer – joordan831 Dec 30 '15 at 22:34

3 Answers3

0

This Regex will select any html tags (including the ones with double quotes such as <"strong">:

<[^>]*>

Look here: http://regexr.com/3cge4

Using C# regular expressions to remove HTML tags

From there, you can simply check the string size and display appropriately.

var itemDescriptionStripped = Regex.Replace(item.ItemDescription, @"<[^>]*>", String.Empty);
if (itemDescriptionStripped.Length >= 250)
    itemDescriptionStripped.Substring(0,249);
else
    itemDescriptionStripped;
Community
  • 1
  • 1
joordan831
  • 720
  • 5
  • 6
0

Your best option IMO is to night get into a parsing nightmare with all the possible values, why not simply inject a class=someCssClassName into the <td> as an attribute. Then control the length, color whatever with CSS.

An even better idea is to assign a class to the containing <tr class=trClass> and then have the CSS apply lengths to child <td> elements.

T McKeown
  • 12,971
  • 1
  • 25
  • 32
  • 1
    Applying css to `td tr` wont work. `item.desc` contains HTML string with `input` elements. I believe OP is using `table` for layout. class needs to be added on `input` elements. – pratikpawar Dec 30 '15 at 22:18
0

You could do something like this to remove all tags (opening, closing, and self-closing) from the string, but it may have the unintended consequence of removing things the user entered that weren't meant to be html tags:

text = Regex.Replace(text, "<\/?[^>]*\/?>", String.Empty);

Instead, I would recommend something like this and letting the user know html isn't supported:

text = text.Replace("<", "&lt;");
text = text.Replace(">", "&gt;");

Just remember to check for your 250 character limit before the conversion:

text = text.Substring(0, 250);
Tim
  • 857
  • 6
  • 13