-1

What I want to do is to replace a part of the text inside the clipboard, but the problem is it is html formatted text and I am unable to modify its content using the below given code in C#. Any solutions?

Steps to replicate my doing:

1- copy an entry from cambridge advanced learner dictionary 4 to clipboard OR any other html formatted text to clipboard
2- Use these C# codes in a windows forms application to modify and replace text while keeping its html formatting:

private void button1_Click(object sender, EventArgs e)
        {

            string myStr = Clipboard.GetText(TextDataFormat.Html);
            myStr.Replace("Cambridge Advanced Learner's Dictionary - 4th Edition", "******************************");
            Clipboard.SetText(myStr,TextDataFormat.Html);

        }

But it seems that it does not work at all!

NOTE: I want to keep the html formatting, I don't want to strip string from its html formatting.


I used Regex and it seems to work when I use:

myStr = Regex.Replace(myStr, "Cambridge Advanced Learner's Dictionary - 4th Edition", "");

but when I want to use:

myStr = Regex.Replace(myStr, "Cambridge Advanced Learner's Dictionary - 4th Edition<br /><br />", "");

it does not work! any solutions to remove those html tags: <br /><br /> ?

acman123
  • 239
  • 3
  • 13

3 Answers3

1

Using Regex solved the problem to some extent like this:

private void button1_Click(object sender, EventArgs e)
        {

            string myStr = Clipboard.GetText(TextDataFormat.Html);
            myStr = Regex.Replace(myStr, "Cambridge Advanced Learner's Dictionary - 4th Edition", "");

            Clipboard.SetText(myStr,TextDataFormat.Html);

        }

but still unable to remove HTML tags like <br /><br /> from clipboard.

acman123
  • 239
  • 3
  • 13
0

You must format the text in a special HTML Clipboard Format (link to description).

It looks like this (working example unlike the exmaple given in the link, which has wrong Start- and End- numbers):

Version:1.0
StartHTML:00085
EndHTML:00287
StartFragment:00105
EndFragment:00269
<!--StartFragment--><HTML><HEAD><META HTTP-EQUIV="Content-Type" CONTENT="text/html;charset=UTF-8" /><TITLE></TITLE></HEAD><BODY>YOUR <B>HTML FORMATTED</B> TEXT GOES HERE!</BODY></HTML><!--EndFragment-->

Also make sure to fill in the right Start- and End- numbers in the top section. More specifically, you must adapt EndHTML, EndFragment and EndSelection to reflect the change in the length of your text. Replacing alone won't work.

Olivier Jacot-Descombes
  • 104,806
  • 13
  • 138
  • 188
0

Since HTML input can be arbitrary, here are the steps I suggest:

  1. Assuming you have a way to detect that the clipboard content is indeed in HTML, tidy it using a C# library of your choice (for example, this). This will allow the app to work with content which is "sanitized", i.e., HTML breaks such as <br> and <br /> below will be tidied to standard <br/> which you can then omit or replace.

  2. Instead of using "one-off" RegEx replacement like the one you have for handing HTML breaks, try to make your code a bit more flexible by anticipating future additions to the list of offending HTML elements you need to replace, i.e., use groups (for example, this). You will then be able to provide the user of your forms app a way to configure which elements to omit.

Code Maverick
  • 326
  • 2
  • 6