-5

I have this string :

 "<figure><img
 src='http://myphotos.net/image.ashx?type=2&image=Images\\2\\9\\11\\12\\3\\8\\4\\7\\685621455625.jpg'
 href='JavaScript:void(0);' onclick='return takeImg(this)'
 tabindex='1' class='myclass' width='55' height='66' alt=\"myalt\"></figure>"

How can I retrieve this link :

http://myphotos.net/image.ashx?type=2&image=Images\\2\\9\\11\\12\\3\\8\\4\\7\\685621455625.jpg

All string are the same type so somehow I need to get substring between src= and href. But I don't know how to do that. Thanks.

jason
  • 6,962
  • 36
  • 117
  • 198
  • 4
    You can use htmlagilitypack https://htmlagilitypack.codeplex.com/ it does a good job of parsing html. It is usually more stable than matching with a regex – mortb Jan 18 '17 at 10:42
  • 2
    Possible duplicate of [Find a string between 2 known values](http://stackoverflow.com/questions/1717611/find-a-string-between-2-known-values) – DevelopmentIsMyPassion Jan 18 '17 at 10:43
  • char quote = '\'' ; string url=(thesourcestring+quote).split(quote)[1] ; – Graffito Jan 18 '17 at 10:49

6 Answers6

3

If you parse HTML don't not use string methods but a real HTML parser like HtmlAgilityPack:

var doc = new HtmlAgilityPack.HtmlDocument();
doc.LoadHtml(html);  // html is your string
var linksAndImages = doc.DocumentNode.SelectNodes("//a/@href | //img/@src");
var allSrcList = linksAndImages
    .Select(node => node.GetAttributeValue("src", "[src not found]"))
    .ToList();
Tim Schmelter
  • 450,073
  • 74
  • 686
  • 939
2

You can use regex:

var src = Regex.Match("the string", "<img.+?src=[\"'](.+?)[\"'].*?>", RegexOptions.IgnoreCase).Groups[1].Value;
janhartmann
  • 14,713
  • 15
  • 82
  • 138
  • Using this method will require to remove break lines. [Check this](https://regex101.com/r/gGFBtO/1) – mrogal.ski Jan 18 '17 at 10:53
  • @m.rogalski What do you mean with that? I couldn't understand – jason Jan 18 '17 at 11:03
  • I meant that if your string contains line breaks as it does in question, this method will be useless. and you can check this clicking link I've posted., This should be specified in the answer. – mrogal.ski Jan 18 '17 at 11:06
2

In general, you should use an HTML/XML parser when parsing a value from HTML code, but with a limited string like this, Regex would be fine.

string url = Regex.Match(htmlString, @"src='(.*?)'").Groups[1].Value;
Abion47
  • 22,211
  • 4
  • 65
  • 88
1

If your string is always in same format, you can easily do this like so :

string input =  "<figure><img src='http://myphotos.net/image.ashx?type=2&image=Images\\2\\9\\11\\12\\3\\8\\4\\7\\685621455625.jpg' href='JavaScript:void(0);' onclick='return takeImg(this)' tabindex='1' class='myclass' width='55' height='66' alt=\"myalt\"></figure>";
// link is between ' signs starting from the first ' sign so you can do :
input = input.Substring(input.IndexOf("'")).Substring(input.IndexOf("'"));
// now your string looks like : "http://myphotos.net/image.ashx?type=2&image=Images\\2\\9\\11\\12\\3\\8\\4\\7\\685621455625.jpg"

return input;
mrogal.ski
  • 5,828
  • 1
  • 21
  • 30
1
string str = "<figure><imgsrc = 'http://myphotos.net/image.ashx?type=2&image=Images\\2\\9\\11\\12\\3\\8\\4\\7\\685621455625.jpg'href = 'JavaScript:void(0);' onclick = 'return takeImg(this)'tabindex = '1' class='myclass' width='55' height='66' alt=\"myalt\"></figure>";

int pFrom = str.IndexOf("src = '") + "src = '".Length;
int pTo = str.LastIndexOf("'href");

string url = str.Substring(pFrom, pTo - pFrom);

Source :

Get string between two strings in a string

Community
  • 1
  • 1
Erik Šťastný
  • 1,487
  • 1
  • 15
  • 41
0

Q is your string in this case, i look for the index of the attribute you want (src = ') then I remove the first few characters (7 including spaces) and after that you look for when the text ends by looking for '.

With removing the first few characters you could use .IndexOf to look for how many to delete so its not hard coded.

        string q =
            "<figure><img src = 'http://myphotos.net/image.ashx?type=2&image=Images\\2\\9\\11\\12\\3\\8\\4\\7\\685621455625.jpg' href = 'JavaScript:void(0);' onclick = 'return takeImg(this)'" +
            "tabindex = '1' class='myclass' width='55' height='66' alt=\"myalt\"></figure>";
        string z = q.Substring(q.IndexOf("src = '"));
        z = z.Substring(7);
        z = z.Substring(0, z.IndexOf("'"));
        MessageBox.Show(z);

This is certainly not the most elegant way (look at the other answers for that :)).

EpicKip
  • 4,015
  • 1
  • 20
  • 37