-3

I prefer to use regex not any HTML Parser.

Best way to extract base64 image from a HTMl that string is like:

 "<p>This is test </p>
  <p><img src=\"data:image/jpeg;base64,/9j/4AAQSkZJRgABAQAAAQABAAD/4gKgSUNDX1BST0ZJTEUA....+tzPaXLlstlSjpcxKPEqV/zH//2Q==\"></p>"

I need this line so I can have access to base 64 image:

/9j/4AAQSkZJRgABAQAAAQABAAD/4gKgSUNDX1BST0ZJTEUA....+tzPaXLlstlSjpcxKPEqV/zH//2Q==
Alma
  • 3,780
  • 11
  • 42
  • 78

1 Answers1

1

If there is an adequate HTML parser for this use case as suggested by others in the comments, go for that...

But, if that doesn't work, regular expressions to the rescue! This is using a positive lookbehind assertion and is matching everything until the first double quote. Should work -- adjust if it doesn't...

var val = "<p>This is test </p><p><img src=\"data:image/jpeg;base64,/9j/4AAQSkZJRgABAQAAAQABAAD/4gKgSUNDX1BST0ZJTEUA....+tzPaXLlstlSjpcxKPEqV/zH//2Q==";
var match = Regex.Match(val, "(?<=data:image/jpeg;base64,)[^\"]*");
Console.WriteLine(match.Value);

// output: /9j/4AAQSkZJRgABAQAAAQABAAD/4gKgSUNDX1BST0ZJTEUA....+tzPaXLlstlSjpcxKPEqV/zH//2Q==

Igor Pashchuk
  • 2,455
  • 2
  • 22
  • 29