How to retrieve all urls from all hrefs I don't want use HTML Agility Pack or similar - must be clean code and very short.
HttpClient client = new HttpClient();
static async Task Main(string[] args)
{
Program program = new Program();
await program.GetTodoItems();
await program.Function();
Console.WriteLine("Hello Word!");
}
private async Task GetTodoItems()
{
string ResponseHtml = await client.GetStringAsync("https://example.com");
var LinkParser = new Regex(@"\b(?:https?://|www\.)\S+\b", RegexOptions.Compiled | RegexOptions.IgnoreCase);
foreach (Match m in LinkParser.Matches(ResponseHtml))
{
Console.WriteLine(m.Value);
}
}
I expect clean urls not doubled and only for website not for scripts. This code show me some link with extra tags and char like this one: