Is there a way to extract titles and URLs from Yahoo search result page using the htmlagility pack?
HtmlWeb web = new HtmlWeb();
string queryText = "your_search_query_here";
string searchResults = "https://en-maktoob.search.yahoo.com/search?p=" + queryText;
var document = web.Load(searchResults);
var nodes = document.DocumentNode.SelectNodes("//a[@cite and @href]");
if (nodes != null)
{
foreach (var node in nodes)
{
string title = node.Attributes["title"]?.Value;
string url = node.Attributes["href"]?.Value;
}
}
This code successfully retrieves titles and URLs from Yahoo search results, however, it includes ads links and other unwanted URLs. How can we filter out these irrelevant links to access only the correct ones?