Include HTML Agility Pack in your application using nuget. To install it in your project, type the following command in the Package Manager Console.
> Install-Package HtmlAgilityPack
After adding the reference via Nuget, you need to include the reference in your page using the following.> using HtmlAgilityPack;
Below function will convert webpage HTML table to C# readable code, just need to pass table class name and page URL.
public List<List<string>> ScrapHtmlTable(string className, string pageURL)
{
HtmlWeb web = new HtmlWeb();
HtmlDocument document = web.Load(pageURL);
List<List<string>> parsedTbl =
document.DocumentNode.SelectSingleNode("//table[@class='" + className + "']")
.Descendants("tr")
.Skip(1) //To Skip Table Header Row
.Where(tr => tr.Elements("td").Count() > 1)
.Select(tr => tr.Elements("td").Select(td => td.InnerText.Trim()).ToList())
.ToList();
return parsedTbl;
}
Invoking function signature:ScrapHtmlTable("className1 className2", "https://www.abc.xz");
0 comments:
Post a Comment