2

I'm using Selenium Webdriver to iterate the rows of a table and creating an instance of class T for each row, setting properties on the object based on data in the row:

public override void RefreshElements()
{
    base.RefreshElements();

    var browseTableRows = Driver.FindElements(By.CssSelector("table.browse>tbody>tr"));
    ItemsList = new List<T>(browseTableRows.Count);
    ItemsById = new Dictionary<int, T>(browseTableRows.Count);

    foreach (var tr in browseTableRows) {
        T item = new T() {
            ID = int.Parse(tr.FindElement(By.XPath("td[2]")).Text),
            Name = tr.FindElement(By.XPath("td[3]")).Text,
            Description = tr.FindElement(By.XPath("td[4]")).Text
        };
        ItemsList.Add(item);
        ItemsById.Add(item.ID, item);
    }
}

This code is quite slow. Any suggestions on how I can speed up this code?

Just to be clear, class T doesn't do anything elaborate:

public class T
{
    public int ID { get; set; }
    public string Name { get; set; }
    public string Description { get; set; }
}

In case it's useful, I'm using version 2.29.1 of Selenium, .NET 4.0 and I'm running the Internet Explorer driver.

Dan Stevens
  • 6,392
  • 10
  • 49
  • 68
  • What's the reason behind needing a class instance for each individual row? Also, *how slow*? Which *bit* is slow? The actual `.FindElements` call? The iterating through the elements? – Arran Jan 24 '13 at 16:18
  • I'm fairly certain the slowness is due to the `FindElements` method, or the choice or parameters passed to the method. I've added the contents of class T so that this is clear. The table I'm testing with has 34 rows and took 52 seconds to process. – Dan Stevens Jan 24 '13 at 16:39
  • Does it make a difference if you switch to a different driver? – Arran Jan 24 '13 at 17:28
  • Yes, the FirefoxDriver is much faster. – Dan Stevens Jan 25 '13 at 09:15
  • 1
    `By.Xpath` is very slow on IE. We had a discussion [here](http://stackoverflow.com/a/14165197/1167879) – Alex Okrushko Jan 25 '13 at 17:54

1 Answers1

4

Two things come to mind. First, you're calling FindElement() for each cell in the row. You'd likely be better off calling row.FindElements(By.TagName("td")) and indexing into the collection returned by it.

Also, getting the text of an element is one of the most expensive operations in WebDriver, since the driver has to walk the DOM (up and down) to determine visibility of parent and child nodes due to CSS styling. If you're sure there's no styling in the table cell that you need to be careful of, you could use a JavaScript call instead to get the inner text of the element, which doesn't care about styling.

Finally, iterating over an entire table as you're doing here is going to be much less efficient than only getting the information you need from the page on-demand. I would reexamine my approach so it doesn't depend on iterating over the entire table at once.

JimEvans
  • 27,201
  • 7
  • 83
  • 108
  • Thanks for the great suggestions. In the end I opted for your last suggestion: instead of iterating the whole table, I created two new methods to replace the `RefreshElements` method - `GetItemByIndex(int index)` and `GetItemByID(int id)`, which only attempt to retrieve a single row. – Dan Stevens Jan 25 '13 at 12:22
  • I would suggest using JS or html parser where table will be passed to it. Just comparison between `execute_script('return arguments[0].innerText', cell_webel)` and `cell_webel.text` which took 2sec for first case and 6sec for the second one. It is significant difference... – Michal Nov 25 '21 at 16:58