1

I wrote the function using js+jQuery.

Here is a link to a jsFiddle which shows my problem: http://jsfiddle.net/anarnold/LpBaW/

The goal of this function is to scan a table and check for rows that have certain field (td) values matching. Rows then get assigned a class denoting whether or not they are unique, and the number of matching rows are printed into the final field (td) of each row.

It basically uses a nested loop of this structure:

For each row... scan the whole table for matches..

The way I identify rows is to concatenate the field (td) texts from each row into a rowid attribute on the final field (td) for each row.

The current funciton works fine, but it gets very slow with large tables ~2000 rows.

There must be a more efficient and elegant way to accomplish this. Any help would be greatly appreciated!

Brian Tompsett - 汤莱恩
  • 5,753
  • 72
  • 57
  • 129
anarnold
  • 103
  • 13

6 Answers6

2

Here is an example of using an associative array to store the results and then iterate on that:

http://jsfiddle.net/AdDMk/

var rowIdCnt = {};

function unqOrMsgTest() {
    // Store counts here
    var rowIdCnt = {};

    // loop through check tds
    $("#binning tr td[col=check]").each(function() {

        // grab row identifer to check against other rows        
        var rowId = $(this).attr("rowid");

        if (rowId in rowIdCnt) {
            rowIdCnt[rowId] ++;
        } else {
            rowIdCnt[rowId] = 1;
        }

    });

    // Now iterate over each count and update the table appropriately:
    $.each(rowIdCnt, function(rowId, cnt) {
        //this bit of logic picks a class to assign rows        
        var resultClass = "notUnique";
        if (cnt < 2) {
            resultClass = "unique";
        }

        //apply the row class and print the redundancy number into td
        $('#binning tr td[rowid='+rowId+']').text(cnt).parent().addClass(resultClass);

    });

}
Jeff B
  • 29,943
  • 7
  • 61
  • 90
1

Here is a better preforming method of doing this. I have removed as many redundant DOM calls as I could as well as corrected the invalid attributes (HTML tags can only support certain attributes.. custom attributes need to be prefixed with data-)

$(document).ready(function(){ //this is just to fire the function
    $("#unqOrMsgTestFire").click(function(){
        unqOrMsgTest();
    });
});

function check_unique(row, collection) {
    var unique = true, rowid = $(row).children('td[data-col=check]')[0].getAttribute('data-rowid');
    collection.each(function() {
        if( $(this).children('td[data-col=check]')[0].getAttribute('data-rowid') == rowid ) {
            unique = false; 
        }
    });
    return unique;
}

function unqOrMsgTest() { 

    var collection = $("#binning tbody").children('tr');

    collection.each(function(i, el){
        el.className += check_unique( el, collection.not(el) ) ? ' unique' : 'notUnique';
    });            

}​

http://jsfiddle.net/rlemon/LpBaW/41/ <- they all fail, but that is expected.

rlemon
  • 17,518
  • 14
  • 92
  • 123
  • I've just tested mine with 2000 rows and it fails horribly. – rlemon Apr 06 '12 at 22:21
  • I respect the honesty, and I will definitely implement your feedback re. illegal attributes... (This is just for an internal testing tool, so it doesn't need to be standard compliant and semantically perfect, but its good practice to do so anyway =) – anarnold Apr 06 '12 at 22:40
  • @rlemon: while this code is more readable, it is still executing a nested loop, so the performance is not going to be much different, hence failing horribly. – Jeff B Apr 06 '12 at 22:45
0

Create hash values for each table entry and use hash tables or sort them before checking for duplicates, so you'll only need to compare neighbours.

worenga
  • 5,776
  • 2
  • 28
  • 50
  • This seems like a far better approach! Working on it now. My js fluency is still pretty low. Can anyone drop a few lines of code showing how to use and sort hashes in js? – anarnold Apr 06 '12 at 22:15
  • @Jeff B gave a nice hash table example, the sorting approach really depends on your table and what fields to compare – worenga Apr 07 '12 at 01:32
0

More elegant way is to do this on the data level before you build this table.

Aleksandar Vucetic
  • 14,715
  • 9
  • 53
  • 56
  • Thanks, vucetica. The table is dynamic; it's repeatedly appended via ajax calls. No direct access to data, so it needs to come from the table. @mightyuhu suggested extracting data into a hash then working with it there. Trying his approach now. – anarnold Apr 06 '12 at 22:11
  • If you are appending data through ajax, then it is even easier. Before appending, add just rows that don't exist in your table. That way (for your example), you will pass 2000 times through your table and 4 000 000 times through your javascript array, which is much faster than passing through your table 4 000 000 times. – Aleksandar Vucetic Apr 07 '12 at 01:38
0

Here's my recommended solution:

http://jsfiddle.net/j9RXR/29/

function unqOrMsgTest() { 
    var rows = $("#binning tbody").children('tr');
    var totalRows = rows.length;
    var idLookup = {};
    var i, rowId, resultClass, checkColumn, rowCount, row;

    // loops through all rows, convert to jQuery objects and track the IDs
    for (i = 0; i < totalRows; i++)
    {
        row = $(rows[i]);
        rowId = row.children('td[col="check"]').attr("rowid");
        rows[i] = row;

        idLookup[rowId] = (rowId in idLookup) ? idLookup[rowId] + 1 : 1;
    }

    // loop through each row and check them for redundancy
    for (var i = 0; i < totalRows; i++ )
    {
        // grab row identifer to check against the id lookup
        row = rows[i];
        checkColumn = row.children('td[col="check"]');
        rowId = checkColumn.attr("rowid");     

        //this bit of logic picks a class to assign rows        
        rowCount = idLookup[rowId];
        resultClass = rowCount < 2 ? "unique" : "notUnique";

        //apply the row class and print the redundancy number into td
        checkColumn.text(rowCount);
        row.attr("class", resultClass);        
    };            
}​

Similar to the above answers that suggest using an associative array (or hash) to store the id's and counts, I've also removed all calls to list.each( function() {...} ) and minimized the number of conversions from dom elements to jQuery objects.

The reason I removed the usage of each is because a new anonymous function is created each iteration, also redundant conversions from this to $(this) were called, not to mention the stack thrashing. Suffice to say a simple for loop is all that's needed and is much faster.

For more on jQuery pitfalls check out jQuery pitfalls to avoid

Community
  • 1
  • 1
Marc Gagne
  • 819
  • 5
  • 7
  • That is interesting. However, when I benchmark your code against 5000 items I get 1390 milliseconds runtime, and when I benchmark my code with `.each()`, I get a runtime of 194 milliseconds. I wonder if your extra jQuery calls add overhead? – Jeff B Apr 06 '12 at 23:08
  • Interesting indeed :) What browser are you using? I cannot seem to get close to 194 milliseconds even with your solution. I've updated the sample here: http://jsfiddle.net/j9RXR/35/ to include both our solutions and on Win7 with Chrome I'm getting your solution coming in around twice as slow :/ I'll have to check into this a bit more! – Marc Gagne Apr 07 '12 at 14:40
  • Also of note I found decent performance gains by hard coding (I know, I know...) columns. For example instead of `checkColumn = row.children('td[col="check"]');` This appears to be much quicker `checkColumn = row.children()[5];` – Marc Gagne Apr 07 '12 at 14:42
  • Now this has me intrigued :) Ran our solutions on Safari on the Macbook and got the same results as I did using Chrome on Windows, however when I tested with FireFox (on the Macbook) your solutions is WAY faster coming in at 227ms for yours vs 1277ms for mine. – Marc Gagne Apr 07 '12 at 14:46
  • That is interesting. I was running Firefox on Windows. I guess it depends on the optimizations of the individual browsers. Still, that seems like a wide range. – Jeff B Apr 07 '12 at 20:24
  • Decided to distill the for loop vs each loop to the basics here: http://jsfiddle.net/FxjxM/2/ Tested using IE, Chrome, FireFox & Safari, all browsers are much faster with a basic for loop vs using jQuery $.each – Marc Gagne Apr 08 '12 at 00:01
  • Thanks, you guys are awesome. Up and running with @Jeff B approach. Massively improved performance. Time to invest some time in actually learning instead of just hack+check. – anarnold Apr 09 '12 at 19:34
0

http://jsfiddle.net/LpBaW/57/

$(function(){ //this is just to fire the function
    $("#unqOrMsgTestFire").click(unqOrMsgTest);
    $("#spawn").click(function(){
        var blueprint = $("#binning tbody tr").first();
        for(var i = 0; i < 1000; i++)
            blueprint.clone().appendTo("#binning tbody").find('td[rowid]').attr('rowid', Math.floor(Math.random()*500));
    });
});

function unqOrMsgTest() {
    console.profile();
    var toCheck = $("#binning > tbody > tr > td[col=check]"),
        ignore = {},
        curId,
        currentlyProcessing,
        rowsAmount,
        i;

    i = toCheck.length - 1;
    while( i >= 0 ) {
        curId = toCheck.eq(i).attr("rowid");
        if( !(curId in ignore) ) {
            ignore[curId] = undefined;

            currentlyProcessing = $("#binning > tbody > tr > td[rowid=" + curId  + "]");

            rowsAmount = currentlyProcessing.length;

            currentlyProcessing
                .text( rowsAmount )
                .parent().attr('class', rowsAmount > 1 ? "notUnique" : "unique");
        }
        i--;
    }
    console.profileEnd();
}​

        rowsAmount = currentlyProcessing.length;

        currentlyProcessing
            .text( rowsAmount )
            .parent().attr('class', rowsAmount > 1 ? "notUnique" : "unique");

        toCheck = toCheck.not( currentlyProcessing );
    }
}
Razor
  • 27,418
  • 8
  • 53
  • 76