3

Here's an interesting task that I faced today and I cannot think of any easy way to achieve the desired result.

Let's suppose we have a database with the following fields (columns): A,B,C,D,E,F,G but we don't know the names nor the count of the fields.

We receive a set of records from this database in the following format: {A:value1, B:value2, ...}.

If a value is not set for the current record the key will be missing too. This means I can receive {A:value} or {C:value1, D:value2} as valid records. The order of the keys will always stay the same. This means {D:value, C:value} is not a valid record.

I'm trying to recover the field names based on the returned records and keep the order of the keys.


For example I can receive records with the following keys:

  • A,C,D,E,F
  • D,F,G
  • A,B,F

From the example above I should be able to restore the original sequence which is A,B,C,D,E,F,G.

  • The first record gives us A,C,D,E,F.
  • The second one tells us that G is after F so now we have A,C,D,E,F,G
  • The third record gives us that B is after A so now we have A,B,C,D,E,F,G

If the order cannot be determined for sure we can use alphabetical order. Example for this is:

  • A,B
  • A,C

In the example above we cannot determine if the original order is A,B,C or A,C,B.

Any ideas how to implement this to work in the general case?

I will be implementing this algorithm using JavaScript but PHP, C++ or Java are also welcome.


EDIT: Do not think of the objects as standart JSON objects. In the real environment the structure is much more complex and the language is not pure JavaScript, but a modified version of ECMAScript. If it will be easier to understand - think only of the keys as an array of values ['A','B','C',...] and try to merge them, keeping the order.


EDIT 2: After struggling for some time and reading some ideas I came with the following solution:

  • Create an object that holds all relations - which column comes after which from each database record.
  • Create a relation between each a->b, b->c => a->c (inspired by Floyd–Warshall where each distance is considered as 1 if exists).
  • Create a sorting function (comparator) that will check if two elements can be compared. If not - alphabetical order will be used.
  • Get only the unique column names and sort them using the comparator function.

You can find the source-code attached below:

var allComparators = {};
var knownObjects = ['A,C,D,E,F','D,F,G','A,B,F'];
var allFields = knownObjects.join(',').split(',');

for (var i in knownObjects) {
    var arr = knownObjects[i].split(',');
    for (var i = 0; i < arr.length; i++) {
        for (var j = i + 1; j < arr.length; j++) {
            allComparators[arr[i]+'_'+arr[j]] = 1;
        }
    }
}

allFields = allFields.filter(function(value, index, self) { 
    return self.indexOf(value) === index;
});

for (var i in allFields) {
    for (var j in allFields) {
        for (var k in allFields) {
            if (allComparators[allFields[i]+'_'+allFields[j]] && allComparators[allFields[j]+'_'+allFields[k]]) {
                allComparators[allFields[i]+'_'+allFields[k]] = 1;
            }
        }
    }
}

allFields.sort(function(a, b) {
    if (typeof allComparators[a + '_' + b] != 'undefined') {
        return -1;
    }
    if (typeof allComparators[b + '_' + a] != 'undefined') {
        return 1;
    }
    return a > b;
});

console.log(allFields);
  • an object(dictionary) has no order! – Amine Hajyoussef Mar 04 '14 at 09:38
  • Are you talking about an array or an object? – Christopher Chiche Mar 04 '14 at 09:39
  • You can see the edit I made – Дамян Станчев Mar 04 '14 at 09:52
  • 1
    You should post the solution you came up with as an answer, not part of the question. Don't forget to give credit if any of the other answers were helpful (or to whatever other resources you used). – Tibos Mar 04 '14 at 12:15
  • This is a candidate-solution as it is not fully tested and I still don't know if it's 100% correct. I will give credits and accept an answer once I'm sure everything is clear. – Дамян Станчев Mar 04 '14 at 12:32
  • First,do you know that A,B,C,D,E,F,G is not the only possible answer for your test case? – Mohsen Kamrani Mar 04 '14 at 13:56
  • 1
    Indeed you can place B every where after A and before F, cause nothing says it must be before c, unless you force a new constraint into your problem. – Mohsen Kamrani Mar 04 '14 at 13:58
  • There is a restriction - *If the order cannot be determined for sure we can use alphabetical order*. In the allComparators array you can see the decision tree and indeed it is not on fixed place. It's only lesser than F and G. Indeed I haven't seen this problem in the test case, but I don't think it's a problem :) – Дамян Станчев Mar 04 '14 at 14:15
  • In your example, the third record tells you that `B` comes after `A` and before `F`. You can't assume that the order is `ABCDEFG`. Given the information you have, it could just as well be `ACDEBFG`. – Jim Mischel Mar 04 '14 at 14:16

3 Answers3

1

I give you the algorithm in a very direct and understandable way but the code! please try yourself and ask for help if required. I express myself in two ways

In technical terms :

  1. Generate a precedence graph (that is a directed graph)
  2. Topological sort it

In more details :

Graph : Map(String, ArrayList< String >) = [Map(key,value)]
each key in the map corresponds to an element (A,B,C,...)
each value contains the elements that should place after the key,e.g for A it is {B,C,D,...}
How to fill the graph :

for each row:
 for each element inside the row:
  if the element is already as a key in the map
      just add its immediate next item to the list*
  else
      add the element to the map and set the value to immediate next element of it**

* if the element is the last one in the row don't add anything to the map
** if the element is the last one in the row use {}, an empty list, as the value

Topological sort:

List sortedList;
for each key in the map:
  if value.size() == 0 {
    remove key from the map
    add it the key to the sortedList
    for each key' in the map:
      if value'.contains(key)
        value'.remove(key) (and update the map)
  }
invert the sortedList

Test case :

the map for your first input will be:
{ A:{C,B} , C:{D} , D:{E,F} , E:{F} , F:{G} , G:{} , B:{F} }

Sort : 
1 - G -> sortedList, map = { A:{C,B} , C:{D} , D:{E,F} , E:{F} , F:{} , B:{F} }
2 - F -> sortedList, map = { A:{C,B} , C:{D} , D:{E} , E:{} , B:{} }
3 - E -> sortedList, map = { A:{C,B} , C:{D} , D:{} }
4 - D -> sortedList, map = { A:{C,B} , C:{} }
5 - C -> sortedList, map = { A:{B} , B:{} }
6 - B -> sortedList, map = { A:{} }
6 - A -> sortedList, map = { }
sortedList = {G,F,E,D,C,B,A}
Invert - > {A,B,C,D,E,F,G} 
Mohsen Kamrani
  • 7,177
  • 5
  • 42
  • 66
  • Hi mok, as I see you have strong knowledge in algorithms. Can you check the solution I have just added in the original question and tell me if you can spot any weak-spots in it? I'm creating a similar to your graph and using it as a comparator for the JS sorting. Thank you. – Дамян Станчев Mar 04 '14 at 11:54
  • 1
    Thank you.I'll do it, but do you have any problem in it or just want to improve it? – Mohsen Kamrani Mar 04 '14 at 12:50
  • I'm just not sure that the 3 nested for-loops will cover all cases and I might fail on some test like a>b, b>c, c>d when comparing **a** with **d**. On this example it's working fine, but I cannot say it will work on 100% in the general case. This is my only concern right now. I found an optimization and updated my first post - if you filter the variables before the 3 nested loops you get much better performance. – Дамян Станчев Mar 04 '14 at 13:12
  • 1
    @mok Awesome! Saved me hours of banging my head against the wall. Beautiful solution and clearly demonstrated - thanks! – randomsock Jun 02 '16 at 20:17
1

do you think something like this would work?

var oMergedList = [];

function indexOfColumn(sColumnName)
{
    for(var i = 0 ; i < oMergedList.length;i++)
        if(oMergedList[i]==sColumnName)
            return i;
    return -1;
}
function getOrdinalIndex(sColumnName)
{
    var i = 0;
    for( ; i < oMergedList.length;i++)
        if(oMergedList[i]>sColumnName)
            break;
    return i;
}

function merge(oPartial)
{
    var nPreviousColumnPosition = -1;
    for(var i = 0 ; i < oPartial.length;i++)
    {
        var sColumnName =  oPartial[i] ;
        var nColumnPosition = indexOfColumn(sColumnName);
        if(nColumnPosition>=0)//already contained
        {
            if(nPreviousColumnPosition>=0 && nColumnPosition!=(nPreviousColumnPosition+1))//but inserted on wrong place
            {
                oMergedList.splice(nColumnPosition, 1);
                nColumnPosition = nPreviousColumnPosition
                 oMergedList.splice(nColumnPosition, 0, sColumnName);
            }
            nPreviousColumnPosition = nColumnPosition;
        }
        else //new
        {
            if(nPreviousColumnPosition<0)//no reference column
            {
                nPreviousColumnPosition = getOrdinalIndex(sColumnName);
            }
            else// insert after previous column
                nPreviousColumnPosition++;
            oMergedList.splice(nPreviousColumnPosition, 0, sColumnName);
        }

    }
}
/* latest sample
merge(['A','C','E','G']);
merge(['A','D']);
merge(['C','D']);
*/
/* default sample
merge(['A','C','D','E','F']);
merge(['D','F','G']);
merge(['A','B','F']);
*/
/* fix order
merge(['A','B']);
merge(['A','C']);
merge(['A','B','C']);
*/
/* insert alphabetically
merge(['B']);
merge(['A']);
merge(['C']);
*/
document.body.innerHTML = oMergedList.join(',');

the only "undefined" parts are where to insert if you have no previous columns (I putted in firt position) and second in the case A,B.. A,C the columns will be inserted when first seen

means A,B..A,C will give A,C,B .. and means A,C..A,B will give A,B,C


edited to use the current array position to fix previous addition so if you add [A,C][A,B] you will get [A,C,B] but if you then pass [A,B,C] the array will be fixed to reflect the new order

also when new columns appears and there is no reference column appends in alphabetical order


fixed the column correctioning par.. should now give you the correct result..

CaldasGSM
  • 3,032
  • 16
  • 26
0

As described by JSON.org there is not such thing as a Json ordered keys:

An object is an unordered set of name/value pairs.

That being said, it becomes quite easy to merge objects as you don't need the order.

for (var attrname in obj2) { obj1[attrname] = obj2[attrname]; }

Source: How can I merge properties of two JavaScript objects dynamically?

Community
  • 1
  • 1
Christopher Chiche
  • 15,075
  • 9
  • 59
  • 98