Here's an interesting task that I faced today and I cannot think of any easy way to achieve the desired result.
Let's suppose we have a database with the following fields (columns): A,B,C,D,E,F,G but we don't know the names nor the count of the fields.
We receive a set of records from this database in the following format: {A:value1, B:value2, ...}.
If a value is not set for the current record the key will be missing too. This means I can receive {A:value} or {C:value1, D:value2} as valid records. The order of the keys will always stay the same. This means {D:value, C:value} is not a valid record.
I'm trying to recover the field names based on the returned records and keep the order of the keys.
For example I can receive records with the following keys:
- A,C,D,E,F
- D,F,G
- A,B,F
From the example above I should be able to restore the original sequence which is A,B,C,D,E,F,G.
- The first record gives us A,C,D,E,F.
- The second one tells us that G is after F so now we have A,C,D,E,F,G
- The third record gives us that B is after A so now we have A,B,C,D,E,F,G
If the order cannot be determined for sure we can use alphabetical order. Example for this is:
- A,B
- A,C
In the example above we cannot determine if the original order is A,B,C or A,C,B.
Any ideas how to implement this to work in the general case?
I will be implementing this algorithm using JavaScript but PHP, C++ or Java are also welcome.
EDIT: Do not think of the objects as standart JSON objects. In the real environment the structure is much more complex and the language is not pure JavaScript, but a modified version of ECMAScript. If it will be easier to understand - think only of the keys as an array of values ['A','B','C',...] and try to merge them, keeping the order.
EDIT 2: After struggling for some time and reading some ideas I came with the following solution:
- Create an object that holds all relations - which column comes after which from each database record.
- Create a relation between each a->b, b->c => a->c (inspired by Floyd–Warshall where each distance is considered as 1 if exists).
- Create a sorting function (comparator) that will check if two elements can be compared. If not - alphabetical order will be used.
- Get only the unique column names and sort them using the comparator function.
You can find the source-code attached below:
var allComparators = {};
var knownObjects = ['A,C,D,E,F','D,F,G','A,B,F'];
var allFields = knownObjects.join(',').split(',');
for (var i in knownObjects) {
var arr = knownObjects[i].split(',');
for (var i = 0; i < arr.length; i++) {
for (var j = i + 1; j < arr.length; j++) {
allComparators[arr[i]+'_'+arr[j]] = 1;
}
}
}
allFields = allFields.filter(function(value, index, self) {
return self.indexOf(value) === index;
});
for (var i in allFields) {
for (var j in allFields) {
for (var k in allFields) {
if (allComparators[allFields[i]+'_'+allFields[j]] && allComparators[allFields[j]+'_'+allFields[k]]) {
allComparators[allFields[i]+'_'+allFields[k]] = 1;
}
}
}
}
allFields.sort(function(a, b) {
if (typeof allComparators[a + '_' + b] != 'undefined') {
return -1;
}
if (typeof allComparators[b + '_' + a] != 'undefined') {
return 1;
}
return a > b;
});
console.log(allFields);