2

I have an array that looks like this (but can be very big, and with longer strings):

var treeOfStrings = [
  'abc',
  'def',
  [
    'ghij',
    [
      'k',
      'l'
    ],
    'mnop'
  ],
  'qrst',
  'uv',
  'wxyz'
];

I want to turn it into a single string, but since it is big and this is in a performance sensitive place, I'd like to do it in a very efficient way.

The current approach (below) is to first make a single array (by recursively walking the children and using push() on a single output array), and then doing a join()....but it seems like there could be a more efficient way, since this one is pretty heavy on the usage of push() so it could do lots of little allocations.

function treeJoin(tree) {
  var out = [];
  flattenArray(tree);
  return out.join('');

  function flattenArray(branch) {
    var len = branch.length;
    var item;
    for(var i=0; i<len; i++) {
      item = branch[i];
      if(typeof item === "string")
        out.push(item);
      else
        flattenArray(item);
    }
  }
}

(the solution will need to work in all reasonably modern browsers as well as node.js)

Edit:

I timed these functions, and interestingly, the simplest way of doing it (well, simplest to me) is actually the fastest (see below). I would have thought doing "+=" on a string over and over was slow, but nope, it's screaming fast (in Chrome). Over ten times faster than the method below that uses map() and join(), and about 6 times faster than the method that overrides Array.toString(), and twice as fast as my method that flattens the tree into a single array and then uses a single join(). Crazy.

treeJoin = function (tree) {
  var s = ''
  var len = tree.length;
  var item;
  for(var i=0; i<len; i++) {
    item = tree[i];
    if(item.push)
      s += treeJoin(item);
    else
      s += item;
  }
  return s
}
rob
  • 9,933
  • 7
  • 42
  • 73

3 Answers3

2

Sometimes cheating works:

var result = treeOfStrings.toString().replace(/,/g,"");

Converting arrays to strings joins their elements with commas, then strip out the commas and you're good to go.

Doesn't behave quite as hoped if any of the pieces already contained commas.

Niet the Dark Absol
  • 320,036
  • 81
  • 464
  • 592
  • Yes it could contain commas in the strings to that won't work. Clever though. – rob Aug 21 '14 at 22:06
  • Alternatively, using `JSON.stringify()`, something like: `JSON.stringify(treeOfStrings).replace(/","|",\["?|"\],"?|^\["|"\]$/g, '')` which could be a bit more resistant to commas. – kamituel Aug 21 '14 at 22:08
  • @kamituel sorry yours doesn't come up with the right result, even with all alphanumeric strings. (its also the slowest of the bunch). But thanks! – rob Aug 21 '14 at 23:52
2

You could do it a little more concisely (though it'd be algorithmically the same) with .map() and .join():

var flatTree = tree.map(function flattener(elem) {
  return (typeof elem) === "string" ? elem : elem.map(flattener).join("");
}).join("");

That'll work if the "tree" is just strings and sub-arrays.

Pointy
  • 405,095
  • 59
  • 585
  • 614
  • Interesting. I don't think its algorithmically the same since yours does lots of joins() while mine only does one at the very end. I will test it on some of our real world examples and see if it is as fast or faster than my current solution....it might be. (I'll wait a day or two to accept an answer, to see if anyone comes up with a super-crazy-fast one, but right now yours is the best one :) ) – rob Aug 21 '14 at 22:22
  • @rob yes there are more joins, but the same number of strings are involved. Unless you've got hundreds of thousands of elements in you tree, I wouldn't expect this to take much time. Dynamic languages like JavaScript are heavily optimized to make stuff like that go pretty fast. – Pointy Aug 21 '14 at 22:23
  • yeah I don't doubt it's quite fast, and very likely faster than mine because of all the pushes in mine. I've never really used map() so it's nice to have a good example I can relate to. – rob Aug 21 '14 at 22:30
  • I know it isnt the same language, but this SO on array_map vs foreach in php probably still accurately accounts for the slowness of using map vs other methods, tl; dr because of how the memory manager handles the lambda (anonymous function) http://stackoverflow.com/questions/18144782/performance-of-foreach-array-map-with-lambda-and-array-map-with-static-function – chiliNUT Aug 22 '14 at 07:09
  • @chiliNUT I doubt JavaScript would show similar differences. – Pointy Aug 22 '14 at 13:04
1

Join is not recursive

var str=treeOfStrings.join("");
console.log(str); //"abc,def,ghij,k,l,mnop,qrst,uv,wxyz"

So @Pointy's method uses map to do a recursive join.

toString on an array will make a call to join instead, which will use join's default delimiter, a comma. (See here for reference)

For the nested arrays, it will automatically join recursively, even if you override toString:

Array.prototype.toString=function() {
return this.join("");
};

then

var str=treeOfStrings.toString(); 
console.log(str); //"abcdefghijklmnopqrstuvwxyz"

Since the toString is done recursively to the nested array elements, so is the join. JS handles the recursion for you, so I would assume it is a fast algorithm, but I haven't benchmarked.

Overriding toString may not be desirable in all situations, but it is just another option.

chiliNUT
  • 18,989
  • 14
  • 66
  • 106
  • Very creative! Results so far: yours is the fastest of the answers, although my original solution is actually a tad faster, believe it or not. Mine averages about 10 milliseconds, yours about 12, Pointy's about 23 and the one using JSON.stringify() 37 (except it came up with a different result so it is disqualified). I'm using randomly generated trees of random strings, each string averaging about 50 characters, the final string being about 67k, and I run it 100 times in a row to time it. Obviously the results probably would differ depending on the input data. – rob Aug 21 '14 at 23:50
  • Ok, well I had to pick one and so I picked yours. I learned the most from it (so weird how it does join recursively...not sure I understand how that works). Still check out my results....the most naive recursive solution is far and away the fastest. – rob Aug 22 '14 at 02:22
  • Sweet! I didnt understand either, but my guess is that when you call tostring, on a nested array, the javascript interpreter is recursively calling tostring on each nested array. – chiliNUT Aug 22 '14 at 03:37
  • 1
    Also @rob fwiw I would guess your algorithm would be a bit faster if instead of pushing to an array, you instead concated to a string, which would both alleviate the need to reallocate memory for all the pushes, and also you wouldnt need to join – chiliNUT Aug 22 '14 at 07:15
  • You are right that concatting strings seems to be way faster than push() and join(), but I had thought the opposite. – rob Aug 22 '14 at 07:27