0

I would like to store links in a Firebase table. I want them to be unique. These links will also be referred to by other tables, so I'd prefer not to have to store the entire long URL string of a link every time I refer to it. I'm having trouble finding a way to enforce the uniqueness of links while also using a relatively short key to refer to them.

For example, given the following schema:

{
  "comments" : {
    "-JYC6EkXz5DZt7s5jFMT" : {
      "content" : "This is the first comment.",
      "createdAt" : 1412190501922,
      "link" : "http---testing-com-1-some-article",
      "userId" : 0
    },
    "-JYC6EmzCoKfYol1Ybyo" : {
      "content" : "This is a reply to the first.",
      "createdAt" : 1412190502079,
      "link" : "http---testing-com-1-some-article",
      "replyToCommentId" : "-JYC6EkXz5DZt7s5jFMT",
      "userId" : 1
    },
    "-JYC6Ep9lwdAwQbZmdYH" : {
      "content" : "This is a reply to the second.",
      "createdAt" : 1412190502218,
      "link" : "http---testing-com-1-some-article",
      "replyToCommentId" : "-JYC6EmzCoKfYol1Ybyo",
      "userId" : 0
    }
  },
  "links" : {
    "http---testing-com-1-some-article" : {
      "comments" : {
        "-JYC6EkXz5DZt7s5jFMT" : true,
        "-JYC6EmzCoKfYol1Ybyo" : true,
        "-JYC6Ep9lwdAwQbZmdYH" : true
      },
      "createdAt" : 1412190501880,
      "url" : "http://testing.com/1/some_article"
    }
  },
  "users" : [ {
    "comments" : {
      "-JYC6EkXz5DZt7s5jFMT" : true,
      "-JYC6Ep9lwdAwQbZmdYH" : true
    },
    "createdAt" : 1412190501881,
    "name" : "Joe Blow"
  }, {
    "comments" : {
      "-JYC6EmzCoKfYol1Ybyo" : true
    },
    "createdAt" : 1412190501881,
    "name" : "Jack Black"
  } ]
}

As you can see, each comment must include a long key for the link it belongs to. Is there a good way to shorten these keys while keeping uniqueness?

Community
  • 1
  • 1
cayblood
  • 1,838
  • 1
  • 24
  • 44
  • You can probably do a hash on the URL to get something shorther and reasonably likely to be unique. Or just keep a counter somewhere in your Firebase and grab the next value each time a new one is needed (within a `transaction` of course). – Frank van Puffelen Oct 01 '14 at 20:21

1 Answers1

1

Are you having storage issues? If not, I wouldn't trouble yourself with optimizing just yet as the complexity is probably not worth the perceived (and possibly intangible) reward.

To answer the question, one simple, foolproof approach would be to just assign each URL an id and use an index table for lookups.

var ref = new Firebase(URL);
var indexRef = ref.child('url_index');

function assignId(url) {
   var key = encodeURI(url);
   // create a new, unique id
   var uniqueId = indexRef.push().name();
   // store the id by URL
   indexRef.child(key).set(uniqueId);
}

function lookupId(url, callback) {
   var key = encodeURI(url);
   indexRef.child(key).once('value', function(snap) {
      // returns the unique id for this URL, or null if it does not exist yet
      callback(snap.val());
   });
}

A simpler approach would be to create a unique hash for each URL and use that to store them. This is not foolproof, but in the scale of human usage (i.e. records less than billions) it's plenty unique.

The nice thing here is you don't need to do a lookup when you have the URL. You can just hash it to obtain its key and then perform whatever ops you want from it (including checking to see if it is unique or fetching the actual URL).

// taken from: http://stackoverflow.com/questions/7616461/generate-a-hash-from-string-in-javascript-jquery
function hashCode(string) {
  var hash = 0, i, chr, len;
  if (string.length == 0) return hash;
  for (i = 0, len = string.length; i < len; i++) {
    chr   = string.charCodeAt(i);
    hash  = ((hash << 5) - hash) + chr;
    hash |= 0; // Convert to 32bit integer
  }
  return hash;
};

var ref = new Firebase(URL);
var indexRef = ref.child('url_index');

function storeUrl(url) {
   var key = hashCode(url);
   // store the id by URL
   indexRef.child(key).set(url);
}

function getUrl(key, callback) {
   indexRef.child(key).once('value', function(snap) {
      // returns the url for a given hash code
      callback(snap.val());
   });
}
Kato
  • 40,352
  • 6
  • 119
  • 149
  • Thanks @kato. I chose to use an atomically-incremented counter for link ids so as to avoid potentially hash collisions. – cayblood Oct 03 '14 at 00:16