0

I've got a title string that comes in from a form my users will submit. I'd like to take that title, in whatever way it is written, and format it to title case, which I believe to be the most read-friendly way.

alexis Sanchez -> Alexis Sanchez
sanchez's ball -> Sanchez's Ball
sanchez and co -> Sanchez and Co

I'd also like to take words like "and" or "or" and have them as lowercase, just as shown above. Is this possible?

Also, on some occasions, I'd like to take this title and filter it so it ignores any special characters like apostrophes and separates it with dashes. This will be for URLs (i.e. .com/sanchezs-ball).

alexis Sanchez -> alexis-sanchez
ALEXIS Sanchez -> alexis-sanchez
sanchez's ball -> sanchezs-ball

I've created two functions that do almost this. I just need some advice in how to tweak them to get them to my desired format.

This one takes my title and turns it into a URL-likeable string. Can this be improved/shortened?

function hashifyString(dirtyString) {
  var title = dirtyString
  .replace(/[^a-z0-9\s]/gi, '')
  .replace(/\b\w+/g,function(s){return s.substr(0).toLowerCase();})
  .replace(/[_\s]/g, '-');
  return title;
}

I need some advice on how to improve upon this one, so it looks out for "and" or "or" and keeps these two words lowercase while keeping the rest of the string title case?

function beautifyString(dirtyString) {
  var string = dirtyString
  .replace(/[^a-z0-9\s]/gi, '')
  .replace(/\b\w+/g,function(s){return s.charAt(0).toUpperCase() + s.substr(1).toLowerCase();})
  return string;
}

I hope this makes sense. Any help is appreciated. Thanks in advance!

realph
  • 4,481
  • 13
  • 49
  • 104
  • You said you want `.com/Sanchezs-Ball` (notice the Capitalization?) but your `hashifyString` does not do that! – Roko C. Buljan Dec 05 '14 at 02:50
  • Sorry, that was a typo. Fixed! – realph Dec 05 '14 at 02:51
  • Have you thought about if one of your users has diacritics in the name what than? i.e: `franois-font` instead of `francois-fonte`, not to talk about Germans, they have ümlauts all over the place :) – Roko C. Buljan Dec 05 '14 at 02:58
  • @RokoC.Buljan I didn't actually think about that. Is there a way to account for this? – realph Dec 05 '14 at 03:01
  • Sure there is! http://stackoverflow.com/questions/990904/javascript-remove-accents-diacritics-in-strings and you still need to cover most of the expected utf-8 unicode characters that are not present in those examples. – Roko C. Buljan Dec 05 '14 at 03:21

2 Answers2

1

First let's simply create an inArray function.

function inArray(val, ary){
  for(var i=0,l=ary.length; i<l; i++){
    if(ary[i] === val){
      return true;
    }
  }
  return false;
}

Now we need to create a function that makes your entire String lowercase so we can capitalize only the first letter in each word. This can be accomplished by splitting the lowercase String on ' '. Next we loop over the results of the .split(' '), testing to see if it's one of the words in the exceptions Array. If so, we don't capitalize it, otherwise we do. .join(' ') everything back on that same ' ' we .split(' ') on, and return the result.

function ucFirstWord(str, exceptions){
  var st = str.toLowerCase().split(' '), r = [];
  for(var i=0,l=st.length; i<l; i++){
    var s = st[i];
    var x = inArray(s, exceptions) ? s : s.charAt(0).toUpperCase()+s.substring(1);
    r.push(x);
  }
  return r.join(' ');
}

This function should be self-explanatory.

function lcHyphenize(str){
  return str.toLowerCase().replace(' ', '-');
}

Usage:

var ucString = ucFirstWord(yourStringVar, ['the', 'and', 'then', 'but', 'or', 'yet']);
var hzString = lcHyphenize(yourStringVar);
StackSlave
  • 10,613
  • 2
  • 18
  • 35
  • That is the `exceptions` Array, used with my `inArray` function, to make sure those words do not become capitalized. – StackSlave Dec 05 '14 at 03:27
  • Code only answers aren't liked. You should explain what it does and why it fixes the OPs issue. And it means you are less likely to get comments like "what does do?". – RobG Dec 05 '14 at 03:29
  • @robG :) it's just me that did not read the OP's request about `I'd also like to take words like "and" or "or" and have them as lowercase` my bad (I thought at first that it was all about Names Surnames), but yes, you're right. – Roko C. Buljan Dec 05 '14 at 03:34
  • @PHPglue This is great, and thanks for the explanation! Do you know if there's a way to make a that `ucFirstWord` function without the exceptions, without making a second function to do just that. Is there a way to make a multi-use function, if that makes sense? – realph Dec 05 '14 at 16:24
0
((?:^|\s)[a-z])(?!nd\b|r\b)

You can try this regex.Replace by $1.upper() or something .This will not select and or or.

See demo.

https://regex101.com/r/yR3mM3/59

vks
  • 67,027
  • 10
  • 91
  • 124