14

How do I humanize a string? Based on the following criteria:

  • Deletes leading underscores, if any.
  • Replaces underscores with spaces, if any.
  • Capitalizes the first word.

For example:

this is a test -> This is a test
foo Bar Baz    -> Foo bar baz
foo_bar        -> Foo bar
foo_bar_baz    -> Foo bar baz
foo-bar        -> Foo-bar
fooBarBaz      -> FooBarBaz
Christian Fazzini
  • 19,613
  • 21
  • 110
  • 215
  • I guess if it's CamelCase, it should be left alone? The same principle with words with dashes. But the first char should always be capitalized. I made an edit. – Christian Fazzini Feb 05 '15 at 04:20

5 Answers5

14

Best is indeed to use some regexes:

^[\s_]+|[\s_]+$ catches 1 or more white-space characters or underscores either at the very beginning (^) or at the very end ($) of the string. Note that this also catches new-line characters. Replace them with an empty string.

[_\s]+ Again catches 1 or more white-space characters or underscores, since the ones at the beginning/end of the string are gone, replace with 1 space.

^[a-z] Catch a lowercase letter at the beginning of the string. Replace with the uppercase version of the match (you need a callback function for that).

Combined:

function humanize(str) {
  return str
      .replace(/^[\s_]+|[\s_]+$/g, '')
      .replace(/[_\s]+/g, ' ')
      .replace(/^[a-z]/, function(m) { return m.toUpperCase(); });
}

document.getElementById('out').value = [
  '    this is a test',
  'foo Bar Baz',
  'foo_bar',
  'foo-bar',
  'fooBarBaz',
  '_fooBarBaz____',
  '_alpha',
  'hello_ _world,   how    are________you?  '
].map(humanize).join('\n');
textarea { width:100%; }
<textarea id="out" rows="10"></textarea>
asontu
  • 4,548
  • 1
  • 21
  • 29
  • This is getting closer to the one liner... :) – istos Feb 05 '15 at 09:25
  • @istos One-liner Shmone-liner **:P** I put it in my bio a long time ago that regular expressions are a tool, not a solution. It's possible to make one regex that catches everything, then in the callback examine the match and decide what treatment it needs (Delete? Replace with space? Capitalize?). But that code would be harder to read and maintain. If your dataset is so large that the performance of a single regex-replace call is significantly better, then you shouldn't be handling that data with JavaScript on the client to begin with **;)** – asontu Feb 05 '15 at 09:34
  • Good points, especially this one: "One-liner Shmone-liner" :) – istos Feb 05 '15 at 09:46
8

This covers all your cases:

var tests = [
  'this is a test',
  'foo Bar Baz',
  ...
]

var res = tests.map(function(test) {
  return test
    .replace(/_/g, ' ')
    .trim()
    .replace(/\b[A-Z][a-z]+\b/g, function(word) {
      return word.toLowerCase()
    })
    .replace(/^[a-z]/g, function(first) {
      return first.toUpperCase()
    })
})

console.log(res)
/*
[ 'This is a test',
  'Foo bar baz',
  'Foo bar',
  'Foo-bar',
  'FooBarBaz' ]
*/
elclanrs
  • 92,861
  • 21
  • 134
  • 171
3

Lodash has _.startCase which is good for humanising object keys. Transforming underscores dashes and camel case into spaces.

In your case you want to capitalise but maintain camel case. This question was asked a while ago. My preference currently would be to create a class that handles the mutations. Its easier to test & maintain. So if in the future you need to support transformations like "1Item" into "First item" you can write one function with a single responsibility.

The below is more computationally expensive but more maintainable. There is one clear function toHumanString which is can easily be understood and modified.

export class HumanizableString extends String {
  capitalizeFirstLetter() => {
    const transformed = this.charAt(0).toUpperCase() + this.slice(1);
    return new HumanizableString(transformed);
  };

  lowerCaseExceptFirst() => {
    const transformed = this.charAt(0) + this.slice(1).toLowerCase();
    return new HumanizableString(transformed);
  };

  camelCaseToSpaces() => {
    const camelMatch = /([A-Z])/g;
    return new HumanizableString(this.replace(camelMatch, " $1"));
  };

  underscoresToSpaces() => {
    const camelMatch = /_/g;
    return new HumanizableString(this.replace(camelMatch, " "));
  };

  toHumanString() => {
    return this.camelCaseToSpaces()
      .underscoresToSpaces()
      .capitalizeFirstLetter()
      .lowerCaseExceptFirst()
      .toString();
  };
}

At the very least you should name your regular expressions to make them more readable.

export const humanise = (value) => {
  const camelMatch = /([A-Z])/g;
  const underscoreMatch = /_/g;

  const camelCaseToSpaces = value.replace(camelMatch, " $1");
  const underscoresToSpaces = camelCaseToSpaces.replace(underscoreMatch, " ");
  const caseCorrected =
    underscoresToSpaces.charAt(0).toUpperCase() +
    underscoresToSpaces.slice(1).toLowerCase();

  return caseCorrected;
};
Lex
  • 4,749
  • 3
  • 45
  • 66
2

Although I think a regex expert would be able to do something like this in a one-liner, personally I would do something like this.

function humanize(str) {
  return str.trim().split(/\s+/).map(function(str) {
    return str.replace(/_/g, ' ').replace(/\s+/, ' ').trim();
  }).join(' ').toLowerCase().replace(/^./, function(m) {
    return m.toUpperCase();
  });
}

Tests:

[
  '    this is a test',
  'foo Bar Baz',
  'foo_bar',
  'foo-bar',
  'fooBarBaz',
  '_fooBarBaz____',
  '_alpha',
  'hello_ _world,   how    are________you?  '
].map(humanize);

/* Result:
   [
     "This is a test", 
     "Foo bar baz", 
     "Foo bar", 
     "Foo-bar", 
     "Foobarbaz", 
     "Foobarbaz", 
     "Alpha", 
     "Hello world, how are you?"
   ]
 */
istos
  • 2,654
  • 1
  • 17
  • 19
1

Another option:

const humanize = (s) => {
  if (typeof s !== 'string') return s
  return s
      .replace(/^[\s_]+|[\s_]+$/g, '')
      .replace(/[_\s]+/g, ' ')
      .replace(/\-/g, ' ')
      .replace(/^[a-z]/, function(m) { return m.toUpperCase(); });
}
Jeremy Lynch
  • 6,780
  • 3
  • 52
  • 63