9

What method of capitalizing is better?

mine:

char[] charArray = string.toCharArray();
charArray[0] = Character.toUpperCase(charArray[0]);
return new String(charArray);

or

commons lang - StringUtils.capitalize:

return new StringBuffer(strLen)
            .append(Character.toTitleCase(str.charAt(0)))
            .append(str.substring(1))
            .toString();

I think mine is better, but i would rather ask.

IAdapter
  • 62,595
  • 73
  • 179
  • 242
  • 9
    Counter-question: is String capitalization really the bottleneck in your application? – Joachim Sauer Oct 08 '09 at 08:07
  • I understand that it doesnt matter that much, but if i would write any library i would try to make it perform as good as possible. – IAdapter Oct 08 '09 at 08:19
  • 6
    Funny. If *I* would write a library I would try to make it *work* as good as possible. – Bombe Oct 08 '09 at 08:25
  • 2
    http://www.codinghorror.com/blog/archives/001218.html Profile, then optimize. If you are writing a library, make it easy to use, hard to abuse, then worry about the speed. As long as you don't use silly algorithms, it will run pretty well. – Calyth Oct 08 '09 at 23:13
  • 1
    In the words of Kent Beck - "make it work, make it right, make it fast". Developers usually guess their bottlenecks wrong anyway. – studgeek Jan 30 '13 at 18:36

11 Answers11

8

I guess your version will be a little bit more performant, since it does not allocate as many temporary String objects.

I'd go for this (assuming the string is not empty):

StringBuilder strBuilder = new StringBuilder(string);
strBuilder.setCharAt(0, Character.toUpperCase(strBuilder.charAt(0))));
return strBuilder.toString();

However, note that they are not equivalent in that one uses toUpperCase() and the other uses toTitleCase().

From a forum post:

Titlecase <> uppercase
Unicode defines three kinds of case mapping: lowercase, uppercase, and titlecase. The difference between uppercasing and titlecasing a character or character sequence can be seen in compound characters (that is, a single character that represents a compount of two characters).

For example, in Unicode, character U+01F3 is LATIN SMALL LETTER DZ. (Let us write this compound character using ASCII as "dz".) This character
uppercases to character U+01F1, LATIN CAPITAL LETTER DZ. (Which is
basically "DZ".) But it titlecases to to character U+01F2, LATIN CAPITAL
LETTER D WITH SMALL LETTER Z. (Which we can write "Dz".)

character uppercase titlecase
--------- --------- ---------
dz        DZ        Dz
Lucero
  • 59,176
  • 9
  • 122
  • 152
  • Could you please provide more detail on the difference between toUpperCase() and toTitleCase()? – Tim Büthe Oct 08 '09 at 08:21
  • 1
    The Apache code was probably written for 1.4 or before. In Sun's implementation back then the Apache code would not create any temporary `char[]` arrays (both `String.substring` and (initially) `StringBuffer.toString` share backing arrays). So the Apache code would gave been, before 2004, faster for large strings. – Tom Hawtin - tackline Oct 08 '09 at 11:45
3

If I were to write a library, I'd try to make sure I got my Unicode right beofre worrying about performance. Off the top of my head:

int len = str.length();
if (len == 0) {
    return str;
}
int head = Character.toUpperCase(str.codePointAt(0));
String tail = str.substring(str.offsetByCodePoints(0, 1));
return new String(new int[] { head }).concat(tail);

(I'd probably also look up the difference between title and upper case before I committed.)

Tom Hawtin - tackline
  • 145,806
  • 30
  • 211
  • 305
2

Performance is equal.

Your code copies the char[] calling string.toCharArray() and new String(charArray).

The apache code on buffer.append(str.substring(1)) and buffer.toString(). The apache code has an extra string instance that has the base char[1,length] content. But this will not be copied when the instance String is created.

Thomas Jung
  • 32,428
  • 9
  • 84
  • 114
1

StringBuffer is declared to be thread safe, so it might be less effective to use it (but one shouldn't bet on it before actually doing some practical tests).

Grzegorz Oledzki
  • 23,614
  • 16
  • 68
  • 106
1

StringBuilder (from Java 5 onwards) is faster than StringBuffer if you don't need it to be thread safe but as others have said you need to test if this is better than your solution in your case.

Chris R
  • 2,464
  • 3
  • 25
  • 31
0

Have you timed both?

Honestly, they're equivalent.. so the one that performs better for you is the better one :)

warren
  • 32,620
  • 21
  • 85
  • 124
  • 2
    Beware that benchmarking language features is very difficult in Java, see this very good article by Brian Goetz: http://www.ibm.com/developerworks/java/library/j-jtp12214/index.html?S_TACT=105AGX02&S_CMP=EDU – Jesper Oct 08 '09 at 08:07
  • 2
    Also note that the results may vary depending on the string length. – Lucero Oct 08 '09 at 08:12
0

Not sure what the difference between toUpperCase and toTitleCase is, but it looks as if your solution requires one less instantiation of the String class, while the commons lang implementation requires two (substring and toString create new Strings I assume, since String is immutable).

Whether that's "better" (I guess you mean faster) I don't know. Why don't you profile both solutions?

mxk
  • 43,056
  • 28
  • 105
  • 132
0

look at this question titlecase-conversion . apache FTW.

Community
  • 1
  • 1
Nico
  • 1,954
  • 2
  • 14
  • 18
0
/**
     * capitalize the first letter of a string
     * 
     * @param String
     * @return String
     * */
    public static String capitalizeFirst(String s) {
        if (s == null || s.length() == 0) {
            return "";
        }
        char first = s.charAt(0);
        if (Character.isUpperCase(first)) {
            return s;
        } else {
            return Character.toUpperCase(first) + s.substring(1);
        }
    }
ahmed_khan_89
  • 2,755
  • 26
  • 49
0

If you only capitalize limited words, you better cache it.

@Test
public void testCase()
{
    String all = "At its base, a shell is simply a macro processor that executes commands. The term macro processor means functionality where text and symbols are expanded to create larger expressions.\n" +
            "\n" +
            "A Unix shell is both a command interpreter and a programming language. As a command interpreter, the shell provides the user interface to the rich set of GNU utilities. The programming language features allow these utilities to be combined. Files containing commands can be created, and become commands themselves. These new commands have the same status as system commands in directories such as /bin, allowing users or groups to establish custom environments to automate their common tasks.\n" +
            "\n" +
            "Shells may be used interactively or non-interactively. In interactive mode, they accept input typed from the keyboard. When executing non-interactively, shells execute commands read from a file.\n" +
            "\n" +
            "A shell allows execution of GNU commands, both synchronously and asynchronously. The shell waits for synchronous commands to complete before accepting more input; asynchronous commands continue to execute in parallel with the shell while it reads and executes additional commands. The redirection constructs permit fine-grained control of the input and output of those commands. Moreover, the shell allows control over the contents of commands’ environments.\n" +
            "\n" +
            "Shells also provide a small set of built-in commands (builtins) implementing functionality impossible or inconvenient to obtain via separate utilities. For example, cd, break, continue, and exec cannot be implemented outside of the shell because they directly manipulate the shell itself. The history, getopts, kill, or pwd builtins, among others, could be implemented in separate utilities, but they are more convenient to use as builtin commands. All of the shell builtins are described in subsequent sections.\n" +
            "\n" +
            "While executing commands is essential, most of the power (and complexity) of shells is due to their embedded programming languages. Like any high-level language, the shell provides variables, flow control constructs, quoting, and functions.\n" +
            "\n" +
            "Shells offer features geared specifically for interactive use rather than to augment the programming language. These interactive features include job control, command line editing, command history and aliases. Each of these features is described in this manual.";
    String[] split = all.split("[\\W]");

    // 10000000
    // upper Used 606
    // hash Used 114

    // 100000000
    // upper Used 5765
    // hash Used 1101

    HashMap<String, String> cache = Maps.newHashMap();

    long start = System.currentTimeMillis();
    for (int i = 0; i < 100000000; i++)
    {

        String upper = split[i % split.length].toUpperCase();

//            String s = split[i % split.length];
//            String upper = cache.get(s);
//            if (upper == null)
//            {
//                cache.put(s, upper = s.toUpperCase());
// 
//            }
    }
    System.out.println("Used " + (System.currentTimeMillis() - start));
}

The text is picked from here.

Currently, I need to upper case the table name and columns, many many more times, but they are limited.Use the hashMap to cache will be better.

:-)

wener
  • 7,191
  • 6
  • 54
  • 78
-1

use this method for capitalizing of string. its totally working without any bug

public String capitalizeString(String value)
{
    String string = value;
    String capitalizedString = "";
    System.out.println(string);
    for(int i = 0; i < string.length(); i++)
    {
        char ch = string.charAt(i);
        if(i == 0 || string.charAt(i-1)==' ')
            ch = Character.toUpperCase(ch);
        capitalizedString += ch;
    }
    return capitalizedString;
}