14

How can I best check if a string input would be a valid java variable for coding? I'm sure I'm not the first one who is willing to do this. But maybe I'm missing the right keyword to find something useful.

Best would probably be a RegEx, which checks that:

  • starts with a letter

  • can then contain digits, letters

  • can contain some special characters, like '_' (which?)

  • may not contain a whitespace separator
membersound
  • 81,582
  • 193
  • 585
  • 1,120

2 Answers2

20
public static boolean isValidJavaIdentifier(String s) {
    if (s.isEmpty()) {
        return false;
    }
    if (!Character.isJavaIdentifierStart(s.charAt(0))) {
        return false;
    }
    for (int i = 1; i < s.length(); i++) {
        if (!Character.isJavaIdentifierPart(s.charAt(i))) {
            return false;
        }
    }
    return true;
}

EDIT: and, as @Joey indicates, you should also filter out keywords and reserved words.

JB Nizet
  • 678,734
  • 91
  • 1,224
  • 1,255
  • 6
    You're missing keywords, `true`, `false` and `null`. They have to be disallowed. But yes, otherwise that's exactly what the spec suggests ;-) – Joey Mar 15 '13 at 16:53
  • Oh yes, that's true. There are plenty of other keywords as well. – JB Nizet Mar 15 '13 at 16:53
  • 1
    Not that many: http://docs.oracle.com/javase/specs/jls/se7/html/jls-3.html#jls-3.9 – assylias Mar 15 '13 at 16:54
  • might want to have a null check and an extra `)` on the first if. also a `_` in the string would trip the code up as it doesn't have a second character – eis Mar 15 '13 at 17:04
  • That gets me started in a nice direction. Keywords check are not so important as of now for my case, but probably easy to extend with the link provided. – membersound Mar 15 '13 at 17:07
  • 1
    @eis: I consider it a bug to pass null to such a method, so an NPE is what I want here. I don't understand the second part of your comment: _ is a valid Java identifier. – JB Nizet Mar 15 '13 at 17:11
  • @JBNizet yes, was just thinking that the for loop looks for char at index 1 on the first iteration, which wouldn't exist on `_`. But I guess for loop condition check is done before the first iteration, too. – eis Mar 15 '13 at 17:32
15

Java 6+

Use

import javax.lang.model.SourceVersion;

boolean isValidVariableName(CharSequence name) {
    return SourceVersion.isIdentifier(name) && !SourceVersion.isKeyword(name);
}

if you need to check whether a string is a valid Java variable name in the latest version of Java or

import javax.lang.model.SourceVersion;

boolean isValidVariableNameInVersion(CharSequence name, SourceVersion version) {
    return SourceVersion.isIdentifier(name) && !SourceVersion.isKeyword(name, version);
}

if you need to check whether a string is a valid Java variable name in a specific Java version.

For example, underscore became a reserved keyword starting from Java 9, so isValidVariableNameInVersion("_", SourceVersion.RELEASE_9) returns false while isValidVariableNameInVersion("_", SourceVersion.RELEASE_8) returns true.

How it works

SourceVersion.isIdentifier(CharSequence name) checks whether or not name is a syntactically valid identifier (simple name) or keyword in the latest source version. !SourceVersion.isKeyword(name) returns false for keywords. As a result, SourceVersion.isIdentifier(name) && !SourceVersion.isKeyword(name) returns true for valid indetifiers and only for them.

The same approach is used in the built-in method SourceVersion.isName(CharSequence name, SourceVersion version) that checks whether name is a syntactically valid qualified name, which means that it will return true for strings like "apple.color":

public static boolean isName(CharSequence name, SourceVersion version) {
    String id = name.toString();

    for(String s : id.split("\\.", -1)) {
        if (!isIdentifier(s) || isKeyword(s, version))
            return false;
    }
    return true;
}

Test

import org.junit.jupiter.api.Test;

import javax.lang.model.SourceVersion;

import static org.assertj.core.api.Assertions.assertThat;

public class ValidVariableNameTest {
    boolean isValidVariableName(CharSequence name) {
        return isValidVariableNameInVersion(name, SourceVersion.RELEASE_8);
    }

    boolean isValidVariableNameInVersion(CharSequence name, SourceVersion version) {
        return SourceVersion.isIdentifier(name) && !SourceVersion.isKeyword(name, version);
    }

    @Test
    void variableNamesCanBeginWithLetters() {
        assertThat(isValidVariableName("test")).isTrue();
        assertThat(isValidVariableName("e2")).isTrue();
        assertThat(isValidVariableName("w")).isTrue();
        assertThat(isValidVariableName("привет")).isTrue();
    }

    @Test
    void variableNamesCanBeginWithDollarSign() {
        assertThat(isValidVariableName("$test")).isTrue();
        assertThat(isValidVariableName("$e2")).isTrue();
        assertThat(isValidVariableName("$w")).isTrue();
        assertThat(isValidVariableName("$привет")).isTrue();
        assertThat(isValidVariableName("$")).isTrue();
        assertThat(isValidVariableName("$55")).isTrue();
    }

    @Test
    void variableNamesCanBeginWithUnderscore() {
        assertThat(isValidVariableName("_test")).isTrue();
        assertThat(isValidVariableName("_e2")).isTrue();
        assertThat(isValidVariableName("_w")).isTrue();
        assertThat(isValidVariableName("_привет")).isTrue();
        assertThat(isValidVariableName("_55")).isTrue();
    }

    @Test
    void variableNamesCannotContainCharactersThatAreNotLettersOrDigits() {
        assertThat(isValidVariableName("apple.color")).isFalse();
        assertThat(isValidVariableName("my var")).isFalse();
        assertThat(isValidVariableName(" ")).isFalse();
        assertThat(isValidVariableName("apple%color")).isFalse();
        assertThat(isValidVariableName("apple,color")).isFalse();
        assertThat(isValidVariableName(",applecolor")).isFalse();
    }

    @Test
    void variableNamesCannotStartWithDigit() {
        assertThat(isValidVariableName("2e")).isFalse();
        assertThat(isValidVariableName("5")).isFalse();
        assertThat(isValidVariableName("123test")).isFalse();
    }


    @Test
    void differentSourceVersionsAreHandledCorrectly() {
        assertThat(isValidVariableNameInVersion("_", SourceVersion.RELEASE_9)).isFalse();
        assertThat(isValidVariableNameInVersion("_", SourceVersion.RELEASE_8)).isTrue();

        assertThat(isValidVariableNameInVersion("enum", SourceVersion.RELEASE_9)).isFalse();
        assertThat(isValidVariableNameInVersion("enum", SourceVersion.RELEASE_4)).isTrue();
    }

    @Test
    void keywordsCannotBeUsedAsVariableNames() {
        assertThat(isValidVariableName("strictfp")).isFalse();
        assertThat(isValidVariableName("assert")).isFalse();
        assertThat(isValidVariableName("enum")).isFalse();

        // Modifiers
        assertThat(isValidVariableName("public")).isFalse();
        assertThat(isValidVariableName("protected")).isFalse();
        assertThat(isValidVariableName("private")).isFalse();

        assertThat(isValidVariableName("abstract")).isFalse();
        assertThat(isValidVariableName("static")).isFalse();
        assertThat(isValidVariableName("final")).isFalse();

        assertThat(isValidVariableName("transient")).isFalse();
        assertThat(isValidVariableName("volatile")).isFalse();
        assertThat(isValidVariableName("synchronized")).isFalse();

        assertThat(isValidVariableName("native")).isFalse();

        // Declarations
        assertThat(isValidVariableName("class")).isFalse();
        assertThat(isValidVariableName("interface")).isFalse();
        assertThat(isValidVariableName("extends")).isFalse();
        assertThat(isValidVariableName("package")).isFalse();
        assertThat(isValidVariableName("throws")).isFalse();
        assertThat(isValidVariableName("implements")).isFalse();

        // Primitive types and void
        assertThat(isValidVariableName("boolean")).isFalse();
        assertThat(isValidVariableName("byte")).isFalse();
        assertThat(isValidVariableName("char")).isFalse();
        assertThat(isValidVariableName("short")).isFalse();
        assertThat(isValidVariableName("int")).isFalse();
        assertThat(isValidVariableName("long")).isFalse();
        assertThat(isValidVariableName("float")).isFalse();
        assertThat(isValidVariableName("double")).isFalse();
        assertThat(isValidVariableName("void")).isFalse();

        // Control flow
        assertThat(isValidVariableName("if")).isFalse();
        assertThat(isValidVariableName("else")).isFalse();

        assertThat(isValidVariableName("try")).isFalse();
        assertThat(isValidVariableName("catch")).isFalse();
        assertThat(isValidVariableName("finally")).isFalse();

        assertThat(isValidVariableName("do")).isFalse();
        assertThat(isValidVariableName("while")).isFalse();
        assertThat(isValidVariableName("for")).isFalse();
        assertThat(isValidVariableName("continue")).isFalse();

        assertThat(isValidVariableName("switch")).isFalse();
        assertThat(isValidVariableName("case")).isFalse();
        assertThat(isValidVariableName("default")).isFalse();
        assertThat(isValidVariableName("break")).isFalse();
        assertThat(isValidVariableName("throw")).isFalse();

        assertThat(isValidVariableName("return")).isFalse();

        // Other keywords
        assertThat(isValidVariableName("this")).isFalse();
        assertThat(isValidVariableName("new")).isFalse();
        assertThat(isValidVariableName("super")).isFalse();
        assertThat(isValidVariableName("import")).isFalse();
        assertThat(isValidVariableName("instanceof")).isFalse();

        // Reserved keywords
        assertThat(isValidVariableName("goto")).isFalse();
        assertThat(isValidVariableName("const")).isFalse();
    }

    @Test
    void literalsCannotBeUsedAsVariableNames() {
        assertThat(isValidVariableName("null")).isFalse();
        assertThat(isValidVariableName("true")).isFalse();
        assertThat(isValidVariableName("false")).isFalse();
    }
}
Denis Stafichuk
  • 2,415
  • 2
  • 16
  • 29
  • wouldn't the SourceVersion.isIdentifier() method be more adequate here? – Sebastian Mar 08 '21 at 16:54
  • Agree, looks that "SourceVersion.isIdentifier()" behaves much better, for string such as "my.var" – Christophe Moine Nov 18 '21 at 08:11
  • @ChristopheMoine @Sebastian thanks a lot for pointing out a bug in my answer. I've fixed it and significantly expanded the answer. Note, that `SourceVersion.isIdentifier()` alone cannot be used to check whether a string is a valid Java variable because it returns `true` for keywords like `null`, `int`, `true`, `false`, that cannot be used as Java variable names – Denis Stafichuk Feb 13 '22 at 13:05