1

For now, i can write in VS2017:

var какаяТоНепонятнаяПеременная = "some variable value here";

and VS2017 has compiled it successfully. I want to allow to write variable names only using letters from the English alphabet.

marsze
  • 15,079
  • 5
  • 45
  • 61
Andrey Ravkov
  • 1,268
  • 12
  • 21
  • 1
    Define "English". What if the variable name is `ghrborpf`? Is that "English"? – David Jan 29 '19 at 20:06
  • @David, yes. i want to allow only characters from English alphabet. This question only for characters, var ghrborpf123 is fine. – Andrey Ravkov Jan 29 '19 at 20:31
  • In theory, `roslyn` could do this, but I don't know any details. – Eris Jan 29 '19 at 21:16
  • They are valid unicode characters, so I don't think you can prevent the compiler from using them. – Spinnaker Jan 29 '19 at 21:18
  • You'll probably want to allow _, 0-9, and maybe $, too. And, if you use any kind of code generation or third-party libraries, allow whatever they use-at least where you might create new variables with them. One area of code generation is application settings, but you can probably keep control of that. Maybe a spell-checker approach would work but you'd have to allow approving many jargon words and acronyms. – Tom Blodget Jan 30 '19 at 10:54

1 Answers1

2

I do not know how to quickly inject my code into compiler process to force build failure, but that's theoretically feasible. What I can suggest is a workaround with unit tests based on Roslyn. The starting point will be an installation of Microsoft.Build, Microsoft.CodeAnalysis.Analyzers, Microsoft.CodeAnalysis.Workspaces.MSBuild nuget packages. The idea is to load a solution and then a project you want to scan (using MSBuildWorkpace api) and iterate through all Documents(files). You asked about validating variable names, so it means you need to detect IdentifierNameSyntax items in the SyntaxTree, however that's not the only thing you can detect - MethodDeclarationSyntax, ClassDeclarationSyntax etc. are detectable too. The sample code is below:

    [Test]
    public async Task Verify_ProjectDoesNotHaveNonASCIICharacters()
    {
        var project = workspace.CurrentSolution.Projects.Single(p => p.Name == "csproj_name");

        foreach (var document in project.Documents)
        {
            var semanticModel = await document.GetSemanticModelAsync();

            foreach (var item in semanticModel.SyntaxTree.GetRoot().DescendantNodes())
            {
                switch (item)
                {
                    // you may catch other Syntax types for methods, class names for example
                    case IdentifierNameSyntax identifierName: 
                        Assert.IsFalse(ContainsUnicodeCharacter(identifierName.Identifier.Text), $"Variable {identifierName.Identifier.Text} in {document.Name} contains non ASCII characters");
                    break;
                }
            }
        }
    }

ASCII character check can be improved, but I used the code from here for the sake of time:

    private bool ContainsUnicodeCharacter(string input)
    {
        const int MaxAnsiCode = 255;
        return input.Any(c => c > MaxAnsiCode);
    }

Some sample code to setup MSBuildWorkspace:

var workspace = MSBuildWorkspace.Create();
await workspace.OpenSolutionAsync("...your_path/solution.sln");
Artem
  • 2,084
  • 2
  • 21
  • 29
  • Please just call it by the various types of subsets of Unicode because all C# code is Unicode characters. ASCII does not go to 255 and ANSI is not ASCII. (Since you are testing a `char` [UTF-16 code unit] <= 255, the ANSI character set you must be referring to is ISO 8859-1). The question now says "English alphabet" (contrast with letter characters in English text like é and æ), so the range is the letters in the [C0 Controls and Basic Latin](http://www.unicode.org/charts/nameslist/index.html) block, or, broadly, like in your code, by .NET Regex `!Regex.IsMatch(input, @"\p{IsBasicLatin}*")` – Tom Blodget Jan 30 '19 at 11:09
  • Hey @TomBlodget. I agree and I mentioned in the post that validation should be improved. I feel like the question is more about the mechanism of validation rather than about how to filter out characters – Artem Jan 30 '19 at 11:27
  • @Artem solutions work fine, but it's didn't work for .net core applications, because of Microsoft.CodeAnalysis.Workspaces.MSBuild nuget package is not working with .netCoreApp :( Thank you!!! – Andrey Ravkov Jan 30 '19 at 12:50