Questions tagged [case-folding]

Questions related to the case-insensitve comparison and use of strings.

Questions related to the case-insensitve comparison and use of strings. This is not as simple as merely lowercasing or uppercasing strings.

26 questions
19
votes
1 answer

Golang complex fold grüßen

I'm trying to get case folding to be consistent between three languages (C++, Python and Golang) because I need to be able to check if a string matches the one saved no matter the language. An example problematic word is the German word "grüßen"…
Shawn Blakesley
  • 1,743
  • 1
  • 17
  • 33
18
votes
2 answers

Assuming Unicode and case-insensitivity, should the pattern ".." match "FfIsS"?

It sounds like a joke, but I can sort of prove it. Assumptions: Dot matches any single character. A case-insensitive pattern matches s if and only if it matches s.toUpperCase(). All of the following is pretty logical and holds in…
maaartinus
  • 44,714
  • 32
  • 161
  • 320
18
votes
3 answers

How to create a case insensitive map in Go?

I want to have a key insensitive string as key. Is it supported by the language or do I have to create it myself? thank you Edit: What I am looking for is a way to make it by default instead of having to remember to convert the keys every time I…
Santiago Corredoira
  • 47,267
  • 10
  • 52
  • 56
10
votes
2 answers

Unicode case folding to upper case

I'm trying to implement a library for reading Microsoft CFB (Compound File Binary) Format files, according to the official specification of that format. The specification is available from this site. In a nutshell - some of the structures of the…
Daniel Kamil Kozar
  • 18,476
  • 5
  • 50
  • 64
9
votes
1 answer

Should I use Python casefold?

Been recently reading on casefold and string comparisons when ignoring case. I've read that the MSDN standard is to use InvariantCulture and definitely avoid toLowercase. However, casefold from what I have read is like a more aggressive toLowercase.…
FlyingLightning
  • 169
  • 1
  • 7
8
votes
3 answers

python: lower() german umlauts

I have a problem with converting uppercase letters with umlauts to lowercase ones. print("ÄÖÜAOU".lower()) The A, O and the U gets converted properly but the Ä,Ö and Ü stays uppercase. Any ideas? First problem is fixed with the .decode('utf-8') but…
user2104634
  • 83
  • 1
  • 4
6
votes
2 answers

Why is upper casing not enough for case-insensitive comparison?

To compare two strings case insensitively, one correct way is to case fold them first. How is this better than upper casing or lower casing? I find examples where lower casing doesn't work right online. For example "σ" and "ς" (two forms of "Σ")…
6
votes
2 answers

Folding case to speed up comparisons

"strasse".Equals("STRAße",StringComparison.InvariantCultureIgnoreCase) This returns true. Which is correct. Unfortunately, when I store one of these in postgres, it thinks they are not the same when doing a case insensitive match (for example, with…
Tanktalus
  • 21,664
  • 5
  • 41
  • 68
4
votes
2 answers

While using casefold(), I am getting an error as " AttributeError: 'str' object has no attribute 'casefold' "

vowels = 'aeiou' # take input from the user ip_str = raw_input("Enter a string: ") # make it suitable for caseless comparisions ip_str = ip_str.casefold() # make a dictionary with each vowel a key and value 0 count = {}.fromkeys(vowels,0) #…
Nikhil Kadam
  • 65
  • 1
  • 1
  • 8
3
votes
0 answers

Detecting Normalization Breaking Changes in Unicode via the UCD

Unicode emphasizes that software should be as forward compatible as possible, by defaulting to treating unassigned characters as if they were a private use code point. This works well in most cases, as most new characters do not change when…
3
votes
3 answers

Case folding variation to variable so input matches one of assigned variables

Creating a guess a word game and secret_word can be in any variation but how would I write different variation of secret_word is recognized by the program? In this case, secret word is "Korea", how am I able to unify any variation or do I have to…
J Min
  • 55
  • 4
3
votes
1 answer

How do I make toLowerCase() and toUpperCase() consistent across browsers

Are there JavaScript polyfill implementations of String.toLowerCase() and String.toUpperCase(), or other methods in JavaScript that can work with Unicode characters and are consistent across browsers? Background info Performing the following will…
Dan McGrath
  • 41,220
  • 11
  • 99
  • 130
3
votes
3 answers

Regex Pattern with Unicode doesn't do case folding

In C# it appears that Grüsse and Grüße are considered equal in most circumstances as is explained by this nice webpage. I'm trying to find a similar behavior in Java - obviously not in java.lang.String. I thought I was in luck with…
geert3
  • 7,086
  • 1
  • 33
  • 49
3
votes
1 answer

Maximum length of a string after performing unicode casefolding

I need to perform casefolding on a set of strings, and must ensure beforehand that they will not exceed a given length after this is done (to hard-code the needed buffer size). The problem is that a string length (in code points) may change after…
michaelmeyer
  • 7,985
  • 7
  • 30
  • 36
3
votes
2 answers

Normalization needed after case folding

Given a NFC normalized string, applying full case folding to that string, can I assume that the result is NFC normalized too? I don't understand what the Unicode standard is trying to tell me in this quote: Normalization also interacts with case…
dalle
  • 18,057
  • 5
  • 57
  • 81
1
2