1

I have a Tableau data source (original source is MariaDB) containing a number of email addresses. When doing some quality checks on the data source I wanted to identify a variety of possible data entry issues. One of these is whether the string is stored in all lower case.

While emails are not case sensitive, the data entry should ensure they are stored in a standardised way and not just as free text. But historic data has not imposed such checks. So I wanted to identify where emails have upper-case characters which is one (small) signal that the data entry has not been careful.

But with Tableau 2020.2 this seems hard to do (either directly from the source table or as extracts). For example testing whether lower([Email])=[Email] simply returns true for all emails regardless of capitalisation.

Is there any easy way to force case sensitivity in comparisons inside Tableau or to check for upper-case characters in a string?

NB: this question is about Tableau not about solving the problem in simple DB queries. I use Tableau connected to a variety of data sources and would prefer a generic solution inside Tableau so details of DB setups are irrelevant.

matt_black
  • 1,290
  • 1
  • 12
  • 17
  • Does this answer your question? [MySQL case insensitive select](https://stackoverflow.com/questions/3936967/mysql-case-insensitive-select) – kmoser Sep 25 '20 at 17:07
  • @kmoser That answers the question for people writing their own DB SQL queries. My question was whether Tableau contains any obvious way to achieve this (whatever the DB defaults are). – matt_black Sep 25 '20 at 17:12
  • Apparently you might be able to get Tableau to treat fields in a case-sensitive manner if you use a CSV data source: https://kb.tableau.com/articles/issue/excel-data-is-treated-in-case-insensitive-way – kmoser Sep 25 '20 at 17:17
  • Nobody has suggested a solution using regular expressions yet. Which is good as I already have a problem. ;-) – matt_black Sep 25 '20 at 17:22
  • Are Tableau's string functions case-sensitive, or are they also dependent on the data source? I'm thinking `CONTAINS([email],LOWER([email]))` might work? – kmoser Sep 25 '20 at 18:29
  • @kmoser As far as I can tell tableau string functions *preserve* case but vary in whether they are case *sensitive* (eg when doing comparisons) depending on the data source. Hence on reason why I asked whether there were ways *inside Tableau* to break that dependence as relying on changes to the data source is usually not practical for BI developers. – matt_black Sep 26 '20 at 20:53
  • If you're looking for any uppercase chars in the email, try `REGEX_MATCH([email],'[A-Z]')`. I'm not sure if that function operates independently of the data source but I guess the only way to find out is to try it. – kmoser Sep 26 '20 at 20:59
  • @kmoser Yes that does work. Make it an answer and I will accept it (though some other regex expressions *are* very data source dependent in Tableau, this one does work). – matt_black Sep 26 '20 at 21:03
  • Great, I've posted my answer. Glad it worked for you! – kmoser Sep 27 '20 at 02:44

2 Answers2

3

REGEX_MATCH([email],'[A-Z]') should be case-sensitive regardless of the data source, since by definition the pattern [A-Z] matches only uppercase letters.

kmoser
  • 8,780
  • 3
  • 24
  • 40
1

Is there any easy way to force case sensitivity in comparisons or to check for upper-case characters in a string?

Basically, you want binary. This conditions evaluates as true if the string has any upper case character:

where lower(email) <> binary email

You could also use a regex:

where email rlike binary '[A-Z]'

Your question indicates that you are storing your data in case-insensitive character set / collation. If the case is meaningful, then you should use case-sensitive settings, which would remove the need for binary, hence making your queries more straight-forward, less error-prone, and potentially more efficient.

GMB
  • 216,147
  • 25
  • 84
  • 135
  • Were I dealing directly with the DB setup, then I could do some of this. But I'm working inside Tableau and trying to find something quick that works without having to go directly to the DB setting or writing SQL queries. – matt_black Sep 25 '20 at 17:10