Is there any feature in the BigQuery roadmap to support dynamic data masking? For example, displaying masked data based on the user's roles. I have explored DLP which helps in storing masked data in BigQuery, but with that approach, one will have to create two versions of the same table masked and unmasked. Please refer to the following link as an example to get additional context to my ask. (Example Link)
5 Answers
As noted by Guillaume, the correct workaround at the moment is to use BigQuery Column-level security for controlling access to specific table columns.
As for the specific Data Masking feature where the column data is returned but masked, this is indeed on the BigQuery roadmap and is expected to be released as part of BigQuery Column-level security. However, there isn't any ETA on the release yet.
You may refer to Google's Bigquery release notes to keep in loop with the latest BigQuery updates and feature releases.

- 123
- 5
There isn't the exact same feature. And, indeed, you have to store the 2 forms of data, masked and unmasked.
However, you have a new feature named CLS: Column Level Security. With this feature you can allow user to see, or not, some column. In your use case, you can show to the user only the unmasked column

- 66,369
- 2
- 47
- 76
For anyone seeing this.. just use a authorized view.. You can hash the data if you need something deterministic or you can use a string/regex function to mask the data.

- 11
- 1
Just an FYI for those who are considering Column Level Security (i.e. use of policy tags). I surfed on in here because we are currently hitting some limitations of policy tags.
We have columns that we can't expose to end users, emailAddress
is a great example, hence we have a policy tag upon it that prevents access to it. However, emailAddress is still a very useful column for end users to answer questions like
How many distinct users visited our site?
For this reason we considered putting views on top of the tables that do this:
select SHA256("some-pepper-value", emailAddress) AS emailAddressHash
This would enable end users to use an obfuscated identifier (and enable them to join tables together on emailAddressHash
which is also an important thing to be able to do). Unfortunately it doesn't work because BigQuery realises that column emailAddress
is still being referenced and thus blocks access to emailAddressHash
. Hence I've been googling around for "dynamic data masking in BigQuery" which landed me in here.

- 10,501
- 14
- 80
- 159
The feature arrived yesterday: https://cloud.google.com/bigquery/docs/column-data-masking-intro
Not sure if it covers your requirements fully but definitely should work for simple cases.

- 105
- 1
- 10