Hero Backgroud Elements 2
Reading Time: 15 Min

Data Masking – a walk in the park with SAP Data Warehouse Cloud

January 6, 2021

Blog
Author
Klaus-Peter Sauer Klaus-Peter Sauer
Klaus-Peter Sauer is Senior Director of Data Warehousing in the SAP Product Management.

The masking or pseudonymization of data offers an additional level of access control that can be applied to tables and views. Masking a column protects sensitive or confidential data in a particular column of a table or view by transforming the data in such a way that it is only partially visible or completely meaningless to a non-privileged user and yet appears real and consistent.

SAP HANA Cloud offers such a masking function, but only a user who also has the UNMASKED object privilege for a table or view can display the original data.
Let me show you how you can use Data Masking in the Data Warehouse Cloud with the following two options.

Prerequisites

You need a working Data Warehouse Cloud tenant and sufficient roles/privileges to create (or use) a database user within a space. Additional tools or other requirements are not needed for this.

Option 1 – Data stored in an open SQL schema

Create a database user and open SQL schema
Create a database user and therefore a Open SQL schema with rights for Data Ingestion and Data Consumption in the space. Alternatively, you can use an existing one you might already have available.

Please note that exactly this database user has the UNMASKED object privilege for all tables and views that you create in this schema using a MASK statement.

Select this user and open the Database Explorer from this screen using the  button above the database user table. In the connect dialog you use the generated user and password from your database user.

Create a table with a MASK statement

For this example, we are using a simple table with four columns and the following CREATE TABLE statement. Just replace <YOUR_SCHEMA_NAME> with your created Database User Name.

In the example this is a customer table, where the third column represents a credit card number (cc) and the fourth column the social security number (ssn). Copy the SQL statement to your SQL console in DB Explorer and run it.

Run CREATE TABLE statement in DB-Explorer

Use some test data by executing the following INSERT statements and display the data as shown in the screenshot.

The cc field is provided with a MASK directly in the CREATE TABLE statement. The data preview shows the values unmasked because this user used has the UNMASKED object privilege.

You can also apply a masking function subsequently to already existing tables. For example, if you created the table using an external ETL tool in the open SQL schema. To demonstrate this, we use an ALTER TABLE statement to add the masking also to the social security number (ssn) in our example. To show you also another option, we keep the last two digits from the right remain visible.

The following ALTER TABLE statement does exactly that:

Use the masked table in Data Warehouse Cloud

In Data Warehouse Cloud you can use this table as a remote table by defining a view based on the table. In the Data Builder you create a new Graphical View. Drag and drop the created table "Customer_OpenSQLSchema" from the open SQL schema under the SOURCES tab on the left. In the upcoming dialog box click on to generate the remote table.

Option 2 – Data stored in the space schema of Data Warehouse Cloud

Create a table in Data Warehouse Cloud
In the second example, the table with data is located within the space schema of the Data Warehouse Cloud and is exposed to the open SQL schema via a graphical view.
First, you create the table "Customer_SpaceSchema" in the table editor and deploy it:

Upload the same sample data set via a CSV file. For your convenience you can get that sample data file here: https://github.com/kp-sauer/Data-Masking

In the graphical view editor you can now create a new view named "Customer_SpaceSchema_View" based on the table you just created.

Make sure you enable the "Expose for Consumption" setting with this view and deploy it. If you use the data preview you see the unmasked data.

Create a view with a MASK statement

In this case you also use the database user in the open SQL schema to create the masking via the Database Explorer – this time via a database view.

You can simply do that using a CREATE VIEW statement similar to the CREATE TABLE statement we have used before, but you can also use a database function just to show you another variant. The advantage is that it can be reused in different use cases using the same logic.

For example, define a function like this:

The credit card number is masked, where the first four and last two digits are shown unmasked.

Now the view named "Customer_SpaceSchema_Masked" can be created using the database function and based on the space schema view.

Use the masked view in Data Warehouse Cloud 

Back in the view builder in Data Warehouse Cloud you can use the just created view from the open SQL schema to display the masking result similar to option 1 before.

Data masking in the data preview

In the view editor you can now carry out further modeling based on this new view and you will receive the credit card number masked.

More details on the data masking feature of SAP HANA Cloud can be obtained from the SAP help pages.