Since the advent of confidential and sensitive information, there have been parties that have tried to access that data and use it for their own nefarious purposes.
In this ongoing game of cat and mouse,numerous data masking software have been developed to make the task of accessing sensitive information futile. Each of these methods has its own strengths and weaknesses with each method usually being best applied to a certain data type.
Following are the key data masking tools and their advantages and disadvantages:
Substitution– randomly substituting the contents of a column of data with data that looks similar looking but completely unrelated data.
Advantage – effectively preserves look and feel of existing data.
Disadvantage – too cumbersome when dealing with vast amounts of data as it may be too difficult to find such large quantities of relevant data to substitute.
Shuffling– like Substitution but in this instance the substitute data is generated from the column itself. The data in a column is randomly shuffled between rows until the data no longer correlates with the remaining information in the row.
Advantages– effectively preserves look and feel of existing data and quickly and efficiently deals with large amounts of data.
Disadvantages – ineffective when dealing with small amounts of data. Also, since original data is still present, if the algorithm used was not sufficiently sophisticated, may be “unshuffled.”
Number and Date Variance– each number or date value in a column is algorithmically modified by some randompercentage of its real value.
Advantage–data masking toolscan reasonably mask numeric data while still keeping the range and distribution of values within existing limits.
Disadvantage – only applicable to numeric data.
Encryption– data is algorithmically scrambled and only those with access to the appropriate key can view the encrypted data.
Advantage– masks data.
Disadvantages–encryption destroys the formatting as well as the look and feel of the data. Consequently, it is easy to see when data has been encrypted. Also, with enough effort, almost any encryption can be broken. Similarly, anyone with the access key can
Also, when using test or development databases, anyone with the appropriate key can access the data, resulting in the encryption being useless.
Nulling Out/Truncating/Deletion – removal of the sensitive data.
Advantage – useful in circumstances where the data is not required.
Disadvantage – not appropriate for test database environments, where data or at least a realistic approximation of the data is required bythe test teams.