Shield Platform Encryption and Queries

If you’ve started working with Shield Platform Encryption, you’ve probably encountered some of the particular ways that it limits how you can interact with data in encrypted fields. A significant limitation is that you cannot use encrypted fields for filtering or sorting in queries– either directly through SOQL or in reports or list views.  After you stop throwing your computer at a wall and think about it, of course, it makes sense: Salesforce Shield Platform Encryption uses probabilistic encryption, which means that the ciphertext for a field is different every time, even if the plaintext value is the same.  Queries and reports are accessing the data layer that is still encrypted, which means there’s no way to identify records based on the decrypted values.  (note: there is a deterministic encryption beta currently available, which does support filtering and sorting on encrypted fields because the ciphertext will be the same for the same plaintext values.  You can contact Salesforce support to get added to the beta).

Not being able to query on encrypted fields can be a problem for many organizations, especially if they need to look up records based on data which must be encrypted for compliance reasons.  For instance, an organization tracking visit data with HIPAA requirements may need to update existing admission records with a discharge date, or insert a new visit if no matching admission record already exists.  HIPAA mandates that visit dates must be encrypted at rest, which means developers and architects have to get creative to create a query for the right record.

As I was puzzling on the problem, I realized that if the data was initialized in an Apex string variable, I’d be able to work with the plaintext value, and that beyond that, I could create a synthetic key using the plaintext values in a map for locating the correct record.  While I wouldn’t want to do this on anything running asynchronously (which could result in the plaintext values being stored, somewhere, decrypted at rest), it should be a secure solution for classes being executed immediately by a trigger.

Here’s the rough code that I put together for the proof of concept.  It isn’t production ready (method is directly in the trigger, needs try/catch blocks, test code coverage, etc), but I hope that it’s enough to give an idea of how to approach this:

trigger ImportStagingTrigger on Import_Staging_Object__c (before insert) {
    map<string, Import_Staging_Object__c> isomap = new Map<string, Import_Staging_Object__c>();
    Clinical_Event__c clinevent = new Clinical_Event__c();
    string cekey;
    string newcekey;
    list cetoupdate = new list();
    list cetoinsert = new list();
    map<string, Clinical_Event__c> cemap = new Map<string, Clinical_Event__c>();
    list celist = new list(
        [SELECT Admission_Date__c, Discharge_Date__c, Attending_Physician__c, Department_of_Service__c 
         FROM Clinical_Event__c 
         WHERE Discharge_Date_populated__c = FALSE]);
    for(Clinical_Event__c ce : celist){
        If(ce.Admission_Date__c != null && ce.Attending_Physician__c != null && ce.Department_of_Service__c != null){
        Integer dayofyear = ce.Admission_Date__c.dayofYear();
        cekey = ce.Attending_Physician__c+ce.Department_of_Service__c+dayofyear;
        system.debug('the clinical event key is '+cekey );
        cemap.put(cekey, ce);
    for(Import_Staging_Object__c isoce :{
        Integer isodayofyear = isoce.Admission_Date__c.dayofYear();
        newcekey = isoce.Attending_Physician__c + isoce.Department_of_Service__c +isodayofyear;
        system.debug('new ce key is '+newcekey);
            system.debug('Clinical event found');
            clinevent = cemap.get(newcekey);
            clinevent.Discharge_Date__c = isoce.Discharge_Date__c;
            clinevent.Discharge_Date_populated__c = TRUE;
        } else {
            system.debug('no clinical events found');
            clinevent.Admission_Date__c = isoce.Admission_Date__c;
            clinevent.Attending_Physician__c = isoce.Attending_Physician__c;
            clinevent.Department_of_Service__c = isoce.Department_of_Service__c;
                 if(isoce.Discharge_Date__c != NULL){
                      clinevent.Discharge_Date__c = isoce.Discharge_Date__c;
                      clinevent.Discharge_Date_populated__c = TRUE;
            cemap.put(newcekey, clinevent);
            update cetoupdate;
            insert cetoinsert;

Because we cannot query at all on encrypted fields (not even, for instance, Discharge_Date__c = NULL), I recommend using a field on the record that isn’t encrypted to limit your query, lest at some point a large data set results in heap size problems for your map.  In my case, I created a checkbox field, Discharge_Date_populated__c which my class sets to TRUE whenever it populates the Discharge_Date__c on a record.  I’m doing this within the class, because declarative automation cannot be triggered based on encrypted fields.     You might also be able to limit records based on created date or other values that do not require encryption.

Keep in mind that if you’re creating a synthetic key string for data in your map, you will also need to create a synthetic key using the corresponding values for the data you are inserting, in order to match the new insert/update to existing records in your map.

What strategies and techniques are you using to work with probabilistically encrypted data?


1 thought on “Shield Platform Encryption and Queries”

Leave a Reply

Fill in your details below or click an icon to log in: Logo

You are commenting using your account. Log Out /  Change )

Google+ photo

You are commenting using your Google+ account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s