The Right To Be Forgotten In Your Application

You’ve probably heard about “the right to be forgotten” according to which Google has to delete search results about you, if you ask them to.

According to a new General Data Protection Regulation of the EU, the right to be forgotten means that a data subject (user) can request the deletion of his data from any data controller (which includes web sites), and the data controller must delete the data without delay. Whether it’s a social network profile, items sold in online shops/auctions, location data, properties being offered for sale, even forum comments.

Of course, it is not as straightforward, as it depends on the contractual obligations of the user (even with implicit contracts), and the Regulation lists a couple of cases, some of which very broad, when the right to be forgotten is applicable, but the bottom line is that such functionality has to be supported.

I’ll quote the relevant bits from the Regulation:

Article 17
1. The data subject shall have the right to obtain from the controller the erasure of personal data concerning him or her without undue delay and the controller shall have the obligation to erase personal data without undue delay where one of the following grounds applies:
(a) the personal data are no longer necessary in relation to the purposes for which they were collected or otherwise processed;
(b) the data subject withdraws consent on which the processing is based according to point (a) of Article 6(1), or point (a) of Article 9(2), and where there is no other legal ground for the processing;
….
2. Where the controller has made the personal data public and is obliged pursuant to paragraph 1 to erase the personal data, the controller, taking account of available technology and the cost of implementation, shall take reasonable steps, including technical measures, to inform controllers which are processing the personal data that the data subject has requested the erasure by such controllers of any links to, or copy or replication of, those personal data.

This “legalese” basically means that your application should have a “forget me” functionality that deletes the user data entirely. No “deleted” or “hidden” flags, no “but our business is based on your data”, no “this will break our application”.

Note also that if your applications pushes data to a 3rd party service (uploads images to youtube, pictures to imgur, syncs data with salesforce, etc.), you have to send deletion requests to these services as well.

The Regulation will be in force in 2018, but probably it’s a good idea to have it in mind earlier. Not just because there’s a Directive in force and courts already have decicions in that directions, but also because when building your system, you have to keep that functionality working. And since in most cases all data is linked to a user in your database, it is thus most likely considered “personal data” under the broad definition in the regulation.

Technically, this can be achieved by ON CASCADE DELETE in relational databases, or by cascade=ALL in ORMs, or my manual application layer deletion. The manual deletion needs supporting and extending when adding new entities/tables, but it safer than having a generic cascade deletion. And as mentioned above that may not be enough – your third party integrations should also feature deletion. And most 3rd party APIs have that functionality, so your “/forget-me” endpoint handler would probably look something like this:

@Transactional
public void forgetUser(UUID userId) {
   User user = userDao.find(userId);
   // if cascading is considered unsafe, delete entity by entity
   fooDao.deleteFoosByUser(user);
   barDao.deleteBarsByUser(user);
   userDao.deleteUser(user);
   thirdPartyService1.deleteUser(user.getThirdPartyService1Token);
   thirdPartyService2.deleteUser(user.getThirdPartyService2Token);
   // if some components of your deployment rely on processing events
   eventQueue.publishEvent(new UserDeletionEvent(user));
}

That code may be execute asynchronously. It can also be run as part of a “forgetting” scheduled job – users submit their deletion requests and the job picks it up. The regulation is not strict about real-time deletion, but it should happen “without undue delay”. So “once a month” is not an option.

My point is – you should think of that feature when designing your application, so that it doesn’t turn out that it’s impossible to delete data without breaking everything.

And you should think about that not (just) because some EU regulation says so, but because users’ data is not your property. Yes, the user has decided to just push stuff to your database, but you don’t own that. So if a user decides it no longer wants you to hold any data about him – you are ethically (and now legally) bound to comply. You can bury the feature ten screens and two password forms deep, but it better be there.

Whether that’s practical – I think it is. It comes handy for acceptance tests, for example, which you can run against production (without relying on a hardcoded user profiles). It is also not that hard to support a deletion feature and it allows you to have a flexible data model.

Whether the Regulation will be properly applied depends on many factors, but the technical one may be significant with legacy systems. And as every system becomes “legacy” after six months, we should be talking about it.