Tracking Cookies and GDPR

GDPR is the new data protection regulation, as you probably already know. I’ve given a detailed practical advice for what it means for developers (and product owners). However, there’s one thing missing there – cookies. The elephant in the room. Previously I’ve stated that cookies are subject to another piece of legislation – the ePrivacy directive, which is getting updated and its new version will be in force a few years from now. And while that’s technically correct, cookies seem to be affected by GDPR as well. In a way I’ve underestimated that effect. When you do a Google search on “GDPR cookies”, you’ll pretty quickly realize that a) there’s not too much information and b) there’s not much technical understanding of the issue. What appears to be the consensus is that GDPR does change the way cookies are handled. More specifically – tracking cookies. Here’s recital 30: (30) Natural persons may be associated with online identifiers provided by their devices, applications, tools and protocols, such as internet protocol addresses, cookie identifiers or other identifiers such as radio frequency identification tags. This may leave traces which, in particular when combined with unique identifiers and other information received by the servers, may be used to create profiles of the natural persons and identify them. How tracking cookies work – a 3rd party (usually an ad network) gives you a code snippet that you place on your website, for example to display ads. That code snippet, however, calls “home” (makes a request to the 3rd party domain). If the 3rd party has previously been used on your computer, it has created...

Setting Up Cassandra With Priam

I’ve previously explained how to setup Cassandra in AWS. The described setup works, but in some cases it may not be sufficient. E.g. it doesn’t give you an easy way to make and restore backups, and adding new nodes relies on a custom python script that randomly selects a seed. So now I’m going to explain how to setup Priam, a Cassandra helper tool by Netflix. My main reason for setting it up is the backup/restore functionality that it offers. All other ways to do backups are very tedious, and Priam happens to have implemented the important bits – the snapshotting and the incremental backups. Priam is a bit tricky to get running, though. The setup guide is not too detailed and not easy to find (it’s the last, not immediately visible item in the wiki). First, it has one branch per Cassandra version, so you have to checkout the proper branch and build it. I immediately hit an issue there, as their naming doesn’t allow eclipse to import the gradle project. Within 24 hours I reported 3 issues, which isn’t ideal. Priam doesn’t support dynamic SimpleDB names, and doesn’t let you override bundled properties via the command line. I hope there aren’t bigger issues. The ones that I encountered, I fixed and made a pull request. What does the setup look like? Append a javaagent to the JVM options Run the Priam web It automatically replaces most of cassandra.yaml, including the seed provider (i.e. how does the node find other nodes in the cluster) Run Cassandra It fetches seed information (which is stored in AWS SimpleDB) and connects...

Using JWT For Sessions

The topic has been discussed many times, on hacker news, reddit, blogs. And the consensus is – DON’T USE JWT. And I largely agree with the criticism of typical arguments for the JWT, the typical “but I can make it work…” explanations and the flaws of the JWT standard.. I won’t repeat everything here, so please go and read those articles. You can really shoot yourself in the foot with JWT, it’s complex to get to know it well and it has little benefits for most of the usecases. That said, I’ve gone against what the articles above recommend, and use JWT, navigating through their arguments and claiming I’m in a sweet spot. I can very well be wrong. I store the user ID in a JWT token stored as a cookie. Not local storage, as that’s problematic. Not the whole state, as I don’t need that may lead to problems (pointed out in the linked articles). What I want to avoid in my setup is sharing sessions across nodes. And this is a very compelling reason to not use the session mechanism of your web server/framework. No, you don’t need to have millions of users in order to need your application to run on more than one node. In fact, it should almost always run on (at least) two nodes, because nodes die and you don’t want downtime. Sticky sessions at the load balancer are a solution to that problem but you are just outsourcing the centralized session storage to the load balancer (and some load balancers might not support it). Shared session cache (e.g. memcached, elasticache, hazelcast)...

Adding Visible Electronic Signatures To PDFs

I’m aware this is going to be a very niche topic. Electronically signing PDFs is far from a mainstream usecase. However, I’ll write it for two reasons – first, I think it will be very useful for those few who actually need it, and second, I think it will become more and more common as the eIDAS regulation gain popularity – it basically says that electronic signatures are recognized everywhere in Europe (now, it’s not exactly true, because of some boring legal details, but anyway). So, what is the usecase – first, you have to electronically sign the PDF with an a digital signature (the legal term is “electronic signature”, so I’ll use them interchangeably, although they don’t fully match – e.g. any electronic data applied to other data can be seen as an electronic signature, where a digital signature is the PKI-based signature). Second, you may want to actually display the signature on the pages, rather than have the PDF reader recognize it and show it in some side-panel. Why is that? Because people are used to seeing signatures on pages and some (lawyers *cough*) may insist on having the signature visible (true story – I’ve got a comment that a detached signature “is not a REAL electronic signature, because it’s not visible on the page”). Now, notice that I wrote “pages”, on “page”. Yes, an electronic document doesn’t have pages – it’s a stream of bytes. So having the signature just on the last page is okay. But, again, people are used to signing all pages, so they’d prefer the electronic signature to be visible on all...

Integration With Zapier

Integration is boring. And also inevitable. But I won’t be writing about enterprise integration patterns. Instead, I’ll explain how to create an app for integration with Zapier. What is Zapier? It is a service that allows you tо connect two (or more) otherwise unconnected services via their APIs (or protocols). You can do stuff like “Create a Trello task from an Evernote note”, “publish new RSS items to Facebook”, “append new emails to a spreadsheet”, “post approaching calendar meeting to Slack”, “Save big email attachments to Dropbox”, “tweet all instagrams above a certain likes threshold”, and so on. In fact, it looks to cover mostly the same usecases as another famous service that I really like – IFTTT (if this then that), with my favourite use-case “Get a notification when the international space station passes over your house”. And all of those interactions can be configured via a UI. Now that’s good for end users but what does it have to do with software development and integration? Zapier (unlike IFTTT, unfortunately), allows custom 3rd party services to be included. So if you have a service of your own, you can create an “app” and allow users to integrate your service with all the other 3rd party services. IFTTT offers a way to invoke web endpoints (including RESTful services), but it doesn’t allow setting headers, so that makes it quite limited for actual APIs. In this post I’ll briefly explain how to write a custom Zapier app and then will discuss where services like Zapier stand from an architecture perspective. The thing that I needed it for – to be...

GDPR for Developers [presentation]

On a recent meetup in Amsterdam I talked about GDPR from a technical point of view, effectively turning my “GDPR – a practical guide for developers” article into a talk. You can see the slides here: If you’re interested, you can also join a webinar on the same topic, organized by AxonIQ, where I will join Frans Vanbuul. You can find more information about the webinar here. The interesting thing that I can share after the meetup and after meeting with potential clients is that everyone (maybe unsurprisingly) has a very specific question that doesn’t get an immediate answer even after you follow the general guidelines. That is maybe a problem on the Regulation’s side, as it has not brought sufficient clarity to businesses. As I said during the presentation – in technology we’re used with binary questions. In law and legal compliance an answer is somewhere on a scale between 1 and 10. “Do I have to encrypt my data at rest”? Well, I guess yes, but in terms of compliance I’d say “6 out of 10”, as it is not strict, depends on the multiple people’s interpretation of the sensitivity of the data and on other factors like access control. So the communication between legal and technical people is key to understand what exactly implementation changes are needed. The post GDPR for Developers [presentation] appeared first on Bozho's tech...