Electronic Signature Using The WebCrypto API

Sometimes we need to let users sign something electronically. Often people understand that as placing your handwritten signature on the screen somehow. Depending on the jurisdiction, that may be fine, or it may not be sufficient to just store the image. In Europe, for example, there’s the Regulation 910/2014 which defines what electronic signature are. As it can be expected from a legal text, the definition is rather vague: ‘electronic signature’ means data in electronic form which is attached to or logically associated with other data in electronic form and which is used by the signatory to sign; Yes, read it a few more times, say “wat” a few more times, and let’s discuss what that means. And it can bean basically anything. It is technically acceptable to just attach an image of the drawn signature (e.g. using an html canvas) to the data and that may still count. But when we get to the more specific types of electronic signature – the advanced and qualified electronic signatures, things get a little better: An advanced electronic signature shall meet the following requirements: (a) it is uniquely linked to the signatory; (b) it is capable of identifying the signatory; (c) it is created using electronic signature creation data that the signatory can, with a high level of confidence, use under his sole control; and (d) it is linked to the data signed therewith in such a way that any subsequent change in the data is detectable. That looks like a proper “digital signature” in the technical sense – e.g. using a private key to sign and a public key to...

“Architect” Should Be a Role, Not a Position

What happens when a senior developer becomes…more senior? It often happens that they get promoted to “architect”. Sometimes an architect doesn’t have to have been a developer, if they see “the bigger picture”. In the end, there’s often a person that holds the position of “architect”; a person who makes decisions about the architecture of the system or systems being developed. In bigger companies there are “architect councils”, where the designated architects of each team gather and decide wise things… But I think it’s a bad idea to have a position of “architect”. Architect is a position in construction – and it makes sense there, as you can’t change and tweak the architecture mid-project. But software architecture is flexible and should not be defined strictly upfront. And development and architecture are so intertwined, it doesn’t make much sense to have someone who “says what’s to be done” and others who “do it”. It creates all sorts of problems, mainly coming from the fact that the architect doesn’t fully imagine how the implementation will play out. If the architect hasn’t written code for a long time, they tend to disregard “implementation details” and go for just the abstraction. However, abstractions leak all the time, and it’s rarely a workable solution to just think of the abstraction without the particular implementation. That’s my first claim – you cannot be a good architect without knowing exactly how to write the whole code underneath. And no, too often it’s not “simple coding”. And if you have been an architect for years, and so you haven’t written code in years, you are almost certainly...

Overview of Message Queues [slides]

Yesterday I gave a talk that went through all the aspects of using messages queues. I’ve previously written that “you probably don’t need a message queue” – now the conclusion is a bit more nuanced, but I still stand by the simplicity argument. The talk goes through the various benefits and use cases of using message queues, and discusses alternatives of the typical “message queue broker” architecture. The slides are available here One maybe strange approach that I propose is to use distributed locks (e.g. with Hazelcast) for distributed batch processing – you lock on a particular id (possibly organization id / client id, rather than an individual record id) thus allowing multiple processors to run in parallel without stepping on each other’s toes (one node picks the first entry, the other one tries the first, but fails, and picks the second one). Something I missed as a benefit of brokers like RabbitMQ is the available tooling – you can monitor and debug your queues pretty easily. That isn’t true for simpler solutions. I wasn’t able to focus much on any of the concepts – e.g. how does brokerless work, or how to deploy a broker (e.g. RabbitMQ) in multiple data centers (availability zones), or how akka (and akka cluster) fits in the the “message queue” landscape. But I hope it’s a good overview that lets everyone have a clear picture of the options, which then they can analyze and assess in more details. The post Overview of Message Queues [slides] appeared first on Bozho's tech...

Event Logs

Most system have some sort of event logs – i.e. what has happened in the system and who did it. And sometimes it has a dual existence – once as an “audit log”, and once as event log, which is used to replay what has happened. These are actually two separate concepts: the audit log is the trace that every action leaves in the system so that the system can later be audited. It’s preferable that this log is somehow secured (will discuss that another time) the event log is a crucial part of the event-sourcing model where the database only stores modifications, rather than the current state. The current state is the obtained after applying all the stored modifications until the present moment. This allows seeing the state of the data at any moment in the past. There are a bunch of ways to get that functionality. For audit logs there is Hibernate Envers, which stores all modifications in a separate table. You can also have a custom solution using spring aspects or JPA listeners (that store modifications in an audit log table whenever a change happens). You can store the changes in multiple ways – as key/value rows (one row per modified field), or as objects serialized to JSON. Event-sourcing can be achieved by always inserting a new record instead of updating or deleting (and incrementing a version and/or setting a “current” flag). There some event-sourcing-native databases – Datomic and Event Store. (Note: “event-sourcing” isn’t equal to “insert-only”, but the approach is very similar) They still seem pretty similar – both track what has happened on the...

Spring Boot, @EnableWebMvc And Common Use-Cases

It turns out that Spring Boot doesn’t mix well with the standard Spring MVC @EnableWebMvc. What happens when you add the annotation is that spring boot autoconfiguration is disabled. The bad part (that wasted me a few hours) is that in no guide you can find that explicitly stated. In this guide it says that Spring Boot adds it automatically, but doesn’t say what happens if you follow your previous experience and just put the annotation. In fact, people that are having issues stemming from this automatically disabled autoconfiguration, are trying to address it in various ways. Most often – by keeping @EnableWebMvc, but also extending Spring Boot’s WebMvcAutoConfiguration. Like here, here and somewhat here. I found them after I got the idea and implemented it that way. Then realized doing it is redundant, after going through Spring Boot’s code and seeing that an inner class in the autoconfiguration class has a single-line javadoc stating Configuration equivalent to {@code @EnableWebMvc}. That answered my question whether spring boot autoconfiguration misses some of the EnableWebMvc “features”. And it’s good that they extended the class that provides EnableWebMvc, rather than mirroring the functionality (which is obvious, I guess). What should you do when you want to customize your beans? As usual, extend WebMvcConfigurerAdapter (annotate the new class with @Component) and do your customizations. So, bottom line of the particular problem: don’t use @EnableWebMvc in spring boot, just include spring-web as a maven/gradle dependency and it will be autoconfigured. The bigger picture here resulted in me adding a comment in the main configuration class detailing why @EnableWebMvc should not be put there. So...

Distributed Cache – Overview

What’s a distributed cache? A solution that is “deployed” in an application (typically a web application) and that makes sure data is loaded from memory, rather than from disk (which is much slower), in order to improve performance and response time. That looks easy if the cache is to be used on a single machine – you just load your most active data from the database in memory (e.g. a Guava Cache instance), and serve it from there. It becomes a bit more complicated when this has to work in a cluster – e.g. 5 application nodes serving requests to users in a round-robin fashion. You have to update the in-memory cache on all machines each time a piece of data is updated by a request to one of the machines. If you just load all the data in memory and don’t invalidate it, the cache won’t be “coherent” – it will have stale values and requests to different application nodes will have different results, which you most certainly want to avoid. Or you can have a single big cache server with tons of memory, but it can die – and that may disrupt the smooth operation, so you’d want to have at least 2 machines in a cluster. You can get a distributed cache in different ways. To list a few: Infinispan (which I’ve covered previously), Terracotta/Ehcache, Hazelcast, Memcached, Redis, Cassandra, Elasticache(by Amazon). The former three are Java-specific (both JCache compliant), but the rest can be used in any setup. Cassandra wasn’t initially meant to be cache solution, but it can easily be used as such. All of...