Enabling Two-Factor Authentication For Your Web Application

It’s almost always a good idea to support two-factor authentication (2FA), especially for back-office systems. 2FA comes in many different forms, some of which include SMS, TOTP, or even hardware tokens. Enabling them requires a similar flow: The user goes to their profile page (skip this if you want to force 2fa upon registration) Clicks “Enable two-factor authentication” Enters some data to enable the particular 2FA method (phone number, TOTP verification code, etc.) Next time they login, in addition to the username and password, the login form requests the 2nd factor (verification code) and sends that along with the credentials I will focus on Google Authenticator, which uses a TOTP (Time-based one-time password) for generating a sequence of verification codes. The ideas is that the server and the client application share a secret key. Based on that key and on the current time, both come up with the same code. Of course, clocks are not perfectly synced, so there’s a window of a few codes that the server accepts as valid. Note that if you don’t trust Google’s app, you can implement your own client app using the same library below (though you can see the source code to make sure no shenanigans happen). How to implement that with Java (on the server)? Using the GoogleAuth library. The flow is as follows: The user goes to their profile page Clicks “Enable two-factor authentication” The server generates a secret key, stores it as part of the user profile and returns a URL to a QR code The user scans the QR code with their Google Authenticator app thus creating a...

Setting Up Cassandra Cluster in AWS

Apache Cassandra is a NoSQL database that allows for easy horizontal scaling, using the consistent hashing mechanism. Seven years ago I tried it and decided not use it for a side-project of mine because it was too new. Things are different now, Cassandra is well established, there’s a company behind it (DataStax), there are a lot more tools, documentation and community support. So once again, I decided to try Cassandra. This time I need it to run in a cluster on AWS, so I went on to setup such a cluster. Googling how to do it gives several interesting results, like this, this and this, but they are either incomplete, or outdates, or have too many irrelevant details. So they are only of moderate help. My goal is to use CloudFormation (or Terraform potentially) to launch a stack which has a Cassandra auto-scaling group (in a single region) that can grow as easily as increasing the number of nodes in the group. Also, in order to have the web application connect to Cassandra without hardcoding the node IPs, I wanted to have a load balancer in front of all Cassandra nodes that does the round-robin for me. The alternative for that would be to have a client-side round-robin, but that would mean some extra complexity on the client which seems avoidable with a load balancer in front of the Cassandra auto-scaling group. The relevant bits from my CloudFormation JSON can be seen here. What it does: Sets up 3 private subnet (1 per availability zone in the eu-west region) Creates a security group which allows incoming and outgoing ports...

SecureLogin For Java Web Applications

No, there is not a missing whitespace in the title. It’s not about any secure login, it’s about the SecureLogin protocol developed by Egor Homakov, a security consultant, who became famous for committing to master in the Rails project without having permissions. The SecureLogin protocol is very interesting, as it does not rely on any central party (e.g. OAuth providers like Facebook and Twitter), thus avoiding all the pitfalls of OAuth (which Homakov has often criticized). It is not a password manager either. It is just a client-side software that performs a bit of crypto in order to prove to the server that it is indeed the right user. For that to work, two parts are key: Using a master password to generate a private key. It uses a key-derivation function, which guarantees that the produced private key has sufficient entropy. That way, using the same master password and the same email, you will get the same private key everytime you use the password, and therefore the same public key. And you are the only one who can prove this public key is yours, by signing a message with your private key. Service providers (websites) identify you by your public key by storing it in the database when you register and then looking it up on each subsequent login The client-side part is performed ideally by a native client – a browser plugin (one is available for Chrome) or a OS-specific application (including mobile ones). That may sound tedious, but it’s actually quick and easy and a one-time event (and is easier than password managers). I have to admit...

Five Must-Watch Software Engineering Talks

We’ve all watched dozens of talks online. And we probably don’t remember many of them. But some do stick in our heads and we eventually watch them again (and again) because we know they are good and we want to remember the things that were said there. So I decided to compile a small list of talks that I find very insightful, useful and that have, in a way, shaped my software engineering practice or expanded my understanding of the software world. 1. How To Design A Good API and Why it Matters by Joshua Bloch – this is a must-watch (well, obviously all are). And don’t skip it because “you are not writing APIs” – everyone is writing APIs. Maybe not used by hundreds of other developers, but used by at least several, and that’s a good enough reason. Having watched this talk I ended up buying and reading one of the few software books that I have actually read end-to-end – “Effective Java” (the talk uses Java as an example, but the principles aren’t limited to Java) 2. How to write clean, testable code by Miško Hevery. Maybe there are tons of talks about testing code, maybe Uncle Bob has a more popular one, but I found this one particularly practical and the the point – that writing testable code is a skill, and that testable code is good code. (By the way, the speaker then wrote AngularJS) 3. Back to basics: the mess we’ve made of our fundamental data types by Jon Skeet. The title says it all, and it’s nice to be reminded of how...

Stubbing Key-Value Stores

Every project that has a database has dilemma: how to test database-dependent code. There are several options (not mutually exclusive): Use mocks – use only unit tests and mock the data-access layer, assuming the DAO-to-database communication works Use an embedded database that each test starts and shuts down. This can also be viewed as unit-testing Use a real database deployed somewhere (either locally or on a test environment). The hard part is making sure it’s always in a clean state. Use end-to-end/functional tests/bdd/UI tests after deploying the application on a test server (which has a proper database). None of the above is without problems. Unit tests with mocked DAOs can’t really test more complex interactions that rely on a database state. Embedded databases are not always available (e.g. if you are using a non-relational database, or if you rely on RDBMS-specific functionality, HSQLDB won’t do), or they can be slow to start and this your tests may take too long supporting. A real database installation complicates setup and keeping it clean is not always easy. The coverage of end-to-end tests can’t be easily measured and they don’t necessarily cover all the edge cases, as they are harder to maintain than unit and integration tests. I’ve recently tried a strange approach that is working pretty well so far – stubbing the database. It is applicable more to key-value stores and less to relational databases. In my case, even though there is embedded cassandra, it was slow to start, wasn’t easy to setup and had subtle issues. That’s why I replaced the whole thing with an in-memory ConcurrentHashMap. Since I’m using...

How To Send Ethereum Transactions With Java

After I’ve expressed my concerns about the blockchain technology, let’s get a bit more practical with the blockchain. In particular, with Ethereum. I needed to send a transaction with Java, so I looked at EthereumJ. You have three options: Full node – you enable syncing, which means the whole blockchain gets downloaded. It takes a lot of time, so I abandoned that approach “Light” node – you disable syncing, so you just become part of the network, but don’t fetch any parts of the chain. Not entirely sure, but I think this corresponds to the “light” mode of geth (the ethereum CLI). You are able to send messages (e.g. transaction messages) to other peers to process and store on the blockchain, but you yourself do not have the blockchain. Offline (no node) – just create and sign the transaction, compute its raw representation (in the ethereum RLP format) and push it to the blockchain via a centralized API, e.g. the etherscan.io API. Etherscan is itself a node on the network and it can perform all of the operations (so it serves as a proxy) Before going further, maybe it’s worth pointing out a few general properties of the blockchain (the ethereum one and popular cryptocurrencies at least) – it is a distributed database, relying on a peer-to-peer (overlay) network, formed by whoever has a client software running (wallet or otherwise). Transactions are in the form of “I (private key owner) want to send this amount to that address”. Transactions can have additional data stored inside them, e.g. representing what they are about. Transactions then get verified by peers (currently...