7 Questions To Ask Yourself About Your Code

I was thinking the other days – why writing good code is so hard? Why the industry still hasn’t got to producing quality software, despite years of efforts, best practices, methodologies, tools. And the answer to those questions is anything but simple. It involves economic incentives, market realities, deadlines, formal education, industry standards, insufficient number of developers on the market, etc. etc. As an organization, in order to produce quality software, you have to do a lot. Setup processes, get your recruitment right, be able to charge the overhead of quality to your customers, and actually care about that. But even with all the measures taken, you can’t guarantee quality code. First, because that’s subjective, but second, because it always comes down to the individual developers. And not simply whether they are capable of writing quality software, but also whether they are actually doing it. And as a developer, you may fit the process and still produce mediocre code. This is why my thoughts took me to the code from the eyes of the developer, but in the context of software as a whole. Tools can automatically catch code styles issues, cyclomatic complexity, large methods, too many method parameters, circular dependencies, etc. etc. But even if you cover those, you are still not guaranteed to have produced quality software. So I came up with seven questions that we as developers should ask ourselves each time we commit code. Is it correct? – does the code implement the specification. If there is no clear specification, did you do a sufficient effort to find out the expected behaviour. And is that...

How to create secure software? Don’t blink! [talk]

Last week Acronis (famous for their TrueImage) organized a conference in Sofia about cybersecurity for developers and I was invited to give a talk. It’s always hard to pick a topic for a talk on a developer conference that is not too narrowly focused – if you choose something too high level, you can be uesless to the audience and seen as a “bullshitter”; if you pick something too specific, half of the audience may be bored because it is not their area of work. So I chose the middle ground – an overview talk with as much specifics as possible in it. I tried to tell interesting stories of security vulnerabilities to illustrate my points. The talk is split in several parts: Purpose of attacks Front-end vulnerabilities (examples and best practices) Back-end vulnerabilities (examples and best practices) Infrastructure vulnerabilities (examples and best practices) Human factor vulnerabilities (examples and best practices) Thoughts on how this fits into the bigger picture of software security You can watch the 30 minutes video here: If you would like to download my slides, click here. or view them at SlideShare: The point is – security is hard and there are a million things to watch for and a million things that can go wrong. You should minimize risk by knowing and following as much best practices as possible. And you should not assume you are secure, as even the best companies make rookie mistakes. The security mindset, which is partly formalized by secure coding practices, is at the core of having a secure software. Asking yourself constantly “what could go wrong” will make...

Avoid Lists in Cassandra

Apache Cassandra is fast and scalable database which over the years became almost as easy to use as a traditional SQL database. At least on the surface. You an use SQL-like queries, but they have a lot of limitations; you have a schema, but it’s not as flexible to modify it as in a SQL database; you have the same tabular structure with a primary key, but it’s more complicated due to the differentiation between partition key and sorting key. And there are a lot of underlying details that seem irrelevant at first, but are crucial for performance and data consistency, like tombstones, SSTable compaction and so on. But I want to discuss the “list” column type, as recently we’ve had a very elusive issue with it. We are in the business of guaranteeing data integrity, and that’s why our records are not updated, ever. This is a good fit for Cassandra, as updates are tricky to get right. But on one of our deployments we noticed something strange – very rarely, the hash of the data in a particular entry out of millions would not match upon comparison with the indexed data. Upon investigation, we noticed that a column of type “list” got duplicate values. It was not an issue with the code, because in this particular case the code was always using Collections.singletonList(..) It appears that Cassandra is trying to be clever and when it sees identical entries in a batch insert, instead of overriding one with the other, it tries to merge them, resulting in a list with duplicate entries. Accounts of the issue are reported...

Certificate Transparency Verification in Java

So I had this naive idea that it would be easy to do certificate transparency verification as part of each request in addition to certificate validity checks (in Java). With half of the weekend sacrificed, I can attest it’s not that trivial. But what is certificate transparency? In short – it’s a publicly available log of all TLS certificates in the world (which are still called SSL certificates even though SSL is obsolete). You can check if a log is published in that log and if it’s not, then something is suspicious, as CAs have to push all of their issued certificates to the log. There are other use-cases, for example registering for notifications for new certificates for your domains to detect potentially hijacked DNS admin panels or CAs (Facebook offers such a tool for free). What I wanted to do is the former – make each request from a Java application verify the other side’s certificate in the certificate transparency log. It seems that this is not available out of the box (if it is, I couldn’t find it. In one discussion about JEP 244 it seems that the TLS extension related to certificate transparency was discussed, but I couldn’t find whether it’s supported in the end). I started by thinking you could simply get the certificate, and check its inclusion in the log by the fingerprint of the certificate. That would’ve been too easy – the logs to allow for checking by hash, however it’s not the fingerprint of a certificate, but instead a signed certificate timestamp – a signature issued by the log prior to inclusion....

Integrating Applications As Heroku Add-Ons

Heroku is a popular Platform-as-a-Service provider and it offers vendors the option to be provided as add-ons. Add-ons can be used by Heroku customers in different ways, but a typical scenario would be “Start a database”, “Start an MQ”, or “Start a logging solution”. After you add the add-on to your account, you can connect to the chosen database, MQ, logging solution or whetaver. Integrating as Heroku add-on is allegedly simple, and Heroku provides good documentation on how to do it. However, there are some pitfalls and so I’d like to share my experience in providing our services (Sentinel Trails and SentinelDB) as Heroku add-ons. Both are SaaS (one is a logging solution, the other one – a cloud datastore), and so when a Heroku customer wants to add it to their account, we have to just create an account for them on our end. In order to integrate with Heroku, you need to implement several endpoints: provisioning – the initial creation of the resources (= account)plan change – since Heroku supports multiple subscription plans, this should also be reflected on your enddeprovisioning – if a user stops using your service, you may want to free some resourcesSSO – allows users to log in your service by clicking an icon in the Heroku console. Implementing these endpoints following the tutorial should be straightforward, but it isn’t exactly. Hence I’m sharing our Spring MVC controller that handles it – you can check it here. A few important bits: You may choose not to obtain a token if you don’t plan to interact with the Heroku API further.We are registering the...

Types of Data Breaches and How To Prevent Them

Data breaches happen practically every day. Personal, including financial and medical data leak to cyber criminals as well as intelligence agencies. Some notable breaches include the Equifax breach, where dozens of personal data fields were leaked, and the recent Marriott breach, where passports, credit cards and locations of people at a given time were breached. I’ve been doing some data protection consultancy as well as working on a data protection product and decided to classify the types of data breaches and give recommendations on how they can be addressed. We don’t always get to know how exactly the breaches happen, but from what is published in news articles and post-mortems, we can have a good overview on the breach landscape. Control over target server – if an attacker is able to connect to a target server and gains full or partial control on it, they can do anything, including running SELECT * FROM ... , copying files, etc. How do attackers gain such control? In many ways, most notably RCE (remote code execution) vulnerabilities and weak admin authentication. How to prevent it? Follow best security practices – regularly update libraries and software to get security patches, do not run native commands from within the application layer, open only necessary ports (80 and 443) to the outside world, configure 2-factor authentication for administrator login. Aim at having an intrusion detection / prevention system. Encrypt your data, and make the encryption as granular as possible for the most sensitive data (e.g. for SentinelDB we utilize per-record encryption) to avoid SELECT * breaches. SQL injections – this is a rookie mistake that...