My Advice To Developers About Working With Databases: Make It Secure

Last month Ben Brumm asked me for the one advice I’d like to give to developers that are working with databases (in reality – almost all of us). He published mine as well as many others’ answers here, but I’d like to share it with my readers as well. If I had to give developers working with databases one advice, it would be: make it secure. Every other thing you’ll figure in time – how to structure your tables, how to use ORM, how to optimize queries, how to use indexes, how to do multitenancy. But security may not be on the list of requirements and it may be too late when the need becomes obvious. So I’d focus on several things: Prevent SQL injections – make sure you use an ORM or prepared statements rather than building queries with string concatenation. Otherwise a malicious actor can inject anything in your queries and turn them into a DROP DATABASE query, or worse – one that exfiltrates all the data. Support encryption in transit – this often has to be supported by the application’s driver configuration, e.g. by trusting a particular server certificate. Unencrypted communication, even within the same datacenter, is a significant risk and that’s why databases support encryption in transit. (You should also think about encryption at rest, but that’s more of an Ops task) Have an audit log at the application level – “who did what” is a very important question from a security and compliance point of view. And no native database functionality can consistently answer the question “who” – it’s the application that manages users....

Discovering an OSSEC/Wazuh Encryption Issue

I’m trying to get the Wazuh agent (a fork of OSSEC, one of the most popular open source security tools, used for intrusion detection) to talk to our custom backend (namely, our LogSentinel SIEM Collector) to allow us to reuse the powerful Wazuh/OSSEC functionalities for customers that want to install an agent on each endpoint rather than just one collector that “agentlessly” reaches out to multiple sources. But even though there’s a good documentation on the message format and encryption, I couldn’t get to successfully decrypt the messages. (I’ll refer to both Wazuh and OSSEC, as the functionality is almost identical in both, with the distinction that Wazuh added AES support in addition to blowfish) That lead me to a two-day investigation on possible reasons. The first side-discovery was the undocumented OpenSSL auto-padding of keys and IVs described in my previous article. Then it lead me to actually writing C code (an copying the relevant Wazuh/OSSEC pieces) in order to debug the issue. With Wazuh/OSSEC I was generating one ciphertext and with Java and openssl CLI – a different one. I made sure the key, key size, IV and mode (CBC) are identical. That they are equally padded and that OpenSSL’s EVP API is correctly used. All of that was confirmed and yet there was a mismatch, and therefore I could not decrypt the Wazuh/OSSEC message on the other end. After discovering the 0-padding, I also discovered a mistake in the documentation, which used a static IV of FEDCA9876543210 rather than the one found in the code, where the 0 preceded 9 – FEDCA0987654321. But that didn’t fix the...

OpenSSL Key and IV Padding

OpenSSL is an omnipresent tool when it comes to encryption. While in Java we are used to the native Java implementations of cryptographic primitives, most other languages rely on OpenSSL. Yesterday I was investigating the encryption used by one open source tool written in C, and two things looked strange: they were using a 192 bit key for AES 256, and they were using a 64-bit IV (initialization vector) instead of the required 128 bits (in fact, it was even a 56-bit IV). But somehow, magically, OpenSSL didn’t complain the way my Java implementation did, and encryption worked. So, I figured, OpenSSL is doing some padding of the key and IV. But what? Is it prepending zeroes, is it appending zeroes, is it doing PKCS padding or ISO/IEC 7816-4 padding, or any of the other alternatives. I had to know if I wanted to make my Java counterpart supply the correct key and IV. It was straightforward to test with the following commands: # First generate the ciphertext by encrypting input.dat which contains "testtesttesttesttesttest" $ openssl enc -aes-256-cbc -nosalt -e -a -A -in input.dat -K '7c07f68ea8494b2f8b9fea297119350d78708afa69c1c76' -iv 'FEDCBA987654321' -out input-test.enc # Then test decryption with the same key and IV $ openssl enc -aes-256-cbc -nosalt -d -a -A -in input-test.enc -K '7c07f68ea8494b2f8b9fea297119350d78708afa69c1c76' -iv 'FEDCBA987654321' testtesttesttesttesttest # Then test decryption with different paddings $ openssl enc -aes-256-cbc -nosalt -d -a -A -in input-test.enc -K '7c07f68ea8494b2f8b9fea297119350d78708afa69c1c76' -iv 'FEDCBA9876543210' testtesttesttesttesttest $ openssl enc -aes-256-cbc -nosalt -d -a -A -in input-test.enc -K '7c07f68ea8494b2f8b9fea297119350d78708afa69c1c760' -iv 'FEDCBA987654321' testtesttesttesttesttest $ openssl enc -aes-256-cbc -nosalt -d -a -A -in input-test.enc -K '7c07f68ea8494b2f8b9fea297119350d78708afa69c1c76000' -iv 'FEDCBA987654321' testtesttesttesttesttest $ openssl...

ElasticSearch Multitenancy With Routing

Elasticsearch is great, but optimizing it for high load is always tricky. This won’t be yet another “Tips and tricks for optimizing Elasticsearch” article – there are many great ones out there. I’m going to focus on one narrow use-case – multitenant systems, i.e. those that support multiple customers/users (tenants). You can build a multitenant search engine in three different ways: Cluster per tenant – this is the hardest to manage and requires a lot of devops automation. Depending on the types of customers it may be worth it to completely isolate them, but that’s rarely the case Index per tenant – this can be fine initially, and requires little additional coding (you just parameterize the “index” parameter in the URL of the queries), but it’s likely to cause problems as the customer base grows. Also, supporting consistent mappings and settings across indexes may be trickier than it sounds (e.g. some may reject an update and others may not depending on what’s indexed). Moving data to colder indexes also becomes more complex. Tenant-based routing – this means you put everything in one cluster but you configure your search routing to be tenant-specific, which allows you to logically isolate data within a single index. The last one seems to be the preferred option in general. What is routing? The Elasticsearch blog has a good overview and documentation. The idea lies in the way Elasticsearch handles indexing and searching – it splits data into shards (each shard is a separate Lucene index and can be replicated on more than one node). A shard is a logical grouping within a single Elasticsearch...

Is It Really Two-Factor Authentication?

Terminology-wise, there is a clear distinction between two-factor authentication (multi-factor authentication) and two-step verification (authentication), as this article explains. 2FA/MFA is authentication using more than one factors, i.e. “something you know” (password), “something you have” (token, card) and “something you are” (biometrics). Two-step verification is basically using two passwords – one permanent and another one that is short-lived and one-time. At least that’s the theory. In practice it’s more complicated to say which authentication methods belongs to which category (“something you X”). Let me illustrate that with a few emamples: An OTP hardware token is considered “something you have”. But it uses a shared symmetric secret with the server so that both can generate the same code at the same time (if using TOTP), or the same sequence. This means the secret is effectively “something you know”, because someone may steal it from the server, even though the hardware token is protected. Unless, of course, the server stores the shared secret in an HSM and does the OTP comparison on the HSM itself (some support that). And there’s still a theoretical possibility for the keys to leak prior to being stored on hardware. So is a hardware token “something you have” or “something you know”? For practical purposes it can be considered “something you have” Smartphone OTP is often not considered as secure as a hardware token, but it should be, due to the secure storage of modern phones. The secret is shared once during enrollment (usually with on-screen scanning), so it should be “something you have” as much as a hardware token SMS is not considered secure and...