A Caveat With AWS Shared Resources

Recently I’ve been releasing a new build, as usual utilizing a blue-green deployment by switching the DNS record to point to the load balancer of the previously “spare” group. But before I switched the DNS, I checked the logs of the newly launched version and noticed something strange – continuous HTTP errors from our web frameworks (Spring MVC) that a certain endpoint does not support the HTTP method. The odd thing was – I didn’t have such an endpoint at all. I enabled further logging and it turned out that the request URL was not about my domain at all. The spare group, not yet having traffic directed at it, was receiving requests pointed at a completely different domain, which I didn’t own. I messaged the domain owner, as well as AWS, to inform them of the issue. The domain owner said they have no idea what that is and that they don’t have any unused or forgotten AWS resources. AWS, however, responded as follows: The ELB service scales dynamically as traffic demand changes, therefore when scaling occurs, the ELB service will take IP addresses from the AWS unused public IP address pool and assign them to the ELB nodes that are provisioned for you. The foreign domain name you see here in your case, likely belongs to another AWS customer who’s AWS resource is no longer using one of the IP addresses that your ELB node now has as it was released to the AWS unused IP pool at some stage, their web clients are very likely excessively caching DNS for these DNS names (not respecting DNS TTL),...

Writing Big JSON Files With Jackson

Sometimes you need to export a lot of data to JSON to a file. Maybe it’s “export all data to JSON”, or the GDPR “Right to portability”, where you effectively need to do the same. And as with any big dataset, you can’t just fit it all in memory and write it to a file. It takes a while, it reads a lot of entries from the database and you need to be careful not to make such exports overload the entire system, or run out of memory. Luckily, it’s fairly straightforward to do that, with a the help Jackson’s SequenceWriter and optionally of piped streams. Here’s how it would look like: private ObjectMapper jsonMapper = new ObjectMapper(); private ExecutorService executorService = Executors.newFixedThreadPool(5); @Async public ListenableFuture<Boolean> export(UUID customerId) { try (PipedInputStream in = new PipedInputStream(); PipedOutputStream pipedOut = new PipedOutputStream(in); GZIPOutputStream out = new GZIPOutputStream(pipedOut)) { Stopwatch stopwatch = Stopwatch.createStarted(); ObjectWriter writer = jsonMapper.writer().withDefaultPrettyPrinter(); try(SequenceWriter sequenceWriter = writer.writeValues(out)) { sequenceWriter.init(true); Future<?> storageFuture = executorService.submit(() -> storageProvider.storeFile(getFilePath(customerId), in)); int batchCounter = 0; while (true) { List<Record> batch = readDatabaseBatch(batchCounter++); for (Record record : batch) { sequenceWriter.write(entry); } } // wait for storing to complete storageFuture.get(); } logger.info("Exporting took {} seconds", stopwatch.stop().elapsed(TimeUnit.SECONDS)); return AsyncResult.forValue(true); } catch (Exception ex) { logger.error("Failed to export data", ex); return AsyncResult.forValue(false); } } The code does a few things: Uses a SequenceWriter to continuously write records. It is initialized with an OutputStream, to which everything is written. This could be a simple FileOutputStream, or a piped stream as discussed below. Note that the naming here is a bit misleading – writeValues(out) sounds like you are instructing...

Proving Digital Events (Without Blockchain)

Recently technical and non-technical people alike started to believe that the best (and only) way to prove that something has happened in an information system is to use a blockchain. But there are other ways to achieve that that are arguably better and cheaper. Of course, blockchain can be used to do that, and it will do it well, but it is far from the only solution to this problem. The way blockchain proves that some event has occurred by putting it into a tamper-evident data structure (a hash chain of the roots of merkle trees of transactions) and distributing that data structure across multiple independent actors so that “tamper-evident” becomes “tamper-proof” (sort-of). So if an event is stored on a blockchain, and the chain is intact (and others have confirmed it’s intact), this is a technical guarantee that it had indeed happened and was neither back-dated, nor modified. An important note here – I’m stressing on “digital” events, because no physical event can be truly guaranteed electronically. The fact that someone has to enter the physical event into a digital system makes this process error-prone and the question becomes “was the event correctly recorded” rather than “was it modified once it was recorded”. And yes, you can have “certified” / “approved” recording devices that automate transferring physical events to the digital realm, e.g. certified speed cameras, but the certification process is a separate topic. So we’ll stay purely in the digital realm (and ignore all provenance use cases). There are two aspects to proving digital events – technical and legal. Once you get in court, it’s unlikely to...

Implementing White-Labelling

Sometimes (very often in my experience) you need to support white-labelling of your application. You may normally run it in a SaaS fashion, but some important or high profile clients may want either a dedicated deployment, or an on-premise deployment, or simply “their corner” on your cloud deployment. White-labelling normally includes different CSS, different logos and other images, and different header and footer texts. The rest of the product stays the same. So how do we support white-labelling in the least invasive way possible? (I will use Spring MVC in my examples, but it’s pretty straightforward to port the logic to other frameworks) First, let’s outline the three different ways white-labelling can be supported. You can (and probably should) implement all of them, as they are useful in different scenarios, and have much overlap. White-labelled installation – change the styles of the whole deployment. Useful for on-premise or managed installations. White-labelled subdomain – allow different styling of the service is accessed through a particular subdomain White-labelled client(s) – allow specific customers, after logging in, to see the customized styles To implement a full white-labelled installation, we have to configure a path on the filesystem where the customized css files and images will be placed, as well as the customized texts. Here’s an example from a .properties file passed to the application on startup: styling.dir=/var/config/whitelabel styling.footer=&copy;2018 Your Company styling.logo=/images/logsentinel-logo.png styling.css=/css/custom.css styling.title=Your Company In spring / spring boot, you can server files from the file system if a certain URL pattern is matched. For example: @Component @Configuration public class WebMvcCustomization implements WebMvcConfigurer { @Value("${styling.dir}") private String whiteLabelDir; @Override public void addResourceHandlers(ResourceHandlerRegistry...

5 Features Eclipse Should Copy From IntelliJ IDEA

Eclipse Photon has been released a few days ago, and I decided to do yet another comparison with IntelliJ IDEA. Last time I explained why I still prefer Eclipse, but because my current project had problems with Java 9 in Eclipse initially, I’ve been using IntelliJ IDEA in the past half a year. (Still using Eclipse for everything else; partly because of the lack of “multiple projects in one workspace” in IDEA). This time, though, the comparison will be the other way around – what IDEA features I’d really like to have in Eclipse; features that make work much easier and way more efficient. (Btw, what’s the proper short version to use – IntelliJ? IDEA?) Isn’t that a departure from my stance “Eclipse is better”? No – I don’t believe there’s a perfect IDE (or perfect anything, for that matter), so any product can try to get the best aspects of the competition. Here I’ll focus on five features of IDEA where Eclipse lags behind. First, the “Find in path” dialog. The interactivity of the dialog, the fact that you see all the results while typing and being able to navigate the results with the arrows is huge. Compare that to Eclipse’s clunky Search dialog, which (while pretty powerful), has a million tabs (rarely focused on the one you need) and then you actually click “Search” to get a list of results in a search panel, where you double-click in order to see the context…it’s just bad compared to IDEA. Second is suggesting static imports. Static imports are not used too often, except in tests. Mockito, Hamcrest, test utility...