by Bozho | Aug 17, 2018 | Aggregated, Developer tips, jackson, JSON, performance
Sometimes you need to export a lot of data to JSON to a file. Maybe it’s “export all data to JSON”, or the GDPR “Right to portability”, where you effectively need to do the same. And as with any big dataset, you can’t just fit it all in memory and write it to a file. It takes a while, it reads a lot of entries from the database and you need to be careful not to make such exports overload the entire system, or run out of memory. Luckily, it’s fairly straightforward to do that, with a the help Jackson’s SequenceWriter and optionally of piped streams. Here’s how it would look like: private ObjectMapper jsonMapper = new ObjectMapper(); private ExecutorService executorService = Executors.newFixedThreadPool(5); @Async public ListenableFuture<Boolean> export(UUID customerId) { try (PipedInputStream in = new PipedInputStream(); PipedOutputStream pipedOut = new PipedOutputStream(in); GZIPOutputStream out = new GZIPOutputStream(pipedOut)) { Stopwatch stopwatch = Stopwatch.createStarted(); ObjectWriter writer = jsonMapper.writer().withDefaultPrettyPrinter(); try(SequenceWriter sequenceWriter = writer.writeValues(out)) { sequenceWriter.init(true); Future<?> storageFuture = executorService.submit(() -> storageProvider.storeFile(getFilePath(customerId), in)); int batchCounter = 0; while (true) { List<Record> batch = readDatabaseBatch(batchCounter++); for (Record record : batch) { sequenceWriter.write(entry); } } // wait for storing to complete storageFuture.get(); } logger.info("Exporting took {} seconds", stopwatch.stop().elapsed(TimeUnit.SECONDS)); return AsyncResult.forValue(true); } catch (Exception ex) { logger.error("Failed to export data", ex); return AsyncResult.forValue(false); } } The code does a few things: Uses a SequenceWriter to continuously write records. It is initialized with an OutputStream, to which everything is written. This could be a simple FileOutputStream, or a piped stream as discussed below. Note that the naming here is a bit misleading – writeValues(out) sounds like you are instructing...
Recent Comments