2

I'm trying to find the most efficient way to parse a large file, and save the results in a database.

The file in question is a 12_500_000 line text file, with logs from a server.

The log looks like this

[notice] 2021-03-10T16:19:26.102551Z couchdb@127.0.0.1 <0.8999.68> 351ac014dd 87.92.211.148:5984 125.129.113.37 user1 GET /userdb 200 ok 8

I'm parsing the database name (userdb) and the verb (GET).

while ((line = reader.readLine()) != null) {

        String[] data = line.split("\\s+");

        service.save(new Request(data[8], data[9].substring(1)));
    }

The parsing time(under 1ms) is insignificant compared to the time it takes to save it to the database (78.6ms).

I wanted to do this async, but from what I understand, you can't save records to a database asynchronously.

Any idea which way to go, to it faster?

kryzystof
  • 190
  • 2
  • 13
  • 1
    I don't know how you can do this in JAVA, but with MySQL you can insert many rows of data at once, that speeds up inserting data enormously. – KIKO Software Mar 13 '21 at 22:35
  • 1
    Try batching your DB operations: https://stackoverflow.com/questions/3784197/efficient-way-to-do-batch-inserts-with-jdbc – tgdavies Mar 13 '21 at 22:36
  • Generate parallel processing by executing multiple threads. One stream for reading and another for interacting with DB for updates. – Amit Mar 13 '21 at 23:23
  • You can use akka ( akka streams ) or java streams to do that, I have done something like this using akka in Scala, and it works perfectly! – AminMal Mar 13 '21 at 23:42

0 Answers0