0

Usually, I try to dig out some info on my problems from the Internet without asking questions, but this time - I'm kinda stuck... The problem is, that I have few huge tables containing JSON objects in rows. The final output is a JSON string with all of the objects, so it can be sent to our headquarters as a operational report. One, single report contains approx. 15GB of data in very specific structure. The report has this structure:

{header1:value,
header2:value,
[...]
sub_objects:[
 sub_object1:[{row1tab1},{row2tab1}...(6GB of data, 14 milions of rows)],
 sub_object2:[{row1tab2},{row2tab2}...(another 5GB of data, 6 milions of rows)],
 sub_object3:[...]
]}

Now, the problem is that my query builds whole, complete 15GB's string on the database side, and then, it's unloaded to a text file using a Java console app, that I've written some time ago. It's nothing fancy - simple query and fetching data as binary stream directly to file.

The above process works fine, but the time is killing me (and soon - my boss will want to kill me also). It takes approx. 5 hours to build the complete string. What's funny - if I'll store the string in database - the unloading process takes only 5 minutes. But still - storing the string in database takes 5 hours.

I'm trying to find some different angle here to fetch the data during query execution on the server's side, so I'd build the complete string directly in file, and not on the database side.

Any help would be much appreciated! If you'll need any extra info on this - don't hesitate to ask - I'll try to provide as much as I can.

Thank you in advance! Tom

freedomn-m
  • 27,664
  • 8
  • 35
  • 57
T.Fruba
  • 73
  • 9
  • If I understand this correctly, you already have the data in the database, correct? If so, I would definitely try to use existing database infrastructure to generate the file instead of writing something on your own. Ask Tom has a couple of good suggestions here: https://asktom.oracle.com/pls/apex/f?p=100:11:0::::P11_QUESTION_ID:9536328100346697722 – Erik Finnman Feb 15 '21 at 09:37
  • Hi Erik and thank you for the response. Unfortunately - we have very strict policies regarding a server's resources and I can't make file on the server side :( That's the reason I'm forced to use my own tool. – T.Fruba Feb 15 '21 at 10:05
  • Okay, I've managed to get the data out from the database and unload it to muliple files (sub_objects1.json, sub_objects2.json, etc.). Also - I have a main, header object in file (header.json). Now, the case is is to merge those files, but to do that - I need to keep the file's structure. In the header file I have a %SUBOBJECT1% etc. in places where the contents of those files should be placed. Any ideas? – T.Fruba Feb 24 '21 at 09:17
  • 1
    Sounds like you need to read each file and the header, and then do a token replacement where you replace %SUBOBJECT1% with the contents of your files. The Scanner class is one way to do it. – Erik Finnman Feb 24 '21 at 12:23
  • @ErikFinnman, thank you for the advice. I have one doubt: wouldn't using Scanner to replace the token cause loading the whole file content to memory? My files are quite big, and would really love to avoid OOM's etc. And BTW - those files are single-line. – T.Fruba Feb 25 '21 at 08:44
  • No, I think the Scanner uses a buffer to avoid loading the entire file into memory. See e.g. https://stackoverflow.com/questions/2231369/scanner-vs-bufferedreader – Erik Finnman Feb 25 '21 at 09:01
  • Great! Now i just need to figure out how to replace the token with contents of another file without loading it to memory. And to be honest - I have no clue where to start ;) – T.Fruba Feb 25 '21 at 10:46

0 Answers0