2

I have a website that accepts files with a maximum size of 10 MB from users. It sends information about them in two messages by websocket - in the first the name and time of sending, in the second - the ArrayBuffer of the file. This all works well in Java Script.

const reader = new FileReader();
let inFiles = document.getElementById("inFiles");
let socket = new WebSocket("ws://" + document.location.host + "/speaker");

...

function goodDate(date) {
    return date.getUTCFullYear() + "-" +
        ("0" + (date.getUTCMonth() + 1)).slice(-2) + "-" +
        ("0" + date.getUTCDate()).slice(-2) + "T" +
        ("0" + date.getUTCHours()).slice(-2) + ":" +
        ("0" + date.getUTCMinutes()).slice(-2) + ":" +
        ("0" + date.getUTCSeconds()).slice(-2) + "." +
        (date.getUTCMilliseconds()+"00").slice(-3) + "+0000";
}

function send() {
    ...

    function readFile(i) {
        let file = inFiles.files[i];
        let gd = goodDate(new Date());
        if (file.size < 10485760) {
            addMessage({ code: 1, text:  "File " + file.name + " sending, do not close the page until a message with his name appears", time: gd }); // Displays information about sending a file on the user's page
            socket.send(JSON.stringify({ code: 2, text: file.name, time: gd }));
            reader.readAsArrayBuffer(file);
            reader.onload = function(e) {
                socket.send(e.target.result);
                if (i<inFiles.files.length){
                    readFile(i+1);
                }
            }
        } else {
            addMessage({code: 0, text: "Error sending the file, exceeded the maximum size of 10 MB", time: gd});
        }
    }

    if (inFiles.files.length > 0){
        readFile(0);
    }

    ...
}

I also have a server on Ubuntu 22.04. It has only 1 core and 1 GB of RAM. The server code is written in Go. It actually receives these messages, collects a request from them and sends it to the Cassandra database. This also works fine.

type Message struct {
    Code int    `json:"code"`
    Text string `json:"text"`
    Time string `json:"time"`
}

func save(chatID string, mess Message, seconddata []byte) {
        // mess - name and date of send of file, seconddata - arraybuffer of file in go (so it is []byte)

    mtype := false // Message is text, not file. No matter
    if mess.Code == 2 {
        mtype = true // Now it is file
    }
    thetime, err := time.Parse(timeLayout, mess.Time)
    if err != nil {
        log.Panic(err)
    }
    err = ExecuteQuery("INSERT INTO messes (chatID, type, mess, date, seconddata) VALUES (?,?,?,?,?)", chatID, mtype, mess.Text, thetime, seconddata)
    if err != nil {
        log.Panic(err)
    }
    byteMess, err := json.Marshal(mess)
    if err != nil {
        log.Panic(err)
    }
    WriteMessageToAll(chatID, byteMess) // Send all users in chat name and date of message
}

But Cassandra crashes after large requests. Its go in Active: failed (Result: oom-kill). In logs of Go server it looks like:

2023/06/28 15:17:08 Client: ip1.ip2.ip3.ip4:port, chat: dne12. Received file name: Video.mp4
2023/06/28 15:17:13 gocql: unable to dial control conn 127.0.0.1:9042: dial tcp 127.0.0.1:9042: connect: connection refused
2023/06/28 15:17:13 gocql: control unable to register events: dial tcp 127.0.0.1:9042: connect: connection refused
2023/06/28 15:17:13 gocql: no hosts available in the pool
2023/06/28 15:17:13 http: panic serving ip1.ip2.ip3.ip4:port: gocql: no hosts available in the pool

I tried to change cassandra-env.sh, here is my changes:

system_memory_in_mb="1024"
system_cpu_cores="1"

MAX_HEAP_SIZE="512M"
max_sensible_yg_per_core_in_mb="128"

I counted these values according to the formulas that are written in that file. How can I fix this? I think that this DB can save file in 10MB, isnt it?

Aaron
  • 55,518
  • 11
  • 116
  • 132
Zlat
  • 21
  • 2

1 Answers1

0

But Cassandra crashes after large requests.

I've seen folks run Cassandra on small amounts of system resources, and this is always the problem.

I counted these values according to the formulas that are written in that file.

First, there's more going into the Java heap than your 10 MB records. In fact, the 10 MB records are going into the new generation area of the heap, which is probably not nearly as large as it needs to be.

Next, the advice given in that file was written in 2011, and is fairly out of date. Especially this part:

# ...go with
# 100 MB per physical CPU core.

The ticket CASSANDRA-8150 was an attempt to rectify some of this guidance. It was never resolved but remains as a trove of information on ways to get the most out of CMS GC.

I should note that I'm assuming that you're using CMS GC here. G1 wouldn't run very well on a 1/2 GB heap.

In that ticket came the recommendation of allocating anywhere from 1/3 to 1/2 of the Java heap to the new generation. By default, the new generation (HEAP_NEWSIZE) computes at %25 of the heap (MAX_HEAP_SIZE). In your case, that comes out to 128 MB, which is just not enough.

Given the current hardware resources, I'd say you should double that by explicitly setting it to 256 MB.

What's probably happening here, is that the Cassandra node's tiny memory is getting overwhelmed before garbage collection can even be run. The easiest way to solve this is to add more RAM. A 1 GB instance with a 512 MB heap is seriously underpowered for Cassandra.

Anyway, give that ticket read and see if there are any additional improvements that you can make. Make sure that you read it thoroughly, though. Applying bits and pieces of settings here and there is probably going to cause more trouble. In your case, I'd suspect that a larger new gen with a short tenuring threshold should help get things through GC quicker, but then you'll probably be suspect to GC pauses.

Or...Increase the system RAM to 8 GB and bump the heap MAX_HEAP_SIZE to at least 4 GB. And even then, I'd still go with a CMS GC HEAP_NEWSIZE of half of it (2 GB).

Aaron
  • 55,518
  • 11
  • 116
  • 132