英文:
Apache Cassandra with go server on low resources
问题
我有一个网站,可以接受用户上传的最大大小为10MB的文件。它通过WebSocket将有关文件的信息分为两条消息发送 - 第一条消息包含文件名和发送时间,第二条消息包含文件的ArrayBuffer。这在JavaScript中都可以正常工作。
我还有一个运行在Ubuntu 22.04上的服务器。它只有1个核心和1GB的RAM。服务器的代码是用Go编写的。它实际上接收这些消息,将它们组合成一个请求,并将其发送到Cassandra数据库。这也运行良好。
但是在处理大量请求后,Cassandra会崩溃。在Go服务器的日志中,它看起来像这样:
2023/06/28 15:17:08 Client: ip1.ip2.ip3.ip4:port, chat: dne12. Received file name: Video.mp4
2023/06/28 15:17:13 gocql: unable to dial control conn 127.0.0.1:9042: dial tcp 127.0.0.1:9042: connect: connection refused
2023/06/28 15:17:13 gocql: control unable to register events: dial tcp 127.0.0.1:9042: connect: connection refused
2023/06/28 15:17:13 gocql: no hosts available in the pool
2023/06/28 15:17:13 http: panic serving ip1.ip2.ip3.ip4:port: gocql: no hosts available in the pool
我尝试更改cassandra-env.sh文件,以下是我的更改:
system_memory_in_mb="1024"
system_cpu_cores="1"
MAX_HEAP_SIZE="512M"
max_sensible_yg_per_core_in_mb="128"
我根据该文件中的公式计算了这些值。我该如何解决这个问题?我认为这个数据库可以保存10MB的文件,不是吗?
英文:
I have a website that accepts files with a maximum size of 10 MB from users. It sends information about them in two messages by websocket - in the first the name and time of sending, in the second - the ArrayBuffer of the file. This all works well in Java Script.
const reader = new FileReader();
let inFiles = document.getElementById("inFiles");
let socket = new WebSocket("ws://" + document.location.host + "/speaker");
...
function goodDate(date) {
return date.getUTCFullYear() + "-" +
("0" + (date.getUTCMonth() + 1)).slice(-2) + "-" +
("0" + date.getUTCDate()).slice(-2) + "T" +
("0" + date.getUTCHours()).slice(-2) + ":" +
("0" + date.getUTCMinutes()).slice(-2) + ":" +
("0" + date.getUTCSeconds()).slice(-2) + "." +
(date.getUTCMilliseconds()+"00").slice(-3) + "+0000";
}
function send() {
...
function readFile(i) {
let file = inFiles.files[i];
let gd = goodDate(new Date());
if (file.size < 10485760) {
addMessage({ code: 1, text: "File " + file.name + " sending, do not close the page until a message with his name appears", time: gd }); // Displays information about sending a file on the user's page
socket.send(JSON.stringify({ code: 2, text: file.name, time: gd }));
reader.readAsArrayBuffer(file);
reader.onload = function(e) {
socket.send(e.target.result);
if (i<inFiles.files.length){
readFile(i+1);
}
}
} else {
addMessage({code: 0, text: "Error sending the file, exceeded the maximum size of 10 MB", time: gd});
}
}
if (inFiles.files.length > 0){
readFile(0);
}
...
}
I also have a server on Ubuntu 22.04. It has only 1 core and 1 GB of RAM. The server code is written in Go. It actually receives these messages, collects a request from them and sends it to the Cassandra database. This also works fine.
type Message struct {
Code int `json:"code"`
Text string `json:"text"`
Time string `json:"time"`
}
func save(chatID string, mess Message, seconddata []byte) {
// mess - name and date of send of file, seconddata - arraybuffer of file in go (so it is []byte)
mtype := false // Message is text, not file. No matter
if mess.Code == 2 {
mtype = true // Now it is file
}
thetime, err := time.Parse(timeLayout, mess.Time)
if err != nil {
log.Panic(err)
}
err = ExecuteQuery("INSERT INTO messes (chatID, type, mess, date, seconddata) VALUES (?,?,?,?,?)", chatID, mtype, mess.Text, thetime, seconddata)
if err != nil {
log.Panic(err)
}
byteMess, err := json.Marshal(mess)
if err != nil {
log.Panic(err)
}
WriteMessageToAll(chatID, byteMess) // Send all users in chat name and date of message
}
But Cassandra crashes after large requests. Its go in Active: failed (Result: oom-kill). In logs of Go server it looks like:
2023/06/28 15:17:08 Client: ip1.ip2.ip3.ip4:port, chat: dne12. Received file name: Video.mp4
2023/06/28 15:17:13 gocql: unable to dial control conn 127.0.0.1:9042: dial tcp 127.0.0.1:9042: connect: connection refused
2023/06/28 15:17:13 gocql: control unable to register events: dial tcp 127.0.0.1:9042: connect: connection refused
2023/06/28 15:17:13 gocql: no hosts available in the pool
2023/06/28 15:17:13 http: panic serving ip1.ip2.ip3.ip4:port: gocql: no hosts available in the pool
I tried to change cassandra-env.sh, here is my changes:
system_memory_in_mb="1024"
system_cpu_cores="1"
MAX_HEAP_SIZE="512M"
max_sensible_yg_per_core_in_mb="128"
I counted these values according to the formulas that are written in that file.
How can I fix this? I think that this DB can save file in 10MB, isnt it?
答案1
得分: 0
> 但是在大型请求之后,Cassandra崩溃了。
我见过有人在系统资源很少的情况下运行Cassandra,这总是个问题。
> 我根据文件中写的公式计算了这些值。
首先,Java堆中的内容比你的10MB记录要多。实际上,这些10MB记录是放在堆的“新生代”区域的,而这个区域的大小可能远远不够。
其次,文件中给出的建议是在2011年编写的,已经相当过时了。特别是这部分内容:
# ...go with
# 100 MB per physical CPU core.
CASSANDRA-8150这个问题是试图纠正一些这些指导的尝试。它从未解决,但仍然是一个关于如何充分利用CMS GC的宝库。
我应该指出,我假设你在这里使用的是CMS GC。在一个1/2 GB的堆上,G1不会运行得很好。
在那个问题中,建议将Java堆的1/3到1/2分配给新生代。默认情况下,新生代(HEAP_NEWSIZE
)计算为堆(MAX_HEAP_SIZE
)的25%。在你的情况下,这将得到128MB,显然是不够的。
考虑到当前的硬件资源,我建议你将其显式设置为256MB,将其加倍。
在这里可能发生的情况是,Cassandra节点的内存太小,在垃圾回收运行之前就被压垮了。解决这个问题最简单的方法是增加更多的RAM。对于Cassandra来说,一个具有1GB内存和512MB堆的实例是严重不足的。
无论如何,阅读一下那个问题,并看看是否有其他可以改进的地方。但请确保彻底阅读,不要零散地应用设置,这可能会引起更多的麻烦。在你的情况下,我怀疑一个更大的新生代和一个较短的tenuring阈值应该有助于更快地完成垃圾回收,但这样可能会导致垃圾回收暂停。
或者...将系统RAM增加到8GB,并将堆的MAX_HEAP_SIZE
至少增加到4GB。即使这样,我仍然建议将CMS GC的HEAP_NEWSIZE
设置为它的一半(2GB)。
英文:
> But Cassandra crashes after large requests.
I've seen folks run Cassandra on small amounts of system resources, and this is always the problem.
> I counted these values according to the formulas that are written in that file.
First, there's more going into the Java heap than your 10 MB records. In fact, the 10 MB records are going into the new generation area of the heap, which is probably not nearly as large as it needs to be.
Next, the advice given in that file was written in 2011, and is fairly out of date. Especially this part:
# ...go with
# 100 MB per physical CPU core.
The ticket CASSANDRA-8150 was an attempt to rectify some of this guidance. It was never resolved but remains as a trove of information on ways to get the most out of CMS GC.
I should note that I'm assuming that you're using CMS GC here. G1 wouldn't run very well on a 1/2 GB heap.
In that ticket came the recommendation of allocating anywhere from 1/3 to 1/2 of the Java heap to the new generation. By default, the new generation (HEAP_NEWSIZE
) computes at %25 of the heap (MAX_HEAP_SIZE
). In your case, that comes out to 128 MB, which is just not enough.
Given the current hardware resources, I'd say you should double that by explicitly setting it to 256 MB.
What's probably happening here, is that the Cassandra node's tiny memory is getting overwhelmed before garbage collection can even be run. The easiest way to solve this is to add more RAM. A 1 GB instance with a 512 MB heap is seriously underpowered for Cassandra.
Anyway, give that ticket read and see if there are any additional improvements that you can make. Make sure that you read it thoroughly, though. Applying bits and pieces of settings here and there is probably going to cause more trouble. In your case, I'd suspect that a larger new gen with a short tenuring threshold should help get things through GC quicker, but then you'll probably be suspect to GC pauses.
Or...Increase the system RAM to 8 GB and bump the heap MAX_HEAP_SIZE
to at least 4 GB. And even then, I'd still go with a CMS GC HEAP_NEWSIZE
of half of it (2 GB).
专注分享java语言的经验与见解,让所有开发者获益!
评论