High Heap usage after a while

fox_pluto · November 18, 2021, 3:01pm

Hi, I have build a cluster with 3 nodes of 8 cores and 8 GB each, the CRATE_HEAP_SIZE is set to 2G.

I have used the cluster for some days for ingesting end retrieving jsons, not the cluster report:

SQL Error [XX000]: ERROR: out_of_memory_error: Java heap space

the Heap from the GUI is actually full, even if I am not using the cluster any more.

What this heap is used for?
How could I prevent the situation? the 2 GB of heap is too low?
Is there a way to empty the Heap ?

Regards,
S.

proddata · November 18, 2021, 3:11pm

What this heap is used for?

some state data for tables, intermediate results, and various other objects

How could I prevent the situation? the 2 GB of heap is too low?

Depends. With the right setup you could store and index quite some data

Is there a way to empty the Heap ?

This is done automatically by the JVM GC, as long as unused Objects can be deleted

This seems a bit like an overload situation

Could you give some context about how much data you ingested, what the table schema looks like.
How many shards did you create?

You could increase the heap to ~50% of the node memory

fox_pluto · November 18, 2021, 3:24pm

Hi,

yes of course, I have 3 nodes each with 3 x 1TB SSD disk.
I have 3 tables created in this way:

create table stats (
	metric object(dynamic) as (
		"eventTimestamp" timestamp with time zone
	)
) clustered into 9 shards
with (number_of_replicas = 1);

create table test (
	metric object(dynamic) as (
		"eventTimestamp" timestamp with time zone
	)
) clustered into 9 shards
with (number_of_replicas = 1);

Every Object has more or less 20 field each.
The tables has 3,5 Milions of row but we expect to reach something like 50 Milions.

What do you think?

Regards,
S.

proddata · December 1, 2021, 10:16pm

Could you run the following query:

SELECT count(*) FROM information_schema.columns
where table_name IN ('test','stats')

As mentioned above I would increase the heap to 50% of system memory (or increase the system memory).
There is a certain minimum requirement to keep information about lucene indexes / tables in memory. This however largely various depending on the used schemas.

Topic		Replies	Views
Data too large?! I have 24Gb or RAM for 3 nodes and 12 cores CrateDB	1	1489	May 31, 2020
For cluster building, one node has multiple instances CrateDB fundamentals , architecture	1	138	November 22, 2023
Error when reading from local crate installation CrateDB	3	518	February 1, 2023
CrateDB performance CrateDB	0	667	October 21, 2020
CrateDB nodes constantly crashing CrateDB	2	676	September 30, 2020

High Heap usage after a while

Related Topics