Table reindexing after version upgrade

Today we have updated our Crate cluster to version 5.0.1 from 4.8 and a new cluster check info has appear telling “The following tables need to be recreated for compatibility with future major versions of CrateDB”. The link to the documentation is System information — CrateDB: Reference where it tells I need to reindex all tables to manintain compatibility in future versions.

Following the instructions first I haver created a new temporary table using current table “show create table”

CREATE TABLE IF NOT EXISTS "demo"."ts_real_raw" (
   "real_id" BIGINT NOT NULL,
   "date" TIMESTAMP WITHOUT TIME ZONE NOT NULL,
   "value" DOUBLE PRECISION,
   "value_text" TEXT,
   "modification_date" TIMESTAMP WITHOUT TIME ZONE GENERATED ALWAYS AS _cast(_cast(current_timestamp, 'timestamp without time zone'), 'timestamp without time zone'),
   "year" INTEGER GENERATED ALWAYS AS EXTRACT(YEAR FROM "date") NOT NULL,
   PRIMARY KEY ("real_id", "date", "year")
)
CLUSTERED INTO 4 SHARDS
WITH (
   "allocation.max_retries" = 5,
   "blocks.metadata" = false,
   "blocks.read" = false,
   "blocks.read_only" = false,
   "blocks.read_only_allow_delete" = false,
   "blocks.write" = false,
   codec = 'default',
   column_policy = 'dynamic',
   "mapping.total_fields.limit" = 1000,
   max_ngram_diff = 1,
   max_shingle_diff = 3,
   number_of_replicas = '0-1',
   "routing.allocation.enable" = 'all',
   "routing.allocation.total_shards_per_node" = -1,
   "store.type" = 'fs',
   "translog.durability" = 'REQUEST',
   "translog.flush_threshold_size" = 536870912,
   "translog.sync_interval" = 5000,
   "unassigned.node_left.delayed_timeout" = 60000,
   "write.wait_for_active_shards" = '1'
)

Then I have changed the order of the steps in the documentation since we are adding data in live, so first I have swaped the table and then try to copy the data.

alter cluster swap table demo.ts_real_raw2 to demo.ts_real_raw
insert into demo.ts_real_raw (real_id, date, value, value_text)  (select real_id, date, value, value_text from demo.ts_real_raw2);

And then executed the insert query:

insert into bemp_demo.ts_real_raw (real_id, date, value, value_text)  (select real_id, date, value, value_text from bemp_demo.ts_real_raw2);

I have removed autogenerated modification_date and year columns since they are autogenerated.

After executing this query around 800 registries have been copied but now the task is stuck for several hours. The cluster activity (disk, memory and cpu) is at normal working levels (no overload) so seems it is doing nothing.

Listing active jobs shows the insertion job:

select * from sys.jobs;

The original table has 83.1 Million records and it is about 4.3 GiB. There is no error in Crate logs, I have tried running the query several times and all tries appear in job list, but the record count does not increment.

¿How can I debug what is happening?

Best rgards,

More info: I am using web interface to execute queries. KILL commands does not terminate the jobs, after restaring the cluster and trying again the same happens, only a few records are copied and the job seems stuck.

I have been doing more tests, this time with COPY TO and COPY FROM. I am able to export correctly to json files using:

COPY bemp_demo.ts_real_raw2 TO DIRECTORY '/snapshot_repo/export/' with (wait_for_completion=false);

Then tried an import using:

COPY bemp_demo.ts_real_raw3 FROM 'file:///snapshot_repo/export/ts_real_raw2_0_.json' with (validation=false) return summary;

demo.ts_real_raw3 table is new empty table. I have tried only importing one of the 4 files it generates (original table has 4 shards) and the destination table is still zero records length and the query does not finishes after a couple of hours, so I can’t get the the summary. I can see it stuck in sys.jobs table.

All these tests are done using de admin UI and crate user.

Best regards,

Hey @iames,
could you please add fail_fast=true (COPY FROM — CrateDB: Reference) as an option to the COPY FROM statement to see if this will stop the query with an error?
It may be related to the Overload protection which could be seen at the log file by adding following entries to your log4j2.yml file:

logger.indexing.name = io.crate.execution.engine.indexing
logger.indexing.level = debug

Or by adjusting the logger at runtime:

SET GLOBAL TRANSIENT logger.indexing.name = 'io.crate.execution.engine.indexing', 'logger.indexing.level = debug';

Log entries like:
.. reached maximum concurrent operations for node ...
should appear if the overload protection is hit.

After adding those lines to log4j2.yml I can see the following logging information:

[2022-09-29T14:41:01,217][DEBUG][i.c.e.e.i.ShardingUpsertExecutor] [crate-dev-bemp-0] Creating smaller bulk requests because shardedRequests is using too much memory. ramBytesUsed=65856 itemsByShard=1 itemSize=65856 maxBytesUsableByShardedRequests=65536
[2022-09-29T14:41:01,218][DEBUG][i.c.e.e.i.ShardingUpsertExecutor] [crate-dev-bemp-0] Creating smaller bulk requests because shardedRequests is using too much memory. ramBytesUsed=65856 itemsByShard=1 itemSize=65856 maxBytesUsableByShardedRequests=65536
[2022-09-29T14:41:01,219][DEBUG][i.c.e.e.i.BatchIteratorBackpressureExecutor] [crate-dev-bemp-0] Pausing consumption jobId=0df66786-3c7e-1af4-a9e0-b0d90672b8db delayInMs=10
[2022-09-29T14:41:01,230][DEBUG][i.c.e.e.i.ShardingUpsertExecutor] [crate-dev-bemp-0] Creating smaller bulk requests because shardedRequests is using too much memory. ramBytesUsed=65856 itemsByShard=1 itemSize=65856 maxBytesUsableByShardedRequests=65536
[2022-09-29T14:41:01,241][DEBUG][i.c.e.e.i.ShardingUpsertExecutor] [crate-dev-bemp-0] Creating smaller bulk requests because shardedRequests is using too much memory. ramBytesUsed=65856 itemsByShard=1 itemSize=65856 maxBytesUsableByShardedRequests=65536
[2022-09-29T14:41:01,252][DEBUG][i.c.e.e.i.ShardingUpsertExecutor] [crate-dev-bemp-0] Creating smaller bulk requests because shardedRequests is using too much memory. ramBytesUsed=65856 itemsByShard=1 itemSize=65856 maxBytesUsableByShardedRequests=65536

The last message is repeated indefinitely at high speed. I don’t know what does it mean and how to solve it. Any clue?

Best regards,

These log message are telling that the COPY operation will slow down (using less items per batched write request to the target shard) caused by less HEAP memory available as most of it is already in use by other (on-going writes and reads) operations.

CrateDB uses circuit breakers to account current used memory and bail out if too much is used to avoid crashing the process due to OOM exceptions.
Most operations will use the QUERY circuit break, see https://crate.io/docs/crate/reference/en/5.0/config/cluster.html#query-circuit-breaker.
If you monitor that, despite of this logs, the HEAP usage of your nodes is not completely saturated, you could increase the indices.breaker.query.limit at your own risk.

Instead of adjusting the breakers, I’d suggest to either:

  • increase your HEAP if possible and not already > 25% (or at least not > 50%) of the available RAM
  • scale up your cluster to distribute more shards and thus lower the HEAP pressure per node
  • lower the read/write load until the import is done
  • or just wait until the records are imported.

The cluster is a one node Crate running on Kubernetes and has 4GB of heap memory configured. I have increased to 25GB and I am still unable to load exported data. The problem is that more than one hour after issuing the COPY FROM command the available records on the table is still zero. It is not a simple slowdown.

Currently 565 shards are started and the read/write operation amount is very light, only few queries per minute.

For example, I launched yesterday the insert query (this case INSERT SELECT) and it is still running after more than 17 hours.

The record count (after issuing a refresh table query) is:

Update:

I have been doing more tests and I have found indices.breaker.query.limit was configured as -1. According to the documentation the default value is 60% so I have reconfigured it using:

SET GLOBAL PERSISTENT "indices.breaker.query.limit" = '60%';

Now both queries (COPY FROM and INSERT SELECT) work. Thank you @smu , I wouldn’t been able to fix it without your help.

1 Like

@iames Great to hear that you’ve solve the issue!
It also highlights a bug as per our documentation -1 should disable breaking while still accounting the memory. Seems like -1 was treat in a wrong way inside the code. We are working on a fix, see Fix an issue wheras a `-1` breaker limit was interpreted wrongly by seut · Pull Request #13096 · crate/crate · GitHub.
Thank you for bringing this up.

2 Likes