Help understanding the cluster aspects

I’m testing cratedb 4.5.0 as a solution for our data analysis needs. I’m running 3 servers all on a private vlan. Server-01 is the master right now. server-01 is importing a large dataset. server-02 and server-03 are part of the cluster, but not importing anything. I upgraded the kernel on server-03, and figured that it is part of a cluster so let’s reboot it to load the new kernel. server-01 stopped the import because server-03 left the cluster. Here’s the error message:

COPY test_client_connections FROM 'file:///opt/data/working/*client_connections.csv.gz' WITH (format = 'csv', compression = 'gzip') RETURN SUMMARY;                                               
JobKilledException[Job killed. Participating node=server-03 disconnected.]

I found this in the docs, but it doesn’t seem to work as I expected. I have to shut down cratedb on the node first, which causes the above error on the other nodes.

Is there some doc I’m missing about how to do this?


What do you mean, by that there is no import running on server-02 and 03?
If they formed a cluster, most likely also data is written to node 02 and 03.
Also the COPY statement - if not specifically excluded - will run on all nodes in the cluster [1] :wink:

If you want to remove a node from the cluster it is always best to use a graceful stop, e.g. with the decomission statement [2]


[1] COPY FROM — CrateDB: Reference
[2] ALTER CLUSTER — CrateDB: Reference

Thanks for the pointers. The import is literally only running on server-01 via crash inside tmux. The source data csv zipfiles only exist on server-01. I can see the tables/shards spread across the cluster. I just didn’t expect the loss from 1 of 3 servers to not fail gracefully. As in, “1 of 3 is gone, ok carry on until t’s back”.

I’ll do the alter cluster statement before rebooting cluster members.