Is it possible to have zero-downtime major version upgrades?

Emre_Sevinc · September 14, 2021, 8:01am

Hello,

In your documentation at Full restart upgrade — CrateDB: How-Tos I see that upgrading to a new major version (feature version) requires stopping all the CrateDB nodes, upgrading each one and restarting them.

I wonder if it’s possible to do a zero-downtime upgrade, e.g. by first having another cluster as a replica of the current cluster, redirecting traffic the via a load-balancer, and then redirecting the traffic again back to the updated cluster and make it catch up with the other cluster (in case new data arrived while the ‘main’ cluster was down).

Do you have any plans to support something like this out-of-the box?

proddata · September 15, 2021, 9:24am

Complete cluster restarts are typically only needed with changes in the underlying storage engine. This is typically done together with a “Major” version (i.e. the next major would be CrateDB 5.0.0). The last major version upgrade 4.0.0 happened in June 2019. The next major version is planned for 2022.

“Minor” version (e.g. 4.1, 4.2, …) that include new features can be done with rolling updates.

Logical Replication (similiar to Postgres) will be part of CrateDB 4.7. However I am not quite sure, how this would help with Upgrades.

What we sometimes see (especially with Cloud infrastructure), is the customers start a second cluster, and restore a recent snapshot there. With a message queue like Kafka/Eventhub/Kinesis, one would create a second consumer group, that feeds data into the 2nd cluster, and with the switch in the load balancer, allows basically a zero downtime major upgrade.

Do you have any plans to support something like this out-of-the box?

We don’t really see a short-downtime every 2-3 years as a big issue right now, and as mentioned above, for use cases that really needed, it already can be achieved with a second cluster for many workloads.

mfussenegger · September 21, 2021, 1:37pm

To add to this - a rolling upgrade from 4.x to 5.x will likely be supported.

We were missing some internal infrastructure to support it before 4.0.

Topic		Replies	Views
New CrateDB Stable Release: CrateDB 4.8 Community	0	471	May 5, 2022
Unstable cluster with 5.3+ version CrateDB	4	180	December 11, 2023
CrateDB from On-promise to Cloud data migration CrateDB Cloud	3	919	November 26, 2019
Can CrateDB function normally if one node in a 3-node cluster immediately goes down? CrateDB	1	499	December 17, 2021
When are shards relocated to other nodes? Can you clarify the behavior of CrateDB? CrateDB	4	692	January 7, 2022

Is it possible to have zero-downtime major version upgrades?

Related Topics