Cluster checks failing: Recreate tables

I recently upgraded our Crate cluster from v2.3.11 to v3.3.5. As expected, many (but not all) of my tables are failing the cluster checks “The following tables need to be recreated for compatibility with future major versions of CrateDB:”.
In preparation for the next upgrade (v4.5+), I need to go ahead and recreate these tables. I am using this command “OPTIMIZE TABLE xxx WITH (upgrade_segments=true);” as recommended in the documentation (Optimization — CrateDB: Reference).
However, after running this against any particular table in my schema, the table still appears in the list of tables needing upgrade. When I check the tables schema (information_schema.tables) the version[‘created’] for that table is still “cratedb : 2.3.11”.
What have I missed here? I have almost a terabyte of data to upgrade, and copying it to new tables would cause some extreme downtime (unacceptable!). Upgrade_segments looked like a dream come true, but it doesn’t do what it is specifically designed to do, namely “Don’t optimize but instead upgrade all segments of the table/partition to the current version of storage engine.”.
I would appreciate any help and advice you can give!
Thanks,
Dennis

1 Like

Hi @denniswi,

unfortunately upgrading segements does not help in your case

Segments are only a subset of the CrateDB/lucene data store. Upgrading the segments does not adress other major index structure changes. For applying these tables have to (very occasionaly) be re-created.

Given you have enough storage available in your CrateDB cluster you should be able to re-create the table without downtime though. You will need to accomodate for newly modified data (new inserts, updates, deleted) during re-creation until switch over to the new table can be done.

If you don’t enough storage to duplicate your table you can potentially do a segmented migration. Re-create one partition after another and after every migration step drop the partition in the old table.

Please reach out if you have further questions
Best regards,
Johannes

1 Like

@jayeff, Thank you for your quick response. May I suggest that the documentation I linked to be clarified with your explanation?

Also,

Can you give a little more detail here? I have about 200 tables. Some are small and pretty static, I’m not worried about them. Others have Billions of rows, organized in weekly partitions, with tens of thousands of records inserted per minute.
It is these large tables that concern me - by the time the latest partition is copied, I may have missed hundreds of thousands of inserts. What do you suggest for this scenario?

Thanks you again for your help!

Dennis

1 Like

Hi @denniswi,

every migration story can be a bit different but I hope I can give you some pointers

Can I assume that the newly inserted records are all written in latest partition? In this case you can safely migrate all old partitions to a new table. This process can run in parallel while new records are being written to the old table.

Once all old records are backfilled you are left with only most recent partition to migrate. Here you can opt for a short downtime to migrate the last partition and switch over tables. If you want to further reduce the required downtime do this on a Monday morning, when you have a fresh partition with just a small number of records.

You also have a option for a zero-downtime table swap, but this will require some additional planning and work. For example you could deploy a code change which writes new records to both the new and old table. In parallel backfill the new table from the old table. Once you reach record paritiy you would have the chance to run some checks (completness, performance, etc) on the new table. If everything looks good do a switch over to the new table.

Hope this helps

Are there plans to improve this shortcoming in future upgrades? This is a huge painpoint, a very difficult workflow when you have a lot of data.

Hi @denniswi,

I understand your point.

There are no near-future plans to have something built in to solve this though as there is no one-size-fits-all approach to easily solve this. As mentioned this is an operation which is only necessary very occuasionally. We are aware of the implicit overhead of a structure change and only require in when there is no way around it