Migrating to a different table schema

Let’s assume you have a table with some data inserted already and you what like to organize your data with a different table structure. Maybe your first thought is to write a simple script that reads the data from the old table, mangles your data as you want, and re-inserts it into a new table with the new schema.

For most problems this is overkill and you could use the power of SQL for a simpler solution to this problem.

Setup

First, let’s create a sample table in CrateDB with some records which we use throughout this tutorial:

CREATE TABLE testdata (
  id BIGINT,
  first_name TEXT,
  last_name TEXT,
  street TEXT,
  zip_code INT,
  city TEXT,
  country TEXT
);

Next, we add some sample records to this table:

INSERT INTO testdata (id, first_name, last_name, street, zip_code, city, country)
VALUES
  (1, 'Jane', 'Doe', 'London Road 1', 12345, 'Manchester', 'UK'),
  (2, 'Franz', 'Müller', 'Lindenstraße 23', 10115, 'Berlin', 'DE'),
  (3, 'Anna', 'Maier', 'Birkenweg 1', 10115, 'Berlin', 'DE');

With this, we are ready to go.

Object construction

If we decide to store the address in an object we can use object literals to construct our object.

The new table structure is as follows:

CREATE TABLE addresses (
  id INT,
  address OBJECT AS (
    street TEXT,
    postal_code INT,
    city TEXT,
    country TEXT
  )
);

Now we use an INSERT INTO SELECT to move the data from testdata to our new table:

INSERT INTO addresses (id, address)
  SELECT 
    id, {street = street, postal_code = zip_code, city = city, country = country}
  FROM testdata;

Using functions

We can use your own user-defined functions or build-in functions (like concat) to change data.

In this example, we store the full name in a TEXT field

CREATE TABLE names (
  id INT,
  full_name TEXT
);
INSERT INTO names (id, full_name)
  SELECT id, concat(first_name, ' ', last_name) as full_name
  FROM testdata;

Using aggregation to build up arrays

Maybe we want to combine multiple records into a single row. We can use array_agg to achieve this.

In this example we build up a table with all people living in a specific zip code:

CREATE TABLE zip_codes (
  zip_code INT,
  person_ids ARRAY(INT)
)
INSERT INTO zip_codes (zip_code, person_ids)
  SELECT zip_code, array_agg(id)
  FROM testdata
  GROUP by zip_code;

Summary

In this tutorial, we did show a couple of methods for how you can change your schema and migrate your existing records to this new schema with the power of plain SQL.

4 Likes