Importing data to CrateDB Cloud clusters

This tutorial demonstrates three different methods of importing data into your CrateDB Cloud cluster.

Two of them, Import from URL, and Import data from a file, are most convenient, as they are available directly from the CrateDB Cloud Console. The last method, Manual import, is using SQL statements in the Admin UI.

Import from URL

This is the most user-friendly method of all the options. To import data from a URL, navigate to the Import tab of your cluster:

Here, you can choose between Import from URL and Import from file. To import from a URL, click the Import from a URL button:

To import data, fill out the URL, name of the table which will be created and populated with your data, data format, and whether it is compressed.

When a table with the chosen name does not exist, it will be automatically created. Please note that in this case, the importer will attempt to guess the data types of each column, so please make sure to verify the result.

Import from URL supports CSV, JSON, and Parquet files. Gzip compression is also supported.

Import from file

Uploading directly from your computer offers more control over your data. From the security point of view, you don’t have to share the data on the internet just to be able to import it to your cluster. You also have more control about who has access to your data.

As with the URL import, CSV, JSON, and Parquet files are supported. There is also a limit to file size, currently 1GB.

Please see our documentation for the current limitations of the Data Import feature.

Manual import

This is the default way of importing data to your database that uses the COPY FROM statement.

Create a table

Before you import the data, the table that will be filled with the data must be created. Once logged in to the Admin panel, navigate to the console.

Example statement:

CREATE TABLE “taxi” (
“passenger_count” INTEGER,
“pickup_datetime” TIMESTAMP WITH TIME ZONE,
“trip_distance” REAL,
“vendorid” INTEGER)
WITH ("column_policy" = 'dynamic',
"number_of_replicas" = '0',
"refresh_interval" = 10000 );

This statement will create a table called taxi with the following columns:

  • passenger_count
  • pickup_datetime
  • trip_distance
  • vendor_id

Import data

Now that you created a table, you can fill it with data. In this case, we use a COPY FROM statement. CSV and JSON file formats are accepted when using COPY FROM in CrateDB. In this case, a JSON file is used.

Example statement:

COPY taxi
FROM ‘https://raw.githubusercontent.com/crate/cloud-howtos/master/docs/_extra/example-dataset.json’
RETURN SUMMARY;

As with the URL import method, your URL must be publicly accessible.

This statement will import a small example dataset hosted in our GitHub repository.
Once submitted, you should get a response like this:

Conclusion

This tutorial shows three methods of importing data to your CrateDB Cloud cluster:

  • Import from URL
  • Import from file
  • Manual import in the Admin UI

We hope it will help you decide which one works best for you.

1 Like