3 Node Cluster not forming after install (Azure, Ubuntu VMs)

I tried to install crateDB on Azure VMs. Each individual nodes initially starts, but doesn’t form a cluster when changing the configuration(crate.yml)

Steps

  1. Crate 3 Azure VMs (Ubuntu 18 LTS)

  2. Install crate (sudo apt-get install crate) --> crate service is running

  3. Stop crate service (sudo service crate stop)

  4. Change crate.yml to

    network.host: _local_, _site_
    cluster.name: az-cluster
    discovery.seed_hosts:
        - 10.0.0.4:4300
        - 10.0.0.5:4300
        - 10.0.0.6:4300
    cluster.initial_master_nodes:
        - 10.0.0.4
        - 10.0.0.5
        - 10.0.0.6
    gateway.expected_nodes: 3
    gateway.recover_after_nodes: 2 
    bootstrap.memory_lock : true```
    
    
  5. Restart crate service
    -> Only malfunctioning single node clusters are formed

Only when I delete all node data
sudo rm -rf /usr/share/crate/data/nodes
the cluster gets formed correctly

Is this a bug or intended?

1 Like

Hi @proddata,

Welcome to the community.

The CrateDB nodes where installed/started with the default configuration file, and they each formed their own cluster, not looking for further nodes (in the absence of cluster.initial_master_nodes).

Discovery.
Discovery settings.

cluster.initial_master_nodes

... By default this is not set, meaning it expects this node to join an already 
formed cluster. 
In development mode, with no discovery settings configured, this step is 
performed by the nodes themselves, but this auto-bootstrapping is designed 
to aim development and is not safe for production. 
In production you must explicitly list the names or IP addresses of the 
master-eligible nodes whose votes should be counted in the very first
election.

You can either delete the data as you did (including cluster state), or you can use the crate-node CLI tool to detach each node from their current cluster, to enable them to form a new one on start (new config file).

For the next install, you could first place your config file in /etc/crate/crate.yml and run:

sudo apt-get -o Dpkg::Options::="--force-confdef" -o Dpkg::Options::="--force-confold" install crate

which will bypass the question in the file, as per this article.

I suppose intended.

Kind regards,

Thank you for this post. I ran into the same thing with 3 physical servers on the same vlan, disconnected from the internet. stop crate on all nodes, rm -rf /var/lib/crate/nodes on all nodes, start crate on all nodes. It did the trick, the cluster congeals and all 3 see each other.