Cluster IP addresses change

Hi, I have 4 node cluster (Crate.io v4.5.0) currently connecting using Public IPs, config of each node as below:

network.host: 8.8.8.1
node.name: "crate-de1"

path.logs: /mnt/crate-storage/var/log/crate
path.data: /mnt/crate-storage/var/lib/crate

discovery.seed_hosts:
    - 8.8.8.1
    - 8.8.8.2
    - 8.8.8.3
    - 8.8.8.4

cluster.name: level7sip

cluster.initial_master_nodes:
    - 8.8.8.1
    - 8.8.8.2

gateway.expected_nodes: 4
gateway.recover_after_nodes: 3

stats.enabled: false

recently we have deployed LAN switch, stopped the cluster, changed the config to use Private IPs as below:

network.bind_host: 0.0.0.0
network.publish_host: 192.168.15.1

node.name: "crate-de1"

path.logs: /var/log/crate
path.data: /mnt/crate_db/data
blobs.path: /mnt/crate_fs/blobs

discovery.seed_hosts:
    - 192.168.15.1
    - 192.168.15.2
    - 192.168.15.3
    - 192.168.15.4

cluster.name: level7sip

cluster.initial_master_nodes:
    - 192.168.15.1
    - 192.168.15.2

gateway.expected_nodes: 4
gateway.recover_after_nodes: 3

stats.enabled: false

started the cluster, but unfortunately Nodes are not able to discover each other and join the cluster.

Would anyone be able to suggest as to why and what needs to be done to make cluster work on new IPs ?

Regards,
Chris

1 Like

Hi chris,

could you share the logs after a node start?

Hi, yes sure - as I new here getting error “Sorry, new users can not upload attachments.”, so please see attachments here. Files ending with -public.log is original working configuration while nodes were on Public IPs. FIles ending with -private.log is when we attempted to move to Private IPs and cluster failed to start correctly.

Please let me know if I can provide any additional information to help troubleshooting this.

Best regards,
Chris

Hm, seems like the nodes don’t see each other.
Do all the nodes allow incoming connections on port 4300 for 192.168.15.0/24 ?

[2022-04-01T10:19:13,710][INFO ][o.e.n.Node               ] [crate-de3] starting ...
[2022-04-01T10:19:13,791][INFO ][o.e.h.n.Netty4HttpServerTransport] [crate-de3] publish_address {195.168.15.3:4200}, bound_addresses {0.0.0.0:4200}
[2022-04-01T10:19:13,796][INFO ][o.e.t.TransportService   ] [crate-de3] publish_address {195.168.15.3:4300}, bound_addresses {0.0.0.0:4300}
[2022-04-01T10:19:13,821][INFO ][o.e.b.BootstrapChecks    ] [crate-de3] bound or publishing to a non-loopback address, enforcing bootstrap checks
[2022-04-01T10:19:13,822][INFO ][o.e.c.c.Coordinator      ] [crate-de3] cluster UUID [xKTdi9ikRJu6q7Wy_9tR1g]
[2022-04-01T10:19:23,831][WARN ][o.e.c.c.ClusterFormationFailureHelper] [crate-de3] master not discovered or elected yet, an election requires at least 2 nodes with ids from [t5z5I-JNS5aVcGIsp3zAuA, DLoku-EGTzKrog99gaIzag, -uyy-ijCSPmwJ-ucDVmMcQ], have discovered [{crate-de3}{-uyy-ijCSPmwJ-ucDVmMcQ}{6N4ZZDHySFWfQVyUz18FnA}{195.168.15.3}{195.168.15.3:4300}{http_address=195.168.15.3:4200}] which is not a quorum; discovery will continue using [195.168.15.1:4300, 195.168.15.2:4300, 195.168.15.4:4300] from hosts providers and [{crate-de3}{-uyy-ijCSPmwJ-ucDVmMcQ}{6N4ZZDHySFWfQVyUz18FnA}{195.168.15.3}{195.168.15.3:4300}{http_address=195.168.15.3:4200}] from last-known cluster state; node term 7, last-accepted version 80437 in term 7

Yes, all nodes can communicate over 192.168.15.0/24 for sure. The only difference between public and private network is public is set to MTU 1500, private is MTU 1400 - would that create an issue perhaps?

Regards,
Chris

I think there were some issues before with different MTU values (crate cluster instable (not joining, not loading license) · Issue #8916 · crate/crate · GitHub)
however I have no real insights on that.

2 Likes

Thanks for info. The reason we have MTU 1400 is due to VLAN setup. I will get NOC to install physical switch allowing for MTU 1500 and report back in few days if that was indeed the reason for the issue.
Regards,
Chris

3 Likes