Master-slave deployment on AWS EKS

Hello,
I’m try to deploy master-slave deployment on AWS EKS cluster and fallowing below Link :

After deploying StateFulSet ‘crate’, pods crate-0,crate-1,crate-2 are running but their logs showing error:

@shailendra could you please share the YAMLs of the statefulset, services, configmaps you are using.

Thank you very much in advance.
regards,
Walter

kind: Service

apiVersion: v1

metadata:

  name: crate-internal-service

  namespace: my-nm

  labels:

    app: crate

spec:

  # A static IP address is assigned to this service. This IP address is

  # only reachable from within the Kubernetes cluster.

  type: ClusterIP

  ports:

    # Port 4300 for inter-node communication.

  - port: 4300

    name: crate-internal

  selector:

    # Apply this to all nodes with the `app:crate` label.

    app: crate

---

kind: Service

apiVersion: v1

metadata:

  name: crate-external-service

  namespace: my-nm

  labels:

    app: crate

spec:

  # Create an externally reachable load balancer.

  type: LoadBalancer

  ports:

    # Port 4200 for HTTP clients.

  - port: 4200

    name: crate-web

    # Port 5432 for PostgreSQL wire protocol clients.

  - port: 5432

    name: postgres

  selector:

    # Apply this to all nodes with the `app:crate` label.

    app: crate

---

kind: StatefulSet

apiVersion: "apps/v1"

metadata:

  # This is the name used as a prefix for all pods in the set.

  name: crate

  namespace: my-nm

spec:

  serviceName: "crate-set"

  # Our cluster has three nodes.

  replicas: 3

  selector:

    matchLabels:

      # The pods in this cluster have the `app:crate` app label.

      app: crate

  template:

    metadata:

      labels:

        app: crate

    spec:

      # InitContainers run before the main containers of a pod are

      # started, and they must terminate before the primary containers

      # are initialized. Here, we use one to set the correct memory

      # map limit.

      initContainers:

      - name: init-sysctl

        image: busybox

        imagePullPolicy: IfNotPresent

        command: ["sysctl", "-w", "vm.max_map_count=262144"]

        securityContext:

          privileged: true

      # This final section is the core of the StatefulSet configuration.

      # It defines the container to run in each pod.

      containers:

      - name: crate

        # Use the CrateDB 4.2.4 Docker image.

        image: crate:4.2.4

        # Pass in configuration to CrateDB via command-line options.

        # We are setting the name of the node's explicitly, which is

        # needed to determine the initial master nodes. These are set to

        # the name of the pod.

        # We are using the SRV records provided by Kubernetes to discover

        # nodes within the cluster.

        command:

          # - -Cnode.name=${crate-pod}

          - -Ccluster.name=${CLUSTER_NAME}

          - -Ccluster.initial_master_nodes=crate-0

          - -Cdiscovery.seed_providers=srv

          - -Cdiscovery.srv.query=_crate-internal._tcp.crate-internal-service.${NAMESPACE}.svc.cluster.local

          - -Cgateway.recover_after_nodes=2

          - -Cgateway.expected_nodes=${EXPECTED_NODES}

          - -Cpath.data=/data

        volumeMounts:

              # Mount the `/data` directory as a volume named `data`.

            - mountPath: /data

              name: data

        resources:

          limits:

            # How much memory each pod gets.

            memory: 512Mi

        ports:

          # Port 4300 for inter-node communication.

        - containerPort: 4300

          name: crate-internal

          # Port 4200 for HTTP clients.

        - containerPort: 4200

          name: crate-web

          # Port 5432 for PostgreSQL wire protocol clients.

        - containerPort: 5432

          name: postgres

        # Environment variables passed through to the container.

        env:

          # This is variable is detected by CrateDB.

        - name: CRATE_HEAP_SIZE

          value: "256m"

          # The rest of these variables are used in the command-line

          # options.

        - name: EXPECTED_NODES

          value: "3"

        - name: CLUSTER_NAME

          value: "gfiware-poc"

        - name: NAMESPACE

          valueFrom:

            fieldRef:

              fieldPath: metadata.namespace

  volumeClaimTemplates:

    # Use persistent storage.

    - metadata:

        name: data

      spec:

        accessModes:

        - ReadWriteOnce

        storageClassName: storage-class-ebs

        resources:

          requests:

            storage: 1Gi
      - command:
        - /docker-entrypoint.sh
        - -Cstats.enabled=true
        - -Ccluster.name=<<CLUSTER_NAME>>
        - -Cnode.name=${POD_NAME}
        - -Ccluster.initial_master_nodes=crate-0,crate-1,crate-2
        - -Cdiscovery.seed_providers=srv
        - -Cdiscovery.srv.query=_cluster._tcp.crate-discovery.${NAMESPACE}.svc.cluster.local
        - -Cgateway.recover_after_nodes=2
        - -Cgateway.expected_nodes=3
        - -Cpath.data=/data
        - -Cssl.http.enabled=true
        - -Cssl.psql.enabled=true
        - -Cprocessors=4
        env:
        - name: CRATE_HEAP_SIZE
          valueFrom:
            configMapKeyRef:
              key: crate.heap_size
              name: crate
        - name: CRATE_JAVA_OPTS
          valueFrom:
            configMapKeyRef:
              key: crate.java_opts
              name: crate

Could you check whether something like this works for you? You may have to check on the DNS Name used for the srv.query as that shows an example. I see you have been using smth. like crate-internal-service, which of course is also ok :slight_smile: just needs to be changed.

Hello @Walter_Behmann,

I did the same as you suggested. I remove Cnode.name=${POD_NAME} because with this, it get failed with error node.name must not be empty.
after removing node name, I’m getting error:

and final templet is :

kind: Service

apiVersion: v1

metadata:

  name: crate-internal-service

  namespace: gfiware

  labels:

    app: crate

spec:

  # A static IP address is assigned to this service. This IP address is

  # only reachable from within the Kubernetes cluster.

  type: ClusterIP

  ports:

    # Port 4300 for inter-node communication.

  - port: 4300

    name: crate-internal

  selector:

    # Apply this to all nodes with the `app:crate` label.

    app: crate

---

kind: Service

apiVersion: v1

metadata:

  name: crate-external-service

  namespace: gfiware

  labels:

    app: crate

spec:

  # Create an externally reachable load balancer.

  type: LoadBalancer

  ports:

    # Port 4200 for HTTP clients.

  - port: 4200

    name: crate-web

    # Port 5432 for PostgreSQL wire protocol clients.

  - port: 5432

    name: postgres

  selector:

    # Apply this to all nodes with the `app:crate` label.

    app: crate

---

kind: StatefulSet

apiVersion: "apps/v1"

metadata:

  # This is the name used as a prefix for all pods in the set.

  name: crate

  namespace: gfiware

spec:

  serviceName: "crate-set"

  # Our cluster has three nodes.

  replicas: 3

  selector:

    matchLabels:

      # The pods in this cluster have the `app:crate` app label.

      app: crate

  template:

    metadata:

      labels:

        app: crate

    spec:

      # InitContainers run before the main containers of a pod are

      # started, and they must terminate before the primary containers

      # are initialized. Here, we use one to set the correct memory

      # map limit.

      initContainers:

      - name: init-sysctl

        image: busybox

        imagePullPolicy: IfNotPresent

        command: ["sysctl", "-w", "vm.max_map_count=262144"]

        securityContext:

          privileged: true

      # This final section is the core of the StatefulSet configuration.

      # It defines the container to run in each pod.

      containers:

      - name: crate

        # Use the CrateDB 4.2.4 Docker image.

        image: crate:4.2.4

        # Pass in configuration to CrateDB via command-line options.

        # We are setting the name of the node's explicitly, which is

        # needed to determine the initial master nodes. These are set to

        # the name of the pod.

        # We are using the SRV records provided by Kubernetes to discover

        # nodes within the cluster.

        command:

          - /docker-entrypoint.sh

          - -Cstats.enabled=true

          - -Ccluster.name=jfiware-poc

          - -Cnode.name=${POD_NAME}

          - -Ccluster.initial_master_nodes=crate-0,crate-1,crate-2

          - -Cdiscovery.seed_providers=srv

          - -Cdiscovery.srv.query=_cluster._tcp.crate-discovery.${NAMESPACE}.svc.cluster.local

          - -Cgateway.recover_after_nodes=2

          - -Cgateway.expected_nodes=3

          - -Cpath.data=/data

          - -Cssl.http.enabled=true

          - -Cssl.psql.enabled=true

          - -Cprocessors=4

        volumeMounts:

              # Mount the `/data` directory as a volume named `data`.

            - mountPath: /data

              name: data

        resources:

          limits:

            # How much memory each pod gets.

            memory: 512Mi

        ports:

          # Port 4300 for inter-node communication.

        - containerPort: 4300

          name: crate-internal

          # Port 4200 for HTTP clients.

        - containerPort: 4200

          name: crate-web

          # Port 5432 for PostgreSQL wire protocol clients.

        - containerPort: 5432

          name: postgres

        # Environment variables passed through to the container.

        env:

          # This is variable is detected by CrateDB.

        - name: CRATE_HEAP_SIZE

          value: "256m"

          # The rest of these variables are used in the command-line

          # options.

        - name: EXPECTED_NODES

          value: "3"

        - name: CLUSTER_NAME

          value: "jfiware-poc"

        - name: NAMESPACE

          valueFrom:

            fieldRef:

              fieldPath: metadata.namespace

  volumeClaimTemplates:

    # Use persistent storage.

    - metadata:

        name: data

      spec:

        accessModes:

        - ReadWriteOnce

        storageClassName: gfiwire-storage-class-ebs

        resources:

          requests:

            storage: 1Gi

The DNS lookup fails because of the wrong DNS-Domain in ^^ it should be rather:
- -Cdiscovery.srv.query=_cluster._tcp.crate-internal-service.${NAMESPACE}.svc.cluster.local

And yes - -Cnode.name=${POD_NAME} needs to be set.

I hope this works for you. If you want me to send you all the YAML files needed to set it up, let me know. But once the discovery service name is set to the right DNS FQDN I guess you are good.

regards,
Walter

Hello @Walter_Behmann,

I updated -Cdiscovery.srv.query with _cluster._tcp.crate-internal-service.${NAMESPACE}.svc.cluster.local but getting same error for resolving DNS

I applied this to an EKS Cluster. Should work with C&C:

kind: Namespace
apiVersion: v1
metadata:
  name: gfiware

---
kind: Service
apiVersion: v1
metadata:
  name: crate-internal
  namespace: gfiware
  labels:
    app: crate
spec:
  type: ClusterIP
  ports:
  - port: 4300
    name: cluster
    targetPort: 4300
    protocol: TCP
  selector:
    app: crate


---

kind: Service
apiVersion: v1
metadata:
  name: crate-external-service
  namespace: gfiware
  labels:
    app: crate
spec:
  # Create an externally reachable load balancer.
  type: LoadBalancer
  ports:
    # Port 4200 for HTTP clients.
  - port: 4200
    name: crate-web
  - port: 5432
    name: postgres
  selector:
    app: crate

---

kind: StatefulSet
apiVersion: "apps/v1"
metadata:
  name: crate
  namespace: gfiware
spec:
  serviceName: "crate-set"
  replicas: 3
  selector:
    matchLabels:
      app: crate
  template:
    metadata:
      labels:
        app: crate
    spec:
      initContainers:
      - name: init-sysctl
        image: busybox
        imagePullPolicy: IfNotPresent
        command: ["sysctl", "-w", "vm.max_map_count=262144"]
        securityContext:
          privileged: true
      containers:
      - name: crate
        image: crate:4.2.4
        command:
          - /docker-entrypoint.sh
          - -Cstats.enabled=true
          - -Ccluster.name=gfiware-poc
          - -Cnode.name=${POD_NAME}
          - -Ccluster.initial_master_nodes=crate-0,crate-1,crate-2
          - -Cdiscovery.seed_providers=srv
          - -Cdiscovery.srv.query=_cluster._tcp.crate-internal.${NAMESPACE}.svc.cluster.local
          - -Cgateway.recover_after_nodes=2
          - -Cgateway.expected_nodes=3
          - -Cpath.data=/data
          - -Cssl.http.enabled=false
          - -Cssl.psql.enabled=false
          - -Cprocessors=4
        volumeMounts:
            - mountPath: /data
              name: data
        resources:
          limits:
            memory: 512Mi
        ports:
        - containerPort: 4300
          name: crate-internal
        - containerPort: 4200
          name: crate-web
        - containerPort: 5432
          name: postgres
        env:
        - name: CRATE_HEAP_SIZE
          value: "256m"
        - name: EXPECTED_NODES
          value: "3"
        - name: CLUSTER_NAME
          value: "gfiware-poc"
        - name: NAMESPACE
          valueFrom:
            fieldRef:
              fieldPath: metadata.namespace
        - name: POD_NAME
          valueFrom:
            fieldRef:
              apiVersion: v1
              fieldPath: metadata.name
  volumeClaimTemplates:
    - metadata:
        name: data
      spec:
        accessModes:
        - ReadWriteOnce
        storageClassName: gp2
        resources:
          requests:
            storage: 10Gi

- -Cdiscovery.srv.query=_cluster._tcp.crate-internal.${NAMESPACE}.svc.cluster.local

_cluster = name of the port defined for 4300
crate-internal= name of the service defined for service discovery

        - name: POD_NAME
          valueFrom:
            fieldRef:
              apiVersion: v1
              fieldPath: metadata.name

Makes the POD Name available in the STS.

Please be advised that such small EBS Volumes are very limited in the throughput they give you. Not gonna be suitable for a real test.

1 Like

Thanks @Walter_Behmann,
Now it’s work.
:+1:

Hello @Walter_Behmann, One more thing I want to know that is it possible to auto-scale pods from 3 to 5 ?

if it is just for a test, you can easily just scale it up with kubectl scale sts crate --replicas=5, that should do it. You will see a warning in the admin UI after that. Eventually you will have to fix

         - -Cgateway.recover_after_nodes=2
         - -Cgateway.expected_nodes=3

We also have a k8s operator for cratedb, but it might be a little overkill for a simple test.

Regards,
Walter