What to do to fix "Too many open files" on a 3-node cluster with almost no data?

The Admin UI is basically also just a SQL client, that uses the _sql / http-endpoint to run queries to display data in the UI. So even though you are not actively querying in the console, the admin ui sends quite some sql-statements to the cluster.

Are you using any sort of loadbalancer / reverse proxy?


The amount of data does not matter to much, but rather the amount of lucene segments, which only is somehow related to the amount of data

btw, this might not have anything to do with actual data, but might as well be a networking problem.

There’s an HAProxy load balancer, but for this development cluster we connected directly to the nodes most of the time. For example, all of the web-based UI connections were directly made to the IP address of a particular node, not via a load balancer.

I made a similar experiment this morning:

  • Every few seconds, check that value of process['open_file_descriptors'] for Node 01 and see that it stays the same at the value of 2902.
  • Connect via Chrome directly to the web-based administrative UI running on node 01.
  • Start clicking on some items on the web UI to generate some SQL queries to the back-end CrateDB server.
  • Observe that the value of process['open_file_descriptors'] increasing to 2944 quickly.
  • Exit Chrome browser.
  • The value of process['open_file_descriptors'] drops but not by much, it becomes 2942 and stays at that amount.
  • The same situation applies to process['open_file_descriptors'] on all 3 nodes. They all increase, on and on…

Meanwhile, on our production cluster, I do something similar, connect directly to first node with Chrome, click, etc. exit. And process['open_file_descriptors'] stays around 500.

Therefore, it is almost like the following: if I continue connecting with Chrome to one of the nodes on our development cluster, click on a few items, exit, and do this repeatedly, the value of process['open_file_descriptors'] will continue to increase, until I start to get /usr/share/crate/lib/site/index.html: Too many open files error again.

What I’m trying to understand:

  • why does the value of process['open_file_descriptors'] continuously increase?

I mean, normally, I’d expect it to stabilize around some value on that cluster that’s not doing much at all. I’d expect that number to increase when I connect with Chrome, but then decrease after I exit Chrome.

Maybe you have some insight? Or some method to debug this strange case further?

I decided to look into the most recent file descriptors and see if there are obvious differences between production cluster node 01, and development cluster node 01.

On production, node 01:

$ sudo ls -lt /proc/218658/fd | head -n40
total 0
lrwx------ 1 crate crate 64 Jan 13 13:55 436 -> socket:[783194]
lrwx------ 1 crate crate 64 Jan 13 13:55 499 -> socket:[783184]
lrwx------ 1 crate crate 64 Jan 13 13:55 506 -> socket:[783192]
lr-x------ 1 crate crate 64 Jan 13 13:53 303 -> /data/nodes/0/indices/fKxtpGYgR4WrlgKifnUa6w/0/index/_4k1.fdt
l-wx------ 1 crate crate 64 Jan 13 13:53 438 -> /data/nodes/0/indices/SSM93J75RCWoR0g1ynfEuw/2/index/_2i394.fdt
l-wx------ 1 crate crate 64 Jan 13 13:53 439 -> /data/nodes/0/indices/SSM93J75RCWoR0g1ynfEuw/2/index/_2i394_Lucene85FieldsIndex-doc_ids_2if7e.tmp
l-wx------ 1 crate crate 64 Jan 13 13:53 443 -> /data/nodes/0/indices/SSM93J75RCWoR0g1ynfEuw/2/index/_2i394_Lucene85FieldsIndexfile_pointers_2if7f.tmp
lrwx------ 1 crate crate 64 Jan 13 13:53 465 -> /data/nodes/0/indices/fKxtpGYgR4WrlgKifnUa6w/0/translog/translog-521.tlog
lrwx------ 1 crate crate 64 Jan 13 13:53 477 -> /data/nodes/0/indices/SSM93J75RCWoR0g1ynfEuw/4/translog/translog-44.tlog
lrwx------ 1 crate crate 64 Jan 13 10:12 484 -> /data/nodes/0/indices/SSM93J75RCWoR0g1ynfEuw/3/translog/translog-45.tlog
l-wx------ 1 crate crate 64 Jan 13 10:12 489 -> /data/nodes/0/indices/SSM93J75RCWoR0g1ynfEuw/4/index/_2i2nq.fdm
lrwx------ 1 crate crate 64 Jan 13 10:06 462 -> /data/nodes/0/indices/QZ08ZovzTYin1OKmU9GvfA/0/translog/translog-1439.tlog
lrwx------ 1 crate crate 64 Dec 27 12:59 556 -> /data/nodes/0/indices/SSM93J75RCWoR0g1ynfEuw/3/translog/translog-46.tlog
lrwx------ 1 crate crate 64 Dec 27 09:25 550 -> /data/nodes/0/indices/QJhmCWHnTAik-1hYASbtqA/5/translog/translog-3.tlog
lr-x------ 1 crate crate 64 Dec 23 00:19 543 -> /data/nodes/0/indices/5VMxfFDOTg-VaBI3x50s6w/4/index/_e0v.fdt
lr-x------ 1 crate crate 64 Dec 21 22:40 535 -> /data/nodes/0/indices/5VMxfFDOTg-VaBI3x50s6w/2/index/_ihp.fdx
lrwx------ 1 crate crate 64 Dec 21 21:46 540 -> socket:[471822]
lrwx------ 1 crate crate 64 Dec 21 21:46 542 -> /data/nodes/0/indices/SSM93J75RCWoR0g1ynfEuw/5/translog/translog-43.tlog
lrwx------ 1 crate crate 64 Dec 21 20:42 533 -> /data/nodes/0/indices/SSM93J75RCWoR0g1ynfEuw/4/translog/translog-42.tlog
lrwx------ 1 crate crate 64 Dec 21 20:42 534 -> /data/nodes/0/indices/mM6WFguiTXO8ozwRnWX9RQ/3/translog/translog-2678.tlog
lrwx------ 1 crate crate 64 Dec 21 20:42 532 -> /data/nodes/0/indices/mM6WFguiTXO8ozwRnWX9RQ/5/translog/translog-2669.tlog
l-wx------ 1 crate crate 64 Dec 21 20:27 485 -> /data/nodes/0/indices/IYTZhsKBTmOEE5FZXXTlqA/4/index/_23j.fdm
lrwx------ 1 crate crate 64 Dec 21 20:27 486 -> /data/nodes/0/indices/TfdwP1MXTiODrdJYgM_vnw/3/translog/translog-4.tlog
lr-x------ 1 crate crate 64 Dec 21 20:27 487 -> /data/nodes/0/indices/QZ08ZovzTYin1OKmU9GvfA/0/index/_d3f.fdx
l-wx------ 1 crate crate 64 Dec 21 20:27 488 -> /data/nodes/0/indices/QJhmCWHnTAik-1hYASbtqA/5/index/write.lock
l-wx------ 1 crate crate 64 Dec 21 20:27 490 -> /data/nodes/0/indices/IYTZhsKBTmOEE5FZXXTlqA/4/index/_23j.fdt
l-wx------ 1 crate crate 64 Dec 21 20:27 491 -> /data/nodes/0/indices/QJhmCWHnTAik-1hYASbtqA/4/index/write.lock
lrwx------ 1 crate crate 64 Dec 21 20:27 492 -> /data/nodes/0/indices/IYTZhsKBTmOEE5FZXXTlqA/5/translog/translog-651.tlog
lr-x------ 1 crate crate 64 Dec 21 20:27 497 -> /data/nodes/0/_state/_lt.cfs
lrwx------ 1 crate crate 64 Dec 21 20:27 498 -> /data/nodes/0/indices/5VMxfFDOTg-VaBI3x50s6w/3/translog/translog-1343.tlog
l-wx------ 1 crate crate 64 Dec 21 20:27 500 -> /data/nodes/0/indices/IYTZhsKBTmOEE5FZXXTlqA/2/index/_20z_Lucene85FieldsIndexfile_pointers_113.tmp
l-wx------ 1 crate crate 64 Dec 21 20:27 513 -> /data/nodes/0/indices/TfdwP1MXTiODrdJYgM_vnw/5/index/write.lock
lrwx------ 1 crate crate 64 Dec 21 20:27 514 -> /data/nodes/0/indices/SSM93J75RCWoR0g1ynfEuw/3/translog/translog-41.tlog
lr-x------ 1 crate crate 64 Dec 21 20:27 517 -> /data/nodes/0/indices/QZ08ZovzTYin1OKmU9GvfA/4/index/_cxs.fdx
lr-x------ 1 crate crate 64 Dec 21 20:27 518 -> /data/nodes/0/indices/5VMxfFDOTg-VaBI3x50s6w/5/index/_c5s.fdt
lr-x------ 1 crate crate 64 Dec 21 20:27 520 -> /data/nodes/0/indices/5VMxfFDOTg-VaBI3x50s6w/5/index/_c5s.fdx
lr-x------ 1 crate crate 64 Dec 21 14:45 545 -> /data/nodes/0/indices/5VMxfFDOTg-VaBI3x50s6w/4/index/_e0v.fdx
lr-x------ 1 crate crate 64 Dec 21 13:29 525 -> /data/nodes/0/indices/fKxtpGYgR4WrlgKifnUa6w/2/index/_4hw.fdx
lrwx------ 1 crate crate 64 Dec 21 13:29 527 -> /data/nodes/0/indices/mM6WFguiTXO8ozwRnWX9RQ/1/translog/translog-2721.tlog

On development, node 01:

$ sudo ls -lt /proc/585/fd | head -n40
total 0
lr-x------ 1 crate crate 64 Jan 14 08:32 2993 -> /sys/fs/cgroup/sys-fs-fuse-connections.mount
lr-x------ 1 crate crate 64 Jan 14 08:32 2994 -> /sys/fs/cgroup/sys-fs-fuse-connections.mount
lr-x------ 1 crate crate 64 Jan 14 08:32 2989 -> /sys/fs/cgroup/sys-fs-fuse-connections.mount
lr-x------ 1 crate crate 64 Jan 14 08:32 2990 -> /sys/fs/cgroup/sys-fs-fuse-connections.mount
lr-x------ 1 crate crate 64 Jan 14 08:32 2991 -> /sys/fs/cgroup
lr-x------ 1 crate crate 64 Jan 14 08:32 2992 -> /sys/fs/cgroup
lr-x------ 1 crate crate 64 Jan 14 08:32 2985 -> /sys/fs/cgroup/sys-fs-fuse-connections.mount
lr-x------ 1 crate crate 64 Jan 14 08:32 2986 -> /sys/fs/cgroup/sys-fs-fuse-connections.mount
lr-x------ 1 crate crate 64 Jan 14 08:32 2987 -> /sys/fs/cgroup
lr-x------ 1 crate crate 64 Jan 14 08:32 2988 -> /sys/fs/cgroup
lr-x------ 1 crate crate 64 Jan 14 08:32 2981 -> /sys/fs/cgroup/sys-fs-fuse-connections.mount
lr-x------ 1 crate crate 64 Jan 14 08:32 2982 -> /sys/fs/cgroup/sys-fs-fuse-connections.mount
lr-x------ 1 crate crate 64 Jan 14 08:32 2983 -> /sys/fs/cgroup
lr-x------ 1 crate crate 64 Jan 14 08:32 2984 -> /sys/fs/cgroup
lr-x------ 1 crate crate 64 Jan 14 08:31 2977 -> /sys/fs/cgroup/sys-fs-fuse-connections.mount
lr-x------ 1 crate crate 64 Jan 14 08:31 2978 -> /sys/fs/cgroup/sys-fs-fuse-connections.mount
lr-x------ 1 crate crate 64 Jan 14 08:31 2979 -> /sys/fs/cgroup
lr-x------ 1 crate crate 64 Jan 14 08:31 2980 -> /sys/fs/cgroup
lr-x------ 1 crate crate 64 Jan 14 08:31 2973 -> /sys/fs/cgroup/sys-fs-fuse-connections.mount
lr-x------ 1 crate crate 64 Jan 14 08:31 2974 -> /sys/fs/cgroup/sys-fs-fuse-connections.mount
lr-x------ 1 crate crate 64 Jan 14 08:31 2975 -> /sys/fs/cgroup
lr-x------ 1 crate crate 64 Jan 14 08:31 2976 -> /sys/fs/cgroup
lr-x------ 1 crate crate 64 Jan 14 08:31 2969 -> /sys/fs/cgroup/sys-fs-fuse-connections.mount
lr-x------ 1 crate crate 64 Jan 14 08:31 2970 -> /sys/fs/cgroup/sys-fs-fuse-connections.mount
lr-x------ 1 crate crate 64 Jan 14 08:31 2971 -> /sys/fs/cgroup
lr-x------ 1 crate crate 64 Jan 14 08:31 2972 -> /sys/fs/cgroup
lr-x------ 1 crate crate 64 Jan 14 08:31 2965 -> /sys/fs/cgroup/sys-fs-fuse-connections.mount
lr-x------ 1 crate crate 64 Jan 14 08:31 2966 -> /sys/fs/cgroup/sys-fs-fuse-connections.mount
lr-x------ 1 crate crate 64 Jan 14 08:31 2967 -> /sys/fs/cgroup
lr-x------ 1 crate crate 64 Jan 14 08:31 2968 -> /sys/fs/cgroup
lr-x------ 1 crate crate 64 Jan 14 08:31 2961 -> /sys/fs/cgroup/sys-fs-fuse-connections.mount
lr-x------ 1 crate crate 64 Jan 14 08:31 2962 -> /sys/fs/cgroup/sys-fs-fuse-connections.mount
lr-x------ 1 crate crate 64 Jan 14 08:31 2963 -> /sys/fs/cgroup
lr-x------ 1 crate crate 64 Jan 14 08:31 2964 -> /sys/fs/cgroup
lr-x------ 1 crate crate 64 Jan 14 08:31 2957 -> /sys/fs/cgroup/sys-fs-fuse-connections.mount
lr-x------ 1 crate crate 64 Jan 14 08:31 2958 -> /sys/fs/cgroup/sys-fs-fuse-connections.mount
lr-x------ 1 crate crate 64 Jan 14 08:31 2959 -> /sys/fs/cgroup
lr-x------ 1 crate crate 64 Jan 14 08:31 2960 -> /sys/fs/cgroup
lr-x------ 1 crate crate 64 Jan 14 08:31 2953 -> /sys/fs/cgroup/sys-fs-fuse-connections.mount

Another difference between the production cluster and development cluster:

  • I installed sshfs on development cluster nodes around 17 December, but I didn’t configure anything for sshfs, didn’t create a network mount, etc.

Development cluster, node 01:

$ dpkg -l | egrep "(fuse)|(sshf)"
ii  fuse3                      3.10.3-2                       amd64        Filesystem in Userspace (3.x version)
ii  libfuse2:amd64             2.9.9-5                        amd64        Filesystem in Userspace (library)
ii  libfuse3-3:amd64           3.10.3-2                       amd64        Filesystem in Userspace (library) (3.x version)
ii  sshfs                      3.7.1+repack-2                 amd64        filesystem client based on SSH File Transfer Protocol

Production cluster, node 01:

$ dpkg -l | egrep "(fuse)|(sshf)"
ii  libfuse2:amd64                2.9.9-1+deb10u1              amd64        Filesystem in Userspace (library)

Can it be that this fuse3 package and its associated programs creating thousands of file descriptors in the fd directory of the java process? Because:

On development cluster (sshfs installed, but not configured, running, etc.), node 01:

$ sudo ls -lt /proc/585/fd | grep cgroup | wc -l
2708

On production cluster, node 01:

$ sudo ls -lt /proc/218658/fd | grep cgroup | wc -l
0

But why? And how to fix this?

Another experiment, this time monitoring cgroup / fuse related file descriptors for java process as I run Chrome, click on a few items on the web-based administrative UI, and exit Chrome:

$ while true; do sudo ls -lt /proc/585/fd | egrep "(cgroup)|(fuse)" | wc -l ; sleep 3; done
2708
2712 <-- Start Chrome
2712 <-- ...
2716 <-- ...
2716
2720
2724 <-- click on UI items, generate SQL to the back-end
2724 <-- ...
2728 <-- ...
2728
2732
2736
2736
2740
2748
2752  <-- Exit Chrome
2752  <-- it's like once those file descriptors are created, they are always there
2752  ...
2752  ...
2752  <-- it's like once those file descriptors are created, they are always there
2752
2752
2752
2752 <-- but why? Some kind of resource leakage?

Hi @Emre_Sevinc,

Core DB team did a quick analysis but did not find an obvious cause for this. I created a card on our board for us to try to reproduce this behaviour and dig deeper

Thanks for reporting this and looking into it so far :+1:

1 Like