Connection Refused error Infrequently

@Harshit_Anand

You should not skip CrateDB minor versions when upgrading.
Starting from 3.0.7 the suggested route would be:
3.0.7 → 3.1.6 → 3.2.8 → 3.3.6 → 4.0.12 → 4.1.8 → 4.2.7 → 4.4.3 → 4.5.1

For Ubuntu (18 / bionic) you can find the packages here:
https://cdn.crate.io/downloads/deb/stable/pool/main/c/crate/

and can install them with apt after downloading

sudo apt install ./crate_4.2.2-1~focal_amd64.deb 
1 Like

Also please be aware of the following:

see
https://crate.io/docs/crate/reference/en/4.5/appendices/release-notes/4.0.0.html

Hey @amotl we upgraded the crate version from 3.0.7 to 3.3.6 and I am sharing the metrics again from SELECT * FROM sys.nodes;

Dear Harshit,

thank you for your responses. I will try to separate my answer into three topics.

Number of TCP connections

13 open HTTP connections feels like a reasonable number that does not immediately reveal an overcommitment situation on this regard. 246 total connections over the lifetime of the node does not reveal any high usage either, while you might have just restarted the node/cluster?

However, a sensible assessment on this matter can only be made when monitoring and recording those metrics over a period of time because looking at a single sample does not reveal any access patterns of applications. Finding out about this is important because overcommitment situations might happen infrequently.

TCP connections dying

On the other hand, you might also being tripped by quite the opposite instead. That is, that database connections might be dying silently without the driver being able to recognize it. This situation might lead to similar errors in userspace.

In order to find out more about this, can I humbly ask you again about the topics I elaborated at Connection Refused error Infrequently - #9 by amotl? In particular, I would like to point out Networking robustness and resiliency on Azure and beyond (AWS, GCP, AliCloud) · Issue #10779 · crate/crate · GitHub in this context. To become more specific, my questions are:

  • How does your hosting environment look like? Are you running on bare metal, virtual machines or some form of containers? Which one?
  • How does your network look like? Are you running everything on the same host or do you use multiple hosts? If so, how are they connected to each other?
  • How does your application container look like? Are you running Apache, PHP-FPM, uWSGI or some other service in this regard for hosting your PHP application as a web application? Or do you invoke batch jobs against the database and don’t run any web application at all?

Answers to those questions will help us to get reasonable insights into the system you are operating. Any other kinds of elaborations about your scenario and workload characteristics will also help.

TCP connection reuse

Last but not least, I would like to reflect on the situation about invoking HTTP requests from PHP in general. That is, when considering a typical web application responding to requests from HTTP clients. In this regard, I also want to ask you if this is actually the case on your end, otherwise I can only guess.

Specifically, I am wondering how any type of connection reuse for HTTP connections might actually work with PHP. For connecting to databases, PHP has a special feature called »Persistent Database Connections«, implemented in, e.g., mysql_pconnect or pg_pconnect. What is called “persistent connections” here is actually “connection pooling” in a different jargon. While I am not an expert in this area, I don’t know about any way to have connection pooling/reuse for HTTP connections invoked from PHP.

With kind regards,
Andreas.

Some research revealed that the pecl_http extension indeed supports connection pooling, see Curl - mdref. On the other hand, I doubt that the regular PHP libcurl binding or the Guzzle HTTP clients support such an option. While the documentation about concurrent requests say:

You can use the GuzzleHttp\Pool object when you have an indeterminate amount of requests you wish to send.

this is not actually about having a connection pool which is able to survive the boundaries of the worker process PHP is executed on within a typical web application.

In contrast, I want to quote a note about pecl_http:

If that is true, I believe it is the only option to make this efficient enough to be able to cope with high traffic situations, see also TechnoSophos: Connection Sharing with CURL in PHP: How to re-use HTTP connections to knock 70% off REST network time.

Dear Harshit,

have you been able to remedy the problems you have been observing?

With kind regards,
Andreas.