Connection Refused error Infrequently

  • getting connection refused error infrequently
  • crate is running fine 95% fine and 5% failing with error connection refused
  • crate is running on single node
  • how can we get max connections connected to cratedb ?
  • how can we monitor it and is it because of any resource contraint or any other issue?

GuzzleHttp\Ring\Exception\ConnectException: cURL error 7: Failed to connect to localhost port 4200: Connection refused in /var/www/api/current/vendor/guzzlehttp/ringphp/src/Client/CurlFactory.php:126
Stack trace:
#0 /var/www/api/current/vendor/guzzlehttp/ringphp/src/Client/CurlFactory.php(91): GuzzleHttp\Ring\Client\CurlFactory::createErrorResponse(Array, Array, Array)
#1 /var/www/api/current/vendor/guzzlehttp/ringphp/src/Client/CurlHandler.php(96): GuzzleHttp\Ring\Client\CurlFactory::createResponse(Array, Array, Array, Array, Resource id #13595)

Hi Harshit,

Is this happening with normal usage or are you trying to benchmark the max number of connections?
I am not aware of any hard limit.

How many parallel connections are you opening?
Whate are the specs of the node?

best regards
Georg

Hey @proddata this is happening for normal usage, not setting any benchmark for number of connections.

These are the details of some of the processes and files
* icore file size (blocks, -c) 0

  • data seg size (kbytes, -d) unlimited
  • scheduling priority (-e) 0
  • file size (blocks, -f) unlimited
  • pending signals (-i) 80030
  • max locked memory (kbytes, -l) 64
  • max memory size (kbytes, -m) unlimited
  • open files (-n) 1024
  • pipe size (512 bytes, -p) 8
  • POSIX message queues (bytes, -q) 819200
  • real-time priority (-r) 0
  • stack size (kbytes, -s) 8192
  • cpu time (seconds, -t) unlimited
  • max user processes (-u) 80030
  • virtual memory (kbytes, -v) unlimited
  • file locks (-x) unlimited

Specs of the node are: CPU cores 2, memory 20GB

you can monitor the no of connections to the node with jmx / the jmx-exporter

Dear Harshit,

thank you for your report. May I ask which version of CrateDB you are running?

While this indicates you are running CrateDB on your workstation, I wanted to make sure to have this confirmed. In fact, the localhost:4200 address might get tunneled elsewhere while CrateDB is actually running within a cloud environment.

There are other things to consider when it comes to network connectivity while operating a service within a cloud environment, so I want to humbly ask you to share some more details about the environment CrateDB is running on beyond just the specifications and system limit configuration settings.

With kind regards,
Andreas.

Hey @amotl currently using the Version: 3.0.7

Dear Harshit,

thank you. Do you also have any information about the environment CrateDB is running on?

With kind regards,
Andreas.

Hey @amotl,

Cratedb is running on ubuntu 18.0.4 currently

Dear Harshit,

thank you. Let me point you towards three different aspects I would like to gather more information about.

Hosting/runtime environment

We really need more details here to get an idea about the exact environment in order to be able to eventually figure out what might occasionally be going wrong on your end wrt. network connectivity issues. There are so many possibilities.

Ubuntu 18 could be running on your workstation or on a remote host, either hosted on a bare metal machine or by any means of virtualization, either on dedicated hardware or within a shared environment operated by other people (sometimes also known as the cloud). Also, the address localhost:4200 might either be really on the local host or might get forwarded to another machine by any means.

Workload

Traffic characteristics

After knowing more details about the environment, we might want to shed more light onto the kind of workload and/or traffic your instance of CrateDB is receiving. This is, in order to figure out whether the database might be in a situation to be overloaded in a manner that is beyond the system resources it has available, which in turn might still lead to some kind of resource exhaustion which is then observable through connections being rejected.

Client behavior

That includes investigating how the HTTP client (is it RingPHP? [1]) is exactly interacting with the database and whether it will just exhaust available resources quickly by not reusing TCP connections appropriately [2]. In order to do that responsibly, any database driver should use some form of connection pooling, where the number of database connections will be limited to the pool size. In high traffic scenarios, this is absolutely crucial.

Are you able to provide more details about the aspects I’ve outlined?

Thank you in advance and with kind regards,
Andreas.

[1] GitHub - guzzle/RingPHP: Simple handler system used to power clients and servers in PHP (this project is no longer used in Guzzle 6+)
[2] On this matter, looking into the metrics re. number of active connections as outlined by @proddata might also give us more insights. On the other hand, it might also be sufficient to just look at the code.

Hey @amotl Would you please share exact command for ubuntu 16 on what information you need. I ll share the outputs. It would really help me to move forward.

Dear Harshit,

thanks for your response.

There is no command available to gather this sort of information I asked you about. It is both about finding out more details about the environment your Ubuntu system is running in and about how the client is talking to the database.

With kind regards,
Andreas.

Hey @proddata when I install jmx exporter inside my server all the tables vanish away which I see from the crate UI, the commands that I run while installing Jmx are

wget 
https://repo1.maven.org/maven2/io/crate/crate-jmx-exporter/1.0.0/crate-jmx-exporter-1.0.0.jar

export CRATE_JAVA_OPTS="-javaagent:/usr/share/crate/lib/crate-jmx-exporter-1.0.0.jar=8080"

./bin/crate

@Harshit_Anand

this seems very strange, as CrateDB wouldn’t not delete or overwrite the state, without user interaction. Did you change the crate.yml and/or download a new CrateDB version?

Posting the metrics here which I got from the jmx exporter @proddata @amotl

Could you maybe post the output from

SELECT * FROM sys.nodes;

Anyway an upgrade to a newer version would probably be a good idea. I just realised the connection metrics only were exposed with 3.1 and up

Hey @amotl we upgraded the version of cratedb in our dummy vm by following the full restart approach using following steps
1. sudo systemctl stop crate
2. sudo apt update crate
3. sudo systemctl start crate

After that we get the following error when we use sudo systemctl start crate

And when we use ./bin/crate we get the following error