Cratedb getting the count of NULL values excluding empty arrays

Hello crate community back here again. I have a question that is a bit tricky. So I am trying to get the count of “NULL” values. However an empty array is also considered NULL. I have empty arrays in my table. My question is in a table with millions of record is it possible to get a count of items where the actual value of the column is “NULL” without including an empty array?
I tried

SELECT COUNT (*)
FROM "scp_service_transaction"."transactions_v2"
WHERE "itemPrices" IS NULL AND "itemPrices" != []
LIMIT 100;

But that returns

Error!

[Object, Object]

Maybe because there are to0 many records being processed? Running

SELECT COUNT (*)
FROM "scp_service_transaction"."transactions_v2"
WHERE "itemPrices" IS NULL
LIMIT 100;

Gives me a count that includes empty arrays.

Thanks!

Dear @arafaraf,

thank you for creating this issue. May I ask which version of CrateDB you are using?

I think there has been an improvement on this matter recently: Ensure IS NULL does not match empty arrays by mfussenegger · Pull Request #12704 · crate/crate · GitHub. It should be available per CrateDB 5.x and the most recent 4.8.x release.

With kind regards,
Andreas.

Thank you for your response. Sadly I am on crate Version:
4.6.7. Wondering if there is any way around this on my current version since in prod I don’t think we will be updating versions.

Dear @arafaraf,

sadly there is no workaround for this problem, you will need CrateDB Version 4.8.2 or newer. Am I right to assume you are using a primitive inner-type for your itemPrices array? [1]

With kind regards,
Andreas.


  1. While investigating the problem space, we discovered a flaw for the non-primitive inner-types OBJECT and GEO_SHAPE, see Selecting empty arrays vs. NULL values when using non-primitive inner-types OBJECT and GEO_SHAPE · Issue #13100 · crate/crate · GitHub. I think it will not matter for your setup unless you are using such – I just wanted to check back about this detail with you. ↩︎

Yes the type is “ARRAY(REAL)”

Thank you for your prompt response. With CrateDB >=4.8.2, I think you can expect this issue to be fixed for your scenario. This quick snippet demonstrates it. Apologies that there is apparently no workaround.

docker run --rm -it --publish=4200:4200 crate:4.8.2
crash <<SQL
  DROP TABLE IF EXISTS testdrive;
  CREATE TABLE testdrive (details ARRAY(REAL));
  INSERT INTO testdrive (details) VALUES ([]);
  INSERT INTO testdrive (details) VALUES (NULL);
  REFRESH TABLE testdrive;
  SELECT count(*) AS count_array_is_null FROM testdrive WHERE details IS NULL;
  SELECT count(*) AS count_array_equals_empty FROM testdrive WHERE details = [];
SQL
+---------------------+
| count_array_is_null |
+---------------------+
|                   1 |
+---------------------+
SELECT 1 row in set (0.004 sec)

+--------------------------+
| count_array_equals_empty |
+--------------------------+
|                        1 |
+--------------------------+
SELECT 1 row in set (0.004 sec)