Case Insensitive searching in cratedb

I know we can achieve case-insensitive by creating full-text analyzer using a lower case filter, but I have a situation here

Situation

  • Right now I have more than 150 columns in cratedb and everyone is a plain index
  • We are also applying aggregate functions and group by, so all the columns are plain index.
  • Now a situation arises, and there is a need to do case insensitive search (earlier it was an exact match)

There are 2 solutions that I can think of it

Solution 1 (bad solution)

  • Using lowercase function like
    select "column1", "column2" from table where lower("column1")=lower('text')
    but this will scan whole table and hits performance

Solution 2

  • Create 2 columns for each field, one analyzed and one raw
  • Then query on the analyzed field and get raw field
    like
    select "col1_raw", "col2_raw" from table where col1_analyzed='data'
    but that means I need to change the existing table structure and create 150 more fields.
    Is this the way?

Is there any other solution to the problem or my assumptions above are wrong about solution 1.

3 Likes

Hi, I realise this question has been open for a while, but I thought of adding a few comments in case anyone comes across this.

You can use GROUP BY on columns indexed using fulltext with (type=‘simple’) , could you elaborate on the constraints that kept you away from this option?

Regarding “Solution 1”, you can use an ILIKE comparison for better readability (remember to escape % and _ characters if needed), but the engine will still need to evaluate each row, which hits performance.

Regarding “Solution 2”, you can also stick with plain indexes and store the data in the new “search” column in a format that is convenient for lookups, all uppercase stripped out of accents for instance, for new/empty tables you can do this with a generated column.