LAST_VALUE() ignores Window partition

I have a table of Ids which are versioned by timestamp (its just a DataVault satellite table) and I´d like to utilize window functions to get the most recent version of each Id like so:

    SELECT a.*
    FROM odp.s_entity_access a
    inner join (
        SELECT last_value(_id) OVER(PARTITION BY entity_id order by lddts) as rid
        FROM odp.s_entity_access
    ) b on a._id = b.rid
;

Unfortunately LAST_VALUE() seems to always return the value in the current_row instead of the last row in its partition, so my query just returns all rows.

If I don’t use PARTITION BY I get the last Rowid of all rows

As entity_id and lddts are the Primary key, I also joined them together and hashed them as an artifical primary key but this returns the same results as using _id

As A workaround I might using GROUP BY and join via the timestamps (as _id won’t be sortable/linear ascending I guess), but I am afraid that this might be significantly slower with a lot of rows.

Am I using LAST_VALUE() in a wrong way, or is this a bug?

EDIT: I found a good workaround utilizing row_number() instead of last_value() but still its not clear to me why one window function works as expected in the same partition/order by and the other does not

    SELECT a.*
    FROM odp.s_entity_access a
    inner join (
        SELECT ROW_NUMBER() OVER(PARTITION BY entity_id order by lddts desc) as ind,_id
        FROM odp.s_entity_access
    ) b on b.ind = 1 and a._id = b._id
;

I think this might be related to the default frame (https://crate.io/docs/crate/reference/en/4.6/general/builtins/window-functions.html#window-definition)

The default frame definition is RANGE BETWEEN UNBOUNDED PRECEDING AND CURRENT ROW . If frame_end is omitted it defaults to CURRENT ROW .

SELECT DISTINCT grp,
    FIRST_VALUE(ts) OVER (w) as "first_at_ts",
    FIRST_VALUE(val) OVER (w) as "first_val",
    LAST_VALUE(ts) OVER (w) as "last_at_ts",
    LAST_VALUE(val) OVER  (w) as "last_val"
FROM firstlast
WINDOW w AS (PARTITION BY grp ORDER BY ts  RANGE BETWEEN UNBOUNDED PRECEDING AND UNBOUNDED FOLLOWING);
1 Like

mhm…but why do other window functions deliver correct results then (see OP-Edit … btw. row_number() correctly counts/increments all partition members)? Also I tried to pass LAST_VALUE() either the _id, or the ts…both didn’t work as expected.

I also tried to set specific frame_end with UNBOUNDED FOLLOWING… still getting all group members returned (current rows)

Could the difference between ROW_NUMBER() and LAST_VALUE() be, that the ordering is different?

PARTITION BY entity_id order by lddts

and

PARTITION BY entity_id order by lddts desc

well…yes, because the very different nature of last_value() and row_number() and how my workaround actually works, thats obviously a difference… nevertheless: it SHOULD also work with last_value() and a correctly defined window, shouldn’it?

The combination first_value() and PARTITION BY entity_id ORDER BY lddts desc RANGE BETWEEN UNBOUNDED PRECEDING AND UNBOUNDED FOLLOWING doesn’t work either

1 Like