-
Notifications
You must be signed in to change notification settings - Fork 1
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Feature/230 get partitioning checkpoints #265
Conversation
Release notes:
|
JaCoCo model module code coverage report - scala 2.13.11
|
JaCoCo agent module code coverage report - scala 2.13.11
|
JaCoCo reader module code coverage report - scala 2.13.11
|
JaCoCo server module code coverage report - scala 2.13.11
|
database/src/main/postgres/runs/V1.8.3__get_partitioning_checkpoints.sql
Outdated
Show resolved
Hide resolved
database/src/main/postgres/runs/V1.8.3__get_partitioning_checkpoints.sql
Show resolved
Hide resolved
database/src/main/postgres/runs/V1.8.3__get_partitioning_checkpoints.sql
Outdated
Show resolved
Hide resolved
database/src/main/postgres/runs/V1.8.3__get_partitioning_checkpoints.sql
Show resolved
Hide resolved
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM, I like it, I just found a few tiny improvement ideas
...in/scala/za/co/absa/atum/server/api/database/runs/functions/GetPartitioningCheckpoints.scala
Show resolved
Hide resolved
# Conflicts: # server/src/main/scala/za/co/absa/atum/server/api/controller/PartitioningControllerImpl.scala # server/src/main/scala/za/co/absa/atum/server/api/repository/PartitioningRepositoryImpl.scala
status := 41; | ||
status_text := 'Partitioning not found'; | ||
RETURN NEXT; | ||
RETURN; | ||
END IF; | ||
|
||
IF i_checkpoints_limit IS NOT NULL THEN | ||
SELECT count(*) > i_checkpoints_limit |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Alternate solution, no need to change the current one, just for inspiration:
PERFORM 1
FROM runs.checkpoints C
WHERE C.fk_partitioning = i_partitioning_id
AND (i_checkpoint_name IS NULL OR C.checkpoint_name = i_checkpoint_name)
LIMIT 1
OFFSET coalesce(i_offset) + i_checkpoints_limit;
_has_more := found;
FROM runs.checkpoints C | ||
WHERE C.fk_partitioning = i_partitioning_id | ||
AND (i_checkpoint_name IS NULL OR C.checkpoint_name = i_checkpoint_name) | ||
LIMIT i_checkpoints_limit + 1 OFFSET i_offset |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Complicated
First I thought this was a mistake to have ORDER BY
omitted from the query.
Then I thought: wait we don't care about the order of we just want to see if there are more records.
Then I thought: but having the ORDER BY
will cause to have the same data pages cached in memory for the query bellow.
And then I thought: but the ordering might be more expensive then the gain the cache gives.
So I don't know 🤣
LIMIT i_checkpoints_limit + 1 OFFSET i_offset | |
ORDER BY C.process_start_time DESC, C.id_checkpoint | |
LIMIT i_checkpoints_limit + 1 OFFSET i_offset |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
let's keep the code as is then for now :)
runs.measurements M ON C.id_checkpoint = M.fk_checkpoint | ||
JOIN | ||
limited_checkpoints LC | ||
JOIN |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Personal quirk, no need to change here:
I do prefer INNER JOIN
to JOIN
to make it obvious. So if in future you would consider using that 👼
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ok, will add the 'INNER' and do so going forward always
C.process_start_time, | ||
C.id_checkpoint | ||
LIMIT nullif(i_limit, 0); | ||
LC.process_start_time, LC.id_checkpoint; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Major:
We want the measurements of one checkpoint go after each other followed by the next checkpoint etc.
While the current way how the measurements come in (all at once) ensure this gives the correct order, this is not guaranteed by the data itself, just by actual processes right now.
LC.process_start_time, LC.id_checkpoint; | |
LC.id_checkpoint, LC.process_start_time; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ok switched the order
|
||
IF NOT FOUND THEN |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is this a problem?
I think it's valid situation, if no checkpoint for an existing partitioning exist so far.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actually, so far in queries returning detail records of a master one, of there were no details one, we ust returned the empty set. Would prefer we stay consistent, and if for some reason this is considered better, changer there too.
Tag @TebaleloS @lsulak
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think you're right. How about status 12 with text 'OK with no checkpoints found'?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think the OK solution might cause even more problems, as the data returned are NULL. So fa-db would need to be able to handle that and recognize that the particular record is OK but return empty sequence.
In the other cases we just returned no records at all.
.in(GetPartitioningCheckpoints) | ||
.in(jsonBody[CheckpointQueryDTO]) | ||
.in(V2Paths.Partitionings / path[Long]("partitioningId") / V2Paths.Checkpoints) | ||
.in(query[Option[Int]]("limit").default(Some(10)).validateOption(Validator.inRange(1, 1000))) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Minor:
Don't we want to put the (upper) limit to constants? Or we might consider different limit for each paginated endpoint?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I agree we could maybe have some base endpoint for all/most paginated endpoints so we don't have to repeat this for every single one of them. Didn't want to optimize it yet given it's the first paginated endpoint ...
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can do later sure. As hinted, this is minor thing, aboslutely not blocking.
Any chance to increase the coverage %, @salamonpavel ? |
Coincidentally @miroslavpojer today mentioned that jacoco-report had release version 1.7.1 with number of bugifxes. Can you please increase the version in the script to test and see how the new version works. |
Those coverage numbers are incorrect. |
The new jacoco-report version explains the meaning of the negative percentage numbers.
Do the numbers make more sense with this explanation? |
Every single line in GetPartitioningCheckpoints.scala is invoked when testing the code yet it reports coverage of 58.1 %. |
The reason for 58,1% is in jacoco.xml |
IF NOT FOUND THEN | ||
status := 12; | ||
status_text := 'OK with no checkpoints found'; | ||
RETURN NEXT; | ||
END IF; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
IF NOT FOUND THEN | |
status := 12; | |
status_text := 'OK with no checkpoints found'; | |
RETURN NEXT; | |
END IF; |
Introduces GET partitioning checkpoints endpoint with pagination
Closes #230
Release notes: