[COST-5137] Azure managed summary #5475

myersCody · 2025-01-30T14:06:43Z

Jira Ticket

COST-5137

Description

This change will:

Break out our complex OCP on CLOUD daily summary logic into separate files.

Instead of having one long 1500 line sql file, we have different stages of progress.

Create table
Resource Matching
Daily summary <-- daily aggregation

During the daily aggregation flow we also handle some data transformation such as:

Unattributed storage
Unattributed network

As the features we support continue to grow, the size of these files will as well. Therefore, we now have the ability to chunk out the files as we see fit. Personally in the future I would like to break out the network & storage unattributed data transformations from the daily aggregation logic. But that is outside the scope of this issue.

Testing

Checkout Branch
Load test customer data
Compare tables:

select sum(pretax_cost), resource_id, array_agg(distinct matched_tag) from managed_reporting_ocpazurecostlineitem_project_daily_summary where year='2024' and month='12' group by resource_id order by resource_id;

select sum(pretax_cost), resource_id from reporting_ocpazurecostlineitem_project_daily_summary where year='2024' and month='12' group by resource_id order by **resource_id;

Example:

trino:org1234567> select sum(pretax_cost), resource_id, array_agg(distinct matched_tag) from managed_reporting_ocpazurecostlineitem_project_daily_summary where year='2024' and month='12' group by resource_id order by resource_id;
       _col0        |               resource_id               |                                     _col2
--------------------+-----------------------------------------+-------------------------------------------------------------------------------
  4.002846943148718 | azure-cloud-prefix-pvc-partial-matching | [NULL]
        9.324266345 | azure_compute1                          | [NULL]
  4.219567560000001 | azure_compute1_OSDisk                   | [NULL]
        4.595152264 | azure_compute2                          | [NULL]
  4.228297566999999 | azure_compute3                          | [NULL]
        5.561305184 | azure_master                            | [NULL]
  4.855337400895995 | disk-id-1234567                         | [NULL]
 3.2741646441796077 | pv-123-claimless                        | [NULL]
        28.87511884 | NULL                                    | ["app": "banking", "app": "weather", "storageclass": "Thor", "app": "mobile"]
(9 rows)

trino:org1234567> select sum(pretax_cost), resource_id from reporting_ocpazurecostlineitem_project_daily_summary where year='2024' and month='12' group by resource_id order by resource_id;
       _col0        |               resource_id
--------------------+-----------------------------------------
  4.002846943148717 | azure-cloud-prefix-pvc-partial-matching
  9.324266344999998 | azure_compute1
         4.21956756 | azure_compute1_OSDisk
  4.595152264000001 | azure_compute2
  4.228297566999999 | azure_compute3
  5.561305184000001 | azure_master
  4.861264507315131 | disk-id-1234567
 3.2741646441796073 | pv-123-claimless

You will notice an additional cost that doesn't have a resource_id. That cost is related to tag matching rows. Apparently tag matching just doesn't work in our old flow.

Release Notes

proposed release note

* [COST-####](https://issues.redhat.com/browse/COST-####) Fix some things

myersCody · 2025-01-30T14:14:39Z

/retest

codecov · 2025-01-31T19:48:47Z

Codecov Report

Attention: Patch coverage is 94.33962% with 3 lines in your changes missing coverage. Please review.

Project coverage is 94.1%. Comparing base (fb3f92f) to head (d19d694).
Report is 1 commits behind head on main.

Additional details and impacted files

@@           Coverage Diff           @@
##            main   #5475     +/-   ##
=======================================
- Coverage   94.1%   94.1%   -0.0%     
=======================================
  Files        371     371             
  Lines      31624   31638     +14     
  Branches    3387    3388      +1     
=======================================
+ Hits       29756   29766     +10     
- Misses      1217    1218      +1     
- Partials     651     654      +3

myersCody · 2025-01-31T20:10:11Z

...asu/database/trino_sql/azure/openshift/daily_summary_flow/0_prepare_daily_summary_tables.sql

+    AND year = {{year}}
+    AND month= {{month}};
+
+INSERT INTO hive.{{schema | sqlsafe}}.managed_azure_uuid_temp (


https://issues.redhat.com/browse/COST-5866

myersCody · 2025-01-31T21:27:05Z

...asu/database/trino_sql/azure/openshift/daily_summary_flow/0_prepare_daily_summary_tables.sql

+    AND azure.date >= {{start_date}}
+    AND azure.date < date_add('day', 1, {{end_date}});
+
+CREATE TABLE IF NOT EXISTS hive.{{schema | sqlsafe}}.managed_azure_openshift_daily_temp


If we want to rename these tables to better match our flow now is our chance. For example this table is where we start doing resource matching.

Naming is always fun. Do you have suggestions here at all?

I guess what I have named them so far 😂 Mostly wanted to call it out in case people had preferences.

koku/masu/database/trino_sql/verify/managed_ocp_on_azure_verification.sql

myersCody added 7 commits January 23, 2025 11:45

[COST-5137] Azure managed summary flow

152432e

Clean up sql & add prefix

1aad09a

Remove duplicate or unused columns

8d9f3be

Create logic to prepare the template within the dataclass

f606239

Rename dataclass to reflect pipeline functionality

216193c

Remove unused columns

a7d3480

Start reworking verification

c2bae58

github-actions bot added the smokes-required label Jan 30, 2025

myersCody changed the title ~~Azure managed summary two~~ Azure managed summary Jan 30, 2025

myersCody added the azure-smoke-tests pr_check will build the image and run azure + ocp on azure smoke tests label Jan 30, 2025

myersCody added 4 commits January 30, 2025 16:44

Merge remote-tracking branch 'origin' into azure_managed_summary_two

e0d31ac

Finish fixing merge conflicts

2c84d74

Fix unittests

a179ebe

Remove verification sql

2306fd8

lcouzens mentioned this pull request Jan 31, 2025

[COST-5137] Initial managed summary tasks rework #5469

Merged

1 task

myersCody and others added 2 commits January 31, 2025 14:15

Merge branch 'main' into azure_managed_summary_two

1a10887

Fix remaining unittests

b25bd28

myersCody changed the title ~~Azure managed summary~~ [COST-5137] Azure managed summary Jan 31, 2025

myersCody commented Jan 31, 2025

View reviewed changes

myersCody marked this pull request as ready for review January 31, 2025 20:18

myersCody requested review from a team as code owners January 31, 2025 20:18

myersCody commented Jan 31, 2025

View reviewed changes

lcouzens reviewed Feb 3, 2025

View reviewed changes

koku/masu/database/trino_sql/verify/managed_ocp_on_azure_verification.sql Outdated Show resolved Hide resolved

myersCody added 2 commits February 3, 2025 12:28

Remove missed verification file.

a55e3c2

Rename directory

d19d694

lcouzens approved these changes Feb 3, 2025

View reviewed changes

myersCody merged commit 6a609de into main Feb 3, 2025
14 checks passed

myersCody deleted the azure_managed_summary_two branch February 3, 2025 20:10

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[COST-5137] Azure managed summary #5475

[COST-5137] Azure managed summary #5475

myersCody commented Jan 30, 2025 •

edited

Loading

myersCody commented Jan 30, 2025

codecov bot commented Jan 31, 2025 •

edited

Loading

myersCody Jan 31, 2025

myersCody Jan 31, 2025

lcouzens Feb 3, 2025

myersCody Feb 3, 2025

[COST-5137] Azure managed summary #5475

[COST-5137] Azure managed summary #5475

Conversation

myersCody commented Jan 30, 2025 • edited Loading

Jira Ticket

Description

Testing

Release Notes

myersCody commented Jan 30, 2025

codecov bot commented Jan 31, 2025 • edited Loading

Codecov Report

myersCody Jan 31, 2025

Choose a reason for hiding this comment

myersCody Jan 31, 2025

Choose a reason for hiding this comment

lcouzens Feb 3, 2025

Choose a reason for hiding this comment

myersCody Feb 3, 2025

Choose a reason for hiding this comment

myersCody commented Jan 30, 2025 •

edited

Loading

codecov bot commented Jan 31, 2025 •

edited

Loading