Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[COST-5137] Azure managed summary #5475

Merged
merged 15 commits into from
Feb 3, 2025
Merged

[COST-5137] Azure managed summary #5475

merged 15 commits into from
Feb 3, 2025

Conversation

myersCody
Copy link
Contributor

@myersCody myersCody commented Jan 30, 2025

Jira Ticket

COST-5137

Description

This change will:

  1. Break out our complex OCP on CLOUD daily summary logic into separate files.

Instead of having one long 1500 line sql file, we have different stages of progress.

  1. Create table
  2. Resource Matching
  3. Daily summary <-- daily aggregation

During the daily aggregation flow we also handle some data transformation such as:

  1. Unattributed storage
  2. Unattributed network

As the features we support continue to grow, the size of these files will as well. Therefore, we now have the ability to chunk out the files as we see fit. Personally in the future I would like to break out the network & storage unattributed data transformations from the daily aggregation logic. But that is outside the scope of this issue.

Testing

  1. Checkout Branch
  2. Load test customer data
  3. Compare tables:
select sum(pretax_cost), resource_id, array_agg(distinct matched_tag) from managed_reporting_ocpazurecostlineitem_project_daily_summary where year='2024' and month='12' group by resource_id order by resource_id;
select sum(pretax_cost), resource_id from reporting_ocpazurecostlineitem_project_daily_summary where year='2024' and month='12' group by resource_id order by **resource_id;

Example:

trino:org1234567> select sum(pretax_cost), resource_id, array_agg(distinct matched_tag) from managed_reporting_ocpazurecostlineitem_project_daily_summary where year='2024' and month='12' group by resource_id order by resource_id;
       _col0        |               resource_id               |                                     _col2
--------------------+-----------------------------------------+-------------------------------------------------------------------------------
  4.002846943148718 | azure-cloud-prefix-pvc-partial-matching | [NULL]
        9.324266345 | azure_compute1                          | [NULL]
  4.219567560000001 | azure_compute1_OSDisk                   | [NULL]
        4.595152264 | azure_compute2                          | [NULL]
  4.228297566999999 | azure_compute3                          | [NULL]
        5.561305184 | azure_master                            | [NULL]
  4.855337400895995 | disk-id-1234567                         | [NULL]
 3.2741646441796077 | pv-123-claimless                        | [NULL]
        28.87511884 | NULL                                    | ["app": "banking", "app": "weather", "storageclass": "Thor", "app": "mobile"]
(9 rows)

trino:org1234567> select sum(pretax_cost), resource_id from reporting_ocpazurecostlineitem_project_daily_summary where year='2024' and month='12' group by resource_id order by resource_id;
       _col0        |               resource_id
--------------------+-----------------------------------------
  4.002846943148717 | azure-cloud-prefix-pvc-partial-matching
  9.324266344999998 | azure_compute1
         4.21956756 | azure_compute1_OSDisk
  4.595152264000001 | azure_compute2
  4.228297566999999 | azure_compute3
  5.561305184000001 | azure_master
  4.861264507315131 | disk-id-1234567
 3.2741646441796073 | pv-123-claimless

You will notice an additional cost that doesn't have a resource_id. That cost is related to tag matching rows. Apparently tag matching just doesn't work in our old flow.

Release Notes

  • proposed release note
* [COST-####](https://issues.redhat.com/browse/COST-####) Fix some things

@myersCody myersCody changed the title Azure managed summary two Azure managed summary Jan 30, 2025
@myersCody myersCody added the azure-smoke-tests pr_check will build the image and run azure + ocp on azure smoke tests label Jan 30, 2025
@myersCody
Copy link
Contributor Author

/retest

@myersCody myersCody changed the title Azure managed summary [COST-5137] Azure managed summary Jan 31, 2025
Copy link

codecov bot commented Jan 31, 2025

Codecov Report

Attention: Patch coverage is 94.33962% with 3 lines in your changes missing coverage. Please review.

Project coverage is 94.1%. Comparing base (fb3f92f) to head (d19d694).
Report is 1 commits behind head on main.

Additional details and impacted files
@@           Coverage Diff           @@
##            main   #5475     +/-   ##
=======================================
- Coverage   94.1%   94.1%   -0.0%     
=======================================
  Files        371     371             
  Lines      31624   31638     +14     
  Branches    3387    3388      +1     
=======================================
+ Hits       29756   29766     +10     
- Misses      1217    1218      +1     
- Partials     651     654      +3     

AND year = {{year}}
AND month= {{month}};

INSERT INTO hive.{{schema | sqlsafe}}.managed_azure_uuid_temp (
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@myersCody myersCody marked this pull request as ready for review January 31, 2025 20:18
@myersCody myersCody requested review from a team as code owners January 31, 2025 20:18
AND azure.date >= {{start_date}}
AND azure.date < date_add('day', 1, {{end_date}});

CREATE TABLE IF NOT EXISTS hive.{{schema | sqlsafe}}.managed_azure_openshift_daily_temp
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If we want to rename these tables to better match our flow now is our chance. For example this table is where we start doing resource matching.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Naming is always fun. Do you have suggestions here at all?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I guess what I have named them so far 😂 Mostly wanted to call it out in case people had preferences.

@myersCody myersCody merged commit 6a609de into main Feb 3, 2025
14 checks passed
@myersCody myersCody deleted the azure_managed_summary_two branch February 3, 2025 20:10
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
azure-smoke-tests pr_check will build the image and run azure + ocp on azure smoke tests smokes-required
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants