Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Partition Granularity not working on postgres internal aggregation month wise #9056

Open
alokthakur142 opened this issue Dec 17, 2024 · 5 comments
Assignees
Labels
question The issue is a question. Please use Stack Overflow for questions.

Comments

@alokthakur142
Copy link

alokthakur142 commented Dec 17, 2024

Describe the bug
I want to partition my activity data month-wise. However, cubejs is combining data of more than 1 months in this manner:
Oct 2024 -> From 1 Oct 2024 to 17 Dec 2024.
Nov 2024 -> From 1 Oct 2024 to 17 Dec 2024.
Dec 2024 -> From 1 Oct 2024 to 17 Dec 2024

Expected behavior
It should store only one month data in one partition:
Oct 2024 -> From 1 Oct 2024 to 31 Oct 2024.
Nov 2024 -> From 1 Nov 2024 to 30 Nov 2024.
Dec 2024 -> From 1 Dec 2024 to 17 Dec 2024

cube(`Activity`, {
  sql: `SELECT id, user_id, type, activity_time, remarks, modified_date FROM public.activity where company = ${COMPILE_CONTEXT.securityContext.companyId} and ${FILTER_PARAMS.Activity.activityTime.filter('activity_time')}
  `,
  
  pre_aggregations: {
    main: {
      type: `original_sql`,
      external: false,
      time_dimension: CUBE.activityTime,
      partition_granularity: `month`,
      allow_non_strict_date_range_match: true,
      refresh_key: {
        every: `1 hour`,
	      sql: `SELECT MAX(t.modified_date) FROM public.activity where company = ${COMPILE_CONTEXT.securityContext.companyId} and ${FILTER_PARAMS.Activity.activityTime.filter('activity_time')}`,
      },
      build_range_start: {
        sql: `SELECT date_trunc('month', NOW()) - interval '2 year'`
      },
      build_range_end: {
        sql: `SELECT date_trunc('month', NOW()) + interval '1 month'`
      }
    }
  },
  
  measures: {
    count: {
      type: `count`
    }
  },
  
  dimensions: {
    id: {
      sql: `id`,
      type: `number`,
      primaryKey: true
    },

    type: {
      sql: `type`,
      type: `string`
    },
    
    remarks: {
      sql: `remarks`,
      type: `string`
    },
    
    activityTime: {
      sql: `activity_time`,
      type: `time`
    }
  }
});

Request:

{"query":{
"limit":10,
"offset":0,
"order":{},
"timeDimensions":[{"dimension":"Activity.activityTime","granularity":"day","dateRange":"This Quarter"}],
"dimensions":["Activity.id"],
"measures":["Activity.count"],
"filters":[],
"renewQuery":false
},"queryType":"multi"}

Result:

Activity ID  Count
100                 3
101                 3
102                 3

Version:
1.0.0 to 1.0.7

@alokthakur142
Copy link
Author

@igorlukanin

@alokthakur142
Copy link
Author

alokthakur142 commented Dec 18, 2024

Here is log for pre aggregation planning:

"preAggregationsSchema":"pre_x","loadSql":["CREATE TABLE pre_x.activity_main20241001 AS SELECT id, user_id, remarks, activity_time, modified_date FROM public.activity where tenant = x and ((activity_time >= $1::timestamptz AND activity_time <= $2::timestamptz))\n  ",["2024-10-01T00:00:00.000Z","2024-12-31T23:59:59.999Z"],{}]

It is sending wrong parameter in $2. It should be 2024-10-31T23:59:59 instead.

Please help on this.

@igorlukanin
Copy link
Member

@alokthakur142 Does anything change if you use the rollup pre-aggregation type?

@igorlukanin igorlukanin self-assigned this Dec 18, 2024
@igorlukanin igorlukanin added the question The issue is a question. Please use Stack Overflow for questions. label Dec 18, 2024
@alokthakur142
Copy link
Author

Yes, enabling rollup will not work with joins.

I have provided a playground to the user with a capability to choose attributes of User, Customer, Activity Type and other related entities.

Queries like: City wise Activity, User wise Activity, Designation wise etc is possible only by using original_sql.

Let me know if you have any question on this.

@igorlukanin
Copy link
Member

I guess the issue here is that original_sql pre-aggregations are not supposed to be partitioned by a time dimension. I'll tag @paveltiunov so he can confirm this.

Queries like: City wise Activity, User wise Activity, Designation wise etc is possible only by using original_sql.

I don't think I follow. Once again, what prevents you from using a rollup pre-aggregation here?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
question The issue is a question. Please use Stack Overflow for questions.
Projects
None yet
Development

No branches or pull requests

2 participants