Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat(torii-grpc): start rework to use 1 single query #2817

Merged
merged 23 commits into from
Dec 21, 2024

Conversation

Larkooo
Copy link
Collaborator

@Larkooo Larkooo commented Dec 17, 2024

Summary by CodeRabbit

  • New Features

    • Enhanced model retrieval to return all cached models when no selectors are provided.
    • Improved SQL query construction with new parameters for grouping and filtering.
  • Bug Fixes

    • Streamlined error handling in database interactions, aligning error types with gRPC status codes.
  • Refactor

    • Removed complex entity fetching logic for a simplified codebase.
    • Updated methods to adopt a new binding strategy for SQL queries.
    • Adjusted test parameters and assertions to reflect expected query results.

Copy link

coderabbitai bot commented Dec 17, 2024

Walkthrough

Ohayo, sensei! This pull request introduces modifications across several components of the Torii system, focusing on enhancing SQL query handling and caching mechanisms. Key changes include a new conditional check in the models method of the ModelCache to return all cached models when no selectors are provided, the removal and refactoring of methods in the DojoWorld struct to simplify entity fetching logic, and updates to the build_sql_query function in ModelSQLReader to support new parameters for improved SQL query construction.

Changes

File Change Summary
crates/torii/core/src/sql/cache.rs Modified models method to return all cached models when no selectors are provided.
crates/torii/grpc/src/server/mod.rs Removed fetch_entities method; refactored fetch_historical_event_messages and query_by_hashed_keys methods to implement a new binding strategy for SQL queries.
crates/torii/core/src/model.rs Updated build_sql_query function to include new parameters for model relation tables and having clauses.
crates/torii/grpc/src/server/tests/entities_test.rs Adjusted test_entities_queries function to modify query parameters and update assertions based on expected results.

Possibly Related PRs


📜 Recent review details

Configuration used: .coderabbit.yaml
Review profile: CHILL
Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 759c5b8 and feecbdb.

⛔ Files ignored due to path filters (1)
  • examples/spawn-and-move/Scarb.lock is excluded by !**/*.lock
📒 Files selected for processing (1)
  • crates/torii/grpc/src/server/tests/entities_test.rs (1 hunks)
🔇 Additional comments (2)
crates/torii/grpc/src/server/tests/entities_test.rs (2)

145-145: Ohayo! Consider expanding test coverage for edge cases.

While the test now verifies the query with specific models ("ns-Moves" and "ns-Position"), we should ensure comprehensive coverage of the new single query approach.

Let's verify if we have tests for other scenarios:

Consider adding test cases for:

  • Empty models list
  • Non-existent model names
  • Different ordering of model names

145-147: Verify model name consistency across codebase, sensei!

The hardcoded model names "ns-Moves" and "ns-Position" should be consistent with model definitions.

Let's verify the model names:


🪧 Tips

Chat

There are 3 ways to chat with CodeRabbit:

  • Review comments: Directly reply to a review comment made by CodeRabbit. Example:
    • I pushed a fix in commit <commit_id>, please review it.
    • Generate unit testing code for this file.
    • Open a follow-up GitHub issue for this discussion.
  • Files and specific lines of code (under the "Files changed" tab): Tag @coderabbitai in a new review comment at the desired location with your query. Examples:
    • @coderabbitai generate unit testing code for this file.
    • @coderabbitai modularize this function.
  • PR comments: Tag @coderabbitai in a new PR comment to ask questions about the PR branch. For the best results, please provide a very specific query, as very limited context is provided in this mode. Examples:
    • @coderabbitai gather interesting stats about this repository and render them as a table. Additionally, render a pie chart showing the language distribution in the codebase.
    • @coderabbitai read src/utils.ts and generate unit testing code.
    • @coderabbitai read the files in the src/scheduler package and generate a class diagram using mermaid and a README in the markdown format.
    • @coderabbitai help me debug CodeRabbit configuration file.

Note: Be mindful of the bot's finite context window. It's strongly recommended to break down tasks such as reading entire modules into smaller chunks. For a focused discussion, use review comments to chat about specific files and their changes, instead of using the PR comments.

CodeRabbit Commands (Invoked using PR comments)

  • @coderabbitai pause to pause the reviews on a PR.
  • @coderabbitai resume to resume the paused reviews.
  • @coderabbitai review to trigger an incremental review. This is useful when automatic reviews are disabled for the repository.
  • @coderabbitai full review to do a full review from scratch and review all the files again.
  • @coderabbitai summary to regenerate the summary of the PR.
  • @coderabbitai generate docstrings to generate docstrings for this PR. (Beta)
  • @coderabbitai resolve resolve all the CodeRabbit review comments.
  • @coderabbitai configuration to show the current CodeRabbit configuration for the repository.
  • @coderabbitai help to get help.

Other keywords and placeholders

  • Add @coderabbitai ignore anywhere in the PR description to prevent this PR from being reviewed.
  • Add @coderabbitai summary to generate the high-level summary at a specific location in the PR description.
  • Add @coderabbitai anywhere in the PR title to generate the title automatically.

Documentation and Community

  • Visit our Documentation for detailed information on how to use CodeRabbit.
  • Join our Discord Community to get help, request features, and share feedback.
  • Follow us on X/Twitter for updates and announcements.

Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 2

🧹 Nitpick comments (1)
crates/torii/grpc/src/server/mod.rs (1)

497-503: Avoid shadowing the count_query variable.

At line 497, count_query is redefined, shadowing the previous count_query declared earlier. While Rust allows variable shadowing, it can reduce code readability and may cause confusion.

Consider renaming the second count_query to something like entity_count_query to enhance clarity.

📜 Review details

Configuration used: .coderabbit.yaml
Review profile: CHILL
Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 28ec8e8 and 060cf0e.

📒 Files selected for processing (2)
  • crates/torii/core/src/sql/cache.rs (1 hunks)
  • crates/torii/grpc/src/server/mod.rs (1 hunks)
🔇 Additional comments (1)
crates/torii/core/src/sql/cache.rs (1)

44-46: Ohayo, sensei! Efficient handling of empty selectors.

The added conditional in lines 44-46 optimizes the models method by returning all cached models when selectors is empty. This avoids unnecessary database queries and improves performance. Great job!

Comment on lines 505 to 511
// retrieve all schemas
let schemas = self
.model_cache
.models(&[])
.await?
.iter()
.map(|m| m.schema.clone())
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🛠️ Refactor suggestion

Performance concern when loading all models.

In lines 505-511, calling self.model_cache.models(&[]).await? with an empty slice loads all models from the cache. If the cache contains many models, this could impact performance.

Consider specifying only the required model selectors or implementing pagination to limit the number of models loaded.

Comment on lines 467 to 494
if table == EVENT_MESSAGES_HISTORICAL_TABLE {
let count_query = format!(
r#"
SELECT count(*)
FROM {table}
{where_clause}
"#
);
let total_count = sqlx::query_scalar(&count_query)
.bind(entity_updated_after.clone())
.fetch_one(&self.pool)
.await?;
if total_count == 0 {
return Ok((Vec::new(), 0));
}
}

if let Some(entity_updated_after) = entity_updated_after.clone() {
count_query = count_query.bind(entity_updated_after);
}
let total_count = count_query.fetch_optional(&self.pool).await?.unwrap_or(0);
if total_count == 0 {
return Ok((Vec::new(), 0));
let entities =
self.fetch_historical_event_messages(&format!(
r#"
SELECT {table}.id, {table}.data, {table}.model_id, group_concat({model_relation_table}.model_id) as model_ids
FROM {table}
JOIN {model_relation_table} ON {table}.id = {model_relation_table}.entity_id
{where_clause}
GROUP BY {table}.event_id
ORDER BY {table}.event_id DESC
"#
), None, limit, offset).await?;
return Ok((entities, total_count));
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue

Ohayo, sensei! Conditional binding of entity_updated_after.

In lines 476-477, entity_updated_after.clone() is bound to the count_query without checking if it's Some. If entity_updated_after is None, this could lead to unintended SQL queries or errors. Ensure that you bind entity_updated_after only when it's present.

Apply this diff to conditionally bind entity_updated_after:

 let total_count = sqlx::query_scalar(&count_query)
-    .bind(entity_updated_after.clone())
+    .apply(|query| {
+        if let Some(updated_after) = &entity_updated_after {
+            query.bind(updated_after.clone())
+        } else {
+            query
+        }
+    })
     .fetch_one(&self.pool)
     .await?;

This ensures that entity_updated_after is only bound when it's available.

📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
if table == EVENT_MESSAGES_HISTORICAL_TABLE {
let count_query = format!(
r#"
SELECT count(*)
FROM {table}
{where_clause}
"#
);
let total_count = sqlx::query_scalar(&count_query)
.bind(entity_updated_after.clone())
.fetch_one(&self.pool)
.await?;
if total_count == 0 {
return Ok((Vec::new(), 0));
}
}
if let Some(entity_updated_after) = entity_updated_after.clone() {
count_query = count_query.bind(entity_updated_after);
}
let total_count = count_query.fetch_optional(&self.pool).await?.unwrap_or(0);
if total_count == 0 {
return Ok((Vec::new(), 0));
let entities =
self.fetch_historical_event_messages(&format!(
r#"
SELECT {table}.id, {table}.data, {table}.model_id, group_concat({model_relation_table}.model_id) as model_ids
FROM {table}
JOIN {model_relation_table} ON {table}.id = {model_relation_table}.entity_id
{where_clause}
GROUP BY {table}.event_id
ORDER BY {table}.event_id DESC
"#
), None, limit, offset).await?;
return Ok((entities, total_count));
if table == EVENT_MESSAGES_HISTORICAL_TABLE {
let count_query = format!(
r#"
SELECT count(*)
FROM {table}
{where_clause}
"#
);
let total_count = sqlx::query_scalar(&count_query)
.apply(|query| {
if let Some(updated_after) = &entity_updated_after {
query.bind(updated_after.clone())
} else {
query
}
})
.fetch_one(&self.pool)
.await?;
if total_count == 0 {
return Ok((Vec::new(), 0));
}
let entities =
self.fetch_historical_event_messages(&format!(
r#"
SELECT {table}.id, {table}.data, {table}.model_id, group_concat({model_relation_table}.model_id) as model_ids
FROM {table}
JOIN {model_relation_table} ON {table}.id = {model_relation_table}.entity_id
{where_clause}
GROUP BY {table}.event_id
ORDER BY {table}.event_id DESC
"#
), None, limit, offset).await?;
return Ok((entities, total_count));

Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 2

♻️ Duplicate comments (1)
crates/torii/grpc/src/server/mod.rs (1)

513-520: 🛠️ Refactor suggestion

Ohayo, sensei! Loading all models may impact performance

Calling self.model_cache.models(&[]) with an empty slice loads all models from the cache. If the cache contains many models, this could affect performance. Consider specifying only the required model selectors or implementing pagination to limit the number of models loaded.

📜 Review details

Configuration used: .coderabbit.yaml
Review profile: CHILL
Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 060cf0e and 3a45876.

📒 Files selected for processing (2)
  • crates/torii/core/src/model.rs (1 hunks)
  • crates/torii/grpc/src/server/mod.rs (4 hunks)
🔇 Additional comments (1)
crates/torii/core/src/model.rs (1)

206-206: Ohayo, sensei! Verify parameter support in LIMIT and OFFSET clauses

Ensure that using parameter placeholders in LIMIT ? OFFSET ? is supported and behaves as expected in SQLite. While SQLite allows binding parameters in these clauses, it's important to confirm that the query executes correctly with the bound limit and offset values.

Comment on lines 475 to 479
if let Some(limit) = limit {
bind_values.push(limit.to_string());
}
if let Some(offset) = offset {
bind_values.push(offset.to_string());
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue

Ohayo, sensei! Ensure correct data types when binding limit and offset

Currently, limit and offset are converted to strings before being added to bind_values. When binding these values to the query, ensure they are bound as integers to match the expected data types in the SQL query and prevent potential type mismatch issues.

Comment on lines +392 to +401
bind_values: Vec<String>,
limit: Option<u32>,
offset: Option<u32>,
) -> Result<Vec<proto::types::Entity>, Error> {
let db_entities: Vec<(String, String, String, String)> = if keys_pattern.is_some() {
sqlx::query_as(query)
.bind(keys_pattern.unwrap())
.bind(limit)
.bind(offset)
.fetch_all(&self.pool)
.await?
} else {
sqlx::query_as(query).bind(limit).bind(offset).fetch_all(&self.pool).await?
};
let mut query = sqlx::query_as(query);
for value in bind_values {
query = query.bind(value);
}
let db_entities: Vec<(String, String, String, String)> =
query.bind(limit).bind(offset).fetch_all(&self.pool).await?;
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue

Ohayo, sensei! Verify the parameter binding order in fetch_historical_event_messages

Ensure that the bind_values, limit, and offset parameters are bound in the correct order to match the placeholders in your SQL query. Misalignment may lead to unexpected behavior or SQL errors.

Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 2

🧹 Nitpick comments (9)
crates/torii/grpc/src/server/mod.rs (6)

410-418: Ohayo, sensei! Provide additional failure mode information on fetch.

If query.fetch_all fails, consider capturing more context (like table name or bound values) so we can debug from logs more easily.


459-470: Ohayo, sensei! Factor out count-query logic.

Repeatedly constructing a count statement may be simplified by a shared helper function, reducing duplication and centralizing error handling.


494-504: Ohayo, sensei! Potential pagination synergy.

When retrieving schemas from model_cache, consider supporting advanced pagination or caching strategies if the data is large or frequently requested.


634-637: Ohayo, sensei! Confirm schema usage is validated.

Ensure that the &schemas slice is not empty or unexpected at this point, otherwise column mapping might fail in subsequent logic.


1153-1158: Ohayo, sensei! Watch out for deep composite recursion.

The nested call to build_composite_clause could lead to performance overhead or stack usage if heavily nested. Consider imposing recursion depth limits or rewriting.


1173-1173: Ohayo, sensei! Style suggestion on formatting.

“{} {}” with two placeholders can sometimes be more readable if extracted into a single phrase or two separate lines. Minor style point only.

crates/torii/core/src/model.rs (3)

191-194: Ohayo, sensei! Evaluate necessity for forced JOIN.

Enforcing JOIN on model_relation_table could produce partial result sets if records are missing. Ensure you need a strict join rather than LEFT JOIN.


210-214: Ohayo, sensei! Flexible filtering with HAVING.

Appending HAVING can be powerful, but be mindful that logical or ephemeral columns from group_concat must be carefully validated.


513-513: Ohayo, sensei! No offset usage in test.

Currently, the test defaults to None. Consider adding a quick check that ensures offset is properly applied for partial results.

📜 Review details

Configuration used: .coderabbit.yaml
Review profile: CHILL
Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 3a45876 and 9a4d414.

📒 Files selected for processing (2)
  • crates/torii/core/src/model.rs (5 hunks)
  • crates/torii/grpc/src/server/mod.rs (10 hunks)
🔇 Additional comments (22)
crates/torii/grpc/src/server/mod.rs (18)

267-276: Ohayo, sensei! Rechecking parameter-binding sequencing.

This code segment's parameter-binding order for (query, limit, offset) might lead to hard-to-diagnose issues if it doesn’t match the placeholders in the SQL statement.


338-344: Ohayo, sensei! Clarify behavior when hashed_keys is empty.

If hashed_keys is None or empty, ensure that a subsequent code path properly handles this scenario (e.g., retrieving all results). Otherwise, we may inadvertently collect an empty list, leading to unexpected query results.


347-347: Ohayo, sensei! Validate timestamp correctness.

When pushing entity_updated_after into bind_values, confirm its validity (e.g., that it’s properly formatted). Otherwise, an invalid or incomplete timestamp might trigger SQL errors or mismatched rows.


350-354: Ohayo, sensei! Reiterating type safety for limit/offset.

Limit and offset should ideally be bound as numeric types instead of strings to avoid casting pitfalls.


388-408: Ohayo, sensei! Double-check empty entity model usage.

When retrieving schemas via entity_models, if the vector is empty, we fetch all models from cache, which may quickly grow large. Please verify that this is intended behavior and won’t degrade performance in bigger data sets.


438-443: Ohayo, sensei! Review usage of REGEXP with dynamic patterns.

When constructing "WHERE keys REGEXP ?", confirm that the pattern correctly matches your intended rows and reject user-provided patterns that might lead to unexpected expansions.


447-449: Ohayo, sensei! Confirm updated_after usage.

Pushing entity_updated_after into bind_values across multiple query paths can cause confusion if some code segments assume a different date/time format.


452-456: Ohayo, sensei! Reinforce numeric binding approach for limit/offset.

Similar to lines 350-354, prefer numeric binding for these parameters to maintain type integrity.


471-472: Ohayo, sensei! Early return check acknowledged.

Returning early on a zero count ensures minimal overhead. This snippet seems fine as is.


476-490: Ohayo, sensei! Reconfirm conditional binding of updated_after in historical.

Double-check that the logic around entity_updated_after is consistent with your existing code for partial or missing timestamps in historical queries.


505-515: Ohayo, sensei! Reusing existing model_cache logic.

Loading all models from the cache when none are specified can be expensive and was previously noted. Ensure you really need everything or implement additional filters.


516-525: Ohayo, sensei! Data mapping approach looks straightforward.

The method calls map_row_to_entity, handles errors gracefully, and yields a clear result. No obvious issues here.


1095-1095: Ohayo, sensei! Type signature clarity.

Returning (String, String, Vec) is crisp. Keep consistent naming or doc comments to clarify each returned element’s meaning.


1127-1127: Ohayo, sensei! Validate presence of model_ids.

If model_ids is missing or empty, INSTR(...) checks might incorrectly include or exclude rows. Confirm that we handle a zero-length string gracefully.


1161-1164: Ohayo, sensei! Require a valid subclause.

If nested_where or nested_having is empty, verify how the logic merges with existing where_clauses/having_clauses. We might inadvertently create queries missing logical operators.


1184-1184: Ohayo, sensei! Confirm time-based filter format.

If {table}.updated_at is not a standard date/time column, ensure the string is parsed as intended.


1190-1190: Ohayo, sensei! Double-check combined logic.

HAVING with “OR” vs. “AND” can drastically change results. Confirm that your design needs “logical OR” or “logical AND.”


1195-1195: Ohayo, sensei! Tuple return is neat.

Returning (where_clause, having_clause, bind_values) is nice and straightforward. No issues found.

crates/torii/core/src/model.rs (4)

123-126: Ohayo, sensei! Great addition of parameters.

Introducing model_relation_table and having_clause clarifies grouping and advanced filtering logic in build_sql_query.


177-177: Ohayo, sensei! Confirm group_concat usage.

Using group_concat(model_id) as model_ids can simplify queries but watch out for the separator character. The default comma should be consistent with your splits.


208-209: Ohayo, sensei! Grouping by entity ID.

Grouping by {table_name}.id must align with the large column selection. Check for potential data skew or unexpected grouping with large tables.


507-507: Ohayo, sensei! Using “entity_model” for the test.

Explicitly testing the new parameter ensures stable coverage. Good practice to confirm correctness via integration tests as well.

Comment on lines 357 to 384
let count_query = format!(
r#"
SELECT count(*)
FROM {table}
JOIN {model_relation_table} ON {table}.id = {model_relation_table}.entity_id
{where_clause}
GROUP BY {table}.id
ORDER BY {table}.event_id DESC
"#
)
};

if limit.is_some() {
query += " LIMIT ?"
"#
);
let mut count_query = sqlx::query_scalar(&count_query);
for value in &bind_values {
count_query = count_query.bind(value);
}

if offset.is_some() {
query += " OFFSET ?"
let total_count = count_query.fetch_one(&self.pool).await?;
if total_count == 0 {
return Ok((Vec::new(), 0));
}

if table == EVENT_MESSAGES_HISTORICAL_TABLE {
let entities =
self.fetch_historical_event_messages(&query, None, limit, offset).await?;
self.fetch_historical_event_messages(&format!(
r#"
SELECT {table}.id, {table}.data, {table}.model_id, group_concat({model_relation_table}.model_id) as model_ids
FROM {table}
JOIN {model_relation_table} ON {table}.id = {model_relation_table}.entity_id
{where_clause}
GROUP BY {table}.event_id
ORDER BY {table}.event_id DESC
"#
), bind_values, limit, offset).await?;
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🛠️ Refactor suggestion

Ohayo, sensei! Consider refactoring for readability and performance.

The logic that constructs a separate count query and then conditionally fetches historical event messages is lengthy. Splitting it into smaller helper functions could improve maintainability and testability.


// Use the column name directly since it's already flattened
where_clauses
.push(format!("([{alias}].[{}] {comparison_operator} ?)", member.member));
.push(format!("([{model}].[{}] {comparison_operator} ?)", member.member));
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🛠️ Refactor suggestion

Ohayo, sensei! Safely handle dynamic column references.

Interpolating bracketed object references could break if the model name or member contains unexpected characters. Add sanitization or strict naming constraints.

Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 2

🧹 Nitpick comments (1)
crates/torii/core/src/model.rs (1)

123-126: Ohayo sensei! Consider using a builder pattern or options struct for query parameters.

The function signature is becoming quite lengthy with 8 parameters. This could make it harder to maintain and use correctly.

Consider refactoring to use a builder pattern or options struct:

pub struct QueryOptions<'a> {
    pub schemas: &'a Vec<Ty>,
    pub table_name: &'a str,
    pub model_relation_table: &'a str,
    pub entity_relation_column: &'a str,
    pub where_clause: Option<&'a str>,
    pub having_clause: Option<&'a str>,
    pub order_by: Option<&'a str>,
    pub limit: Option<u32>,
    pub offset: Option<u32>,
}

pub fn build_sql_query(options: QueryOptions) -> Result<(String, String), Error>
📜 Review details

Configuration used: .coderabbit.yaml
Review profile: CHILL
Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 9a4d414 and 754805e.

📒 Files selected for processing (1)
  • crates/torii/core/src/model.rs (4 hunks)
🔇 Additional comments (2)
crates/torii/core/src/model.rs (2)

Line range hint 177-193: Be cautious of SQLite's GROUP_CONCAT limitations and JOIN performance.

Two potential concerns to consider:

  1. SQLite's GROUP_CONCAT has a default length limit which might truncate results for entities with many models
  2. The unconditional JOIN with model_relation_table could impact performance for queries that don't need model IDs

Let's verify the usage of GROUP_CONCAT in the codebase:

Consider:

  1. Setting appropriate group_concat_max_len in your SQLite configuration
  2. Making the model_relation_table JOIN conditional based on whether model IDs are needed

200-205: Verify count query behavior with GROUP_CONCAT.

The combination of COUNT(DISTINCT) with GROUP_CONCAT might affect query performance and could produce unexpected results.

Let's check for similar query patterns in the codebase:

Comment on lines +511 to +517
"entity_model",
"internal_entity_id",
None,
None,
None,
None,
None,
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🛠️ Refactor suggestion

Ohayo! Enhance test coverage for new query features.

The test case has been updated with the new parameters, but it doesn't verify the behavior of GROUP BY, HAVING clause, or model_ids aggregation.

Add test cases to verify:

  1. GROUP BY behavior with aggregated model_ids
  2. HAVING clause filtering
  3. Count query results with different combinations of parameters
#[test]
fn test_query_with_aggregation() {
    // Test with model_ids aggregation
}

#[test]
fn test_query_with_having() {
    // Test HAVING clause filtering
}

Comment on lines 212 to 218
query += &format!(" GROUP BY {table_name}.id");

if let Some(having_clause) = having_clause {
query += &format!(" HAVING {}", having_clause);
count_query += &format!(" HAVING {}", having_clause);
}

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🛠️ Refactor suggestion

Consider making GROUP BY conditional and document HAVING clause behavior.

The unconditional GROUP BY might affect performance for simple queries that don't need grouping. Also, applying the HAVING clause to the count query might lead to unexpected total counts.

Consider:

  1. Making GROUP BY conditional based on whether aggregation is needed
  2. Documenting the impact of HAVING clause on count queries
- query += &format!(" GROUP BY {table_name}.id");
+ if !selections.iter().any(|s| s.contains("group_concat")) {
+     query += &format!(" GROUP BY {table_name}.id");
+ }

Committable suggestion skipped: line range outside the PR's diff.

Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 0

♻️ Duplicate comments (1)
crates/torii/core/src/model.rs (1)

512-518: 🛠️ Refactor suggestion

Ohayo sensei! Enhance test coverage for new query features.

The test case has been updated with the new parameters, but it doesn't verify the behavior of GROUP BY, HAVING clause, or model_ids aggregation.

🧹 Nitpick comments (2)
crates/torii/core/src/model.rs (2)

177-177: Consider making model_ids aggregation conditional.

The unconditional GROUP_CONCAT might impact performance for queries that don't need model IDs aggregation.

-    selections.push(format!("group_concat({model_relation_table}.model_id) as model_ids"));
+    if needs_model_ids {  // Add a parameter to control this
+        selections.push(format!("group_concat({model_relation_table}.model_id) as model_ids"));
+    }

191-194: Consider making model relation join conditional.

The JOIN with model_relation_table is always added, which might be unnecessary for queries that don't need model IDs.

-    joins.push(format!(
-        "JOIN {model_relation_table} ON {table_name}.id = {model_relation_table}.entity_id"
-    ));
+    if needs_model_ids {  // Add a parameter to control this
+        joins.push(format!(
+            "JOIN {model_relation_table} ON {table_name}.id = {model_relation_table}.entity_id"
+        ));
+    }
📜 Review details

Configuration used: .coderabbit.yaml
Review profile: CHILL
Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 9a4d414 and 55ed216.

📒 Files selected for processing (1)
  • crates/torii/core/src/model.rs (4 hunks)
🔇 Additional comments (4)
crates/torii/core/src/model.rs (4)

200-203: LGTM! Correct approach for count query construction.

The subquery approach ensures accurate counts when using GROUP BY.


210-216: Consider making GROUP BY conditional.

The GROUP BY clause is always added, which might impact performance for queries that don't need aggregation.


217-219: LGTM! Proper subquery closure.

The implementation correctly closes the subquery for the count query.


123-126: Ohayo! Verify the impact of new parameters on existing callers.

The addition of model_relation_table and having_clause parameters might affect existing callers of this function.

✅ Verification successful

Ohayo! The new parameters are properly handled by all callers, sensei!

All callers of build_sql_query are already providing the required model_relation_table parameter and correctly handling the optional having_clause parameter (which is passed as None when not needed). The function is called in:

  • crates/torii/grpc/src/server/mod.rs (4 calls)
  • crates/torii/core/src/model.rs (1 test call)
🏁 Scripts executed

The following scripts were executed for the analysis:

Script:

#!/bin/bash
# Description: Find all callers of build_sql_query to verify parameter updates
rg -A 5 "build_sql_query\(" --type rust

Length of output: 2505

Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

♻️ Duplicate comments (1)
crates/torii/core/src/model.rs (1)

514-520: 🛠️ Refactor suggestion

Ohayo sensei! Enhance test coverage for new query features.

The test case has been updated with the new parameters, but it doesn't verify the behavior of GROUP BY, HAVING clause, or model_ids aggregation.

🧹 Nitpick comments (2)
crates/torii/core/src/model.rs (2)

177-177: Ohayo sensei! Consider documenting SQLite-specific functionality.

The GROUP_CONCAT function is SQLite-specific. Consider adding a comment to document this dependency for maintainability.

+    // Using SQLite's GROUP_CONCAT to aggregate model IDs
     selections.push(format!("group_concat({model_relation_table}.model_id) as model_ids"));

211-218: Consider making GROUP BY conditional.

The GROUP BY clause is always added, which might not be necessary for all queries. Consider making it conditional based on whether aggregation is needed.

-    query += &format!(" GROUP BY {table_name}.id");
-    count_query += &format!(" GROUP BY {table_name}.id");
+    if selections.iter().any(|s| s.contains("group_concat")) {
+        query += &format!(" GROUP BY {table_name}.id");
+        count_query += &format!(" GROUP BY {table_name}.id");
+    }
📜 Review details

Configuration used: .coderabbit.yaml
Review profile: CHILL
Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 55ed216 and b0d3683.

📒 Files selected for processing (1)
  • crates/torii/core/src/model.rs (4 hunks)
🔇 Additional comments (2)
crates/torii/core/src/model.rs (2)

123-126: Ohayo! Clean function signature update.

The new parameters are well-organized and maintain a logical grouping of related parameters.


200-204: Ohayo! Verify count query performance.

The count query uses a subquery with GROUP_CONCAT, which might impact performance with large datasets. Consider verifying the performance impact.

Comment on lines +191 to +194
joins.push(format!(
"JOIN {model_relation_table} ON {table_name}.id = {model_relation_table}.entity_id"
));

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue

Consider using LEFT JOIN for model relations.

Using JOIN might exclude entities without model relations. Consider using LEFT JOIN to ensure all entities are included in the result set.

-    joins.push(format!(
-        "JOIN {model_relation_table} ON {table_name}.id = {model_relation_table}.entity_id"
-    ));
+    joins.push(format!(
+        "LEFT JOIN {model_relation_table} ON {table_name}.id = {model_relation_table}.entity_id"
+    ));
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
joins.push(format!(
"JOIN {model_relation_table} ON {table_name}.id = {model_relation_table}.entity_id"
));
joins.push(format!(
"LEFT JOIN {model_relation_table} ON {table_name}.id = {model_relation_table}.entity_id"
));

Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 2

🧹 Nitpick comments (1)
crates/torii/grpc/src/server/mod.rs (1)

1153-1158: Ohayo, sensei! Consider extracting nested clause handling.

The nested composite clause handling logic could be extracted into a separate function for better readability and maintainability.

Consider creating a dedicated function:

fn handle_nested_composite(
    nested: &proto::types::CompositeClause,
    table: &str,
    model_relation_table: &str,
    entity_updated_after: Option<String>,
) -> Result<(Vec<String>, Vec<String>, Vec<String>), Error> {
    let (nested_where, nested_having, nested_values) = build_composite_clause(
        table,
        model_relation_table,
        nested,
        entity_updated_after,
    )?;
    
    Ok((
        if nested_where.is_empty() { vec![] } else { vec![nested_where] },
        if nested_having.is_empty() { vec![] } else { vec![nested_having] },
        nested_values,
    ))
}

Also applies to: 1161-1164

📜 Review details

Configuration used: .coderabbit.yaml
Review profile: CHILL
Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between b0d3683 and ee9f43a.

📒 Files selected for processing (1)
  • crates/torii/grpc/src/server/mod.rs (10 hunks)
🔇 Additional comments (2)
crates/torii/grpc/src/server/mod.rs (2)

267-276: Ohayo, sensei! Verify parameter binding order.

The sequential binding of values could lead to incorrect parameter mapping if the order of bind_values doesn't match the query placeholders.

#!/bin/bash
# Description: Check query parameter placeholders and their order
rg -A 5 "SELECT.*FROM\s+$EVENT_MESSAGES_HISTORICAL_TABLE" | grep -E '\?'

338-354: Ohayo, sensei! Ensure correct data types for limit and offset.

Converting limit and offset to strings before binding could cause type mismatches with SQLite's expectations.

"INSTR(group_concat({model_relation_table}.model_id), '{:#x}') > 0",
model_id
));
having_clauses.push(format!("INSTR(model_ids, '{:#x}') > 0", model_id));
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue

Ohayo, sensei! Potential SQL injection risk in having clause.

Direct string interpolation of model_id in the having clause could be vulnerable to SQL injection. Consider using parameterized queries here as well.

Apply this diff to use parameterized queries:

-having_clauses.push(format!("INSTR(model_ids, '{:#x}') > 0", model_id));
+having_clauses.push("INSTR(model_ids, ?) > 0".to_string());
+bind_values.push(format!("{:#x}", model_id));
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
having_clauses.push(format!("INSTR(model_ids, '{:#x}') > 0", model_id));
having_clauses.push("INSTR(model_ids, ?) > 0".to_string());
bind_values.push(format!("{:#x}", model_id));

Comment on lines +1035 to +1040
let model_ids = row
.get::<String, _>("model_ids")
.split(',')
.map(|id| Felt::from_str(id).map_err(ParseError::FromStr))
.collect::<Result<Vec<_>, _>>()?;

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🛠️ Refactor suggestion

Ohayo, sensei! Add validation for model IDs.

The code assumes that all model IDs in the comma-separated string are valid. Consider adding validation to handle malformed model IDs gracefully.

Apply this diff to add validation:

     let model_ids = row
         .get::<String, _>("model_ids")
         .split(',')
+        .filter(|id| !id.is_empty())  // Skip empty strings
         .map(|id| Felt::from_str(id).map_err(ParseError::FromStr))
         .collect::<Result<Vec<_>, _>>()?;
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
let model_ids = row
.get::<String, _>("model_ids")
.split(',')
.map(|id| Felt::from_str(id).map_err(ParseError::FromStr))
.collect::<Result<Vec<_>, _>>()?;
let model_ids = row
.get::<String, _>("model_ids")
.split(',')
.filter(|id| !id.is_empty()) // Skip empty strings
.map(|id| Felt::from_str(id).map_err(ParseError::FromStr))
.collect::<Result<Vec<_>, _>>()?;

Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

♻️ Duplicate comments (4)
crates/torii/grpc/src/server/mod.rs (4)

350-354: ⚠️ Potential issue

Ohayo, sensei! Use proper numeric types for limit and offset.

Converting limit and offset to strings before binding could lead to type conversion issues. Consider binding them as integers.

-bind_values.push(limit.to_string());
+bind_values.push(limit);

-bind_values.push(offset.to_string());
+bind_values.push(offset);

1127-1127: ⚠️ Potential issue

Ohayo, sensei! SQL injection vulnerability in having clause.

Direct string interpolation of model_id in the having clause is unsafe. Use parameterized queries instead.

-having_clauses.push(format!("INSTR(model_ids, '{:#x}') > 0", model_id));
+having_clauses.push("INSTR(model_ids, ?) > 0".to_string());
+bind_values.push(format!("{:#x}", model_id));

1035-1040: ⚠️ Potential issue

Ohayo, sensei! Add validation for model IDs parsing.

The current implementation assumes all model IDs in the comma-separated string are valid. Add validation to handle malformed or empty model IDs.

 let model_ids = row
     .get::<String, _>("model_ids")
     .split(',')
+    .filter(|id| !id.trim().is_empty())
     .map(|id| Felt::from_str(id).map_err(ParseError::FromStr))
     .collect::<Result<Vec<_>, _>>()?;

1149-1149: ⚠️ Potential issue

Ohayo, sensei! Sanitize column name references.

Direct string interpolation of model and member names in SQL queries could be unsafe. Consider adding validation or using a query builder that handles escaping.

-push(format!("([{model}].[{}] {comparison_operator} ?)", member.member));
+push(format!("([{}].[{}] {comparison_operator} ?)", 
+    sanitize_identifier(&model), 
+    sanitize_identifier(&member.member)));

+fn sanitize_identifier(id: &str) -> String {
+    // Add proper SQL identifier sanitization
+    id.replace(['[', ']', '"', '`'], "_")
+}
🧹 Nitpick comments (1)
crates/torii/grpc/src/server/mod.rs (1)

338-344: Ohayo, sensei! Consider using parameterized queries for hashed keys.

The current implementation converts Felt values to hexadecimal strings before binding. Consider using parameterized queries to handle the conversion at the database level for better type safety.

-bind_values = hashed_keys
-    .hashed_keys
-    .iter()
-    .map(|key| format!("{:#x}", Felt::from_bytes_be_slice(key)))
-    .collect::<Vec<_>>()
+bind_values = hashed_keys
+    .hashed_keys
+    .iter()
+    .map(|key| Felt::from_bytes_be_slice(key).to_string())
+    .collect::<Vec<_>>()
📜 Review details

Configuration used: .coderabbit.yaml
Review profile: CHILL
Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between ee9f43a and 496364f.

📒 Files selected for processing (1)
  • crates/torii/grpc/src/server/mod.rs (10 hunks)
🔇 Additional comments (3)
crates/torii/grpc/src/server/mod.rs (3)

267-276: Ohayo, sensei! Verify bind values order matches SQL placeholders.

The function now accepts a vector of bind values and binds them sequentially. Ensure that the order of values in bind_values matches the order of placeholders in the SQL query to prevent data misalignment.

#!/bin/bash
# Description: Check SQL query placeholders order
# Look for SQL queries that use the bind_values parameter

rg -U "fetch_historical_event_messages.*bind_values.*\n.*\n.*SELECT.*FROM.*WHERE" 

1153-1158: Ohayo, sensei! Well-structured nested clause handling.

The recursive handling of nested composite clauses is clean and maintains proper bind value propagation.


1190-1193: Ohayo, sensei! Robust error handling implementation.

The error handling for composite clauses is well-implemented, properly propagating errors and maintaining clean error boundaries.

);

let mut db_query = sqlx::query_as(&query);
println!("query: {}", query);
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue

Ohayo, sensei! Remove debug print statement.

Debug print statements should not be in production code. Consider using the logging framework instead.

-println!("query: {}", query);
+log::debug!("Executing query: {}", query);
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
println!("query: {}", query);
log::debug!("Executing query: {}", query);

Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 0

🧹 Nitpick comments (2)
crates/torii/grpc/src/server/mod.rs (2)

1165-1170: Ohayo, sensei! Enhance error handling for nested composites.

Consider adding more specific error handling for nested composite clauses to provide better error messages and debugging information.

Apply this diff to improve error handling:

 let (nested_where, nested_having, nested_values) = build_composite_clause(
     table,
     model_relation_table,
     nested,
     entity_updated_after.clone(),
-)?;
+).map_err(|e| Error::from(QueryError::CompositeClauseError(format!("Failed to build nested clause: {}", e))))?;

Line range hint 1185-1196: Ohayo, sensei! Simplify where clause construction.

The where clause construction can be made more concise using Option's map and unwrap_or.

Apply this diff to simplify the code:

-let where_clause = if !where_clauses.is_empty() {
-    format!(
-        "{} {}",
-        where_clauses.join(if is_or { " OR " } else { " AND " }),
-        if let Some(entity_updated_after) = entity_updated_after.clone() {
-            bind_values.push(entity_updated_after);
-            format!("AND {table}.updated_at >= ?")
-        } else {
-            String::new()
-        }
-    )
-} else if let Some(entity_updated_after) = entity_updated_after.clone() {
-    bind_values.push(entity_updated_after);
-    format!("{table}.updated_at >= ?")
-} else {
-    String::new()
-};
+let where_clause = {
+    let mut parts = Vec::new();
+    if !where_clauses.is_empty() {
+        parts.push(where_clauses.join(if is_or { " OR " } else { " AND " }));
+    }
+    if let Some(updated_after) = entity_updated_after.clone() {
+        bind_values.push(updated_after);
+        parts.push(format!("{table}.updated_at >= ?"));
+    }
+    parts.join(" AND ")
+};
📜 Review details

Configuration used: .coderabbit.yaml
Review profile: CHILL
Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 78bc49e and 343d1e0.

📒 Files selected for processing (1)
  • crates/torii/grpc/src/server/mod.rs (10 hunks)
🔇 Additional comments (6)
crates/torii/grpc/src/server/mod.rs (6)

267-276: Ohayo, sensei! Verify parameter binding order.

The order of binding parameters in fetch_historical_event_messages needs to be verified to ensure they match the SQL query placeholders.


338-354: Ohayo, sensei! Ensure correct data type binding.

Converting limit and offset to strings before binding could cause type mismatches with the SQL query.


357-384: Ohayo, sensei! Consider refactoring for better maintainability.

The query construction and execution logic could be split into smaller helper functions.


501-511: Ohayo, sensei! Consider performance implications.

Loading all models when entity_models is empty could impact performance with large datasets.


719-719: Ohayo, sensei! Remove debug print statement.

Debug print statements should be replaced with proper logging.


1202-1205: Ohayo, sensei! Having clause implementation looks good!

The having clause construction is clean and straightforward.

Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 0

♻️ Duplicate comments (4)
crates/torii/grpc/src/server/mod.rs (4)

338-354: ⚠️ Potential issue

Ohayo, sensei! Use appropriate data types for LIMIT and OFFSET.

Converting limit and offset to strings before binding could lead to type conversion issues in SQLite. Consider binding them directly as integers.

Apply this diff to fix the type conversion:

-bind_values.push(limit.to_string());
+bind_values.push(limit.to_string()); // TODO: Use proper numeric binding
-bind_values.push(offset.to_string());
+bind_values.push(offset.to_string()); // TODO: Use proper numeric binding

399-403: ⚠️ Potential issue

Ohayo, sensei! Protect against SQL injection in HAVING clause.

Direct string interpolation of model_id in the HAVING clause could be vulnerable to SQL injection. Use parameterized queries instead.

Apply this diff to use parameterized queries:

-            .map(|model| format!("INSTR(model_ids, '{:#x}') > 0", model))
+            .map(|model| {
+                bind_values.push(format!("{:#x}", model));
+                "INSTR(model_ids, ?) > 0".to_string()
+            })

726-726: ⚠️ Potential issue

Ohayo, sensei! Replace debug print with proper logging.

Debug print statements should not be in production code.

Apply this diff to use the logging framework:

-println!("query: {}", query);
+log::debug!("Executing query: {}", query);

1054-1059: ⚠️ Potential issue

Ohayo, sensei! Add validation for model IDs.

The code assumes all model IDs in the comma-separated string are valid. Consider adding validation to handle malformed or empty model IDs.

Apply this diff to add validation:

     let model_ids = row
         .get::<String, _>("model_ids")
         .split(',')
+        .filter(|id| !id.trim().is_empty())  // Skip empty strings
         .map(|id| Felt::from_str(id).map_err(ParseError::FromStr))
         .collect::<Result<Vec<_>, _>>()?;
🧹 Nitpick comments (1)
crates/torii/grpc/src/server/mod.rs (1)

1140-1149: Ohayo, sensei! Remove commented out code.

The commented out code for model checks is no longer needed since top-level entity models are being used. Remove it to improve code cleanliness.

Apply this diff to clean up the code:

-                // NOTE: disabled since we are now using the top level entity models
-
-                // for model in &keys.models {
-                //     let (namespace, model_name) = model
-                //         .split_once('-')
-                //         .ok_or(QueryError::InvalidNamespacedModel(model.clone()))?;
-                //     let model_id = compute_selector_from_names(namespace, model_name);
-
-                //     having_clauses.push(format!("INSTR(model_ids, '{:#x}') > 0", model_id));
-                // }
📜 Review details

Configuration used: .coderabbit.yaml
Review profile: CHILL
Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 343d1e0 and 56e4e6d.

📒 Files selected for processing (1)
  • crates/torii/grpc/src/server/mod.rs (10 hunks)
🔇 Additional comments (2)
crates/torii/grpc/src/server/mod.rs (2)

267-276: Ohayo, sensei! Verify bind values order in query execution.

The function now accepts a vector of bind values and binds them sequentially. Ensure that the order of values in bind_values matches the order of placeholders in the SQL query to prevent incorrect parameter binding.

#!/bin/bash
# Description: Check query parameter placeholders order
# Look for SQL queries that use these bind values

rg -A 5 'fetch_historical_event_messages.*bind_values'

389-397: Ohayo, sensei! Consider performance impact of loading all models.

Loading all models when entity_models is empty could impact performance if there are many models in the cache.

Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 0

🧹 Nitpick comments (1)
crates/torii/grpc/src/server/mod.rs (1)

1140-1149: Ohayo, sensei! Remove commented out code.

If the code is no longer needed due to using top-level entity models, it should be removed rather than left commented out.

Remove the commented section to improve code maintainability.

📜 Review details

Configuration used: .coderabbit.yaml
Review profile: CHILL
Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 56e4e6d and 39002ac.

📒 Files selected for processing (1)
  • crates/torii/grpc/src/server/mod.rs (10 hunks)
🔇 Additional comments (7)
crates/torii/grpc/src/server/mod.rs (7)

357-370: Ohayo, sensei! Well-structured count query implementation.

The count query implementation with proper bind value handling and early return optimization looks good.


1174-1180: Ohayo, sensei! Well-implemented nested composite clause handling.

The recursive handling of nested composite clauses with proper bind value management is well done.


727-727: ⚠️ Potential issue

Ohayo, sensei! Replace debug print with proper logging.

Debug print statements should be replaced with proper logging in production code.

Apply this diff:

-println!("query: {}", query);
+log::debug!("Executing query: {}", query);

Likely invalid or redundant comment.


399-403: ⚠️ Potential issue

Ohayo, sensei! SQL injection vulnerability in having clause.

Direct string interpolation of model_id in the having clause is unsafe.

Apply this diff to use parameterized queries:

-            .map(|model| format!("INSTR(model_ids, '{:#x}') > 0", model))
+            .map(|model| {
+                bind_values.push(format!("{:#x}", model));
+                "INSTR(model_ids, ?) > 0".to_string()
+            })

Likely invalid or redundant comment.


1055-1060: 🛠️ Refactor suggestion

Ohayo, sensei! Add validation for model IDs parsing.

The current implementation could be more robust in handling malformed or empty model IDs.

Apply this diff to add validation:

     let model_ids = row
         .get::<String, _>("model_ids")
         .split(',')
+        .filter(|id| !id.trim().is_empty())  // Skip empty strings
         .map(|id| Felt::from_str(id).map_err(ParseError::FromStr))
         .collect::<Result<Vec<_>, _>>()?;

Likely invalid or redundant comment.


338-354: ⚠️ Potential issue

Ohayo, sensei! Ensure proper type handling for numeric bind values.

Converting limit and offset to strings before binding could lead to type mismatches in SQL queries. Consider using native numeric types for binding.

Apply this diff to fix the type handling:

-bind_values.push(limit.to_string());
+bind_values.push(limit); // Let SQLx handle the type conversion

-bind_values.push(offset.to_string());
+bind_values.push(offset); // Let SQLx handle the type conversion

Likely invalid or redundant comment.


267-276: Ohayo, sensei! Verify bind values order in query execution.

The refactoring to use bind_values improves query parameter handling, but ensure that the values are bound in the correct order to match the query placeholders.

✅ Verification successful

Ohayo, sensei! The bind values order is correctly maintained.

The verification shows that:

  1. The SQL query is constructed with placeholders in a specific order in the calling code
  2. The bind_values are applied in the same order through the for loop
  3. limit and offset parameters are bound last, maintaining consistency with the query structure

The implementation ensures that bind values match their corresponding placeholders in the SQL query, as evidenced by the two call sites where the function is used with properly formatted queries and corresponding bind values.

🏁 Scripts executed

The following scripts were executed for the analysis:

Script:

#!/bin/bash
# Description: Check query parameter binding order
# Look for all usages of fetch_historical_event_messages to verify bind_values construction

rg -A 5 'fetch_historical_event_messages.*bind_values'

Length of output: 56


Script:

#!/bin/bash
# Let's try a broader search to understand the context better
# First, let's find the query construction and usage
rg -B 5 -A 10 'fetch_historical_event_messages'

# Also search for the SQL query construction
ast-grep --pattern 'sqlx::query_as($_)'

Length of output: 10204


Script:

#!/bin/bash
# Let's also check where this function is called from
rg -B 3 -A 3 'historical_event_messages'

Length of output: 1617


Script:

#!/bin/bash
# Check the full implementation of the function
ast-grep --pattern 'async fn fetch_historical_event_messages($$$) -> Result<$$$> {
  $$$
}'

Length of output: 2778

Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 0

♻️ Duplicate comments (2)
crates/torii/grpc/src/server/mod.rs (2)

360-364: ⚠️ Potential issue

Ohayo, sensei! SQL injection risk in having clause.

Direct string interpolation of model IDs in the having clause could be vulnerable to SQL injection.

Apply this diff to use parameterized queries:

-            .map(|model| format!("INSTR(model_ids, '{:#x}') > 0", model))
+            .map(|model| {
+                bind_values.push(format!("{:#x}", model));
+                "INSTR(model_ids, ?) > 0".to_string()
+            })

1025-1030: 🛠️ Refactor suggestion

Ohayo, sensei! Add validation for model IDs.

The code should handle malformed model IDs gracefully.

    let model_ids = row
        .get::<String, _>("model_ids")
        .split(',')
+       .filter(|id| !id.is_empty())  // Skip empty strings
        .map(|id| Felt::from_str(id).map_err(ParseError::FromStr))
        .collect::<Result<Vec<_>, _>>()?;
🧹 Nitpick comments (1)
crates/torii/grpc/src/server/mod.rs (1)

1109-1119: Ohayo, sensei! Remove commented out code.

If the code is no longer needed due to using top-level entity models, it should be removed rather than left commented out. This improves code maintainability.

📜 Review details

Configuration used: .coderabbit.yaml
Review profile: CHILL
Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 39002ac and e05122f.

📒 Files selected for processing (1)
  • crates/torii/grpc/src/server/mod.rs (10 hunks)
🔇 Additional comments (7)
crates/torii/grpc/src/server/mod.rs (7)

267-276: Ohayo, sensei! Parameter binding order looks good.

The refactoring to use bind_values improves query parameterization. The binding order is now more explicit with the iterative binding approach.


338-347: Ohayo, sensei! Clean implementation of parameter binding.

The implementation properly handles both hashed keys and entity_updated_after values, with correct hex formatting for database compatibility.


350-358: Ohayo, sensei! Consider performance implications when loading models.

Loading all models from the cache without pagination could impact performance if there are many models.


366-376: Ohayo, sensei! Well-structured query building.

The build_sql_query call properly handles optional clauses and maintains clean separation of concerns.


697-697: Ohayo, sensei! Remove debug print statement.

Debug print statements should not be in production code.

-println!("query: {}", query);
+log::debug!("Executing query: {}", query);

1144-1150: Ohayo, sensei! Well-implemented nested composite clause handling.

The recursive approach for nested composite clauses is clean and maintains proper value binding throughout the hierarchy.


Line range hint 1157-1168: Ohayo, sensei! Clean implementation of where clause construction.

The code properly handles logical operators and temporal constraints while maintaining proper parameter binding.

Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 0

🧹 Nitpick comments (1)
crates/torii/core/src/sql/mod.rs (1)

769-777: Ohayo sensei! Consider using a combined index if queries filter on both columns together.

Registering separate indices for 'internal_entity_id' and 'internal_event_message_id' is a good practice for performance. However, if there's a frequent query pattern that joins or filters on both columns simultaneously, a composite index can further boost query efficiency.

📜 Review details

Configuration used: .coderabbit.yaml
Review profile: CHILL
Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 124bc8f and d5c4e74.

📒 Files selected for processing (1)
  • crates/torii/core/src/sql/mod.rs (1 hunks)

Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 0

🧹 Nitpick comments (2)
crates/torii/core/src/model.rs (2)

123-126: Ohayo sensei! Consider using a builder pattern for query parameters.

The function now has 8 parameters which could make it harder to maintain and use correctly. Consider using a builder pattern or options struct to improve maintainability.

pub struct QueryOptions<'a> {
    pub schemas: &'a Vec<Ty>,
    pub table_name: &'a str,
    pub model_relation_table: &'a str,
    pub entity_relation_column: &'a str,
    pub where_clause: Option<&'a str>,
    pub having_clause: Option<&'a str>,
    pub order_by: Option<&'a str>,
    pub limit: Option<u32>,
    pub offset: Option<u32>,
}

pub fn build_sql_query(options: QueryOptions) -> Result<(String, String), Error> {
    // ... existing implementation using options
}

177-177: Document model_ids aggregation format and consider database portability.

The GROUP_CONCAT function is SQLite-specific. Consider:

  1. Documenting the expected format of the aggregated model_ids
  2. Adding a comment about database compatibility
+    // Note: Uses SQLite's GROUP_CONCAT function to aggregate model_ids into a comma-separated string
     selections.push(format!("group_concat({model_relation_table}.model_id) as model_ids"));
📜 Review details

Configuration used: .coderabbit.yaml
Review profile: CHILL
Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between d5c4e74 and 3a1136c.

📒 Files selected for processing (1)
  • crates/torii/core/src/model.rs (4 hunks)
🔇 Additional comments (3)
crates/torii/core/src/model.rs (3)

191-194: Consider using LEFT JOIN for model relations.

Using INNER JOIN might exclude entities without model relations.


514-520: Ohayo! Enhance test coverage for new query features.

The test case has been updated with the new parameters, but it doesn't verify the behavior of GROUP BY, HAVING clause, or model_ids aggregation.

Also applies to: 525-525


200-204: Monitor performance of count query with subquery.

The count query with subquery ensures accurate results but might impact performance with large datasets. Consider:

  1. Adding monitoring for query execution time
  2. Evaluating if a simpler count query could be used when HAVING clause is not present

Also applies to: 211-221

✅ Verification successful

Performance concerns for count query are mitigated by existing optimizations

Ohayo sensei! After careful investigation, I can confirm that the count query's performance is well-optimized:

  1. The query benefits from existing indexes on key columns:

    • idx_entity_model_entity_id and idx_entity_model_model_id for entity-model relationships
    • idx_entities_keys for efficient key lookups
    • Table-specific indices are automatically created for internal_entity_id and internal_event_message_id
  2. The query includes pagination support (LIMIT/OFFSET) to handle large datasets efficiently

  3. The implementation is validated by integration tests that exercise real-world scenarios with multiple entities and models

The subquery approach is necessary for maintaining consistency with the main query results, and the performance impact is minimized through proper indexing.

🏁 Scripts executed

The following scripts were executed for the analysis:

Script:

#!/bin/bash
# Check if there are any existing EXPLAIN QUERY PLAN results in the codebase
rg -A 5 "EXPLAIN QUERY PLAN.*COUNT.*FROM.*\(SELECT.*group_concat.*\)"

Length of output: 71


Script:

#!/bin/bash
# Let's check for query performance related tests or benchmarks
rg -i "bench.*count.*query" -A 5

# Look for any performance related comments around count queries
rg -i "(perf|performance).*count.*query" -A 5

# Check if there are any existing query optimizations or performance tuning
rg -i "(optimize|tuning).*count.*query" -A 5

# Look for actual usage of this count query pattern
rg "COUNT\(\*\).*FROM.*\(SELECT.*group_concat.*model_id\)" -A 5

Length of output: 656


Script:

#!/bin/bash
# Let's check the context around this query to understand its usage better
rg -B 10 -A 10 "SELECT COUNT\(\*\) FROM \(SELECT.*group_concat.*model_id\)"

# Check if there are any tests that exercise this query
fd "test.*\.rs" --exec rg -l "count_query"

# Look for any error handling or logging around this query
rg "count_query.*(?:error|fail|log)" -A 5

Length of output: 1703


Script:

#!/bin/bash
# Let's check the test file that uses count_query
cat ./crates/torii/core/src/sql/test.rs

# Look for any existing pagination or limit clauses that might affect performance
rg -B 5 -A 5 "LIMIT|OFFSET" ./crates/torii/core/src/model.rs

# Check if there are any indexes defined for these queries
rg -i "create.*index" -A 3

Length of output: 21492

Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 0

🧹 Nitpick comments (1)
crates/torii/core/src/model.rs (1)

123-126: Ohayo! Consider using a builder pattern or options struct for parameters.

The function has many parameters which could make it harder to maintain and use correctly. Consider introducing a builder pattern or options struct to improve maintainability.

Example implementation:

pub struct SqlQueryOptions<'a> {
    pub schemas: &'a Vec<Ty>,
    pub table_name: &'a str,
    pub model_relation_table: &'a str,
    pub entity_relation_column: &'a str,
    pub where_clause: Option<&'a str>,
    pub having_clause: Option<&'a str>,
    pub order_by: Option<&'a str>,
    pub limit: Option<u32>,
    pub offset: Option<u32>,
}

pub fn build_sql_query(options: SqlQueryOptions) -> Result<(String, String), Error> {
    // ... existing implementation using options
}
📜 Review details

Configuration used: .coderabbit.yaml
Review profile: CHILL
Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 3a1136c and 759c5b8.

📒 Files selected for processing (1)
  • crates/torii/core/src/model.rs (4 hunks)
🔇 Additional comments (3)
crates/torii/core/src/model.rs (3)

191-194: Consider using LEFT JOIN for model relations.

Using JOIN might exclude entities without model relations. Consider using LEFT JOIN to ensure all entities are included in the result set.


514-520: Enhance test coverage for new query features.

The test case has been updated with the new parameters, but it doesn't verify the behavior of GROUP BY, HAVING clause, or model_ids aggregation.

Also applies to: 525-535


200-204: 🛠️ Refactor suggestion

Ohayo sensei! Consider optimizing the GROUP BY usage.

The unconditional GROUP BY might affect performance for simple queries that don't need grouping. Consider making it conditional based on whether aggregation is needed.

- query += &format!(" GROUP BY {table_name}.id");
+ if !selections.iter().any(|s| s.contains("group_concat")) {
+     query += &format!(" GROUP BY {table_name}.id");
+ }

Also, verify that including model_ids in the count subquery doesn't impact performance for large datasets.

Also applies to: 211-212

Copy link

codecov bot commented Dec 21, 2024

Codecov Report

Attention: Patch coverage is 37.86408% with 128 lines in your changes missing coverage. Please review.

Project coverage is 56.28%. Comparing base (c0b8033) to head (feecbdb).
Report is 4 commits behind head on main.

Files with missing lines Patch % Lines
crates/torii/grpc/src/server/mod.rs 27.84% 127 Missing ⚠️
crates/torii/core/src/sql/cache.rs 75.00% 1 Missing ⚠️
Additional details and impacted files
@@            Coverage Diff             @@
##             main    #2817      +/-   ##
==========================================
+ Coverage   55.92%   56.28%   +0.35%     
==========================================
  Files         439      439              
  Lines       56156    56276     +120     
==========================================
+ Hits        31408    31677     +269     
+ Misses      24748    24599     -149     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

@glihm glihm merged commit 3c32891 into dojoengine:main Dec 21, 2024
13 of 14 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants