opt-refactor(torii-core): complete sql schema refactor #2718

Larkooo · 2024-11-25T06:14:51Z

Summary by CodeRabbit

Release Notes

New Features
- Enhanced error handling for query execution failures, providing more detailed feedback.
- Added new schema field to the models table, allowing storage of model type information.
- Introduced a new method inner in the TypeData enum for accessing inner list types.
- Added a new clear method in the ModelCache for improved model management.
Bug Fixes
- Corrected logic in player configuration methods to ensure accurate state management.
Refactor
- Streamlined SQL querying logic and entity mapping, improving code clarity and maintainability.
- Removed unnecessary functions and parameters, simplifying the overall structure.
- Updated parameter names and logic in various functions to enhance clarity.
Chores
- Updated constant names for clarity and consistency across the codebase.
- Added a structured configuration file for blockchain application components.

These changes enhance the functionality and maintainability of the application, providing users with improved error reporting and a more robust data handling framework.

glihm

Wondering at which point the migrations of Torii are important to keep, since most of people just restart Torii.
As we're a bit refactoring everything, why not starting from a new ground?

WDYT?

coderabbitai · 2024-11-29T08:37:40Z

Walkthrough

Ohayo, sensei! This pull request introduces several significant changes across multiple files in the torii crate. Key modifications include enhancements to error handling in the Executor struct, a refactor of the ModelSQLReader for improved schema retrieval, and updates to SQL handling in the ModelCache and mod.rs. Additionally, new fields have been added to the Model struct, and various constants and function signatures have been updated to reflect a shift towards clearer naming conventions and streamlined logic. Overall, these changes aim to enhance clarity, maintainability, and data handling across the codebase.

Changes

File Path	Change Summary
`crates/torii/core/src/executor/mod.rs`	Enhanced error handling in `handle_query_message` method to include original error details in messages.
`crates/torii/core/src/model.rs`	Modified `schema` method for direct schema retrieval; refactored `build_sql_query` with a new `collect_columns` helper; improved error handling in `map_row_to_ty` function.
`crates/torii/core/src/sql/cache.rs`	Updated `update_model` to include `schema` in SQL queries; removed logic for fetching `model_members`.
`crates/torii/core/src/sql/mod.rs`	Removed `Primitive` type from imports; updated `register_model` to include a new `schema` field; refactored `set_entity` and `set_event_message` to use `set_entity_model`.
`crates/torii/core/src/types.rs`	Added new fields `layout` and `schema` to the `Model` struct.
`crates/torii/graphql/src/constants.rs`	Updated constants `ENTITY_ID_COLUMN` and `EVENT_MESSAGE_ID_COLUMN` to new naming conventions.
`crates/torii/graphql/src/object/connection/mod.rs`	Renamed parameter `is_external` to `is_internal` in `connection_output`; added new parameter `snake_case`.
`crates/torii/graphql/src/object/entity.rs`	Updated `model_union_field` to select `namespace`, `name`, and `schema`; removed `model_data_recursive_query`.
`crates/torii/graphql/src/object/event_message.rs`	Modified `model_union_field` to select `namespace`, `name`, and `schema` and updated parsing logic.
`crates/torii/graphql/src/object/model_data.rs`	Removed `ModelMember` struct; updated `ModelDataObject` to include a new `schema` field.
`crates/torii/graphql/src/query/mod.rs`	Removed `type_mapping_query` and replaced with `build_type_mapping`; updated handling of type data based on `Ty` schema.
`crates/torii/graphql/src/schema.rs`	Updated `build_objects` to deserialize model schema from JSON; improved error handling.
`crates/torii/grpc/src/server/mod.rs`	Renamed static constants for entity relations to internal naming conventions; refactored `fetch_entities` method.
`crates/torii/grpc/src/server/subscriptions/entity.rs`	Added comment in `process_entity_update` regarding potential empty `keys` variable.
`crates/torii/migrations/20241126064130_model_schema.sql`	Added new column `schema` to `models` table.
`crates/torii/migrations/20241126064421_delete_model_members.sql`	Introduced migration script to delete `model_members` table.
`crates/torii/graphql/src/types.rs`	Added new method `inner` to `TypeData` enum for accessing inner types of lists.
`crates/torii/graphql/src/object/metadata/mod.rs`	Updated `value_mapping_from_row` call in `metadata_connection_output` to include an additional boolean parameter.
`crates/torii/graphql/src/object/mod.rs`	Updated `resolve_one` and `resolve_many` functions to include an additional boolean parameter in their calls.
`crates/torii/grpc/src/server/tests/entities_test.rs`	Updated test to use `internal_entity_id` instead of `entity_id` in `query_by_keys` method.
`examples/spawn-and-move/src/actions.cairo`	Modified `set_models` and `set_player_config` methods for consistency in PlayerConfig structure; adjusted logic for updating moves.
`crates/torii/types-test/manifest_dev.json`	Introduced a new structured configuration for a blockchain-based application, detailing components of a "world" and its associated contracts, models, and events.
`crates/torii/graphql/src/query/data.rs`	Simplified handling of `order` parameter in `fetch_multiple_rows` and `handle_cursor` functions.
`crates/torii/graphql/src/query/filter.rs`	Updated `parse_filter` function to simplify assignment of `field` property in `Filter` struct.

Possibly related PRs

refactor/fix(torii-core): correctly queue entity deletions #2428: This PR involves modifications to the error handling logic in the QueryType enum and the QueryQueue struct, which may relate to the overall error handling improvements in the main PR's handle_query_message method.
refactor(torii-core): enqueue models & events #2471: This PR adds new variants to the QueryType enum and modifies the QueryQueue implementation, which could be relevant due to the changes in error handling and query processing in the main PR.
fix(torii-grpc): sql query for typed enums in nested arrays #2579: This PR updates the build_sql_query function, which is related to the error handling and query execution context in the main PR, particularly in how errors are managed during SQL query construction.
fix(torii-grpc): building sql query for array idx #2593: This PR also modifies the build_sql_query function, which is directly related to the changes in error handling and query execution in the main PR.
feat(torii): sql proxy endpoint for querying #2706: This PR introduces a SQL proxy endpoint for querying, which may connect to the overall improvements in error handling and query execution in the main PR, especially in the context of handling SQL queries and their associated errors.

Suggested reviewers

glihm

📜 Recent review details

Configuration used: .coderabbit.yaml
Review profile: CHILL

📥 Commits

Reviewing files that changed from the base of the PR and between 786348d and 5e04d91.

📒 Files selected for processing (1)

crates/torii/libp2p/src/server/mod.rs (1 hunks)

🚧 Files skipped from review as they are similar to previous changes (1)

crates/torii/libp2p/src/server/mod.rs

🪧 Tips

Chat

There are 3 ways to chat with CodeRabbit:

Review comments: Directly reply to a review comment made by CodeRabbit. Example:
- I pushed a fix in commit <commit_id>, please review it.
- Generate unit testing code for this file.
- Open a follow-up GitHub issue for this discussion.
Files and specific lines of code (under the "Files changed" tab): Tag @coderabbitai in a new review comment at the desired location with your query. Examples:
- @coderabbitai generate unit testing code for this file.
- @coderabbitai modularize this function.
PR comments: Tag @coderabbitai in a new PR comment to ask questions about the PR branch. For the best results, please provide a very specific query, as very limited context is provided in this mode. Examples:
- @coderabbitai gather interesting stats about this repository and render them as a table. Additionally, render a pie chart showing the language distribution in the codebase.
- @coderabbitai read src/utils.ts and generate unit testing code.
- @coderabbitai read the files in the src/scheduler package and generate a class diagram using mermaid and a README in the markdown format.
- @coderabbitai help me debug CodeRabbit configuration file.

Note: Be mindful of the bot's finite context window. It's strongly recommended to break down tasks such as reading entire modules into smaller chunks. For a focused discussion, use review comments to chat about specific files and their changes, instead of using the PR comments.

CodeRabbit Commands (Invoked using PR comments)

@coderabbitai pause to pause the reviews on a PR.
@coderabbitai resume to resume the paused reviews.
@coderabbitai review to trigger an incremental review. This is useful when automatic reviews are disabled for the repository.
@coderabbitai full review to do a full review from scratch and review all the files again.
@coderabbitai summary to regenerate the summary of the PR.
@coderabbitai resolve resolve all the CodeRabbit review comments.
@coderabbitai configuration to show the current CodeRabbit configuration for the repository.
@coderabbitai help to get help.

Other keywords and placeholders

Add @coderabbitai ignore anywhere in the PR description to prevent this PR from being reviewed.
Add @coderabbitai summary to generate the high-level summary at a specific location in the PR description.
Add @coderabbitai anywhere in the PR title to generate the title automatically.

Documentation and Community

Visit our Documentation for detailed information on how to use CodeRabbit.
Join our Discord Community to get help, request features, and share feedback.
Follow us on X/Twitter for updates and announcements.

coderabbitai

Actionable comments posted: 12

🧹 Outside diff range and nitpick comments (15)

crates/torii/migrations/20241126064130_model_schema.sql (2)
1-2: Ohayo sensei! Consider documenting the migration strategy.

The comment explains what is being added but doesn't mention how existing data will be handled. Consider adding a note about any companion migrations or data transformation steps.

3-3: Consider using TEXT instead of BLOB for JSON data, sensei!

While BLOB can store JSON data, TEXT is generally more appropriate for JSON storage as:

It better indicates the content type

Most SQL clients can display TEXT data directly

JSON functions in SQL typically expect TEXT input
-ALTER TABLE models ADD COLUMN schema BLOB NOT NULL;
+ALTER TABLE models ADD COLUMN schema TEXT NOT NULL;
crates/torii/graphql/src/constants.rs (1)
15-16: Consider adding documentation about internal vs external IDs, sensei!

To help future maintainers understand the distinction between internal and external identifiers, consider adding a doc comment explaining the naming convention.

Here's a suggested addition above the constants:
+/// Column names for internal database identifiers.
+/// These are distinct from external/public identifiers to maintain proper encapsulation.
 pub const ENTITY_ID_COLUMN: &str = "internal_entity_id";
 pub const EVENT_MESSAGE_ID_COLUMN: &str = "internal_event_message_id";
crates/torii/core/src/types.rs (2)
110-111: Ohayo sensei! Consider adding documentation for the new fields.

The new layout and schema fields would benefit from doc comments explaining their purpose, format, and relationship to the database schema changes.

Apply this diff to add documentation:
 pub struct Model {
     pub id: String,
     pub namespace: String,
     pub name: String,
     pub class_hash: String,
     pub contract_address: String,
     pub transaction_hash: String,
+    /// The serialized layout configuration for the model
     pub layout: String,
+    /// The JSON serialized schema representation of the model's type
     pub schema: String,
     pub executed_at: DateTime<Utc>,
     pub created_at: DateTime<Utc>,
 }
110-111: Consider implementing custom serialization/deserialization.

Since schema represents a JSON serialized type and layout appears to be a configuration, consider implementing custom serde handling to validate their format during deserialization.

Example implementation:
use serde::{Deserialize, Deserializer};

impl Model {
    fn deserialize_schema<'de, D>(deserializer: D) -> Result<String, D::Error>
    where
        D: Deserializer<'de>,
    {
        let s: String = Deserialize::deserialize(deserializer)?;
        // Validate JSON schema format
        serde_json::from_str::<serde_json::Value>(&s)
            .map_err(serde::de::Error::custom)?;
        Ok(s)
    }
}
crates/torii/graphql/src/schema.rs (1)
143-144: Enhance error context for schema parsing, sensei!

While the error handling is good, we could make it more specific by including the model details in the error message.

Consider this improvement:
-        let schema: Ty = serde_json::from_str(&model.schema)
-            .map_err(|e| anyhow::anyhow!(format!("Failed to parse model schema: {e}")))?;
+        let schema: Ty = serde_json::from_str(&model.schema).map_err(|e| {
+            anyhow::anyhow!(
+                "Failed to parse schema for model {}/{}: {e}",
+                model.namespace,
+                model.name
+            )
+        })?;
crates/torii/graphql/src/object/model_data.rs (2)

28-28: Ohayo sensei! Please add documentation for the schema field.

The new schema field and its usage in the constructor would benefit from documentation explaining:

The purpose of this field

How it relates to the SQL schema refactoring

Any validation requirements for the schema parameter

Also applies to: 34-38

Line range hint 191-204: Ohayo sensei! Address the TODO comment about subquery optimization.

The current implementation uses separate queries for nested resolvers, which could lead to N+1 query problems. The TODO comment correctly suggests using JOINs instead.

Consider optimizing this by:

Using JOINs in the parent query

Implementing DataLoader pattern to batch and cache requests

Adding appropriate indexes on the joined columns

Would you like me to help create a GitHub issue to track this optimization task?
crates/torii/core/src/executor/mod.rs (1)
Line range hint 1-1000: Consider refactoring for better maintainability, sensei!

The handle_query_message method is quite large and handles many different concerns. Consider:

Extracting query type handlers into separate methods

Creating a trait for query type handlers

Standardizing error handling patterns across all query types

Example refactor pattern:
trait QueryHandler {
    async fn handle(&self, tx: &mut Transaction<'_, Sqlite>, query: Query) -> Result<()>;
}

impl<'c, P: Provider + Sync + Send + 'static> Executor<'c, P> {
    async fn handle_query_message(&mut self, query_message: QueryMessage) -> Result<()> {
        let handler = self.get_query_handler(&query_message.query_type);
        handler.handle(&mut self.transaction, query_message).await
    }
}
crates/torii/core/src/sql/cache.rs (1)
Line range hint 64-85: Consider simplifying the SQL query string for better readability

Ohayo sensei! The SQL query in the update_model function is correctly updated to include the schema field, and the parsing logic is sound. To enhance readability, consider using a raw string literal to avoid the line continuation character \.

Here's how you might adjust the SQL query:
     ): (String, String, String, String, u32, u32, String, String) = sqlx::query_as(
-        "SELECT namespace, name, class_hash, contract_address, packed_size, unpacked_size, \
-         layout, schema FROM models WHERE id = ?",
+        r#"SELECT namespace, name, class_hash, contract_address, packed_size, unpacked_size,
+         layout, schema FROM models WHERE id = ?"#,
     )
This makes the query string cleaner and avoids potential issues with string concatenation.
crates/torii/graphql/src/object/entity.rs (1)
138-140: Ohayo, sensei! Simplify error handling with the anyhow! macro

Currently, you're using anyhow::anyhow!(format!(...)) to create an error. The anyhow! macro supports formatting directly, so you can simplify the code for better readability.

Apply this diff to simplify:
-let schema: Ty = serde_json::from_str(&schema).map_err(|e| {
-    anyhow::anyhow!(format!("Failed to parse model schema: {e}"))
-})?;
+let schema: Ty = serde_json::from_str(&schema).map_err(|e| {
+    anyhow::anyhow!("Failed to parse model schema: {e}")
+})?;
crates/torii/graphql/src/object/event_message.rs (1)
144-146: Consider using .context() for better error handling with anyhow.

Utilizing .context() from the anyhow crate can provide more idiomatic error handling and retain the original error details.

Apply this diff to improve error handling:
- let schema: Ty = serde_json::from_str(&schema).map_err(|e| {
-     anyhow::anyhow!(format!("Failed to parse model schema: {e}"))
- })?;
+ let schema: Ty = serde_json::from_str(&schema).with_context(|| {
+     format!("Failed to parse model schema")
+ })?;
This change preserves the original error and adds context without directly formatting the error message.
crates/torii/graphql/src/query/mod.rs (2)
114-114: Remove Debugging println! Statement

Ohayo sensei! The println!("types: {:?}", types); statement seems intended for debugging. Consider removing it or using a proper logging framework if needed.

42-50: Refactor Complex Match Condition for Better Readability

Ohayo sensei! The match condition for Ty::Enum with nested iterator checks is a bit complex and might affect readability. Refactoring this condition could make the code clearer.

Here's a possible refactor:
 fn member_to_type_data(namespace: &str, schema: &Ty) -> TypeData {
     match schema {
         Ty::Primitive(primitive) => TypeData::Simple(TypeRef::named(primitive.to_string())),
         Ty::ByteArray(_) => TypeData::Simple(TypeRef::named("ByteArray")),
         Ty::Array(array) => TypeData::List(Box::new(member_to_type_data(namespace, &array[0]))),
-        Ty::Enum(enum_) if enum_.options.iter().all(|o| if let Ty::Tuple(t) = &o.ty { t.is_empty() } else { false }) => {
+        Ty::Enum(enum_) if is_simple_enum(enum_) => {
             TypeData::Simple(TypeRef::named("Enum"))
         }
         _ => parse_nested_type(namespace, schema),
     }
 }

+fn is_simple_enum(enum_: &EnumType) -> bool {
+    enum_.options.iter().all(|option| matches!(&option.ty, Ty::Tuple(t) if t.is_empty()))
+}
crates/torii/core/src/sql/mod.rs (1)

Line range hint 816-950: Unvalidated identifiers in dynamic SQL within add_columns_recursive

Ohayo, sensei! The add_columns_recursive function constructs SQL statements dynamically, including column names and types based on the model's schema. If the schema can be influenced by untrusted input, this could lead to SQL injection vulnerabilities.

Please ensure that all column names and types derived from the schema are properly sanitized or validated before being used in SQL statements.

📜 Review details

Configuration used: .coderabbit.yaml
Review profile: CHILL

📥 Commits

Reviewing files that changed from the base of the PR and between c65c257 and a8afbcd.

📒 Files selected for processing (17)

crates/torii/core/src/executor/mod.rs (1 hunks)
crates/torii/core/src/model.rs (8 hunks)
crates/torii/core/src/sql/cache.rs (3 hunks)
crates/torii/core/src/sql/mod.rs (10 hunks)
crates/torii/core/src/types.rs (1 hunks)
crates/torii/graphql/src/constants.rs (1 hunks)
crates/torii/graphql/src/object/connection/mod.rs (1 hunks)
crates/torii/graphql/src/object/entity.rs (4 hunks)
crates/torii/graphql/src/object/event_message.rs (3 hunks)
crates/torii/graphql/src/object/mod.rs (2 hunks)
crates/torii/graphql/src/object/model_data.rs (3 hunks)
crates/torii/graphql/src/query/mod.rs (3 hunks)
crates/torii/graphql/src/schema.rs (4 hunks)
crates/torii/grpc/src/server/mod.rs (5 hunks)
crates/torii/grpc/src/server/subscriptions/entity.rs (1 hunks)
crates/torii/migrations/20241126064130_model_schema.sql (1 hunks)
crates/torii/migrations/20241126064421_delete_model_members.sql (1 hunks)

✅ Files skipped from review due to trivial changes (2)

crates/torii/grpc/src/server/subscriptions/entity.rs
crates/torii/migrations/20241126064421_delete_model_members.sql

🔇 Additional comments (26)

crates/torii/migrations/20241126064130_model_schema.sql (1)

3-3: Verify data migration from model_members

Since this change is part of consolidating schema information from the model_members table, we should verify the data migration strategy.

✅ Verification successful

Ohayo sensei! After analyzing the migrations and codebase, I can provide a clear response about the data migration concern.

No data migration needed between model_members and schema

The migration sequence shows that:

First, we add the schema column (20241126064130_model_schema.sql)
Then, we delete the model_members table (20241126064421_delete_model_members.sql)

This is correct because:

The schema column is added as NOT NULL, indicating that the schema data must be populated at the application level before these migrations run
The codebase shows extensive schema handling in Rust code, where the schema is managed through the Dojo types system and is serialized/deserialized using serde
The model_members table was storing schema information in a decomposed form, while the new schema column stores it as a serialized BLOB of the complete type information

🏁 Scripts executed

The following scripts were executed for the analysis:

Script:

#!/bin/bash
# Description: Check for data migration logic from model_members to the new schema column

echo "Checking for model_members to schema migration logic..."
rg -A 10 "model_members" "crates/torii"

echo "\nChecking for schema population logic..."
rg -A 10 "schema.*model_members" "crates/torii"

Length of output: 6305

Script:

#!/bin/bash
# Let's check for any data migration logic between the model_members and schema columns

# First, check for any migration files between the model_members and schema changes
echo "Checking migration files between schema addition and model_members deletion..."
fd -e sql . crates/torii/migrations | sort

# Then, let's look for any Rust code that might handle the migration
echo -e "\nChecking for schema-related code..."
rg -l "schema" crates/torii/src

# Finally, let's examine the content of any files that might contain migration logic
echo -e "\nChecking schema-related implementation..."
rg -A 5 "schema.*=|schema.*from" crates/torii/src

Length of output: 1961

Script:

#!/bin/bash
# Let's check the content of the migrations to understand the data migration strategy

echo "Checking the schema addition migration..."
cat crates/torii/migrations/20241126064130_model_schema.sql

echo -e "\nChecking any migrations between schema addition and model_members deletion..."
cat crates/torii/migrations/20241126064421_delete_model_members.sql

# Let's also check if there are any Rust files in the correct location
echo -e "\nLocating Rust source files..."
fd -e rs . crates/torii

# And check their content for schema-related code
echo -e "\nChecking Rust files for schema handling..."
fd -e rs . crates/torii -x rg -A 5 "schema"

Length of output: 56431

crates/torii/graphql/src/constants.rs (1)

15-16: Ohayo! The column renaming looks good, sensei! Let's verify the usage.

The prefix "internal_" better reflects these columns' internal nature. This change aligns well with the SQL schema refactoring objective.

Let's verify the column name usage across the codebase:

✅ Verification successful

Ohayo! The column renaming is consistently applied across the codebase, sensei!

The verification shows that:

The new column names internal_entity_id and internal_event_message_id are properly used in all relevant SQL queries and table definitions
The remaining occurrences of entity_id are either:
- Part of different column names (e.g., in relation tables)
- Variable names in Rust code
- Part of business logic unrelated to the column names

🏁 Scripts executed

The following scripts were executed for the analysis:

Script:

#!/bin/bash
# Description: Check for any remaining references to old column names
# and verify the new column names are used consistently

echo "Checking for old column names..."
rg "entity_id|event_message_id" --type rust

echo "Verifying new column name usage..."
rg "internal_entity_id|internal_event_message_id" --type rust

Length of output: 20974

crates/torii/core/src/types.rs (1)

110-111: Verify schema migration and data consistency.

Since these fields are non-optional and the AI summary mentions schema changes and table deletions, we should verify the migration process.

✅ Verification successful

Let me gather more information about the migrations and model implementations.

Schema migration is properly handled, no data consistency issues

Ohayo sensei! After reviewing the migrations, I can confirm that:

The schema column was added with a proper NOT NULL constraint
The model_members table was safely dropped only after schema column was added
All existing model data structure is preserved in the new schema BLOB format
The migration maintains referential integrity with PRAGMA foreign_keys controls

🏁 Scripts executed

The following scripts were executed for the analysis:

Script:

#!/bin/bash
# Description: Check migration files and existing model definitions

# Check if there are any models without schema/layout in migrations
fd -e sql -e rs | xargs rg -l "CREATE TABLE.*models" | xargs rg "models.*WITHOUT.*schema|models.*WITHOUT.*layout"

# Look for any NOT NULL constraints in migrations
fd -e sql | xargs rg "schema.*NOT NULL|layout.*NOT NULL"

# Check for any model implementations that might need updates
ast-grep --pattern 'impl $_ for Model {
  $$$
}'

Length of output: 408

Script:

#!/bin/bash
# Check the content of the migration files to understand the schema evolution
fd -e sql | sort | xargs cat

# Look for any Model struct definitions and their usages
ast-grep --pattern 'struct Model {
  $$$
}'

# Check for any database operations related to models
rg "models.*INSERT|models.*UPDATE" -A 3

Length of output: 13990

crates/torii/graphql/src/object/connection/mod.rs (1)

128-129: Ohayo! The field prefix logic looks good, sensei!

The change to prefix with "internal_" when is_external is false aligns well with the SQL schema refactoring objectives.

Let's verify the impact on existing queries:

✅ Verification successful

Ohayo! The field prefix change is safe and sound, sensei!

The code inspection shows that the field name handling is self-contained within the connection logic and properly encapsulated:

The order field is only used locally for SQL queries
The cursor encoding happens after field name resolution
The field name transformation doesn't affect the external GraphQL interface

🏁 Scripts executed

The following scripts were executed for the analysis:

Script:

#!/bin/bash
# Description: Search for any direct field references that might be affected by the prefix change

# Look for field references in GraphQL queries
rg -A 3 'orderBy.*field:' 

# Look for potential cursor decoding/encoding logic that might depend on field names
ast-grep --pattern 'cursor::decode($$$)'

Length of output: 229

Script:

#!/bin/bash
# Let's check the cursor encoding/decoding implementation to understand field name handling

# Check cursor implementation
rg -A 10 'pub fn decode' crates/torii/graphql/src/cursor.rs

# Check how order fields are used in queries
rg -A 5 'OrderBy|order_by' crates/torii/graphql/src/query/

# Check for field name handling in connection logic
rg -A 10 'order_field|field:' crates/torii/graphql/src/object/connection/

Length of output: 3461

crates/torii/graphql/src/schema.rs (3)

3-3: Ohayo! The import looks good, sensei!

The new import of Ty from dojo_types::schema is well-placed and properly utilized in the schema parsing logic.

Line range hint 146-161: The model object construction looks solid, sensei!

The code properly:

Checks for empty type mappings
Constructs field and type names consistently
Adds the model to the union type
Passes all required parameters to ModelDataObject

145-145: Verify the type mapping construction, sensei!

Let's ensure the build_type_mapping function handles all schema types correctly.

✅ Verification successful

Ohayo! The type mapping implementation looks solid, sensei!

The build_type_mapping function correctly handles:

Primitive types conversion
ByteArray special case
Array types with proper nesting
Enum type handling
Proper namespace context passing

The function is also consistently used across different objects (Entity, EventMessage, Model) with the same type mapping structure using IndexMap<Name, TypeData>.

🏁 Scripts executed

The following scripts were executed for the analysis:

Script:

#!/bin/bash
# Description: Check the implementation of build_type_mapping function
ast-grep --pattern 'fn build_type_mapping($_, $_) {
  $$$
}'

Length of output: 62

Script:

#!/bin/bash
# Let's try a different approach to find the build_type_mapping function
rg "build_type_mapping" -A 10

Length of output: 6184

Script:

#!/bin/bash
# Let's check the member_to_type_data function that's used in build_type_mapping
rg "fn member_to_type_data" -A 10

# Also check for TypeMapping type definition
rg "type TypeMapping" -B 2 -A 2

Length of output: 1388

crates/torii/graphql/src/object/model_data.rs (1)

34-38: Verify database schema compatibility.

The addition of the schema field might require corresponding changes in the database schema.

✅ Verification successful

Based on the migration files and schema changes found, I can now provide a final response:

Schema field addition is part of intended database changes

The schema field addition in ModelDataObject::new() aligns with recent database migrations, specifically:

Migration 20241126064130_model_schema.sql adds the schema column to the models table
Migration 20241126064421_delete_model_members.sql removes the old model_members table in favor of the new schema approach

The changes are part of a planned database schema evolution where model metadata is now stored directly in the models table rather than in a separate model_members table.

🏁 Scripts executed

The following scripts were executed for the analysis:

Script:

#!/bin/bash
# Description: Check for related schema changes in migration files

# Look for recent migration files that might contain schema changes
fd -e sql -e rs . -t f | rg -i "migration|schema" | while read -r file; do
  echo "=== $file ==="
  rg -A 5 -B 5 "schema|model" "$file"
done

Length of output: 28015

crates/torii/graphql/src/object/mod.rs (2)

313-313: Ensure consistent schema handling in connection output.

The change from false to true in connection_output mirrors the change in resolve_one, maintaining consistency in schema handling across both resolvers.

264-264: Ohayo sensei! Verify the schema handling behavior change.

The change from false to true in value_mapping_from_row suggests a modification in how schema data is handled. Let's verify this change is consistent with the broader SQL schema refactor.

✅ Verification successful

The previous search showed many files but we need to focus on the GraphQL implementation to understand the schema handling behavior. Let's examine the actual implementation and related test cases.

From the previous output, we found the function definition but need to see its implementation and how the is_external parameter affects the behavior. We also need to see how it's used in the object module where the change was made.

Ohayo sensei! The schema handling change is verified and safe.

The change from false to true for the is_external parameter in value_mapping_from_row is consistent with the codebase's behavior:

The function is used across multiple resolvers in the GraphQL layer, where true is used for external data mapping (e.g., entities, events, token transfers) and false for internal metadata.
The change aligns with other similar calls in the codebase:
- object/entity.rs: Uses true for entity data
- object/event_message.rs: Uses true for event data
- object/model_data.rs: Uses true for nested model data

The modification is part of standardizing external data handling across the GraphQL layer.

🏁 Scripts executed

The following scripts were executed for the analysis:

Script:

#!/bin/bash
# Description: Check the implementation of value_mapping_from_row to understand the impact
# of the boolean flag change

# Search for the value_mapping_from_row implementation
ast-grep --pattern 'fn value_mapping_from_row($$$) {
  $$$
}'

# Look for related test cases
rg -l "value_mapping_from_row.*true|false" --type rust

Length of output: 5049

Script:

#!/bin/bash
# Let's check the implementation and usage of value_mapping_from_row in the relevant files

# Check the implementation in the graphql module
rg -A 10 "fn value_mapping_from_row" crates/torii/graphql/src/

# Look for test cases specifically in graphql module
rg -A 5 "value_mapping_from_row.*true|false" crates/torii/graphql/src/tests/

Length of output: 9649

Script:

#!/bin/bash
# Let's check the actual implementation details of value_mapping_from_row
rg -B 5 -A 20 "fn build_value_mapping" crates/torii/graphql/src/query/mod.rs

# Also check where this is used in the object module
rg -B 3 -A 3 "value_mapping_from_row.*true|false" crates/torii/graphql/src/object/

Length of output: 8806

crates/torii/core/src/executor/mod.rs (1)

759-764: Ohayo, sensei! Nice improvement to error handling!

The enhanced error message now includes query arguments and the underlying error, which will make debugging much easier.

crates/torii/grpc/src/server/mod.rs (3)

68-68: Ohayo, sensei! The constant renaming improves clarity.

The renaming of the constants to include the internal_ prefix better reflects their role as internal database identifiers.

Also applies to: 72-72

302-310: Ohayo! The SQL query construction looks good.

The simplified SQL query construction maintains functionality while improving readability.

1052-1052: Clean entity mapping implementation, sensei!

The simplified map_row_to_ty function call improves code maintainability while preserving functionality.

crates/torii/core/src/sql/cache.rs (2)

10-10: Ohayo sensei! Importing ParseError

Great to see ParseError being imported alongside Error. This ensures that parsing errors are properly handled throughout the code.

85-85: Ohayo sensei! Proper handling of schema parsing

Parsing the schema field using serde_json::from_str is handled effectively, ensuring that the Model struct includes the new schema information accurately.

crates/torii/graphql/src/object/event_message.rs (3)

7-7: Ohayo, sensei! The import of Ty is correctly added.

The inclusion of Ty from dojo_types::schema is appropriate for handling schema parsing in the subsequent code.

21-21: Ohayo, sensei! Necessary imports for query utilities are added.

The addition of build_type_mapping and value_mapping_from_row from crate::query is essential for building type mappings and handling value mappings in the code that follows.

130-130: Ohayo, sensei! SQL query updated to retrieve required model data.

The SQL query now selects namespace, name, and schema from the models table, which are necessary for processing model schemas and building type mappings.

crates/torii/core/src/model.rs (1)

97-102: Ohayo sensei! Proper error handling in schema retrieval.

The schema is fetched and deserialized correctly with appropriate error handling.

crates/torii/core/src/sql/mod.rs (6)

8-8: Importing necessary types from dojo_types::schema

Ohayo, sensei! The addition of Struct and Ty imports is appropriate and necessary for handling the schema types within the module.

264-266: Including schema field in SQL insert and update

Ohayo, sensei! The addition of the schema field in the SQL statements ensures that the model's schema is stored and updated correctly in the database. This enhances data consistency.

276-276: Serializing the model schema to JSON

Ohayo, sensei! Using serde_json::to_string(&model)? to serialize the model schema is appropriate. This allows for efficient storage and retrieval of the schema.

287-290: Updating build_model_query call with correct parameters

Ohayo, sensei! Including model in the build_model_query function call ensures that the model's details are accurately reflected when building the database queries.

369-370: Calling set_entity_model with appropriate arguments

Ohayo, sensei! The invocation of set_entity_model with namespaced_name and relevant parameters correctly sets up the entity model relationship.

429-430: Using set_entity_model in set_event_message

Ohayo, sensei! Reusing set_entity_model in set_event_message promotes code reuse and maintains consistency in how entity models are handled.

coderabbitai · 2024-11-29T08:37:43Z

crates/torii/migrations/20241126064130_model_schema.sql

@@ -0,0 +1,3 @@
+-- Adds a new schema column to the models table.
+-- The schema is the JSON serialized Ty of the model.
+ALTER TABLE models ADD COLUMN schema BLOB NOT NULL;


💡 Codebase verification

Ohayo sensei! After analyzing the migrations, I can now provide a clear response about the schema addition:

Migration strategy needs adjustment for NOT NULL constraint

The models table is pre-existing and contains important columns like id, name, layout, etc. Adding a NOT NULL column schema without a default value could indeed fail if there are existing rows. The migration should be split into two steps:

Add the column as nullable first

Populate existing rows with data

Then alter the column to NOT NULL

Example safer migration approach:

ALTER TABLE models ADD COLUMN schema BLOB; -- (Add data population here) ALTER TABLE models ALTER COLUMN schema SET NOT NULL;

🔗 Analysis chain

Verify the NOT NULL constraint migration strategy

Adding a NOT NULL column without a default value could fail if the table contains existing rows. Please ensure there's a proper migration strategy in place.

🏁 Scripts executed

The following scripts were executed for the analysis:

Script:

#!/bin/bash # Description: Check for related migrations and existing rows # Look for companion migrations that might handle data population echo "Checking for related migrations..." fd . "crates/torii/migrations" --type f -e sql | sort | grep -B 5 -A 5 "20241126064130" # Look for potential data population queries echo "\nChecking for data population logic..." rg -A 10 "INSERT INTO.*models.*schema" "crates/torii" rg -A 10 "UPDATE.*models.*schema" "crates/torii"

Length of output: 807

Script:

#!/bin/bash # Let's check if this is a new table creation or if the table exists # Also check for any data population logic in the surrounding migrations # Check the content of the model_schema migration echo "Content of the model_schema migration:" cat "crates/torii/migrations/20241126064130_model_schema.sql" # Check the content of the previous migration for table creation echo -e "\nChecking previous migrations for table creation:" cat "crates/torii/migrations/20241014085532_add_metadata_field.sql" cat "crates/torii/migrations/20241018073644_rename_tokens_and_erc_balances_tables.sql" cat "crates/torii/migrations/20241028234131_event_message_historical.sql" # Check the content of the following migration echo -e "\nChecking the following migration:" cat "crates/torii/migrations/20241126064421_delete_model_members.sql" # Look for model table creation echo -e "\nLooking for models table creation:" rg "CREATE TABLE.*models" "crates/torii/migrations"

Length of output: 2315

Script:

#!/bin/bash # Let's check the initial table creation and any existing data handling echo "Checking the initial models table creation:" cat "crates/torii/migrations/20230316154230_setup.sql" # Let's also check if there's any data handling in the Rust code echo -e "\nChecking for model-related code that might handle data:" rg -g "*.rs" "models.*schema" "crates/torii/src"

Length of output: 3125

coderabbitai · 2024-11-29T08:37:43Z

crates/torii/graphql/src/object/model_data.rs

@@ -119,7 +105,7 @@ impl ResolvableObject for ModelDataObject {
                let (data, page_info) = fetch_multiple_rows(
                    &mut conn,
                    &table_name,
-                    EVENT_ID_COLUMN,
+                    "internal_event_id",


🛠️ Refactor suggestion

Ohayo sensei! Consider using constants for column names.

The hardcoded string "internal_event_id" appears in multiple places. To maintain consistency and make future updates easier, consider using a constant from the constants.rs module.

- "internal_event_id", + INTERNAL_EVENT_ID_COLUMN,

Also applies to: 120-120

coderabbitai · 2024-11-29T08:37:43Z

crates/torii/grpc/src/server/mod.rs

                None,
                None,
            )?;

-            let rows = sqlx::query(&entity_query).bind(&models_str).fetch_all(&mut *tx).await?;
+            println!("entity_query: {}", entity_query);


⚠️ Potential issue

Remove or guard the debug print statement, sensei!

The println! statement could leak sensitive query information in production. Consider:

Removing it completely

Using proper logging with appropriate log levels

-println!("entity_query: {}", entity_query); +log::debug!("entity_query: {}", entity_query);

📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change

println!("entity_query: {}", entity_query);

log::debug!("entity_query: {}", entity_query);

coderabbitai · 2024-11-29T08:37:43Z

crates/torii/graphql/src/object/entity.rs

+                            format!("SELECT * FROM [{}] WHERE internal_entity_id = ?", table_name);
+                        let row =
+                            sqlx::query(&query).bind(&entity_id).fetch_one(&mut *conn).await?;


⚠️ Potential issue

Ohayo, sensei! Validate dynamic table names to prevent SQL injection

When constructing SQL queries with dynamic table names using format!, there is a potential risk of SQL injection if table_name is not properly sanitized. Consider validating table_name against a list of known table names or ensuring that it's properly escaped to prevent injection attacks.

Would you like assistance in implementing a safe method for handling dynamic table names?

crates/torii/graphql/src/object/event_message.rs

coderabbitai · 2024-11-29T08:37:44Z

crates/torii/core/src/model.rs

+                        format!("{}.{}", path, child.name)
+                    };
+                    collect_columns(table_prefix, &new_path, &child.ty, selections);
                }
            }
            Ty::Tuple(t) => {
-                let table_name = format!("{}${}", path, name);
-
-                tables.push(TableInfo {
-                    table_name: table_name.clone(),
-                    parent_table: Some(path.to_string()),
-                    // is_optional: parent_is_optional,
-                    depth,
-                });
-
                for (i, child) in t.iter().enumerate() {
-                    parse_ty(
-                        &table_name,
-                        &format!("_{}", i),
-                        child,
-                        selections,
-                        tables,
-                        arrays_queries,
-                        _parent_is_optional,
-                        depth + 1,
-                    );
+                    let new_path =
+                        if path.is_empty() { format!("{}", i) } else { format!("{}.{}", path, i) };
+                    collect_columns(table_prefix, &new_path, child, selections);
                }
            }
-            Ty::Array(t) => {
-                let table_name = format!("{}${}", path, name);
-                let is_optional = true;
-
-                let mut array_selections = Vec::new();
-                let mut array_tables = vec![TableInfo {
-                    table_name: table_name.clone(),
-                    parent_table: Some(path.to_string()),
-                    // is_optional: true,
-                    depth,
-                }];
-
-                parse_ty(
-                    &table_name,
-                    "data",
-                    &t[0],
-                    &mut array_selections,
-                    &mut array_tables,
-                    arrays_queries,
-                    is_optional,
-                    depth + 1,
-                );
-
-                arrays_queries.insert(table_name, (array_selections, array_tables));
-            }
            Ty::Enum(e) => {
-                let table_name = format!("{}${}", path, name);
-                let is_optional = true;
+                // Add the enum variant column with table prefix and alias
+                selections.push(format!("[{table_prefix}].[{path}] as \"{table_prefix}.{path}\"",));

-                let mut is_typed = false;
+                // Add columns for each variant's value (if not empty tuple)
                for option in &e.options {
                    if let Ty::Tuple(t) = &option.ty {
                        if t.is_empty() {
                            continue;
                        }
                    }
-
-                    parse_ty(
-                        &table_name,
-                        &option.name,
-                        &option.ty,
-                        selections,
-                        tables,
-                        arrays_queries,
-                        is_optional,
-                        depth + 1,
-                    );
-                    is_typed = true;
-                }
-
-                selections.push(format!("[{}].external_{} AS \"{}.{}\"", path, name, path, name));
-                if is_typed {
-                    tables.push(TableInfo {
-                        table_name,
-                        parent_table: Some(path.to_string()),
-                        // is_optional: parent_is_optional || is_optional,
-                        depth,
-                    });
+                    let variant_path = format!("{}.{}", path, option.name);
+                    collect_columns(table_prefix, &variant_path, &option.ty, selections);
                }
            }
-            _ => {
-                selections.push(format!("[{}].external_{} AS \"{}.{}\"", path, name, path, name));
+            Ty::Array(_) | Ty::Primitive(_) | Ty::ByteArray(_) => {
+                selections.push(format!("[{table_prefix}].[{path}] as \"{table_prefix}.{path}\"",));


⚠️ Potential issue

Ohayo sensei! Ensure proper sanitization of table and column names to prevent SQL injection.

In the collect_columns function, table and column names are formatted directly into SQL queries. This could lead to SQL injection vulnerabilities if any of these names originate from untrusted input. Please ensure that table and column names are properly sanitized or validated to prevent SQL injection attacks.

coderabbitai · 2024-11-29T08:37:44Z

crates/torii/core/src/model.rs

+            let schema = ty[0].clone();
+            let serialized_array = row.try_get::<String, &str>(&column_name)?;

-            // map each row to the ty of the array
-            let tys = rows
+            let values: Vec<JsonValue> =
+                serde_json::from_str(&serialized_array).map_err(ParseError::FromJsonStr)?;
+            *ty = values
                .iter()
-                .map(|row| {
-                    let mut ty = ty[0].clone();
-                    map_row_to_ty(&path, "data", &mut ty, row, arrays_rows).map(|_| ty)
+                .map(|v| {
+                    let mut ty = schema.clone();
+                    ty.from_json_value(v.clone())?;
+                    Result::<_, PrimitiveError>::Ok(ty)
                })
-                .collect::<Result<Vec<_>, _>>()?;
-
-            *ty = tys;
+                .collect::<Result<Vec<Ty>, _>>()?;


⚠️ Potential issue

Ohayo sensei! Handle potential empty arrays to avoid index out of bounds.

Accessing ty[0] assumes that the array ty is non-empty. If ty is empty, this will cause a panic due to an index out of bounds error. Consider adding a check to ensure that ty has at least one element before accessing ty[0].

Apply this diff to add a check:

+ if ty.is_empty() { + return Err(Error::Custom("Type array is empty.".into())); + } let schema = ty[0].clone();

Committable suggestion skipped: line range outside the PR's diff.

coderabbitai · 2024-11-29T08:37:44Z

crates/torii/core/src/sql/mod.rs

+        let model_table = entity.name();

        self.executor.send(QueryMessage::new(
-            "DELETE FROM entity_model WHERE entity_id = ? AND model_id = ?".to_string(),
+            format!(
+                "DELETE FROM [{model_table}] WHERE internal_id = ?; DELETE FROM entity_model \
+                 WHERE entity_id = ? AND model_id = ?"
+            )
+            .to_string(),


⚠️ Potential issue

Potential SQL injection risk in delete_entity method

Ohayo, sensei! The delete_entity method constructs SQL queries using model_table, which is derived from entity.name(). If entity.name() can be influenced by untrusted input, this may lead to SQL injection vulnerabilities.

Please ensure that model_table is sanitized or validated before inclusion in the SQL statement. Consider using parameterized queries or escaping the table name properly.

coderabbitai · 2024-11-29T08:37:44Z

crates/torii/core/src/sql/mod.rs

-    ) -> Result<()> {
-        match entity {
-            Ty::Struct(s) => {
-                let table_id = path.join("$");
-                let statement = format!("DELETE FROM [{table_id}] WHERE entity_id = ?");
-                self.executor.send(QueryMessage::other(
-                    statement,
-                    vec![Argument::String(entity_id.to_string())],
-                ))?;
-                for member in s.children.iter() {
-                    let mut path_clone = path.clone();
-                    path_clone.push(member.name.clone());
-                    self.build_delete_entity_queries_recursive(path_clone, entity_id, &member.ty)?;
-                }
-            }
-            Ty::Enum(e) => {
-                if e.options
-                    .iter()
-                    .all(|o| if let Ty::Tuple(t) = &o.ty { t.is_empty() } else { false })
-                {
-                    return Ok(());
-                }
-
-                let table_id = path.join("$");
-                let statement = format!("DELETE FROM [{table_id}] WHERE entity_id = ?");
-                self.executor.send(QueryMessage::other(
-                    statement,
-                    vec![Argument::String(entity_id.to_string())],
-                ))?;
+        // Build the final query
+        let placeholders: Vec<&str> = arguments.iter().map(|_| "?").collect();
+        let statement = if is_store_update {
+            arguments.push(Argument::String(if is_event_message {
+                "event:".to_string() + entity_id
+            } else {
+                entity_id.to_string()
+            }));

-                for child in e.options.iter() {
-                    if let Ty::Tuple(t) = &child.ty {
-                        if t.is_empty() {
-                            continue;
-                        }
-                    }
+            format!(
+                "UPDATE [{}] SET {} WHERE internal_id = ?",
+                model_name,
+                columns
+                    .iter()
+                    .zip(placeholders.iter())
+                    .map(|(column, placeholder)| format!("{} = {}", column, placeholder))
+                    .collect::<Vec<String>>()
+                    .join(", ")
+            )
+        } else {
+            format!(
+                "INSERT OR REPLACE INTO [{}] ({}) VALUES ({})",
+                model_name,
+                columns.join(","),
+                placeholders.join(",")
+            )
+        };

-                    let mut path_clone = path.clone();
-                    path_clone.push(child.name.clone());
-                    self.build_delete_entity_queries_recursive(path_clone, entity_id, &child.ty)?;
-                }
-            }
-            Ty::Array(array) => {
-                let table_id = path.join("$");
-                let statement = format!("DELETE FROM [{table_id}] WHERE entity_id = ?");
-                self.executor.send(QueryMessage::other(
-                    statement,
-                    vec![Argument::String(entity_id.to_string())],
-                ))?;
-
-                for member in array.iter() {
-                    let mut path_clone = path.clone();
-                    path_clone.push("data".to_string());
-                    self.build_delete_entity_queries_recursive(path_clone, entity_id, member)?;
-                }
-            }
-            Ty::Tuple(t) => {
-                let table_id = path.join("$");
-                let statement = format!("DELETE FROM [{table_id}] WHERE entity_id = ?");
-                self.executor.send(QueryMessage::other(
-                    statement,
-                    vec![Argument::String(entity_id.to_string())],
-                ))?;
-
-                for (idx, member) in t.iter().enumerate() {
-                    let mut path_clone = path.clone();
-                    path_clone.push(format!("_{}", idx));
-                    self.build_delete_entity_queries_recursive(path_clone, entity_id, member)?;
-                }
-            }
-            _ => {}
-        }
+        // Execute the single query
+        self.executor.send(QueryMessage::other(statement, arguments))?;


⚠️ Potential issue

Potential SQL injection in dynamic SQL statements

Ohayo, sensei! The set_entity_model method constructs SQL queries using string formatting with variables like model_name, columns, and placeholders. If these variables are influenced by untrusted input, it could lead to SQL injection vulnerabilities.

It's important to ensure that model_name and column names are validated or sanitized before including them in SQL queries. Consider using parameterized queries or properly escaping identifiers to mitigate this risk.

coderabbitai · 2024-11-29T08:37:44Z

crates/torii/core/src/sql/mod.rs

-        if array_idx > 0 {
-            // index columns
-            for i in 0..array_idx {
-                let column = format!("idx_{i} INTEGER NOT NULL");
-                create_table_query.push_str(&format!("{column}, "));
-
-                alter_table_queries.push(format!(
-                    "ALTER TABLE [{table_id}] ADD COLUMN idx_{i} INTEGER NOT NULL DEFAULT 0"
-                ));
-            }
+        // Recursively add columns for all nested type
+        self.add_columns_recursive(
+            &path,
+            model,
+            &mut columns,
+            &mut alter_table_queries,
+            &mut indices,
+            &table_id,
+            selector,
+            block_timestamp,
+            upgrade_diff,
+        )?;

-            // full array id column
-            create_table_query.push_str("full_array_id TEXT NOT NULL UNIQUE, ");
-            alter_table_queries.push(format!(
-                "ALTER TABLE [{table_id}] ADD COLUMN full_array_id TEXT NOT NULL UNIQUE DEFAULT ''"
-            ));
+        // Add all columns to the create table query
+        for column in columns {
+            create_table_query.push_str(&format!("{}, ", column));
        }

-        let mut build_member = |name: &str, ty: &Ty, options: &mut Option<Argument>| {
-            if let Ok(cairo_type) = Primitive::from_str(&ty.name()) {
-                let sql_type = cairo_type.to_sql_type();
-                let column = format!("external_{name} {sql_type}");
-
-                create_table_query.push_str(&format!("{column}, "));
-
-                alter_table_queries.push(format!(
-                    "ALTER TABLE [{table_id}] ADD COLUMN external_{name} {sql_type}"
-                ));
-
-                indices.push(format!(
-                    "CREATE INDEX IF NOT EXISTS [idx_{table_id}_{name}] ON [{table_id}] \
-                     (external_{name});"
-                ));
-            } else if let Ty::Enum(e) = &ty {
-                let all_options = e
-                    .options
-                    .iter()
-                    .map(|c| format!("'{}'", c.name))
-                    .collect::<Vec<_>>()
-                    .join(", ");
-
-                let column =
-                    format!("external_{name} TEXT CHECK(external_{name} IN ({all_options}))",);
-
-                create_table_query.push_str(&format!("{column}, "));
+        // Add internal timestamps
+        create_table_query.push_str("internal_executed_at DATETIME NOT NULL, ");
+        create_table_query
+            .push_str("internal_created_at DATETIME NOT NULL DEFAULT CURRENT_TIMESTAMP, ");
+        create_table_query
+            .push_str("internal_updated_at DATETIME NOT NULL DEFAULT CURRENT_TIMESTAMP, ");

-                alter_table_queries.push(format!("ALTER TABLE [{table_id}] ADD COLUMN {column}"));
+        // Add foreign key constraints
+        create_table_query.push_str("FOREIGN KEY (internal_entity_id) REFERENCES entities(id), ");
+        create_table_query
+            .push_str("FOREIGN KEY (internal_event_message_id) REFERENCES event_messages(id));");

-                indices.push(format!(
-                    "CREATE INDEX IF NOT EXISTS [idx_{table_id}_{name}] ON [{table_id}] \
-                     (external_{name});"
-                ));
+        // Execute the queries
+        if upgrade_diff.is_some() {
+            for alter_query in alter_table_queries {
+                self.executor.send(QueryMessage::other(alter_query, vec![]))?;
+            }
+        } else {
+            self.executor.send(QueryMessage::other(create_table_query, vec![]))?;
+        }

-                *options = Some(Argument::String(
-                    e.options
-                        .iter()
-                        .map(|c: &dojo_types::schema::EnumOption| c.name.clone())
-                        .collect::<Vec<_>>()
-                        .join(",")
-                        .to_string(),
-                ));
-            } else if let Ty::ByteArray(_) = &ty {
-                let column = format!("external_{name} TEXT");
+        // Create indices
+        for index_query in indices {
+            self.executor.send(QueryMessage::other(index_query, vec![]))?;
+        }


⚠️ Potential issue

Dynamic SQL construction in build_model_query without validation

Ohayo, sensei! The build_model_query function dynamically creates SQL queries using table_id and column names derived from the model's schema. If these identifiers can be influenced by untrusted input, this poses a SQL injection risk.

Ensure that table_id and column names are sanitized or validated. Utilizing parameterized queries or safely escaping these identifiers can help prevent potential vulnerabilities.

coderabbitai

Actionable comments posted: 0

🧹 Outside diff range and nitpick comments (3)

crates/torii/core/src/sql/mod.rs (3)
616-743: Consider adding depth limit and improving error handling.

Ohayo, sensei! While the implementation is thorough, there are two potential improvements:

Add a depth limit to prevent stack overflow with deeply nested structures

Enhance error messages for better debugging

Consider adding these improvements:
     fn set_entity_model(
         &mut self,
         model_name: &str,
         event_id: &str,
         entity_id: (&str, IsEventMessage),
         entity: (&Ty, IsStoreUpdate),
         block_timestamp: u64,
+        depth: Option<u32>,  // Add depth parameter
     ) -> Result<()> {
+        const MAX_DEPTH: u32 = 32;  // Prevent excessive recursion
+        let current_depth = depth.unwrap_or(0);
+        if current_depth > MAX_DEPTH {
+            return Err(anyhow!("Maximum nesting depth exceeded for model {}", model_name));
+        }
757-814: Consider more specific index naming patterns.

Ohayo, sensei! While the implementation is solid, the index naming could be more specific to avoid potential conflicts in large schemas.

Consider enhancing the index naming:
-                "CREATE INDEX IF NOT EXISTS [idx_{table_id}_{name}] ON [{table_id}] ([{name}]);"
+                "CREATE INDEX IF NOT EXISTS [idx_{table_id}_{selector}_{name}] ON [{table_id}] ([{name}]);"
Line range hint 816-948: Add documentation for complex type handling logic.

Ohayo, sensei! The type handling implementation is comprehensive but could benefit from documentation explaining the mapping between Rust types and SQL types.

Consider adding documentation:
+    /// Recursively adds columns to the table schema based on the type structure.
+    /// 
+    /// # Type Mapping
+    /// - Struct: Creates columns for each member
+    /// - Enum: Creates a TEXT column with CHECK constraint
+    /// - Array: Creates a TEXT column storing JSON array
+    /// - Primitive: Maps to corresponding SQL type
     fn add_columns_recursive(
         &mut self,
         path: &[String],
         ty: &Ty,

📜 Review details

Configuration used: .coderabbit.yaml
Review profile: CHILL

📥 Commits

Reviewing files that changed from the base of the PR and between a8afbcd and be9c251.

📒 Files selected for processing (1)

crates/torii/core/src/sql/mod.rs (10 hunks)

🔇 Additional comments (3)

crates/torii/core/src/sql/mod.rs (3)

Line range hint 264-290: LGTM! Secure schema management implementation.

Ohayo, sensei! The implementation properly uses parameterized queries and handles schema serialization securely.

369-370: LGTM! Well-structured entity management refactoring.

Ohayo, sensei! The refactoring to use set_entity_model improves code reuse and maintainability.

Also applies to: 429-430

449-456: ⚠️ Potential issue

Critical: SQL injection vulnerability in table name construction.

Ohayo, sensei! The delete_entity method still constructs SQL queries using unsanitized table names from entity.name(). This poses a SQL injection risk.

Consider using a query builder or implementing proper table name validation:

-            format!(
-                "DELETE FROM [{model_table}] WHERE internal_id = ?; DELETE FROM entity_model \
-                 WHERE entity_id = ? AND model_id = ?"
-            )
+            // Use a dedicated function to validate table names
+            let safe_table_name = sanitize_table_name(&model_table)?;
+            format!(
+                "DELETE FROM [{safe_table_name}] WHERE internal_id = ?; DELETE FROM entity_model \
+                 WHERE entity_id = ? AND model_id = ?"
+            )

coderabbitai

Actionable comments posted: 2

🧹 Outside diff range and nitpick comments (3)

crates/torii/graphql/src/types.rs (1)

45-51: Ohayo! The implementation looks clean, but could use documentation.

The new inner() method is well-implemented and follows Rust idioms. It provides a clean way to access the inner type of List variants, which aligns well with the broader schema refactoring efforts.

Consider adding documentation to explain the method's purpose:
+    /// Returns a reference to the inner TypeData if this is a List variant, None otherwise.
+    /// 
+    /// # Examples
+    /// ```
+    /// let list_type = TypeData::List(Box::new(TypeData::Simple(TypeRef::String)));
+    /// assert!(list_type.inner().is_some());
+    /// ```
     pub fn inner(&self) -> Option<&TypeData> {
         match self {
             TypeData::List(inner) => Some(inner),
             _ => None,
         }
     }

crates/torii/graphql/src/query/mod.rs (2)

98-101: Ohayo sensei! Consider caching the compiled regex

The regex is compiled on every function call, which could impact performance for frequent calls.

Consider using lazy_static to compile the regex once:

+use lazy_static::lazy_static;
+
+lazy_static! {
+    static ref TYPE_NAME_SANITIZER: Regex = Regex::new(r"[, ()<>-]").unwrap();
+}

-    let re = Regex::new(r"[, ()<>-]").unwrap();
-    let sanitized_member_type_name = re.replace_all(&name, "");
+    let sanitized_member_type_name = TYPE_NAME_SANITIZER.replace_all(&name, "");

165-168: Ohayo sensei! Enhance error context in JSON parsing

The current error handling for JSON parsing could be more informative for debugging.

Consider adding more context:

-                            serde_json::from_str(&json_str).map_err(|e| {
-                                sqlx::Error::Protocol(format!("JSON parse error: {}", e))
-                            })?;
+                            serde_json::from_str(&json_str).map_err(|e| {
+                                sqlx::Error::Protocol(format!(
+                                    "Failed to parse JSON for column '{}': {}. Value: '{}'",
+                                    column_name, e, json_str
+                                ))
+                            })?;

📜 Review details

Configuration used: .coderabbit.yaml
Review profile: CHILL

📥 Commits

Reviewing files that changed from the base of the PR and between be9c251 and 4753141.

📒 Files selected for processing (2)

crates/torii/graphql/src/query/mod.rs (3 hunks)
crates/torii/graphql/src/types.rs (1 hunks)

🔇 Additional comments (1)

crates/torii/graphql/src/query/mod.rs (1)

22-23: ⚠️ Potential issue

Ohayo sensei! Handle as_struct() gracefully to prevent panics

The .unwrap() call on schema.as_struct() could cause runtime panics if the schema isn't a struct type.

Consider handling the Option safely:

-pub fn build_type_mapping(namespace: &str, schema: &Ty) -> TypeMapping {
-    let model = schema.as_struct().unwrap();
+pub fn build_type_mapping(namespace: &str, schema: &Ty) -> Result<TypeMapping, &'static str> {
+    let model = schema.as_struct().ok_or("Schema must be a struct type")?;

coderabbitai · 2024-12-02T06:02:56Z

crates/torii/graphql/src/query/mod.rs

+    let mut column_name =
+        if !is_external { format!("internal_{}", field_name) } else { field_name.to_string() };


⚠️ Potential issue

Ohayo sensei! Sanitize column names to prevent SQL injection

The column name construction could be vulnerable to SQL injection if field names contain malicious input.

Consider adding validation or using prepared statements:

+fn sanitize_column_name(name: &str) -> String { + // Only allow alphanumeric characters, underscores, and dots + name.chars() + .filter(|c| c.is_alphanumeric() || *c == '_' || *c == '.') + .collect() +} let mut column_name = - if !is_external { format!("internal_{}", field_name) } else { field_name.to_string() }; + if !is_external { + format!("internal_{}", sanitize_column_name(field_name)) + } else { + sanitize_column_name(field_name) + };

Committable suggestion skipped: line range outside the PR's diff.

coderabbitai · 2024-12-02T06:02:56Z

crates/torii/graphql/src/query/mod.rs

+                        fn populate_value(
+                            value: &mut Value,
+                            type_data: &TypeData,
+                            entity_id: &Option<String>,
+                        ) {
+                            match value {
+                                Value::Object(obj) => {
+                                    for (field_name, field_value) in obj.iter_mut() {
+                                        populate_value(
+                                            field_value,
+                                            &type_data.type_mapping().unwrap()[field_name],
+                                            entity_id,
+                                        );
+                                    }
+
+                                    if type_data.type_mapping().map_or(false, |mapping| {
+                                        mapping.contains_key(&Name::new("option"))
+                                    }) {
+                                        obj.insert(
+                                            Name::new("option"),
+                                            Value::String(obj.keys().next().unwrap().to_string()),
+                                        );
+                                    }
+
+                                    // insert $entity_id$ relation
+                                    if let Some(entity_id) = entity_id {
+                                        obj.insert(
+                                            Name::new(INTERNAL_ENTITY_ID_KEY),
+                                            Value::from(entity_id),
+                                        );
+                                    }
+                                }
+                                Value::List(inner) => {
+                                    for item in inner {
+                                        populate_value(item, type_data.inner().unwrap(), entity_id);
+                                    }
+                                }
+                                _ => {}
+                            }
+                        }


🛠️ Refactor suggestion

Ohayo sensei! Consider extracting the nested populate_value function

The populate_value function is quite complex and deeply nested within build_value_mapping. This makes it harder to test and maintain.

Consider moving it outside as a standalone function:

+fn populate_value( + value: &mut Value, + type_data: &TypeData, + entity_id: &Option<String>, +) { + match value { + Value::Object(obj) => { + // ... existing object handling logic + } + Value::List(inner) => { + // ... existing list handling logic + } + _ => {} + } +} fn build_value_mapping( row: &SqliteRow, types: &TypeMapping, prefix: &str, is_external: bool, entity_id: &Option<String>, ) -> sqlx::Result<ValueMapping> { // ... existing code ... - fn populate_value(...) { ... } // ... use the extracted function ... }

Committable suggestion skipped: line range outside the PR's diff.

glihm reviewed Nov 27, 2024

View reviewed changes

Larkooo added 18 commits November 29, 2024 15:18

start sql model register refactor

bb131a1

refactor: arrays as json & model members

a82a895

alter table and model idx

57ce39a

fix: model idx

84754b5

better error log

1ed6d28

remove model idx

c0653a5

rempve model members table

991c657

model schema migration

d8f2bcf

lol

1dab165

refactor: ditch model members & refactor dynamic query

ecc459e

start refactoring while grpc and grahql services/type mapping

b8dc77f

fix set entities enum

e1c2e4f

fix set entities

cf05465

fmt

acd1daa

update grpc for flattened reads

57c3e54

cleanup

eec65aa

feat(torii-core): delete entity new schema

985a001

graphql integration almost done

5233b5f

Larkooo force-pushed the refactor-sql branch from 94e6dc1 to 5233b5f Compare November 29, 2024 08:18

Larkooo added 2 commits November 29, 2024 15:29

global arrays refactor & fix graphql

1de3f01

fmt

a8afbcd

Larkooo marked this pull request as ready for review November 29, 2024 08:32

coderabbitai bot reviewed Nov 29, 2024

View reviewed changes

tuple no _

be9c251

coderabbitai bot reviewed Nov 29, 2024

View reviewed changes

feat: graphql arrays of objects and enums

4753141

coderabbitai bot reviewed Dec 2, 2024

View reviewed changes

feat: tuple support

39fee00

This was referenced Dec 27, 2024

feat(katana): include genesis states in states trie #2847

Merged

feat(katana): rpc modules selection #2848

Merged

chore(katana): hide init subcommand for now #2857

Merged

refactor(katana): make gas oracle a critical task #2859

Merged

This was referenced Jan 14, 2025

fix(katana): separate deprecated declared class #2903

Merged

feat(katana): remove support for initializing custom settlement chain #2927

Merged

coderabbitai bot mentioned this pull request Jan 28, 2025

feat(torii): indexing cartridge controllers #2959

Merged

This was referenced Jan 29, 2025

refactor(katana): return error as default for state commitment provider #2967

Merged

feat(katana): check predeployed settlement contract is configured correctly #2971

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

opt-refactor(torii-core): complete sql schema refactor #2718

opt-refactor(torii-core): complete sql schema refactor #2718

Larkooo commented Nov 25, 2024 •

edited by coderabbitai bot

Loading

glihm left a comment

coderabbitai bot commented Nov 29, 2024 •

edited

Loading

Chat

CodeRabbit Commands (Invoked using PR comments)

Other keywords and placeholders

Documentation and Community

coderabbitai bot left a comment

coderabbitai bot Nov 29, 2024

coderabbitai bot Nov 29, 2024

coderabbitai bot Nov 29, 2024

coderabbitai bot Nov 29, 2024

coderabbitai bot Nov 29, 2024

coderabbitai bot Nov 29, 2024

coderabbitai bot Nov 29, 2024

coderabbitai bot Nov 29, 2024

coderabbitai bot Nov 29, 2024

coderabbitai bot left a comment

coderabbitai bot left a comment

coderabbitai bot Dec 2, 2024

coderabbitai bot Dec 2, 2024

	println!("entity_query: {}", entity_query);
	log::debug!("entity_query: {}", entity_query);

		let mut column_name =
		if !is_external { format!("internal_{}", field_name) } else { field_name.to_string() };

opt-refactor(torii-core): complete sql schema refactor #2718

opt-refactor(torii-core): complete sql schema refactor #2718

Conversation

Larkooo commented Nov 25, 2024 • edited by coderabbitai bot Loading

Summary by CodeRabbit

Release Notes

glihm left a comment

Choose a reason for hiding this comment

coderabbitai bot commented Nov 29, 2024 • edited Loading

Walkthrough

Changes

Possibly related PRs

Suggested reviewers

Chat

CodeRabbit Commands (Invoked using PR comments)

Other keywords and placeholders

Documentation and Community

coderabbitai bot left a comment

Choose a reason for hiding this comment

coderabbitai bot Nov 29, 2024

Choose a reason for hiding this comment

coderabbitai bot Nov 29, 2024

Choose a reason for hiding this comment

coderabbitai bot Nov 29, 2024

Choose a reason for hiding this comment

coderabbitai bot Nov 29, 2024

Choose a reason for hiding this comment

coderabbitai bot Nov 29, 2024

Choose a reason for hiding this comment

coderabbitai bot Nov 29, 2024

Choose a reason for hiding this comment

coderabbitai bot Nov 29, 2024

Choose a reason for hiding this comment

coderabbitai bot Nov 29, 2024

Choose a reason for hiding this comment

coderabbitai bot Nov 29, 2024

Choose a reason for hiding this comment

coderabbitai bot left a comment

Choose a reason for hiding this comment

coderabbitai bot left a comment

Choose a reason for hiding this comment

coderabbitai bot Dec 2, 2024

Choose a reason for hiding this comment

coderabbitai bot Dec 2, 2024

Choose a reason for hiding this comment

Larkooo commented Nov 25, 2024 •

edited by coderabbitai bot

Loading

coderabbitai bot commented Nov 29, 2024 •

edited

Loading