-
Notifications
You must be signed in to change notification settings - Fork 16
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[DO NOT MERGE] Add validateValueType function to ensure data type integrity #1002
base: main
Are you sure you want to change the base?
Conversation
Hi, I’m Jit, a friendly security platform designed to help developers build secure applications from day zero with an MVS (Minimal viable security) mindset. In case there are security findings, they will be communicated to you as a comment inside the PR. Hope you’ll enjoy using Jit. Questions? Comments? Want to learn more? Get in touch with us. |
It might be worth adding a couple scale tests to this to ensure that running on tens or hundreds of thousands of values doesn't significantly impact performance. You don't have to include those on every ci run, but they're good to have to validate. We did that with normalization in persister and caught a couple of important bugs that would have caused significant slow downs. |
@xdumaine I've added tests for 500k of items, they run properly on my M1 Pro laptop.
|
Added |
I think this approach is both valid and a good incremental step forward. It solves a problem we have. I do think that it may be worth considering if using a JSON schema validator on the entities and properties would not provide a more battle-tested form of data validation. We already make use of some JSON Schema validation during development and with better integration I think it could be a powerful safeguard in our platform. Additionally, JSON schemas would help with search and entity property documentation, a current and longstanding problem. |
packages/integration-sdk-core/src/data/createIntegrationEntity.ts
Outdated
Show resolved
Hide resolved
packages/integration-sdk-core/src/data/createIntegrationEntity.ts
Outdated
Show resolved
Hide resolved
// For non-array values, check if the type is supported | ||
const valueType = typeof value; | ||
if (!SUPPORTED_TYPES.includes(valueType)) { | ||
throw new IntegrationError({ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
How can we keep steps from failing on unsupported data? This is a step towards strictness that'll have any overall impact on job statuses and the ingestion team.
How else could we improve the overall health of jobs and data?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We should open a bigger discussion for this but this should be a MAJOR, we have some ideas in the AWS Pod as to how to proceed
Can we simplify this and avoid recursive calls with something like validate(entity: Entity): void {
for(const [key, value] of Object.entries(entity)) {
const valid = Array.isArray(value) ? value.every(p => SUPPORTED_TYPES.includes(typeof p)) : SUPPORTED_TYPES.includes(typeof value);
valid || throw new IntegrationError({
code: 'UNSUPPORTED_TYPE',
message: `Unsupported type found at "${key}": Nested arrays are not supported.`,
});
}
} |
@Nick-NCSU tried adding your code into my existing codebase but couldn't succeed, feel free to push to this branch! |
Added the validateValueType function, significantly enhancing the robustness of our data handling in the createIntegrationEntity process.
The function is designed to enforce strict type checking, ensuring that all data conforms to our predefined SUPPORTED_TYPES.
This only applies for createIntegrationEntity, we already have types for
assign
but IMO it's never too late to validate anyway