Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Typed search attributes #1612

Open
wants to merge 16 commits into
base: main
Choose a base branch
from
Open

Typed search attributes #1612

wants to merge 16 commits into from

Conversation

THardy98
Copy link
Contributor

@THardy98 THardy98 commented Jan 24, 2025

What was changed

Introduces TypedSearchAttributes to the Typescript SDK, allowing users to supply and receive type-enforced search attributes. TypedSearchAttributes are intended to replace the existing (not type-enforced) SearchAttributes which have been marked as @deprecated in its existing usages.

Why?

Brings parity with other SDKs which have a typed search attribute implementation.
Brings parity to how the server represents search attributes.
Provides a typed, structured interface to use search attributes.

For reviewers

Points of interest:

  • typed-search-attributes.ts contains the API
  • test-typed-search-attributes.ts contains tests and demonstrates API usage
  • worflow.ts for changes to upsertSearchAttributes (mutations to search attributes in workflow info)
  • payload-search-attributes.ts contains decoding/encoding logic for search attributes (both legacy and typed)

  1. Closes [Feature Request] Typed Search Attributes #1232

  2. How was this tested:
    Added tests, see test-typed-search-attributes.ts

  3. Any docs updates needed?
    Likely as we are deprecating existing SearchAttributes

@THardy98 THardy98 requested a review from a team as a code owner January 24, 2025 13:39
@THardy98 THardy98 marked this pull request as draft January 24, 2025 13:39
@THardy98 THardy98 requested a review from mjameswh January 24, 2025 16:54
@THardy98
Copy link
Contributor Author

Most of the necessary revisions are done, fixing failing tests and adding new ones.

@THardy98 THardy98 marked this pull request as ready for review February 6, 2025 08:54
- remove ITypedSearchAttributes, expose concrete class TypedSearchAttributes
- update constructor to take a list of pairs, consistent with the usage of pairs and class throughout the API
- `Options` objects (i.e. user input) allow for both pairs and the concrete class as input types
- created `TypedSearchAttributeUpdate`/`Pair` types, to be used for upserts and updates, the only difference with these types and the existing `TypedSearchAttribute/Pair` types is that they except `null` values (for deletion)
- fixed the `upsert` method to accomodate these types
- `updateSearchAttributes` now returns a new copy of `TypedSearchAttributes` to avoid mutation of existing `TypedSearchAttributes` objects
- a myriad of small changes to accomodate the type changes
@THardy98 THardy98 force-pushed the typed-search-attributes branch from e6c6b31 to af0754e Compare February 6, 2025 18:29
@THardy98
Copy link
Contributor Author

THardy98 commented Feb 6, 2025

Updated the PR description with an overview for reviewers.

Some remaining points of uncertainty:

  • a couple TODO comments in places where I follow an existing pattern but am unsure if this is still the case (particularly those in payload-search-attributes.ts and internals.ts
  • I serialized expected responses in tests (i.e. JSON.parse(JSON.stringify(...))) to handle JSON converted responses from query. Figured this was okay, but just want to double-check
  • wondering if it's worth to test the isTypeSearchAttributePair type guard directly, or if the existing tests are sufficient

@THardy98 THardy98 force-pushed the typed-search-attributes branch from 9d727b3 to 3877147 Compare February 6, 2025 23:54
@THardy98 THardy98 force-pushed the typed-search-attributes branch from 3877147 to 3261083 Compare February 7, 2025 00:09
@THardy98 THardy98 requested a review from mjameswh February 7, 2025 00:32
@THardy98
Copy link
Contributor Author

THardy98 commented Feb 7, 2025

Who do I need to get in touch with to update docs? (considering we're deprecating a user-facing feature in exchange for another)

…tic methods within the class

renamed some types and methods
@THardy98 THardy98 force-pushed the typed-search-attributes branch from 1a601a1 to 0c3afe2 Compare February 7, 2025 22:58
@@ -723,6 +728,10 @@ test('Workflow can read WorkflowInfo', configMacro, async (t, config) => {
runId: handle.firstExecutionRunId,
taskQueue,
searchAttributes: {},
// Ensure serialized data structure is correct
typedSearchAttributes: JSON.parse(JSON.stringify(new TypedSearchAttributes())),
Copy link
Contributor

@mjameswh mjameswh Feb 21, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It was not evident to me at first sight that the TypedSearchAttributes object would actually serialize to JSON properly. By spec, JSON.stringify(obj) only serializes enumerable own properties, so an object's private fields shouldn't be listed.

But turns out it does, as TS predates JS' EcmaScript 5 class model, so the TypeScript compiler translates "private fields" to regular properties on the JS object. So we get something like this:

{
    "searchAttributes": {
        "CustomIntField": ["INT", 42],
        "CustomKeywordField": ["KEYWORD", "test-value2"]
    }
}

Still, that feels fragile, especially given that this is what sink functions will see, and what users might see in logs if they print out Workflow Info or Search Attributes. I particularly don't like that we have this inner searchAttributes object, and that we serialize using our internal storage format rather than something that adheres to the "user visible" API (i.e. the format that a user could pass as input to the TypedSearchAttributes class).

So two suggestions:

  1. On the TypedSearchAttributes class, redefine the toJSON() function to return the precise object as we'd like it to get serialized, i.e. probably a TypedSearchAttributePair[];
  2. In at least some of your tests, rather than doing JSON.parse(JSON.stringify(someSearchAttributeObject)) for the purpose of deepEquals(), be explicit on what you would expect to get (e.g. [[["CustomIntField", "INT"], 42]]), so that we don't break that at a later point (that would be just too easy to miss otherwise).

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah I agree with these points.

  1. I've defined toJSON() on the class to return its JSON representation as SearchAttributePair[]
  2. I've removed all usages of JSON.stringify in tests. I've changed WorkflowInfo to return SearchAttributePair[] instead of TypedSearchAttributes. This is the only case where the server response is SearchAttributePair[] because we don't get an opportunity to decode. I think this is okay, particularly because users will likely use SearchAttributePair[] when specifying search attribute options. Maybe we can change the field name to reflect that the value is a list of pairs, I have no preference. Consequently, when making assertions on WorkflowInfo we just need to compare lists, no need for JSON.stringify. All other test cases work with TypedSearchAttributes directly.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reverted the type back to a class type TypedSearchAttributes as discussed offline (keeping the toJSON override).

@@ -118,6 +118,7 @@ if (RUN_INTEGRATION_TESTS) {
memo: {},
parent: undefined,
searchAttributes: {},
typedSearchAttributes: JSON.parse(JSON.stringify(new TypedSearchAttributes())),
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks like we might have more and more discrepencies between the actual WorkflowInfo interface and that sink functions see of it. We already had the unsafe.now(), mentionned below, and now, typedSearchAttributes will be plain objects/arrays rather than an instance of the TypedSearchAttributes class.

We could consider rehydrating the WorkflowInfo objects before actually passing them to sink functions, but I honestly don't think there's a strong use case for it enough to justify the extra CPU cycle and memory churning that would involve. So I think we may simply create a variant of the WorkflowInfo interface that corresponds to what we actually get in sinks.

Don't fix this just now, but please add a FIXME and/or open a ticket for that.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

See comment above on changing this type to being SearchAttributePair[] instead of TypedSearchAttributes (again, maybe we can change the field name to better reflect this).

Copy link
Contributor Author

@THardy98 THardy98 Feb 28, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yup, just created a literal to satisfy the test and added a FIXME. As discussed offline, we don't expect this to be a problem in practice.

...info,
searchAttributes: {
...info.searchAttributes,
// Note that we keep empty arrays in the search attributes.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think that was a previous oversight and would prefer we actually fix that. We should prune out "empty" search attributes on upsert.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done.

I hesitated to do this before because I noticed this comment in Python's upsert impl:

 # Update the keys in the existing dictionary. We keep exact values
 # sent in instead of any kind of normalization. This means empty
 # lists remain as empty lists which matches what the server does.

It's not clear to me why this would be useful, regardless if it matches server. Maybe worth asking Chad about this ?

@@ -1347,7 +1362,8 @@ export function setDefaultSignalHandler(handler: DefaultSignalHandler | undefine
*
* @param searchAttributes The Record to merge. Use a value of `[]` to clear a Search Attribute.
*/
export function upsertSearchAttributes(searchAttributes: SearchAttributes): void {

export function upsertSearchAttributes(searchAttributes: SearchAttributes | TypedSearchAttributeUpdatePair[]): void {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Need to update the typedoc above.

Copy link
Contributor Author

@THardy98 THardy98 Feb 26, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done.

for (const pair of searchAttributes) {
const [k, v] = pair;
if (v === null) {
// If the value is null (i.e. delete), set the value to an empty list.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same as below, I think keeping the empty list in worklfowInfo.searchAttributes was a previous oversight, and would prefer that we remove the attribute entirely.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ditto the previous comment on the Python SDK's implementation (removed but maybe we should clarify semantics with Python SDK and Chad)

const typedKey = TypedSearchAttributes.getKeyFromUntyped(k, v);

// Unable to discern the typing of the key, skip.
if (!typedKey) {
Copy link
Contributor

@mjameswh mjameswh Feb 21, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think in this case it would be preferable to see if there is already an SA named k in typedSearchAttributes, and if so:

  1. Try to use type of SA k in typedSearchAttributes as the type for v;
  2. If the type is incompatible, remove entry k from typedSearchAttributes.

That will avoid some possible cases of inconsistencies where typedSearchAttributes[k] and searchAttributes[k] both exists but have different values.

Copy link
Contributor Author

@THardy98 THardy98 Feb 26, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think case 1 is possible. If we get undefined from TypedSearchAttributes.getKeyFromUntyped, we've checked that the value is not string, number, boolean, Date, or string[]. Consequently, there would be no possible SearchAttributeType that would be valid. The value would fail a isValidValueForType check for any SearchAttributeType.

For case 2, I had initially implemented this change but then reverted it. I can't think of scenarios where typedSearchAttributes[k] and searchAttributes[k] both exist but have different values (but I very well could just not be aware of the semantics).

If we enumerate the possibilities for attr A:

  1. A belongs in both typed and untyped, same value
  2. A belongs in both typed and untyped, different value (if this is possible)
  3. A belongs in typed but not untyped (if this is possible)
  4. A belongs in untyped but not typed (if this is possible)
  5. A does not belong in typed or untyped

Deletion makes things more consistent for cases 2 (A will only exist in untyped but at least there are no differing values) and 3 (A will not exist in either), but makes things less consistent for case 1 (which is arguably the common case). It's a no-op for cases 4 and 5.

In any case, I'm not sure if this would be expected behavior for a user. I don't think a user would expect an update/insert operation to delete unless explicitly specified to do so.

WDYT?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed the deletion case for upsert.

Added some improvements to handle date strings and less type loss from KEYWORD to TEXT

// Use updateSearchAttributes to get a deep copy of the current state.
let typedSearchAttributes = info.typedSearchAttributes.updateAttributes([]);
for (const [k, v] of Object.entries(searchAttributes)) {
const typedKey = TypedSearchAttributes.getKeyFromUntyped(k, v);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Seems like the code here (getKeyFromUntyped and below) wants to be permissive, kind of allowing going through with v being a single primitive, rather than an array, but then you'll get until line 1442 and fail on newValue = v[0]. You may also get weird behaviors if v is set to the single value false or 0, because of the !v check at line 1427.

Technically speaking, the former API only allowed v to be an array or undefined, so let's just throw early if v anything else than that.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

v is recognized as SearchAttributeValueOrReadonly within the loop, which does imply array or undefined.

I don't get any compiler issues here given that typing, but I've added an error check just in case if we want to be extra safe.

let newValue: unknown = v;
if (typedKey.type !== SearchAttributeType.KEYWORD_LIST) {
// Skip if we found multiple values for a non-list key.
if (Array.isArray(v) && v.length > 1) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Remove from typedSearchAttributes in that case. That's better than having incohenrencies.

Copy link
Contributor Author

@THardy98 THardy98 Feb 26, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ditto the previous comment on removing from typedSearchAttributes.

In any case, I've removed this check and a couple other validation checks in the upsert logic. If we can determine a valid type for the given input then these additional checks are redundant.

I've also left a short comment in the upsert logic that covers the basic cases, thought it might be useful though the logic is not complicated. Can remove if you think unnecessary. Testing some of these cases (i.e. multi-value input for non keyword list, invalid objects/primitives) is difficult given that server will error if you try to do this.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Removed the comment. Code is pretty readable and doesn't really warrant it.

@@ -12,6 +12,7 @@ export * from './converter/data-converter';
export * from './converter/failure-converter';
export * from './converter/payload-codec';
export * from './converter/payload-converter';
export * from './converter/payload-search-attributes';
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let's be specific in what we reexport. TBH, I don't think there's anything in that file that I would consider as public API. It's ok for our own packages to import inner files of the common package.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah I agree.

Haven't commit this change yet, getting some weird import issues after replacing with @temporalio/common/lib/converter/payload-search-attributes, but I'll add it once I fix it.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done

@@ -24,6 +25,7 @@ export type { Timestamp, Duration, StringValue } from './time';
export * from './workflow-handle';
export * from './workflow-options';
export * from './versioning-intent';
export * from './typed-search-attributes';
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Again, be specific. There are things here that are meant to be public APIs, but not *.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yup, lack of oversight here. I've made the exports more explicit now.

@mjameswh
Copy link
Contributor

Naming question (probably/possibly bike-shedding):

Do we prefer TypedSearchAttributes or SearchAttributesTyped?

Definitely TypedSearchAttributes for me.

};

// The corresponding typed search attributes from untypedSearchAttributes.
const typedFromUntypedInput: TypedSearchAttributePair[] = [
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Though it may be useful to have a "createAtttributeValuePair()" utility function, it is not how I would expect a user to use the Typed Search Attributes API in general.

The mental model, from a user's POV must be key(name, type) => value.

Take this Python code:

    # Define TSA Keys, would usually be in some global place
    key_1 = SearchAttributeKey.for_text("python-test-schedule-sa-update-key-1")
    key_2 = SearchAttributeKey.for_keyword("python-test-schedule-sa-update-key-2")

    # Example of setting TSA Keys=>Values from scratch
    handle = await client.create_schedule(
        f"schedule-{uuid.uuid4()}",
        Schedule(action=create_action, spec=ScheduleSpec()),
        search_attributes=TypedSearchAttributes(
            [
                SearchAttributePair(key_1, val_1),
                SearchAttributePair(key_2, val_2),
            ]
        ),
    )

    # Example of reading TSA values
    assert input.description.typed_search_attributes[key_1] == val_1
    assert input.description.typed_search_attributes[key_2] == val_2

    # Creating new TSA by deriving exisitng TSA K/V
    return ScheduleUpdate(
        input.description.schedule,
        search_attributes=input.description.typed_search_attributes.updated(
            SearchAttributePair(key_1, val_1 + "-new"),
            SearchAttributePair(key_2, val_2 + "-new"),
        ),
    )

In TS and following existing idioms of the TS SDK, that give something like this:

    // Define TSA Keys, would usually be in some global place
    export const key1 = defineSearchAttribute("my-custom-sa-key-1", 'STRING');
    export const key2 = defineSearchAttribute("my-custom-sa-key-2", 'KEYWORD');
    export const key3 = defineSearchAttribute("my-custom-sa-key-3", 'INT');

    // Example of setting TSA Keys=>Values from scratch
    const handle = await client.workflow.start(
        // ...
        searchAttributes: [              // Technically no need for a builder function here as TS type 
            [key1, "My customer Name"],  // system is expressive enough to provide type safety on mappings.
            [key2, "some-keyword"],      // I wouldn't mind some utility function to make this more readable,
        ]                                // but the tuple notation should technically be enough.
    );

    // Example of reading TSA values
    assert input.description.typedSearchAttributes.get(key1) === "My customer Name";
    assert input.description.typedSearchAttributes.get(key2) "some-keyword";

    # Creating new TSA by deriving exisitng TSA K/V
    return ScheduleUpdate(
        input.description.schedule,
        search_attributes=input.description.typedSearchAttributes.updated([
            [key2, undefined],  // Delete an existing key
            [key3, 84],
        ]),
    )

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As discussed offline, the API now offers usage like:

// Define keys:
const key1 = { name: 'key_text', type: SearchAttributeType.TEXT };
const key2 = { name: 'key_int', type: SearchAttributeType.INT };

// Define pairs:
const textPair = { key: key1, value: 'even_newer_value' };
const intPair = { key: key2, value: 45 };

// Define collections
const searchAttrs = [textPair, intPair];
// or inline
const inlineAttrs = [
    { key: key1, value: 'even_newer_value' },
    { key: key2, value: 45 }
]

// Define class instance 
const typedSearchAttrs = new TypedSearchAttributes(searchAttrs);
// or inline
const typedSearchAttrs = new TypedSearchAttributes([
    { key: key2, value: 45 }
]); 

@THardy98
Copy link
Contributor Author

THardy98 commented Feb 26, 2025

Just want to call out this comment I wrote in test-typed-search-attributes.ts, because I'm having trouble figuring out why this behavior occurs:

    // Note: There is a discrepancy between typed search attributes in workflow info and workflow
    // exec description after upserts with untyped search attributes.
    //
    // The upsert in this test is done through a signal, which serializes the upsert input.
    // The serialized input converts dates to their ISO string representation. When we read untyped
    // seralized search attributes in upsert, we try to infer their key type based on the value.
    // By default, untyped string values are identified as TEXT type. As a result, serialized date strings
    // become TEXT type and attributes that may have been KEYWORD type are also inferred as TEXT type.
    //
    // When we read the search attributes from the workflow info, we expect to see these type conversions
    // in the typed search attributes.
    //
    // This does not happen with workflow exec description. It's unclear to me why this happens, given that
    // the same upsert logic applies. The only difference being that we decode search attributes from server
    // for the workflow exec description.

I suspect the decoding from server attributes is the cause, but I don't understand how this could happen given that the type conversions due to the upsert should occur first?

- change API
- clarify upsert code/semantics
- improve test assertions (removing json serialization for data comparisons)
@THardy98 THardy98 force-pushed the typed-search-attributes branch from 2fb4d77 to 40f9117 Compare February 26, 2025 07:57
@THardy98 THardy98 requested a review from mjameswh February 26, 2025 17:05
…es. Improvements to handle date strings and TEXT/KEYWORD type inferences
@THardy98
Copy link
Contributor Author

THardy98 commented Feb 28, 2025

Just want to call out this comment I wrote in test-typed-search-attributes.ts, because I'm having trouble figuring out why this behavior occurs:
...

We're pushing to the activator before upserting and the workflow exec description reads from server, so we don't incur this type loss for the description.

In any case, we handle date strings and KEYWORD -> TEXT type loss a bit better now which removes this discrepancy.

Removed this comment.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

[Feature Request] Typed Search Attributes
2 participants