-
Notifications
You must be signed in to change notification settings - Fork 1
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Allow some special characters on TagsQuery component #205
Conversation
🦋 Changeset detectedLatest commit: 79e9e2b The changes in this PR will be included in the next version bump. This PR includes changesets to release 3 packages
Not sure what this means? Click here to learn what changesets are. Click here if you're a maintainer who wants to add another changeset to this PR |
@@ -118,19 +105,4 @@ let TagsInput = forwardRef( | |||
} | |||
) | |||
|
|||
// Just uses an internal observable array | |||
export let MockTagsInput = inject(() => { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This component was unused. Looks like it was used for a test at some point.
5d43273
to
2bdceb0
Compare
2bdceb0
to
2a33db8
Compare
I asked Marc for the spreadsheet he made about the analyzers, we might want to update this at some point but at the very least this should live somewhere in our notion documentation: https://docs.google.com/spreadsheets/d/1ObJyl2w0e1jszgYIdk9yUuZU7N9oeSOOtFDFRK_LDwg/edit#gid=0 Here is the code he used to generate this as well: https://github.com/smartprocure/elasticsearch-test-suite/blob/master/src/index.js |
_.replace(/([&|!(){}[\]^"~*?\\<>;,$'])/g, ' '), | ||
// These characters are not stripped out by our analyzers but they are | ||
// `query_string` reserved characters so we need to escape them. | ||
_.replace(/([+\-=:/])/g, '\\$1') |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Even though we are stripping out the char '\', it should not exist in this replace for reserved characters based on the requirements: found here
Escape + - = & : /
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In this context it's escaping the -
character
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Let's add &
to this list and remove it from the list above.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done
packages/react/src/greyVest/utils.js
Outdated
_.trim, | ||
splitCommas ? splitTagOnComma : _.identity, | ||
_.castArray, | ||
sanitizeTagFn ? _.map(sanitizeTagFn) : _.identity |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In what case will this be called without a sanitizeTagFn?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Full disclaimer: I think sanitizeTagFn
is a terrible API. Having said that, sanitizeTagFn
is only currently relevant to transform the tags text for the main search bar, from which we end up sending elastic query_string
queries.
An example of where it's not needed is the TagsText component, which I think is used in the "Matches" filter.
|
||
let isValidInput = (tag, tags) => !_.isEmpty(tag) && !_.includes(tag, tags) | ||
|
||
let TagsInput = forwardRef( | ||
( | ||
{ | ||
tags, | ||
addTags, | ||
addTags: setTags, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I see setTags in the story but nowhere else, where is this used?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
_.replace(/([&|!(){}[\]^"~*?\\<>;,$'])/g, ' '), | ||
// These characters are not stripped out by our analyzers but they are | ||
// `query_string` reserved characters so we need to escape them. | ||
_.replace(/([+\-=:/])/g, '\\$1') |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Let's add &
to this list and remove it from the list above.
packages/react/src/greyVest/utils.js
Outdated
// | ||
// If in doubt, make a request to the `/{index}/analyze` elasticsearch endpoint | ||
// to see exactly which characters get stripped out of text. | ||
let wordRegex = /[^|><!(){}[\]^"~*?\\;,$']+/g |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Let's consolidate this list with the list here: https://github.com/smartprocure/contexture/pull/205/files#diff-69f43c306be976d94d4493c9ca003a660ca8f6084b779bf99ae2288ca58d4519R38
We can pull this out into a common package that's shared between the front end and back end.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Created the common package and moved the regex there 👍
Let's also make sure to handle the scenario where there is no text left after stripping reserved characters from a tag. In that case, we should remove the tag altogether. |
@@ -68,7 +68,6 @@ | |||
"mobx": "^4.3.1", | |||
"mobx-react": "^6.3.0", | |||
"mobx-utils": "^5.0.0", | |||
"moment": "^2.24.0", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This dependency was duplicated.
f111190
to
5cdeda6
Compare
5cdeda6
to
79e9e2b
Compare
Coverage after merging feature/allow-special-keyword-characters into main will be
Coverage Report
|
Done |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks good.
The tags component used in the search bar currently strips out non-word characters in each tag. For example a tag
my+email
will be converted tomy email
. We do this to visually convey tags that contain text that closely resembles what we send to elastic (see futil-js/contexture-elasticsearch#170 for more information).There are three places to think about when reviewing this PR:
query_string
characters before sending the elastic request to prevent our queries from exploding. See https://github.com/smartprocure/contexture/blob/feature/allow-special-keyword-characters/packages/provider-elasticsearch/src/example-types/filters/tagsQuery.js#L36/{index}/_analyze
with a payload such as{ "analyzer" : "exact", "text" : "student discourse"}
Behavior in this PR
query_string
reserved characters in the tags component:| > < ! ( ) { } [ ] ^ " ~ * ? \
.; , $ '
as well since they are removed by all our analyzers anyway.& + - = : /
in contexture-elasticsearch because those are kept by our analyzers but are also reservedquery_string
characters.Further notes
The props
wordsMatchPattern
,maxWordsPerTag
, andmaxCharsPerTagWord
taken by theExpandableTagsInput
andTagsInput
component are not passed from anywhere (including spark) so this PR removes them. This is a case of adding props in case they're useful but without a use case.A common package
contexture-util
was introduced to share code between contexture packages.