-
Notifications
You must be signed in to change notification settings - Fork 144
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
fix(chunking): minimal fix for combine arg typo #345
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM! Looks like that unit test I mentioned is catching this as expected. Parallel mode isn't so relevant right now, but that test is asserting that the api parameters are properly sent to partition, so I think you'll just need to rename the arg over there as well.
I'd also make a note in the changelog that this is fixed in case users are checking the release notes.
cbdc673
to
5acd8a7
Compare
The parameter name in Actually, It's one of the challenges with taking advantage of the |
@awalker4 how does the versioning work for this repo? Should I be bumping the version to If you can help me understand how this ends up appearing in production after merging, that would help me a lot so I can let the end-users know what to expect :) |
5acd8a7
to
edd5d2f
Compare
@scanny The version change is driven by the changelog - so you can add a section there under 0.0.63-dev0. Running |
For consistency and to enable further lint-error fixing in future PRs, add the configuration required for `black` and `ruff` to run from the command-line. This also enables these to be run by language servers (LSPs) in popular editors. Settings are the same as in the `unstructured` repo.
A pair of anomalies, possibly related to Apple M3 Silicon or just newest MacbookPro. The reported available virtual memory is 90 GB so bumped that up to 300 GB in `test_general_api_returns_503()`. Also, for whatever reason, `starlette.Request.client` is `None` when running the tests locally, at least on my machine, so include that possible case (which should be accounted for anyway, `None` is a legitimate value for `Request.client`). All tests pass locally after these two changes.
edd5d2f
to
6d936e6
Compare
6d936e6
to
a8935ad
Compare
This is the minimal fix for the
combine_text_under_n_chars
arg-name typo bug described in #337.Chunking parameters were added to the API about 4 months ago in this commit: 49fe710
At that time, the identifier
combine_under_n_chars
was used instead ofcombine_text_under_n_chars
(as that parameter appears inchunk_by_title()
). As a result, any value supplied for this argument via the REST API is ignored and defaults to 500. Thecombine_under_n_chars
parameter name appears in the API documentation and elsewhere (README.md etc.) and differs from the parameter name documented forchunk_by_title()
.A fuller remedy will be to add a separate
combine_text_under_n_chars
parameter to the API endpoint such that either name is accepted for now, then update the documentation such that all mentions arecombine_text_under_n_chars
. At some point in the future we may remove thecombine_under_n_chars
parameter from the endpoint.However, making those changes requires a synchronized release of the
unstructured
library and in the interest of time we're making this "quick-fix" that can be released independently of any other dependencies.