modified get_stop_words(), preventing being changed from outside. #31
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Dear Alir3z4,
I used this repo for the work at my previous company, and I found one issue with the function
get_stop_words()
:if we obtain the list in variable and modifiy the list variable, like:
then the return list from
get_stop_words()
will also be changed:This will raise a mistake when we call the function
get_stop_words('en')
many times recursively, like:To solve this issue, of course the user can use
copy.deepcopy(get_stop_words('en'))
, however this may not be noticed by the user.Thus I added a
copy
in the functionget_stop_words('en')
, namely:and as a result:
And I have tested the performance before and after, see:
before: https://github.com/yyanhan/python-stop-words/blob/example/test_before.ipynb
after: https://github.com/yyanhan/python-stop-words/blob/example/test_after.ipynb
I hope this PR can make it better!
Best
Han