Replies: 8 comments 1 reply
-
I would recommend you to use Huggingface tokenizer instead. see: https://github.com/deepjavalibrary/djl/tree/master/extensions/tokenizers |
Beta Was this translation helpful? Give feedback.
-
thanks. could you explain the reason why?
…On Thu, May 9, 2024 at 7:57 AM Frank Liu ***@***.***> wrote:
I would recommend you to use Huggingface tokenizer instead. see:
https://github.com/deepjavalibrary/djl/tree/master/extensions/tokenizers
—
Reply to this email directly, view it on GitHub
<#3169 (comment)>,
or unsubscribe
<https://github.com/notifications/unsubscribe-auth/A6NWEK2DTAKNPTJPKTV2R5TZBOFGBAVCNFSM6AAAAABHN2DZN2VHI2DSMVQWIX3LMV43SRDJONRXK43TNFXW4Q3PNVWWK3TUHM4TGNZQGQ2TG>
.
You are receiving this because you authored the thread.Message ID:
***@***.***>
|
Beta Was this translation helpful? Give feedback.
-
|
Beta Was this translation helpful? Give feedback.
-
how do we utilize the HF tokenizers which are written in Rust? asking for
my understanding.
…On Thu, May 9, 2024 at 9:28 AM Frank Liu ***@***.***> wrote:
1. DJL's HuggingfaceTokenizer has exactly the same behavior that
matches Huggingface model
2. We can automatically import those tokenizers and model specific
configurations from Huggingface hub
3. BertFullTokenizer doesn't have all the configurations compare to
HuggingfaceTokenizer
—
Reply to this email directly, view it on GitHub
<#3169 (comment)>,
or unsubscribe
<https://github.com/notifications/unsubscribe-auth/A6NWEK5GKKALDURH6QO7AIDZBOPZZAVCNFSM6AAAAABHN2DZN2VHI2DSMVQWIX3LMV43SRDJONRXK43TNFXW4Q3PNVWWK3TUHM4TGNZSGEZDG>
.
You are receiving this because you authored the thread.Message ID:
***@***.***>
|
Beta Was this translation helpful? Give feedback.
-
HuggingfaceTokenizer is a DJL class
See more exampels: |
Beta Was this translation helpful? Give feedback.
-
thanks. i guess my question is that did someone write Java equivalent of
the rust code or is it interoping with Rust somehow - how?
…On Thu, May 9, 2024 at 12:56 PM Frank Liu ***@***.***> wrote:
HuggingfaceTokenizer
<https://javadoc.io/doc/ai.djl.huggingface/tokenizers/latest/ai/djl/huggingface/tokenizers/HuggingFaceTokenizer.html>
is a DJL class
HuggingFaceTokenizer tokenizer = HuggingFaceTokenizer.newInstance("bert-base-cased");
Encoding encoding = tokenizer.encode(inputs);
See more exampels:
https://github.com/deepjavalibrary/djl/blob/master/extensions/tokenizers/src/test/java/ai/djl/huggingface/tokenizers/HuggingFaceTokenizerTest.java#L34
—
Reply to this email directly, view it on GitHub
<#3169 (comment)>,
or unsubscribe
<https://github.com/notifications/unsubscribe-auth/A6NWEK6BAQCN6FRH4C6HOBDZBPIHZAVCNFSM6AAAAABHN2DZN2VHI2DSMVQWIX3LMV43SRDJONRXK43TNFXW4Q3PNVWWK3TUHM4TGNZUGUYDS>
.
You are receiving this because you authored the thread.Message ID:
***@***.***>
|
Beta Was this translation helpful? Give feedback.
-
DJL uses JNI to interop with Rust. |
Beta Was this translation helpful? Give feedback.
-
thanks. one more question: where is the native library stored? is it packaged within some jar? |
Beta Was this translation helpful? Give feedback.
-
is this class thread-safe? it looks like so but i'd like to confirm.
Beta Was this translation helpful? Give feedback.
All reactions