Replies: 3 comments 3 replies
-
DJL support transfer learning for PyTorch, you can take a look this example: https://github.com/deepjavalibrary/djl/blob/master/examples/src/main/java/ai/djl/examples/training/transferlearning/TrainAmazonReviewRanking.java |
Beta Was this translation helpful? Give feedback.
-
We can access the C++ API with the JavaCPP Presets for PyTorch, so it should be possible to do that: |
Beta Was this translation helpful? Give feedback.
-
Hi, I made some changes to loading using the PtEngine's loader. Now I can see the blocks from the loaded model appearing as PtSymbolBlock. However loading the model this way causes checkGradients() to fail. When I looked at them closer, it looks like the loaded block is not returning gradient to DJL properly. Specifically this block of code |
Beta Was this translation helpful? Give feedback.
-
It seems pytorch allows a Torchscript model to continue to train when loaded outside python, for example in c++ as discussed here https://github.com/pytorch/pytorch/issues/17614.
Is something like this possible with DJL?
I tried to generate a Torchscript model from huggingface following the script here https://huggingface.co/transformers/torchscript.html#saving-a-model
and loading this with DJL.
Then I added my task-specific loss and trained for a few iterations. There were no errors but I noticed only the task head was updated while the language model (the Torchscript model) did not seem to update (same result if I run forward before and after training).
There are no errors from the Pytorch engine. I tried to follow the c++ example from pytorch but I am not sure how to access the parameters as this person did in c++ here https://github.com/pytorch/pytorch/issues/17614#issuecomment-769151466. If I examine the Block object I loaded, I could not find the parameters as the c++ example shows.
Any help is greatly appreciated =)
Beta Was this translation helpful? Give feedback.
All reactions