-
Notifications
You must be signed in to change notification settings - Fork 161
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Fix Doc-Sum stream output format #1219
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This change makes the output v1.1 compatible, but the issue for wrong first token latency count remains. in v1.1 DocSum on outputted tokens e.g
`curl http://10.96.106.94:8888/v1/docsum -H "Content-Type: multipart/form-data" -F "type=text" -F "messages=" -F "files=@./pubmed_10.txt" -F "max_tokens=1024" -F "language=en" -F "stream=true"
data: b' \n\n'
data: b'The'
data: b' provided'
data: b' text'
`
Now the output is aligned with v1.1, but for some reason it still out put lines that it should not out as tokens. Next example shows current output. First three outputs are not token outputs, fourth line is first real token. What are the three output lines and why they are there?
`curl http://10.110.23.165:8888/v1/docsum \
-H "Content-Type: multipart/form-data" \
-F "type=text" \
-F "messages=" \
-F "files=@./pubmed_10.txt" \
-F "max_tokens=1024" \
-F "language=en" \
-F "stream=true"
data: {"ops":[{"op":"replace","path":"","value":{"id":"0d08e051-f332-4ee4-9ffa-af7c38a32d91","streamed_output":[],"final_output":null,"logs":{},"name":"StuffDocumentsChain","type":"chain"}}]}
data: {"ops":[{"op":"add","path":"/logs/LLMChain","value":{"id":"1338af3d-063f-4cac-a7d0-4e0c891dcf67","name":"LLMChain","type":"chain","tags":[],"metadata":{},"start_time":"2025-01-23T07:42:11.221+00:00","streamed_output":[],"streamed_output_str":[],"final_output":null,"end_time":null}}]}
data: {"ops":[{"op":"add","path":"/logs/HuggingFaceEndpoint","value":{"id":"fbc35b7c-9cc0-4ea7-8878-011feb75ea14","name":"HuggingFaceEndpoint","type":"llm","tags":[],"metadata":{},"start_time":"2025-01-23T07:42:11.225+00:00","streamed_output":[],"streamed_output_str":[],"final_output":null,"end_time":null}}]}
data: {"ops":[{"op":"add","path":"/logs/HuggingFaceEndpoint/streamed_output_str/-","value":" \n\n"},{"op":"add","path":"/logs/HuggingFaceEndpoint/streamed_output/-","value":" \n\n"}]}
data: {"ops":[{"op":"add","path":"/logs/HuggingFaceEndpoint/streamed_output_str/-","value":"The"},{"op":"add","path":"/logs/HuggingFaceEndpoint/streamed_output/-","value":"The"}]}
`
Yes, you are right. The stream output is not expected: Not only the format, but also the contents. |
Workround to keep Doc-Sum stream output aligned with v1.1 format This is a workaround to extract the tokens from stream output. Fix issue: opea-project/GenAIInfra#753 Signed-off-by: Wang, Xigui <[email protected]>
for more information, see https://pre-commit.ci
The workaround extracts the LLM output tokens from the stream output. The extracted tokens are correctly. As I am not familiar with the stream output format, I believe there are matured solution I did not find out yet.
|
Close it and wait for a complete fix. |
Keep Doc-Sum stream output aligned with v1.1 format
Fix issue:
opea-project/GenAIInfra#753
Type of change
List the type of change like below. Please delete options that are not relevant.
Dependencies
List the newly introduced 3rd party dependency if exists.
Tests
Describe the tests that you ran to verify your changes.