Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Make TextStreamer public & add unit-tests #1700

Open
wants to merge 9 commits into
base: master
Choose a base branch
from

Conversation

pavel-esir
Copy link
Contributor

@pavel-esir pavel-esir commented Feb 10, 2025

  • Check whether text is incomplete should be done as early as possible in order to write to m_decoded_lengths relevant information that text in the position of this token is incomplete (is expressed with -1).
  • Made TextStreamer public.
  • Also added unit-tests for TextStreamer which use pybind to make available this object from Python for test purposes. But TextCallbackStreamer is still private and visible only for developers.

Ticket: CVS-148635

@pavel-esir pavel-esir added the bug Something isn't working label Feb 10, 2025
@pavel-esir pavel-esir added this to the 2025.1 milestone Feb 10, 2025
@github-actions github-actions bot added category: LLM LLM pipeline (stateful, static) category: GHA CI based on Github actions category: cmake / build Cmake scripts no-match-files labels Feb 10, 2025
CMakeLists.txt Outdated Show resolved Hide resolved
.github/workflows/windows.yml Outdated Show resolved Hide resolved
.github/workflows/linux.yml Outdated Show resolved Hide resolved
if (text.size() >= 3 && text.compare(text.size() - 3, 3, replacement) == 0) {
m_decoded_lengths[m_decoded_lengths.size() - 1] = -1;
// Don't print incomplete text
return set_streaming_status(on_finalized_subword_callback(res.str()));
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

IterableStreamer needs a similar change. Please add ChunkStreamer to tests as well.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

  1. Turned out IterableStreamer already handles this situation correctly.

  2. We talked with @as-suvorov, looks like he will add such unit-tests together with implementation of TextStreamer with vectorized write.

@pavel-esir pavel-esir force-pushed the add_text_streamer_tests branch 3 times, most recently from 220815c to c30f1a8 Compare February 12, 2025 10:45
@github-actions github-actions bot added category: visual language Visual language pipeline category: continuous batching Continuous batching category: whisper Whisper pipeline category: speculative decoding Speculative decoding category: Python API Python API for GenAI category: samples GenAI samples category: GenAI C++ API Changes in GenAI C++ public headers category: prompt lookup labels Feb 12, 2025
@pavel-esir pavel-esir force-pushed the add_text_streamer_tests branch from 1667b9e to 8d9c781 Compare February 12, 2025 15:34
@github-actions github-actions bot removed the category: samples GenAI samples label Feb 12, 2025
@pavel-esir pavel-esir changed the title Fix TextCallbackStreamer add unit-tests Make TextStreamer public & add unit-tests Feb 12, 2025
@pavel-esir pavel-esir requested a review from Wovchena February 12, 2025 15:45
@@ -131,6 +143,11 @@ PYBIND11_MODULE(py_openvino_genai, m) {
.def(py::init<>())
.def("write", &StreamerBase::write, "Write is called every time new token is decoded. Returns a StreamingStatus flag to indicate whether generation should be stopped or cancelled", py::arg("token"))
.def("end", &StreamerBase::end, "End is called at the end of generation. It can be used to flush cache if your own streamer has one");

py::class_<TextStreamer, std::shared_ptr<TextStreamer>>(m, "TextStreamer", text_streamer_docstring)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we need to highlight that TextStreamerBase is a base class

I would also create a dedicated py_text_streamer.cpp file for all streamers on python side

# ("black-forest-labs/FLUX.1-dev", dict(subfolder="tokenizer")), # FLUX.1-dev has tokenizer in subfolder
]

# Chekc that fix for CVS-157216 works.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
# Chekc that fix for CVS-157216 works.
# Check that fix for CVS-157216 works.

callback: User-defined callback function to process the decoded text, callback should return either boolean flag or StreamingStatus.

"""
def __init__(self, tokenizer: ..., callback: typing.Callable[[str], bool | StreamingStatus]) -> None:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

type ... is unexpected

Tokenizer class should be defined before TextStreamer in pybind11 PYBIND11_MODULE(py_openvino_genai, ..)

...
def end(self) -> None:
...
def write(self, arg0: int) -> StreamingStatus:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

please, use explicit argument name via py::arg to avoid automatic arg0 names

__all__ = ['Adapter', 'AdapterConfig', 'AggregationMode', 'AutoencoderKL', 'CLIPTextModel', 'CLIPTextModelWithProjection', 'CacheEvictionConfig', 'ChunkStreamerBase', 'ContinuousBatchingPipeline', 'CppStdGenerator', 'DecodedResults', 'EncodedResults', 'FluxTransformer2DModel', 'GenerationConfig', 'GenerationResult', 'Generator', 'Image2ImagePipeline', 'ImageGenerationConfig', 'InpaintingPipeline', 'LLMPipeline', 'PerfMetrics', 'RawPerfMetrics', 'SD3Transformer2DModel', 'Scheduler', 'SchedulerConfig', 'StopCriteria', 'StreamerBase', 'StreamingStatus', 'T5EncoderModel', 'Text2ImagePipeline', 'TokenizedInputs', 'Tokenizer', 'TorchGenerator', 'UNet2DConditionModel', 'VLMPipeline', 'WhisperGenerationConfig', 'WhisperPerfMetrics', 'WhisperPipeline', 'WhisperRawPerfMetrics', 'draft_model', 'get_version', 'openvino', 'os', 'py_openvino_genai']
__version__: str
__all__ = ['Adapter', 'AdapterConfig', 'AggregationMode', 'AutoencoderKL', 'CLIPTextModel', 'CLIPTextModelWithProjection', 'CacheEvictionConfig', 'ChunkStreamerBase', 'ContinuousBatchingPipeline', 'CppStdGenerator', 'DecodedResults', 'EncodedResults', 'FluxTransformer2DModel', 'GenerationConfig', 'GenerationResult', 'Generator', 'Image2ImagePipeline', 'ImageGenerationConfig', 'InpaintingPipeline', 'LLMPipeline', 'PerfMetrics', 'RawPerfMetrics', 'SD3Transformer2DModel', 'Scheduler', 'SchedulerConfig', 'StopCriteria', 'StreamerBase', 'StreamingStatus', 'T5EncoderModel', 'Text2ImagePipeline', 'TextStreamer', 'TokenizedInputs', 'Tokenizer', 'TorchGenerator', 'UNet2DConditionModel', 'VLMPipeline', 'WhisperGenerationConfig', 'WhisperPerfMetrics', 'WhisperPipeline', 'WhisperRawPerfMetrics', 'draft_model', 'get_version', 'openvino', 'os', 'py_openvino_genai']
__version__: str = '2025.1.0.0-1760-6e22001f986-add_text_streamer_tests'
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

having a version is unexpected

it's explicitly removed to avoid regeneration of this file for each commit because of different hashes

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working category: cmake / build Cmake scripts category: continuous batching Continuous batching category: GenAI C++ API Changes in GenAI C++ public headers category: GHA CI based on Github actions category: LLM LLM pipeline (stateful, static) category: prompt lookup category: Python API Python API for GenAI category: speculative decoding Speculative decoding category: visual language Visual language pipeline category: whisper Whisper pipeline no-match-files
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants