Tensorboard Eval Images with TF-Vision #11270

RayanMoarkech · 2024-10-18T05:50:05Z

Prerequisites

Please answer the following questions for yourself before submitting an issue.

I am using the latest TensorFlow Model Garden release and TensorFlow 2.
I am reporting the issue to the correct repository. (Model Garden official or research directory)
I checked to make sure that this issue has not been filed already.

1. The entire URL of the file you are using

https://www.tensorflow.org/tfmodels/vision/object_detection#load_logs_in_tensorboard

2. Describe the bug

I am following this documentation, https://www.tensorflow.org/tfmodels/vision/object_detection#load_logs_in_tensorboard
When I open tensorboard, and select images, I get "No image data was found."

I also tried to add EXPERIMENT_CONFIG.task.allow_image_summary = True, but I got an error, even with the dataset and code given by the documentation.

The error:

ValueError: Expected scalar shape, saw shape: (1, 640, 640, 3).

The code:

model, eval_logs = tfm.core.train_lib.run_experiment(
    distribution_strategy=distribution_strategy,
    task=task,
    mode='train_and_eval',
    params=EXPERIMENT_CONFIG,
    model_dir=paths['MODEL_CHECKPOINT_PATH'],
    run_post_eval=True,
)

3. Steps to reproduce

Following this documentation: https://www.tensorflow.org/tfmodels/vision/object_detection#load_logs_in_tensorboard
After training, open tensorboard, and select images
See "No image data was found."

Now, try to train again with

EXPERIMENT_CONFIG.task.allow_image_summary = True
see error:

ValueError: Expected scalar shape, saw shape: (1, 640, 640, 3).

4. Expected behavior

I would like to see the evaluated images per epochs saved on tensorboard.

5. Additional context

Let me know if you need anything extra

6. System information

OS Platform and Distribution (e.g., Linux Ubuntu 16.04): MacOS 15.0
Mobile device name if the issue happens on a mobile device: N/A
TensorFlow installed from (source or binary): source
TensorFlow version (use command below): v2.17.0-rc1-2-gad6d8cc177d 2.17.0
Python version: 3.10
Bazel version (if compiling from source): N/A
GCC/Compiler version (if compiling from source): N/A
CUDA/cuDNN version: N/A
GPU model and memory: N/A -> using CPU

The text was updated successfully, but these errors were encountered:

bharatjetti · 2024-11-19T08:43:46Z

Hi @RayanMoarkech
I worked on the problem and reproduced the issue of No image data was found, I added piece of code to the existing,
i.e in the show_batch function added this line and made necessary changes.

with summary_writer.as_default():
tf.summary.image(f'Image_with_bboxes_{i+1}', np.expand_dims(image, axis=0), step=train_steps)

and I found that it is working, here is the notebook that I worked on. Please check it here is the screenshot.

RayanMoarkech · 2024-11-19T17:29:57Z

But this will not create an image log at every summary interval while training the model with:

tfm.core.train_lib.run_experiment

Correct me if I'm wrong. But I was not able to connect the training to produce an image summary at the same time I am doing a summary_interval. So this means this option is only a manual code that I should run on every train step I want to stop at?

Based on your screenshot, you can see the data is from a . RUN

bharatjetti · 2024-12-10T06:44:47Z

Hi @RayanMoarkech,
It seems there is no issue with model training and we can observe the results on few example images. However to get the entire image summary automatically, Could you please raise it in tensorboard repo, please feel free to close this issue.

google-ml-butler · 2024-12-14T19:22:56Z

Are you satisfied with the resolution of your issue?
Yes
No

RayanMoarkech · 2025-01-28T20:03:35Z

I figured how to log the images to tensorboard. It does not seem to be a supported communication between tensorfow and tensorboard. So it needs to be implemented manually. Here it is for whomever is searching:

You first need to set the log image to true:

EXPERIMENT_CONFIG.task.allow_image_summary = True

Then you will need to define a Callable Orbit Action method that will take the image from the log and push it to tensorboard. It is important to delete the image from the data, since tensorflow will not know how to log an image summary. (here is the broken part that needs to be fixed internally).

from typing import Dict, Union

img_val_logs_path = f"{paths['MODEL_CHECKPOINT_PATH']}/validation"

if not os.path.exists(img_val_logs_path):
    os.makedirs(img_val_logs_path, exist_ok=True)

summary_writer = tf.summary.create_file_writer(img_val_logs_path)
Output = float  # Replace with actual type if needed

evaluated_step = steps_per_loop # Not from 0 since it runs 1 time without validation

def image_eval(data: Dict[str, Union[tf.Tensor, float, np.number, np.ndarray, Output]]) -> None:
    global evaluated_step
    # Now, let's log the image with bounding boxes (optional depending on your use case)
    evaluated_step += steps_per_loop
    with summary_writer.as_default():
        for i in range(valid_batch_size):
            # Extract the image and bounding boxes from the dictionary
            image = data.get(f'image/validation_outputs/{i}')
            # Log image
            tf.summary.image(f'image_{i}', image, step=evaluated_step)
            # Delete the data image
            del data[f'image/validation_outputs/{i}']

    print(f"Logged images with bounding boxes at step {evaluated_step}")

Lastly, you need to add the callable to the tfm.core.train_lib.run_experiment params:

eval_actions=[image_eval]

@bharatjetti I am not sure if you want to reopen the issue to fix something internally. The issue is when allow_image_summary is True, then the model training does not know how to log it in events. Expecting scalar numbers, but receiving an image.

RayanMoarkech added models:official models that come under official repository type:bug Bug in the code labels Oct 18, 2024

laxmareddyp assigned bharatjetti and LakshmiKalaKadali Oct 21, 2024

bharatjetti added the stat:awaiting response Waiting on input from the contributor label Nov 19, 2024

google-ml-butler bot removed the stat:awaiting response Waiting on input from the contributor label Nov 19, 2024

bharatjetti added the stat:awaiting response Waiting on input from the contributor label Dec 10, 2024

LakshmiKalaKadali removed their assignment Dec 10, 2024

RayanMoarkech closed this as completed Dec 14, 2024

RayanMoarkech mentioned this issue Dec 14, 2024

Tensorboard Eval Images with TF-Vision tensorflow/tensorboard#6963

Open

3 tasks

RayanMoarkech reopened this Jan 28, 2025

google-ml-butler bot removed the stat:awaiting response Waiting on input from the contributor label Jan 28, 2025

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Tensorboard Eval Images with TF-Vision #11270

Tensorboard Eval Images with TF-Vision #11270

RayanMoarkech commented Oct 18, 2024 •

edited

Loading

bharatjetti commented Nov 19, 2024

RayanMoarkech commented Nov 19, 2024 •

edited

Loading

bharatjetti commented Dec 10, 2024

google-ml-butler bot commented Dec 14, 2024

RayanMoarkech commented Jan 28, 2025

Tensorboard Eval Images with TF-Vision #11270

Tensorboard Eval Images with TF-Vision #11270

Comments

RayanMoarkech commented Oct 18, 2024 • edited Loading

Prerequisites

1. The entire URL of the file you are using

2. Describe the bug

3. Steps to reproduce

4. Expected behavior

5. Additional context

6. System information

bharatjetti commented Nov 19, 2024

RayanMoarkech commented Nov 19, 2024 • edited Loading

bharatjetti commented Dec 10, 2024

google-ml-butler bot commented Dec 14, 2024

RayanMoarkech commented Jan 28, 2025

RayanMoarkech commented Oct 18, 2024 •

edited

Loading

RayanMoarkech commented Nov 19, 2024 •

edited

Loading