Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add YOLOv4 to studio #120

Merged
merged 33 commits into from
Jan 11, 2025
Merged

Add YOLOv4 to studio #120

merged 33 commits into from
Jan 11, 2025

Conversation

bgoelTT
Copy link

@bgoelTT bgoelTT commented Dec 23, 2024

This PR adds frontend and backend support to TT-Studio. It fully implements:

  • frontend UI components similar to the AI Playground
    • upload an image from local disk mode
    • use local webcam mode
    • draws boxes to screen
    • displays table of detections at bottom of screen
  • backend image handling
    • image resizing (inference server expects images to be 320x320)
    • sending request to inference server backend

Future improvements

  • move image resizing into backend (currently the total round trip latency from the frontend is quite large ~200ms)
    • this will significantly improve performance as we will not be performing 2x the number of JPEG decode/encode like we do now
  • write optimized version of post-processing (this is the true current bottleneck that could see us reaching upwards of 50FPS)
    • best case, we can display the 30FPS that was previously demonstrated with the python-based WebRTC frontend
  • create colour mapping for each class category so that each category has a different box colour (currently all red)

bgoelTT and others added 30 commits December 9, 2024 14:40
- adds a component to open live webcam
- hit endpoint
- draws bounding boxes WIP: has errors
@bgoelTT bgoelTT added the enhancement New feature or request label Dec 23, 2024
@bgoelTT bgoelTT requested review from milank94 and anirudTT December 23, 2024 16:54
@bgoelTT bgoelTT self-assigned this Dec 23, 2024
@bgoelTT bgoelTT changed the base branch from main to staging December 23, 2024 17:00
@milank94
Copy link
Contributor

@bgoelTT have any screenshots or a walkthrough of the frontend for the object detection demo?

@bgoelTT
Copy link
Author

bgoelTT commented Dec 23, 2024

Yes, please see this video https://drive.google.com/file/d/1Jxvbvl79YtoktRYD3GLSyRQC-RiLXpDv/view?usp=share_link

It demonstrates the two modes for the new Object Detection component: the image upload and webcam mode. The video demonstrates the need for the improvements described in this PR's description.

Copy link
Contributor

@anirudTT anirudTT left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Post testing feedback

  1. Dropdown Menu:
    • Update the yolov4 option in the dropdown to use capital letters (YOLOV4) to maintain consistency with naming conventions.
Screenshot 2025-01-02 at 9 40 35 AM
  1. Tooltip on Models Deployed Page:
    • Modify the tooltip for the "Object Detection" button.
Screenshot 2025-01-02 at 9 38 56 AM
  1. Webcam Cleanup:
    • We would need to implement a mechanism to stop or clean up the webcam when:
      • The user navigates away from the "Start Webcam" page.
      • The user clicks the "Stop" button.
    • Currently, the green light on my MacBooks remains on, indicating the webcam is still active even after navigation. This would need to be addressed for better resource management and user privacy.
Screenshot 2025-01-02 at 9 40 03 AM

Copy link
Contributor

@milank94 milank94 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pending @anirudTT changes, looks good.

@anirudTT anirudTT self-requested a review January 3, 2025 21:16
Copy link
Contributor

@anirudTT anirudTT left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM! Nice work 💯
Tested the following on n150 :

  • Model Deployment via TT-Studio
  • upload image and check detection
  • Webcam detection
  • Webcam object is unmounted when stop capture button is clicked.
Screenshot 2025-01-03 at 4 15 37 PM

@bgoelTT bgoelTT merged commit a1076eb into staging Jan 11, 2025
4 checks passed
@bgoelTT bgoelTT deleted the object-detection branch January 11, 2025 00:56
bgoelTT added a commit that referenced this pull request Jan 22, 2025
* Initial commit - add object detection route

* Add package-lock.json

* Add two-column object detection component

* Add new layout and component structure

* Use Aceternity UI file picker

* adds tabs to control menu

* modifies to move webcam to main component

* adds webcam component

* add the react package for webcam util

* add shadcnn tabs ui component

* modifies file upload to show last uploaded file + color change + always show icon to upload

* Fix containing element scroll and z-stack

* Add overflow scroll to main component

* Allow images to assume full width of ObjectDetectionComponent

* Add YoloV4 model config to backend API

* Create new object-detection endpoint & expand DeviceConfigurations enum to support WH_ARCH_YAML setting

* Add ModelType enumeration in frontend to faciliate conditional navigation & use modelID in endpoint invocation

* WIP add components to support:
- adds a component to open live webcam
- hit endpoint
- draws bounding boxes WIP: has errors

* draw box on image

* remove

* Optimize real-time object detection to prevent frame backlog

* Ensure webcam stops completely when stop button is clicked +
layout changes

* ts fixes

* Fix aspect ratio of video container to 4:3

* Fix navigation and add <img> to SourcePicker component - TODO - wire up API call and detection handling

* Refactor inference API call and UI

* Fix UI bugs

* Add API authentication to YOLOv4 backend

* Address PR comments

---------

Co-authored-by: Anirudh Ramchandran <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants