Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Marker tracking, Status update? #10

Open
AdaRoseCannon opened this issue Mar 30, 2022 · 44 comments
Open

Marker tracking, Status update? #10

AdaRoseCannon opened this issue Mar 30, 2022 · 44 comments

Comments

@AdaRoseCannon
Copy link
Member

This is frequently requested.

/facetoface

@toji
Copy link
Member

toji commented Jul 19, 2022

/agenda

@probot-label probot-label bot added the agenda label Jul 19, 2022
@tangobravo
Copy link

tangobravo commented Jul 27, 2022

We discussed on the call the difference between QR markers and tracked images. I believe @cabanier asked whether QR codes would be sufficient for everyone.

I mentioned a common use case for Image Tracking in handheld AR was augmenting print designs, and though the design often includes a QR code too that's usually a small part of the design and not large enough in the camera view to offer good quality tracking by itself.

Here's a brief example of a very common flow for Image Tracked content in the web - the user moves in close to scan the QR code in their camera app, but the site uses the full image target for the AR experience:

webar-image-tracking.mp4

@hybridherbst
Copy link

I'd also like to know what the status is. Seems this part of WebXR, while frequently requested, is currently stalled. What's needed to un-stall it? 🙂

I agree with tangobravo on usecases; QR code tracking is useful as well but for many experiences it's the entry point into the browser as a convenient way to not type an URL.

What may be worth noting is that both SceneViewer and QuickLook contain image tracking functionality with the ability to directly jump from a QR Code (usually scanned through device cameras) into AR through a website.

@AdaRoseCannon
Copy link
Member Author

This was discussed at both the Face to Face and TPAC. The main blocker is that WebXR implementations are built on top of lower level platforms and don't implement a massive amount of low-level XR capabilities in the browser itself.

These platforms have very different performance capabilities when it comes to marker tracking with almost no overlap in image types that perform well.

E.g. ARCore is very good at tracking detailed full colour images like a painting but doesn't perform well on simple high contrast shapes like a QR code. HoloLens is very good at tracking QR codes but doesn't work well on rich images.

@tangobravo
Copy link

tangobravo commented Nov 29, 2022

I could be mis-remembering when we discussed it, but I believe HoloLens explicitly detects and tracks QR codes rather than it simply having an "image marking tracking" algorithm that works well with them. The discussion was more whether from an API point of view they could both be considered "markers" and enabled via the same API. My view was they are distinct algorithms and should be distinct features with their own APIs.

As far as general "Image Tracking" is concerned, most algorithms like the same kinds of things in their target images: high-contrast detail without repetitive texture. Photos and complex graphics layouts (like product packaging) usually work well. That would be the case for both ARKit and ARCore image markers on mobile at least. Right now it might be that Android is the only platform with WebXR implementations with an underlying AR platform that would support these image markers, but I don't think that should be a blocker to moving forwards with the spec.

@tangobravo
Copy link

I know there was also the thought that if the same markers could work between devices then it would also allow for co-located experiences between Android and HoloLens for example, which is definitely a good idea. Again I don't think it should be a blocker for the spec - HoloLens can implement a QR Marker feature and Android an Image Marker feature for now. Whoever wants that interoperability the most can then implement the one they're missing to get the ability to operate in the same space.

@Maksims
Copy link

Maksims commented Nov 30, 2022

I believe image tracking and marker tracking - are different use cases and user flows.

For image fracking images are provided in advance of tracking is activated for an image.

While for QR tracking, it should be possible to auto-track generic QR codes, without a need of providing its image in advance. Also, QR tracking can provide decoded information - which is used to differentiate each marker.

@hybridherbst
Copy link

@AdaRoseCannon just for understanding the implications, does that mean this is blocked until new devices with hopefully better capabilities come out (and some agreement that new devices should be better at marker and/or QR code tracking)? Or is it blocked by not having enough data points to agree on a standard (for potentially future devices) because devices do it so different today?

@AdaRoseCannon
Copy link
Member Author

I personally don't think it should be blocked at all. In my opinion the devices have distinct enough target audiences that the issue shouldn't be a significant one. I.e. HoloLens is more industry focused and smart phones are more consumer focused.

@tangobravo
Copy link

I think the naming of the repo is causing some problems here - I proposal we rename this repo image-tracking and just focus on that use case here: see #15.

@cabanier
Copy link
Member

I think marker tracking is more useful and distinct from image tracking.
Maybe we need another repo for image tracking...

@tangobravo
Copy link

I think marker tracking is more useful and distinct from image tracking.

Which marker type are you thinking of?

Maybe we need another repo for image tracking...

The API currently explained in this one is for image tracking.

I'd suggest we move further discussion of this to #15.

@hybridherbst
Copy link

I think there's a distinction between "marker tracking" (e.g. being able to put a 3D transform on a some known target, doesn't matter if that's a QR code or a known image or ...) and "data marker detection" (detecting the existance of and reading the data from, for example, a QR code). Maybe you mean the latter, and not actually the "tracking" case?

@FireDragonGameStudio
Copy link

Hi, I know I'm pretty late to the discussion here, but are there any updates on marker/image tracking in WebXR? Already played around with it a lot, created various image marker experiments, like indoor navigation (-> https://youtu.be/riiJdNq2LWI?t=1250)

Would love to see any updates here :)

br,
Max

@gabrieljbaker
Copy link

Just chiming in that at framevr.io we are very interested in seeing the status of this or an update on the current thinking around it, if it's changed at all over the past few months. :)

We'd like to use it (it's supported in babylon.js webxr implementation) but are just a bit nervous about whether it will be moving forward.

@ROBYER1
Copy link

ROBYER1 commented Jun 21, 2023

Looking to implement this through Needle Tools, however the fact that Android users need to enable this via an experimental flag is a real show-stopper. I have tested out the marker tracking with the #webxr-incubations flag enabled and it works great, what's the stopper for it being made enabled by default?
https://engine.needle.tools/docs/everywhere-actions.html#image-tracking

@koktavy
Copy link

koktavy commented Aug 8, 2023

@toji any news about enabling this by default? Curious where the discussion is happening and how best to follow along. Thanks!

@BostonLeeK
Copy link

@toji any updates or terms of implementation on phone side?

@ryo0ka
Copy link

ryo0ka commented Jan 15, 2024

Hi @toji,

Excuse me for tagging you, just throwing my ideas here.

8thWall is currently the only viable alternative and it's expensive: they charge close to $3000/mo for a commercial license (for performing computation on their cloud) and there has been a lot of interesting clients/projects that I couldn't work with because they would barely make profit due to the license fee.

I've tried other alternatives but found them rather underperforming.

For iOS (which doesn't employ WebXR) image tracking "works" because my friend has managed to implement WebXR with ARKit via WebView and AppClip, which essentially takes one extra button click to open, and is a lot less expensive for upkeep. It's truly ironic to me that WebXR works better on iOS Safari than Android Chrome.

All that is because this feature is flagged "experimental" on Android Chrome, though the feature actually works! I'm struggling to understand why it has to be like that, all things considered -- it makes me sad when clients can't pay for 8thWall's license fee.

I hope that this may bring a new perspective to you and make a positive change in your decision making going forward.

Cheers, Ryo

@kfarr
Copy link

kfarr commented Mar 18, 2024

Hi all sharing that this is a common request from many diverse personas in addition to creative agency and advertising use case.

Here are user stories requesting this feature that represent 3 different personas:

  • As a user publishing a book I wish to have the ability to use an image from a page in the book as a marker to initialize a geospatial experience on a mobile device using a web-based standard. The image marker feature is required to use the image from the page as a marker to localize the experience in the correct orientation relative to the user agent camera.
  • As a city agency employee sharing information about an upcoming streets improvement project I wish to use an image, that may also be a QR code that doubles as a way to load the app, as a mechanism to allow passers-by to initialize a city-hosted web application using webxr standard (no app required) to display one or more alternative designs in 3D context of the surrounding environment. The image marker feature is required to use the image as a mechanism to localize the scene in the correct orientation relative to the user agent camera.
  • As a safe streets advocate suggesting improvements to a public right-of-way I wish to use a well-known, publicized image in a public location such as a famous piece of art, billboard, logo, store front, building face, street art, or other semi-permanent 2-dimensional visual as an orientation to begin the placement of geolocated traffic control devices such as cones, safe-hit delineator posts, or permanently affixed bollards. The image marker feature is required to use the image as a mechanism to localize the scene in the correct orientation relative to the user agent camera.

The current workaround is to use a software SLAM solution, or to publish a native app to take advantage of AR Core.

I echo @ryo0ka 's comments that the feature behind the flag works on Android Chromium works as expected to satisfy the above use cases and more, and that the defacto iOS solution of app-clip polyfill is capable of supporting this standard. Therefore we are awaiting Android Chrome implementation to remove from behind feature flag if I understand correctly? How best can we help to advocate for this @toji ? We want to make your job easier by helping to provide any examples, users, or other commercial evidence that will help move this forward.

@Maksims
Copy link

Maksims commented Mar 19, 2024

Hi all sharing that this is a common request from many diverse personas in addition to creative agency and advertising use case.

Here are user stories requesting this feature that represent 3 different personas:

  • As a user publishing a book I wish to have the ability to use an image from a page in the book as a marker to initialize a geospatial experience on a mobile device using a web-based standard. The image marker feature is required to use the image from the page as a marker to localize the experience in the correct orientation relative to the user agent camera.
  • As a city agency employee sharing information about an upcoming streets improvement project I wish to use an image, that may also be a QR code that doubles as a way to load the app, as a mechanism to allow passers-by to initialize a city-hosted web application using webxr standard (no app required) to display one or more alternative designs in 3D context of the surrounding environment. The image marker feature is required to use the image as a mechanism to localize the scene in the correct orientation relative to the user agent camera.
  • As a safe streets advocate suggesting improvements to a public right-of-way I wish to use a well-known, publicized image in a public location such as a famous piece of art, billboard, logo, store front, building face, street art, or other semi-permanent 2-dimensional visual as an orientation to begin the placement of geolocated traffic control devices such as cones, safe-hit delineator posts, or permanently affixed bollards. The image marker feature is required to use the image as a mechanism to localize the scene in the correct orientation relative to the user agent camera.

The current workaround is to use a software SLAM solution, or to publish a native app to take advantage of AR Core.

I echo @ryo0ka 's comments that the feature behind the flag works on Android Chromium works as expected to satisfy the above use cases and more, and that the defacto iOS solution of app-clip polyfill is capable of supporting this standard. Therefore we are awaiting Android Chrome implementation to remove from behind feature flag if I understand correctly? How best can we help to advocate for this @toji ? We want to make your job easier by helping to provide any examples, users, or other commercial evidence that will help move this forward.

Some of use cases you've provided are already possible using Image Tracking API.

@FireDragonGameStudio
Copy link

Hi all sharing that this is a common request from many diverse personas in addition to creative agency and advertising use case.
Here are user stories requesting this feature that represent 3 different personas:

  • As a user publishing a book I wish to have the ability to use an image from a page in the book as a marker to initialize a geospatial experience on a mobile device using a web-based standard. The image marker feature is required to use the image from the page as a marker to localize the experience in the correct orientation relative to the user agent camera.
  • As a city agency employee sharing information about an upcoming streets improvement project I wish to use an image, that may also be a QR code that doubles as a way to load the app, as a mechanism to allow passers-by to initialize a city-hosted web application using webxr standard (no app required) to display one or more alternative designs in 3D context of the surrounding environment. The image marker feature is required to use the image as a mechanism to localize the scene in the correct orientation relative to the user agent camera.
  • As a safe streets advocate suggesting improvements to a public right-of-way I wish to use a well-known, publicized image in a public location such as a famous piece of art, billboard, logo, store front, building face, street art, or other semi-permanent 2-dimensional visual as an orientation to begin the placement of geolocated traffic control devices such as cones, safe-hit delineator posts, or permanently affixed bollards. The image marker feature is required to use the image as a mechanism to localize the scene in the correct orientation relative to the user agent camera.

The current workaround is to use a software SLAM solution, or to publish a native app to take advantage of AR Core.
I echo @ryo0ka 's comments that the feature behind the flag works on Android Chromium works as expected to satisfy the above use cases and more, and that the defacto iOS solution of app-clip polyfill is capable of supporting this standard. Therefore we are awaiting Android Chrome implementation to remove from behind feature flag if I understand correctly? How best can we help to advocate for this @toji ? We want to make your job easier by helping to provide any examples, users, or other commercial evidence that will help move this forward.

Some of use cases you've provided are already possible using Image Tracking API.

Maybe I'm not getting something here, but what do you mean with "using Image Tracking API"? Because although the naming doesn't reflect it (yet?), this repo basically is the Image Tracking API afaik.

@tangobravo
Copy link

Maybe I'm not getting something here, but what do you mean with "using Image Tracking API"? Because although the naming doesn't reflect it (yet?), this repo basically is the Image Tracking API afaik.

Yes. I think the point being made was the implementation here is already useful for various things, but being behind a flag means it's not something that can be used in practice for projects that are aimed at regular users who are never going to mess with experimental flags.

It feels a bit odd that Google have gone to the effort of doing the implementation but are not advocating for it to be moved beyond experimental.

@FireDragonGameStudio
Copy link

That makes sense, thx for the clarification @tangobravo

@AdaRoseCannon
Copy link
Member Author

The issue with moving image tracking forward is that the image tracking capabilities of the various devices is so divergent that even if they shared the same API they couldn't be used to track the same images. There isn't really a subset of markers that works well everywhere.

@tangobravo
Copy link

The issue with moving image tracking forward is that the image tracking capabilities of the various devices is so divergent that even if they shared the same API they couldn't be used to track the same images. There isn't really a subset of markers that works well everywhere.

You stated earlier in this issue that you didn't consider that to be a blocker.

Image tracking isn't commonly implemented on headsets but is practically universal in mobile AR SDKs. "WebAR" SDKs built on getUserMedia (such as those from Zappar and 8th Wall) support this, as do the increasing number of "Instant App / App Clip" solutions such as that used by Adobe Aero, which are backed by ARCore and ARKit in the end.

The simple solution here is to just rename this repo to image-tracking (as I proposed in #15), and release it without requiring the experimental flag.

What's clear is that the interests of the implementors in the working group is still very geared towards headsets. All my attempts from the sidelines over the last couple of years to advocate for changes to make WebXR more useful in real-world mobile AR scenarios haven't proved effective. I'd listed all the issues as I see them in immersive-web/webxr-ar-module#77 and proposed some solutions.

Obviously it needs an implementor in the end to care sufficiently about these issues to drive these changes, and Google as the sole implementor of handheld WebXR don't seem interested in driving the agenda.

@benferns
Copy link

benferns commented Apr 9, 2024

I build Variant Launch (a WebXR iOS shim that supports basic WebXR and a limited subset of features - mainly basic items like hit-test) and marker tracking is definitely the most commonly requested feature, despite the limited feature support.

I'd estimate about 1/3 of people I get signing up are extremely interested in it based on the feedback channels I have. I assume this represents a reasonable cross-section of non-headset WebXR devs and real-world use-cases, so the desire is definitely there.

Currently the only alternatives for web-based AR experiences with markers are very expensive (some would argue Launch is too!) or have limited accuracy, but markers have become a common element of many XR projects.

I am of course ignoring the giant apple-shaped elephant in the room, but thats (mostly) out of our hands!

@michaeltheory
Copy link

Just adding another voice to the crowd asking for this feature to be un-flagged.

While access to the raw camera data theoretically allows one to process the frame and calculate an image marker, that requires loading large libraries like tensorflowjs and opencvjs and then doing complex calculations that will consume frame time.

I'm surprised that image tracking is such a left-behind feature when it works just fine behind a flag and it's a critical component to many marketing campaign based AR. WebXR is so much better than the getUserMedia SLAM pay-to-play services but this is a critically missing feature.

Please enable this soon.

@hybridherbst
Copy link

I am of course ignoring the giant apple-shaped elephant in the room

Maybe another reason for Google to enable this: the apple-shaped elephant supports excellent image tracking within QuickLook/USDZ for multiple years now, leading to the awkward situation that image tracking without installing an app "just works" on iOS, but requires a flag to be enabled in Chrome WebXR.

@Maksims
Copy link

Maksims commented May 27, 2024

I am of course ignoring the giant apple-shaped elephant in the room

Maybe another reason for Google to enable this: the apple-shaped elephant supports excellent image tracking within QuickLook/USDZ for multiple years now, leading to the awkward situation that image tracking without installing an app "just works" on iOS, but requires a flag to be enabled in Chrome WebXR.

I am of course ignoring the giant apple-shaped elephant in the room

Maybe another reason for Google to enable this: the apple-shaped elephant supports excellent image tracking within QuickLook/USDZ for multiple years now, leading to the awkward situation that image tracking without installing an app "just works" on iOS, but requires a flag to be enabled in Chrome WebXR.

WebXR does not work on iOS. Native viewers - is a different story.

@hybridherbst
Copy link

hybridherbst commented May 27, 2024

In Needle, we generate interactive scenes for QuickLook on-the-fly in the browser and those support image tracking on iOS.
From a user's perspective, this means "image tracking in AR works without an app on iOS but does not work without an app on Android", even when the underlying technologies are vastly different. Sorry if this was misleading :)

@CoderSilas
Copy link

+1 for the update its been 2+ years.. Would be great to have this.

@somepablo
Copy link

I understand that image-tracking is a highly requested feature. However, I also understand the working group's decision not to push this feature forward due to several reasons:

  • Limited Applicability: At present, image-tracking is primarily useful for mobile AR. It is not supported by OpenXR or the internal SDKs of well-known headsets.
  • Inconsistent Implementation: Even within mobile AR, there are significant differences in how image-tracking is implemented between ARCore and ARKit, making it challenging to provide a consistent experience.

At Onirix (a WebAR platform), we considered integrating with this API for Android/Chrome, but ultimately decided to develop a custom native image-tracking solution that works consistently across both Android and iOS. About three years after its launch, we can affirm our customers have been very satisfied with this approach.

Given the current landscape, I don't foresee this feature being implemented in the short to mid-term. If you need a solution right now, I recommend using one of the WebAR platforms that support image-tracking for the best results. Alternatively, you can explore some of the free and open-source options available, though they may come with compromises in performance and quality.

Hope this helps anyone interested in this feature.

@PetterGs
Copy link

I understand that Niantic and Zappar have a vested interest in keeping WebXR gimped. How many of their "representatives" are in the OpenXR workgroup?

@tangobravo
Copy link

tangobravo commented Aug 26, 2024

I understand that Niantic and Zappar have a vested interest in keeping WebXR gimped. How many of their "representatives" are in the OpenXR workgroup?

Zappar doesn't have any members in the Working Group. AFAIK I am the only person from Zappar to have joined the Community Group (which is freely open for anyone to join). [edit: I was talking here about the Immersive Web Working Group and Community Group who are the people responsible for the WebXR specification. OpenXR is a distinct group working on a shared native C API for headsets, we have no involvement with that at all.]

In my view I've been the most active advocate for improvements in WebXR for mobile use cases for the last few years, laying out the practical issues and suggesting some potential solutions - see immersive-web/webxr-ar-module#77 for example. Also as you'll see from this issue I'm strongly supportive of moving this proposal forward. Zappar's Mattercraft tool also supports building WebXR experiences for both headset and handheld, alongside support for our own in-browser computer vision libraries.

@cabanier
Copy link
Member

I understand that Niantic and Zappar have a vested interest in keeping WebXR gimped. How many of their "representatives" are in the OpenXR workgroup?

There is support for marker tracking in OpenXR through Magic Leap's extension.
The group wants to have support for this; it's just that our current devices don't have support for marker tracking.

@hybridherbst
Copy link

I think there seems to still be confusion as to what this repository and spec proposal is about (see also #15). What most (all?) users in the thread above are asking about is image-tracking – as supported on Android WebXR behind a flag for many years now. There is no expectation that e.g. Quest would have to gain image-tracking capabilities just because Android has image-tracking capabilities.

@cabanier
Copy link
Member

I think there seems to still be confusion as to what this repository and spec proposal is about (see also #15). What most (all?) users in the thread above are asking about is image-tracking – as supported on Android WebXR behind a flag for many years now. There is no expectation that e.g. Quest would have to gain image-tracking capabilities just because Android has image-tracking capabilities.

What I meant to say is that it's not Niantic or Zappar that are holding back progress.

@PetterGs
Copy link

If you say so. I just wish people would stop using the almighty flag as an excuse. If it is behind a flag then it might as well not exist as far as website content is concerned. Nobody is using browsers for running content on their own device with whatever flags they may or may not have.

Content is geared towards the general public. They aren't gonna mess around with flags before visiting your website.

@AdaRoseCannon
Copy link
Member Author

I understand that Niantic and Zappar have a vested interest in keeping WebXR gimped. How many of their "representatives" are in the OpenXR workgroup?

Chair comment: @PetterGs You can't be using this kind of ableist language here, please check your tone and review the W3C's https://www.w3.org/policies/code-of-conduct/#unacceptablebehavior

Also you should not be making wild accusations about other members of the community, Niantic and Zappar have been excellent participants in the community and the reasons this API are stalled are purely technical. This feature is one which has been hotly requested by developers and within the group for a long time but it has not been possible to make an API which will work reliably cross platform which is an essential feature of a WebAPI.

@hybridherbst
Copy link

it has not been possible to make an API which will work reliably cross platform

Out of curiosity, has there been consensus whether the hereby proposed API is about image tracking (e.g. getting the pose of a known image like a poster) or about marker tracking (e.g. reading the data of markers like QR codes)?

@PetterGs
Copy link

You can't be using this kind of ableist language here, please check your tone and review the W3C's https://www.w3.org/policies/code-of-conduct/#unacceptablebehavior

I sincerely apologize for my word choice. English isn't my first language, which makes it a challenge to place common vernacular on the correct spot on the euphemism treadmill.

I am very thankful for the chairs comment, since this elucidates the role of both Niantic and Zappar. It's all crystal clear now.

Satobtage

@cperey
Copy link

cperey commented Aug 28, 2024

I haven't read every message in this thread so I'm sorry if this has previously been mentioned but, people might not be aware that the Khronos OpenXR WG Spatial Entities subgroup has been up and running since late 2023. AR features, including marker (and image) tracking, are receiving their attention (currently the focus).

The group expects to be able to release a draft spec by the end of 2024.

@tangobravo
Copy link

it has not been possible to make an API which will work reliably cross platform

Out of curiosity, has there been consensus whether the hereby proposed API is about image tracking (e.g. getting the pose of a known image like a poster) or about marker tracking (e.g. reading the data of markers like QR codes)?

This is still unclear to me too. @cabanier suggested he'd prefer image-tracking move to a new repo. Whether we rename this one or start a new one, the practical blocker for progress is that Google, the sole implementor of this current proposal, appear no longer interested in progressing it - @toji it would be really useful to have some input on that.

There is also precedent for WebXR features that are more appropriate on mobile than headsets, ie dom-overlay which doesn't have any headset implementations AFAIK. @AdaRoseCannon you did also previously mention your personal view was that it wasn't a blocker if there were no headset implementations that supported this.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests