Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Generated sitemap.xml weirdness #2777

Open
MJSPollard opened this issue Feb 25, 2025 · 2 comments
Open

Generated sitemap.xml weirdness #2777

MJSPollard opened this issue Feb 25, 2025 · 2 comments

Comments

@MJSPollard
Copy link

What is the location of your example repository?

hydrogen.shop

Which package or tool is having this issue?

Hydrogen

What version of that package or tool are you using?

"@shopify/hydrogen": "2025.1.1"

What version of Remix are you using?

"@remix-run/react": "^2.15.3"

Steps to Reproduce

First Issue
Go to https://hydrogen.shop/sitemap.xml. Click one of the links such as https://hydrogen.shop/sitemap/products/1.xml. This is not valid xml. Unless its somehow being interpolated back into the original sitemap I have no idea how this is working. There are no docs on this and I've never seen a sitemap that looks like this. Its in the current hydrogen example and our production app that is running on the latest version.

Second Issue:
/articles/article-handle is being generated by the sitemap generator. To my knowledge there is no way to query an article by handle alone and there is no template for the articles routes. Seems incorrect that there would be no template but be listed in the sitemap only to just 404 by default? All the examples in hydrogen.shop 404 such as https://hydrogen.shop/articles/10-tips-for-better-snowboarding

Expected Behavior

I would expect normal and valid xml file in a sitemap xml page or at least some kind of explanation in the docs about this esoteric technique.

I would not expect to see a sitemap generation for articles/article-handle when there is no hydrogen template for articles without blogs and no possibel way to query an article by handle without knowing the blog handle or the actual gid.

Actual Behavior

See Above.

@wizardlyhel
Copy link
Contributor

Thanks for flagging this.

For the article link https://hydrogen.shop/articles/10-tips-for-better-snowboarding, this is a mistake on our end where we forgot we change the url of our blog links 🤦 . The actual url should've been at https://hydrogen.shop/journal/10-tips-for-better-snowboarding. It can be easily modified in the routes/sitemap/$type.$page[.xml].tsx where if we detect type === 'article' we should modify it to journal.

As for the sitemap index file, it is following the the Sitemap spec as described by Sitemap Protocol. https://developers.google.com/search/docs/crawling-indexing/sitemaps/large-sitemaps

Since we have no way of telling how many sitemap urls will be generated per shop, it's safer to assume the biggest possible sitemap format that is allowed.

@MJSPollard
Copy link
Author

Thanks for the reply!

I understand the sitemap index of having 'child' sitemaps but https://hydrogen.shop/sitemap/products/1.xml does not appear to be a valid sitemap xml file? There are no sitemap tags, just a wall of text. Maybe I've just never seen this before but it's very confusing. Earlier projects I've worked on with hydrogen have valid sitemap xml on the child sitemaps.

For the other issue, so the recommended approach is to detect 'article' and return something else for now? Don't articles need to be in the context of a blog in order to fetch them? articleByHandle is not a valid storefront graphql query, you have to use the blog handle or the GID of the article. Will there soon be a journal page template then and there is a shift away from blogs?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants