-
Notifications
You must be signed in to change notification settings - Fork 19
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
ElementHandle fails to find node by id #75
Comments
Okay I sat down and investigated and I think this is sort of a logic issue with this code. Something that's a bit unfortunate about element handles the way they work right now is that if the page transforms, it's likely that those handles are completely different. Essentially, here when you call The correct way of implementing this would actually be requesting the nth child of some parent div in a while loop. Something like this should work just fine: import { ElementHandle, launch } from "jsr:@astral/[email protected]";
const browser = await launch({ headless: false });
const page = await browser.newPage();
await page.goto("https://www.google.com/search?true&source=lnms&tbm=isch&sa=X&tbs=isz:l&hl=en&q=Barceloneta%2C+Beach");
await page.waitForSelector('div[role="dialog"]', { timeout: 30_000 });
const [accept_cookies] = await page.$$("div[role='dialog'] button")
.then((buttons) =>
Promise.all(
buttons.map((b, i) => b.innerHTML().then((it): [ElementHandle, string] => [buttons[i], it])),
)
)
.then((it) => it.filter(([_, html]) => html.toLowerCase().includes("accept")))
.then((it) => it.map(([el, _]) => el));
if (accept_cookies) {
await accept_cookies?.click?.();
await page.waitForNavigation({ waitUntil: "none" });
}
const images = [];
let i = 0;
while(true) {
const pageImages = await page.$$("g-img");
const cur = pageImages[i++];
const className = await cur.evaluate((it: HTMLElement) => !!it.className);
if (!className) continue;
await cur.click();
await page.waitForNavigation({ waitUntil: "none" });
await page.waitForSelector('img[aria-hidden="false"]', { timeout: 30_000 });
const image = await page.evaluate(() => {
const img = document.querySelector('img[aria-hidden="false"]')! as HTMLImageElement;
if (img) {
return {
title: img.closest("c-wiz")?.querySelector("h1")?.innerText!,
url: img.src,
source: (img.parentElement as HTMLLinkElement).href,
};
}
throw new Error("Image not found");
});
images.push(image);
if(images.length === 20) {
break;
}
}
console.log(images); |
Going to give it a try and let you know if this approach worked out, thank you so much for your response. |
This code has essentially been moved from puppeteer to astral, so while this works, for this specific script, I bet we're breaking a lot of unwritten rules / expectations when element handles are lost in page transitions (if I understand correctly) |
We're trying to move everything to locators (#2) ASAP, so hopefully this will not be a problem soon. Puppeteer has these unwritten rules but they do a lot of work to make them not show up too often. Unfortunately, when they do show up, it's basically impossible to debug due to the weird hacks they use. TLDR; locators fix this elegantly, they're just not quite ready yet. |
Deno v.1.44.1
Astral 0.4.2
reproduction:
Happy to open a PR given some pointers what to fix.
The text was updated successfully, but these errors were encountered: