Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Cannot get page content using puppeteer ? #365

Open
geminigeek opened this issue Jan 18, 2025 · 5 comments
Open

Cannot get page content using puppeteer ? #365

geminigeek opened this issue Jan 18, 2025 · 5 comments
Assignees
Labels
CDP Chrome Debug Protocol DOM API puppeteer

Comments

@geminigeek
Copy link

geminigeek commented Jan 18, 2025

hi,

i am trying to get page content/html but i am getting error Error [ReferenceError]: XMLSerializer is not defined

i am using docker for running the script, from inside docker i can dump the content of url with command line usage

EDIT: i tried same code with "cloud.lightpanda.io" its also not working using browser=lightpanda

my code

import puppeteer from "puppeteer-core"
let url = "https://www.wikipedia.org/"

// use browserWSEndpoint to pass the Lightpanda's CDP server address.
const browser = await puppeteer.connect({
  browserWSEndpoint: "ws://127.0.0.1:9222",
})

// The rest of your script remains the same.
const context = await browser.createBrowserContext()
const page = await context.newPage()

await page.goto(url)

const html = await page.content()
console.log("html :>> ", html)

await page.close()
await context.close()
await browser?.disconnect()

error

node:internal/modules/run_main:122
    triggerUncaughtException(
    ^

Error [ReferenceError]: XMLSerializer is not defined
    at  ( at CdpFrame.<anonymous> (file:///root/lightpanda-docker/using-light-panda/node_modules/.pnpm/[email protected]/node_modules/puppeteer-core/lib/esm/puppeteer/util/decorators.js:101:27), <anonymous>:8:43)
    at #evaluate (file:///root/lightpanda-docker/using-light-panda/node_modules/.pnpm/[email protected]/node_modules/puppeteer-core/lib/esm/puppeteer/cdp/ExecutionContext.js:387:19)
    at async ExecutionContext.evaluate (file:///root/lightpanda-docker/using-light-panda/node_modules/.pnpm/[email protected]/node_modules/puppeteer-core/lib/esm/puppeteer/cdp/ExecutionContext.js:274:16)
    at async IsolatedWorld.evaluate (file:///root/lightpanda-docker/using-light-panda/node_modules/.pnpm/[email protected]/node_modules/puppeteer-core/lib/esm/puppeteer/cdp/IsolatedWorld.js:97:16)
    at async CdpFrame.evaluate (file:///root/lightpanda-docker/using-light-panda/node_modules/.pnpm/[email protected]/node_modules/puppeteer-core/lib/esm/puppeteer/api/Frame.js:345:20)
    at async CdpFrame.content (file:///root/lightpanda-docker/using-light-panda/node_modules/.pnpm/[email protected]/node_modules/puppeteer-core/lib/esm/puppeteer/api/Frame.js:574:20)
    at async CdpPage.content (file:///root/lightpanda-docker/using-light-panda/node_modules/.pnpm/[email protected]/node_modules/puppeteer-core/lib/esm/puppeteer/api/Page.js:555:20)
    at async file:///root/lightpanda-docker/using-light-panda/error.mjs:15:14

Node.js v22.11.0
@krichprollsch
Copy link
Member

Hello @geminigeek
Thanks for the report. I think it's due to a lack of a web API implementation.
https://developer.mozilla.org/en-US/docs/Web/API/XMLSerializer

@krichprollsch krichprollsch added DOM API CDP Chrome Debug Protocol labels Jan 20, 2025
@krichprollsch krichprollsch self-assigned this Jan 20, 2025
@Redskull-127
Copy link

hey any update on this?

@frankgreco
Copy link

I'm hitting this with just about any website. Is there a workaround?

@krichprollsch
Copy link
Member

Hello, no progress for now. I hope be able to take a look the next week.

@krichprollsch
Copy link
Member

Hello here 👋 I merged XMLSerializer + outerHTML implementations.
Changes must unblock puppeteer's content call.

I forced a re-build of the nightly releases.
If you can have a try. 🙏

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
CDP Chrome Debug Protocol DOM API puppeteer
Projects
None yet
Development

No branches or pull requests

4 participants