Replies: 3 comments 1 reply
-
You can see Playwright request.redirectedFrom() and request.redirectedTo() Example: // https://crawlee.dev/api/playwright-crawler/class/PlaywrightCrawler
import { PlaywrightCrawler, log } from 'crawlee'
(async () => {
const crawler = new PlaywrightCrawler({
// https://crawlee.dev/api/playwright-crawler/interface/PlaywrightRequestHandler
async requestHandler({ response }) {
// https://crawlee.dev/api/playwright-crawler/interface/PlaywrightCrawlingContext#response
log.info(`Url: ${response.request().redirectedFrom().url()} Redirected to ${response.request().redirectedFrom().redirectedTo().url()}`);
},
});
await crawler.run([
'https://httpbin.org/relative-redirect/1',
]);
})(); |
Beta Was this translation helpful? Give feedback.
-
You might be able to use puppeteer to check the size of a request's redirect chain within an interception handler and then continue or abort depending on your limit. https://pptr.dev/guides/request-interception/ |
Beta Was this translation helpful? Give feedback.
-
Thank you very much for your answers. For example: const crawler = new PlaywrightCrawler({
launchContext: {
// Here you can set options that are passed to the playwright .launch() function.
launchOptions: {
// This is an option to the Playwright engine.
maxRedirectToFollow: 5,
},
},
// This is an option to the crawler.
maxRedirectToFollow: 5,
}); |
Beta Was this translation helpful? Give feedback.
-
Is there a way to provide an option for max follow redirects when start PlaywrightCrawler?
There is an option maxRedirects for APIRequestContext but I guess it will not work.
I guess, PlaywrightCrawler uses Browser.
Beta Was this translation helpful? Give feedback.
All reactions