Replies: 1 comment
-
On my windows system, Crawlee V 3.1.4, Node JS V 16.17.1 import { CheerioCrawler } from 'crawlee';
let urls = [
['http://example.com/1.html'],
['http://example.com/2.html'],
['http://example.com/3.html'],
['http://example.com/4.html'],
] //array of urls
for (const element of urls) { // limit to only 10 rquest in queue
const crawler = new CheerioCrawler({
async requestHandler({ $, request }) {
const title = $('title').text();
console.log(`The title of "${request.url}" is: ${title}.`);
}
})
await crawler.run(element);
} Why do you loop on the urls? import { CheerioCrawler } from 'crawlee';
let urls = [
'http://example.com/1.html',
'http://example.com/2.html',
'http://example.com/3.html',
'http://example.com/4.html',
] //array of urls
const crawler = new CheerioCrawler({
async requestHandler({ $, request }) {
const title = $('title').text();
console.log(`The title of "${request.url}" is: ${title}.`);
}
})
await crawler.run(urls); And limit to 10 requests per one crawl with maxRequestsPerCrawl option. |
Beta Was this translation helpful? Give feedback.
0 replies
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
-
Which package is this bug report for? If unsure which one to select, leave blank
@crawlee/cheerio (CheerioCrawler)
Issue description
the issue in not that easy to reproduce. and i am getting the error on version 3 on version 2 things were running very smoothly.
log file content:
Code sample
Package version
3.1.4
Node.js version
16
Operating system
windows
Apify platform
I have tested this on the
next
release3.1.5.beta8
Other context
No response
Beta Was this translation helpful? Give feedback.
All reactions