-
Notifications
You must be signed in to change notification settings - Fork 488
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Crawler mode speed #17
Comments
Did you run it in local network? This crawler can't run behind NAT now. |
Sure. I read the documentation.
Throughput seems to be 60-100 per minute. With p2pspider I get 3200/minute after an hour.
…-------- Messaggio originale --------
Da: Lime <[email protected]>
Data:10/01/2017 13:54 (GMT+02:00)
A: shiyanhui/dht <[email protected]>
Cc: Zibri <[email protected]>,Author <[email protected]>
Oggetto: Re: [shiyanhui/dht] Crawler mode speed (#17)
Did you run it in local network? This crawler can't run behind NAT now.
—
You are receiving this because you authored the thread.
Reply to this email directly, view it on GitHub, or mute the thread.
|
Is it the sample that you are running?
|
Yes, I am running the sample.
I understand what you say but on the same pc, with the same bootstrap servers the two programs behave differently.
Yours starts at about 60 peers/minute and stays there even after 3 hours.
p2pspider (which I suggest you to test just for comparison) starts also at 60 then in about 2 hours is at full speed and consumes almost half of my bandwidth!
and also 100% cpu time...
p2pspider uses nodejs because it’s javascript I just wanted to test yours to see if I get similar results with less cpu time (bandwidth comes with it I think)
or, maybe is anything I need to set?
Regards,
Zibri
http://www.zibri.org
https://twitter.com/Zibri
From: Lime
Sent: Wednesday, January 11, 2017 03:03
To: shiyanhui/dht
Cc: Zibri ; Author
Subject: Re: [shiyanhui/dht] Crawler mode speed (#17)
Is it the sample that you are running?
a.. There are two kind of peers message in dht protocol, get_peers and announce_peer. Only announce_peer is what we want. The example will only print successful BT seed. I don't know what p2pspider print.
b.. We got announce_peer message, and then we fetch the BT seed. If it fails, the IP:port will be put in blacklist, and DHT crawler will fetch it again after maybe one hour.
—
You are receiving this because you authored the thread.
Reply to this email directly, view it on GitHub, or mute the thread.
|
OK, I'll figure it out. |
I have the same question as Zibri. Intrinsically golang should go faster than nodejs, and I am listening for annouce peer right now. But please let us know if what in the config we can tweek to make the spider mode go faster. |
At the moment I am using simdht with pypy... check it out. But I still think C is the way...
Inviato dal mio dispositivo Samsung
…-------- Messaggio originale --------
Da: fanpei91 <[email protected]>
Data: 13/12/17 16:39 (GMT+01:00)
A: shiyanhui/dht <[email protected]>
Cc: Zibri <[email protected]>, Author <[email protected]>
Oggetto: Re: [shiyanhui/dht] Crawler mode speed (#17)
I have rewritten p2pspider from node to golang recently. Same efficiency as before, but higher performance.
—
You are receiving this because you authored the thread.
Reply to this email directly, view it on GitHub, or mute the thread.
{"api_version":"1.0","publisher":{"api_key":"05dde50f1d1a384dd78767c55493e4bb","name":"GitHub"},"entity":{"external_key":"github/shiyanhui/dht","title":"shiyanhui/dht","subtitle":"GitHub repository","main_image_url":"https://cloud.githubusercontent.com/assets/143418/17495839/a5054eac-5d88-11e6-95fc-7290892c7bb5.png","avatar_image_url":"https://cloud.githubusercontent.com/assets/143418/15842166/7c72db34-2c0b-11e6-9aed-b52498112777.png","action":{"name":"Open in GitHub","url":"https://github.com/shiyanhui/dht"}},"updates":{"snippets":[{"icon":"PERSON","message":"@fanpei91 in #17: I have rewritten [p2pspider](https://github.com/fanpei91/p2pspider) from node to golang recently. Same efficiency as before, but higher performance."}],"action":{"name":"View Issue","url":"#17 (comment)"}}}
|
simdht for golang is here godht I don't think so c is the way. You need to learn golang 1.9 runtime's performance. |
I am checking dht in go…
I see the output:
link: magnet:?xt=urn:btih:49a2afaa0a3bb5e1eb45cb2cc598c7ed6cd9c2c5
node: 2.136.205.155:58236
peer: 2.136.205.155:58236
How to include the announced FILE NAME (if present in the announcement)?
What I need is just the has and the name.
Sent from Mail for Windows 10
From: fanpei91
Sent: Wednesday, December 13, 2017 18:04
To: shiyanhui/dht
Cc: Zibri; Mention
Subject: Re: [shiyanhui/dht] Crawler mode speed (#17)
simdht for golang is here godht
I don't think so c is the way. You need to learn golang 1.9 runtime's performance.
—
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub, or mute the thread.
|
Hmm
Even if I never coded in go I did this:
for announce := range dht.Announce {
rawa := announce.Raw["a"].(map[string]interface{})
fmt.Println(fmt.Sprintf("link: magnet:?xt=urn:btih:%v\nraw: %s\n",
announce.InfohashHex,
rawa["name"],
))
And now it writes hash and name.
Hmm.. I wonder the speed compared to simDHT with pypy
From: fanpei91
Sent: Wednesday, December 13, 2017 18:04
To: shiyanhui/dht
Cc: Zibri; Mention
Subject: Re: [shiyanhui/dht] Crawler mode speed (#17)
simdht for golang is here godht
I don't think so c is the way. You need to learn golang 1.9 runtime's performance.
—
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub, or mute the thread.
|
By the way, as of now my crawler (modified simDHT running multithreaded is doing:
(20494 hashes / min) (8019 unique hashes /min)
Bandwidth: 11834.47 / 6569.03 Kbit/s
But I think that with the right coding it could go much higher than that!
From: fanpei91
Sent: Wednesday, December 13, 2017 18:04
To: shiyanhui/dht
Cc: Zibri; Mention
Subject: Re: [shiyanhui/dht] Crawler mode speed (#17)
simdht for golang is here godht
I don't think so c is the way. You need to learn golang 1.9 runtime's performance.
—
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub, or mute the thread.
|
I did some testing..
Even putting 1500 friends/sec the speed is ridiculous..
pid 1217 speed 27 running time 29:51
Total speed: 27
Top speed: 75
Top speed 75? After half an hour? I get that speed in 1 minutes with the modified simdht.
And look the speed now:
pid 3687 speed 12561 running time 19-19:34:17
pid 3686 speed 6662 running time 19-19:34:17
Total speed: 19223
TOP SPEED
Top speed: 19223
That’s 20K hashes per minute!
From: fanpei91
Sent: Wednesday, December 13, 2017 18:04
To: shiyanhui/dht
Cc: Zibri; Mention
Subject: Re: [shiyanhui/dht] Crawler mode speed (#17)
simdht for golang is here godht
I don't think so c is the way. You need to learn golang 1.9 runtime's performance.
—
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub, or mute the thread.
|
Even increasing the connection limits I notice that in crawler mode it gets only 60 peers/minute.
Is there a setting to increase the speed?
With another crawler I have I can get 100000/hour!
The text was updated successfully, but these errors were encountered: