forked from jech/dht
-
Notifications
You must be signed in to change notification settings - Fork 2
/
Copy pathREADME
279 lines (201 loc) · 11.1 KB
/
README
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
DTNDHT is a library providing a fully distributed Bittorrent DHT based naming
service especially for DTN Bundle Protocol EIDs (RFC 5050). It can be used for
all IP (v4/v6) based networks. Further informations and evaluation can be
found here: http://dl.acm.org/citation.cfm?doid=2348616.2348619
This library implements a variant of the Kademlia Distributed Hash Table (DHT)
used in the Bittorrent network (``mainline'' variant). It is based on the
work of Juliusz Chroboczek <[email protected]> for the transmission dht.
This library adds a new message type to the DHT, which can be used to lookup
and announce DTN convergence layer informations.
To use this library, the dtndht/dtndht.h has to be included.
The files tools/src/lookup.c and tools/src/announce.c are stand-alone programs
that participates in the bittorrent DHT and executes lookups or announcements.
The code is designed to work well in both event-driven and threaded code.
The caller, which is either an event-loop or a dedicated thread, must
periodically call the function dtn_dht_periodic. In addition, it must call
dtn_dht_periodic whenever any data has arrived from the network.
All functions return -1 in case of failure, or a positive value in case of
success.
Initialization
**************
All library calls need a context pointer to a dtn_dht_context struct. So there
are a few helping/automation functions for it:
* dtn_dht_initstruct(struct dtn_dht_context *ctx);
This initializes the struct automatically to default values. It sets the default
DHT UDP port and also generates a random DHT ID. The values can also been set
manually. There is a pointer to a convergence layer struct. This can be changed
every time between calling library functions. It should contain all active
convergence layer and its parameter.
To open the UDP sockets for DHT simply call:
* dtn_dht_init_sockets(struct dtn_dht_context *ctx);
Also this can be made manually.
The first library function to be called is:
* dtn_dht_init(struct dtn_dht_context *ctx);
This must be called before using the DHT. You pass it the created context,
containing a bound IPv4 datagram socket, a bound IPv6 datagram socket, and
your DHT node ID, a 20-octet array that should be globally unique.
If you're on a multi-homed host, you should bind the sockets to one of your
addresses.
DHT Node IDs must be well distributed, you should either generate a truly
random value (using plenty of entropy, like generated by dtn_dht_initstruct),
or at least take the SHA-1 of something. However, it is a good idea to keep
the ID stable, so you may want to store it in stable storage at client
shutdown.
* dtn_dht_build_id_from_str(unsigned char *target, const char *s, size_t len)
Generates a DHT node ID from the given string s with a given length len.
Shutdown
********
* dtn_dht_uninit
This may be called at the end of the DHT session.
After shutting down the DHT, you could shut down the opened UDP sockets by
calling the function:
* dtn_dht_close_sockets(struct dtn_dht_context *ctx)
Bootstrapping
*************
The DHT needs to be taught a small number of contacts to begin functioning.
You can hard-wire a small number of stable nodes in your application, but
this obviously fails to scale. You may save the list of known good nodes
at shutdown, and restore it at restart.
* dtn_dht_ping_node
This is the main bootstrapping primitive. You pass it an address at which
you believe that a DHT node may be living, and a query will be sent. If
a node replies, and if there is space in the routing table, it will be
inserted.
* dtn_dht_insert_node
This is a softer bootstrapping method, which doesn't actually send
a query -- it only stores the node in the routing table for later use. It
is a good idea to use that when e.g. restoring your routing table from
disk.
Note that dtn_dht_insert_node requires that you supply a node id. If the
id turns out to be wrong, the DHT will eventually recover; still, inserting
massive amounts of incorrect information into your routing table is
certainly not a good idea.
An additionally difficulty with dht_insert_node is that, for various
reasons, a Kademlia routing table cannot absorb nodes faster than a certain
rate. Dumping a large number of nodes into a table using dtn_dht_insert_node
will probably cause most of these nodes to be discarded straight away.
(The tolerable rate is difficult to estimate; it is probably on the order
of one node every few seconds per node already in the table divided by 8,
for some suitable value of 8.)
* dtn_dht_dns_bootstrap(struct dtn_dht_context *ctx, const char* name, const
char* service)
Starts a DNS lookup for the given host name and service. The returned IP address
or if multiple addresses where returned, all addresses will be used to call
dtn_dht_ping_node. This is a very simple bootstrapping mechanism, if you have
got a DNS server providing IP and port informations of existing DHT nodes.
If the given domain name is NULL, the DNS server of the IBR will be chosen:
dtndht.ibr.cs.tu-bs.de
If the given service name is NULL, the default service name of the bittorrent
DHT will be chosen:
6881
After getting contact to an active DHT node, the number of DHT neighbors can
be enhanced by searching for random hashes on the DHT. This can be done by
calling:
* dtn_dht_start_random_lookup(struct dtn_dht_context *ctx);
Doing some work
***************
* dtn_dht_periodic
This function should be called by your main loop periodically, and also
whenever data is available on the socket. The time after which
dht_periodic should be called if no data is available is returned in the
parameter tosleep. (You do not need to be particularly accurate; actually,
it is a good idea to be late by a random value.)
The parameters buf, buflen, from and fromlen optionally carry a received
message. If buflen is 0, then no message was received.
* dtn_dht_lookup(struct dtn_dht_context *ctx, const char *eid, size_t eidlen);
This schedules a lookup for EID information about the given eid and its length.
Up to DHT_MAX_SEARCHES (1024) searches can be in progress at a given time;
any more, and dtn_dht_lookup will return -1. If you specify a new search for
the same EID as a lookup is still in progress, the previous lookup is
combined with the new one -- you will only receive a completion indication
once.
* dtn_dht_announce(struct dtn_dht_context *ctx, const char *eid,
size_t eidlen, enum dtn_dht_lookup_type type)
This starts a periodic announcement of the given EID to the DHT. The type
(SINGLETON, NEIGHBOUR, or GROUP) gives the EID, which has to be announced.
The SINGLETON type must be used for the daemons EID. If a neighbor should
be announced, the type has to be NEIGHBOR. GROUP should be used for all
group memberships, which should be announced.
* dtn_dht_deannounce(const char *eid, size_t eidlen)
A previous announcement will be stopped, which means, it will not announced
to the DHT anymore. It does not mean, that the announcement will be deleted
on the DHT. So it could happen, that an announcement will stay on the DHT
for about 30 minutes since this function has been called. There is no way to
delete announced informations from the DHT, because it is a design feature.
Information queries
*******************
* dtn_dht_nodes
This returns the number of known good, dubious and cached nodes in our
routing table. This can be used to decide whether it's reasonable to start
a search; a search is likely to be successful as long as we have a few good
nodes; however, in order to avoid overloading your bootstrap nodes, you may
want to wait until good is at least 4 and good + doubtful is at least 30 or
so.
It also includes the number of nodes that recently send us an unsolicited
request; this can be used to determine if the UDP port used for the DHT is
firewalled.
If you want to display a single figure to the user, you should display
good + doubtful, which is the total number of nodes in your routing table.
Some clients try to estimate the total number of nodes, but this doesn't
make much sense -- since the result is exponential in the number of nodes
in the routing table, small variations in the latter cause huge jumps in
the former.
* dtn_dht_ready_for_work(struct dtn_dht_context *ctx)
Will return true, if the limits described above has been reached. So if
true is returned, you could start lookups.
* dtn_dht_get_nodes
This retrieves the list of known good nodes, starting with the nodes in our
own bucket. It is a good idea to save the list of known good nodes at
shutdown, and ping them at startup.
* dtn_dht_save_conf(const char *filename)
Saves the routing table to the given file.
* dtn_dht_load_prev_conf(const char *filename)
Loads a previously saved routing file and bootstraps from it.
Functions provided by you
*************************
* dtn_dht_handle_lookup_result(const struct dtn_dht_lookup_result *result)
The function is called if a lookup for an EID has been answered. The result
contains the parsed contact informations about EID of the answering node,
its convergence layer informations and its neighbors and group membership.
* dtn_dht_operation_done(const unsigned char *info_hash)
This will be called if a lookup or an announcement has been finished.
The returned info_hash is a SHA1 hash of the EID, which operation has been
finished.
Functions possibly helpful for you
**********************************
* dtn_dht_blacklist(int enable)
Switches blacklist on/off. Default is on. The blacklist blocks wrong acting
DHT nodes.
* unsigned int dtn_dht_blacklisted_nodes(unsigned int *ipv4_return,
unsigned int *ipv6_return)
If blacklist has been enabled (default) the number of blocked addresses
(IPv4 and IPv6) is returned.
* dtn_dht_free_convergence_layer_struct(struct dtn_convergence_layer *clayer)
Delete a given list of convergence layers. It is just a util function for
simpler cleanup after changing the convergence layer in the context.
Final notes
***********
* NAT
Nothing works well across NATs, but Kademlia is somewhat less impacted than
many other protocols. While there is no periodic pinging in this
implementation, maintaining a full routing table requires slightly more than
one packet exchange per minute, even in a completely idle network; this
should be sufficient to make most full cone NATs happy. So taking part as a
member of the DHT is possible for node behind a NAT, but it isn't that good
as it could be. BUT announcing any DTN informations to other will fail with
a very high probability. Lookups are working, because the UDP communication
direction is from behind the NAT into the internet.
* IBRDTN
This library is used by IBRDTN (http://www.ibr.cs.tu-bs.de/projects/ibr-dtn/).
So a good usage example is the integration in the IBRDTN daemon. This could
be very helpful to understand the way it works.
* Evaluation
In the folder "eval" are a lot of scripts for evaluation. They are very
fast coded and personalized, so simply running them will probably fail.
But with a little bit of changing paths and IPs it should be easy to run
them.
* Missing functionality
Some of the code has had very little testing. If it breaks, you get to
keep both pieces.
Till Lorentzen