Skip to content

soneyworld/dtndht

 
 

Repository files navigation

DTNDHT is a library providing a fully distributed Bittorrent DHT based naming
service especially for DTN Bundle Protocol EIDs (RFC 5050). It can be used for
all IP (v4/v6) based networks. Further informations and evaluation can be
found here: http://dl.acm.org/citation.cfm?doid=2348616.2348619

This library implements a variant of the Kademlia Distributed Hash Table (DHT)
used in the Bittorrent network (``mainline'' variant). It is based on the 
work of Juliusz Chroboczek <[email protected]> for the transmission dht.
This library adds a new message type to the DHT, which can be used to lookup
and announce DTN convergence layer informations.

To use this library, the dtndht/dtndht.h has to be included.

The files tools/src/lookup.c and tools/src/announce.c are stand-alone programs
that participates in the bittorrent DHT and executes lookups or announcements.

The code is designed to work well in both event-driven and threaded code.
The caller, which is either an event-loop or a dedicated thread, must
periodically call the function dtn_dht_periodic.  In addition, it must call
dtn_dht_periodic whenever any data has arrived from the network.

All functions return -1 in case of failure, or a positive value in case of
success.

Initialization
**************

All library calls need a context pointer to a dtn_dht_context struct. So there
are a few helping/automation functions for it: 

* dtn_dht_initstruct(struct dtn_dht_context *ctx);

This initializes the struct automatically to default values. It sets the default
DHT UDP port and also generates a random DHT ID. The values can also been set 
manually. There is a pointer to a convergence layer struct. This can be changed
every time between calling library functions. It should contain all active
convergence layer and its parameter.

To open the UDP sockets for DHT simply call:
* dtn_dht_init_sockets(struct dtn_dht_context *ctx);
Also this can be made manually.


The first library function to be called is:
* dtn_dht_init(struct dtn_dht_context *ctx);
This must be called before using the DHT. You pass it the created context,
containing a bound IPv4 datagram socket, a bound IPv6 datagram socket, and 
your DHT node ID, a 20-octet array that should be globally unique.

If you're on a multi-homed host, you should bind the sockets to one of your
addresses.

DHT Node IDs must be well distributed, you should either generate a truly
random value (using plenty of entropy, like generated by dtn_dht_initstruct),
or at least take the SHA-1 of something.  However, it is a good idea to keep
the ID stable, so you may want to store it in stable storage at client
shutdown.

* dtn_dht_build_id_from_str(unsigned char *target, const char *s, size_t len)
Generates a DHT node ID from the given string s with a given length len.

Shutdown
********

* dtn_dht_uninit
This may be called at the end of the DHT session.

After shutting down the DHT, you could shut down the opened UDP sockets by
calling the function:
* dtn_dht_close_sockets(struct dtn_dht_context *ctx)



Bootstrapping
*************

The DHT needs to be taught a small number of contacts to begin functioning.
You can hard-wire a small number of stable nodes in your application, but
this obviously fails to scale.  You may save the list of known good nodes
at shutdown, and restore it at restart.

* dtn_dht_ping_node

This is the main bootstrapping primitive.  You pass it an address at which
you believe that a DHT node may be living, and a query will be sent.  If
a node replies, and if there is space in the routing table, it will be
inserted.

* dtn_dht_insert_node

This is a softer bootstrapping method, which doesn't actually send
a query -- it only stores the node in the routing table for later use.  It
is a good idea to use that when e.g. restoring your routing table from
disk.

Note that dtn_dht_insert_node requires that you supply a node id.  If the
id turns out to be wrong, the DHT will eventually recover; still, inserting
massive amounts of incorrect information into your routing table is
certainly not a good idea.

An additionally difficulty with dht_insert_node is that, for various
reasons, a Kademlia routing table cannot absorb nodes faster than a certain
rate.  Dumping a large number of nodes into a table using dtn_dht_insert_node
will probably cause most of these nodes to be discarded straight away.
(The tolerable rate is difficult to estimate; it is probably on the order
of one node every few seconds per node already in the table divided by 8,
for some suitable value of 8.)

* dtn_dht_dns_bootstrap(struct dtn_dht_context *ctx, const char* name, const
 char* service)
Starts a DNS lookup for the given host name and service. The returned IP address
or if multiple addresses where returned, all addresses will be used to call
dtn_dht_ping_node. This is a very simple bootstrapping mechanism, if you have
got a DNS server providing IP and port informations of existing DHT nodes.

If the given domain name is NULL, the DNS server of the IBR will be chosen:
dtndht.ibr.cs.tu-bs.de
If the given service name is NULL, the default service name of the bittorrent
DHT will be chosen:
6881

After getting contact to an active DHT node, the number of DHT neighbors can
be enhanced by searching for random hashes on the DHT. This can be done by
calling:
* dtn_dht_start_random_lookup(struct dtn_dht_context *ctx);

Doing some work
***************

* dtn_dht_periodic

This function should be called by your main loop periodically, and also
whenever data is available on the socket.  The time after which
dht_periodic should be called if no data is available is returned in the
parameter tosleep.  (You do not need to be particularly accurate; actually,
it is a good idea to be late by a random value.)

The parameters buf, buflen, from and fromlen optionally carry a received
message.  If buflen is 0, then no message was received.

* dtn_dht_lookup(struct dtn_dht_context *ctx, const char *eid, size_t eidlen);

This schedules a lookup for EID information about the given eid and its length.

Up to DHT_MAX_SEARCHES (1024) searches can be in progress at a given time;
any more, and dtn_dht_lookup will return -1.  If you specify a new search for
the same EID as a lookup is still in progress, the previous lookup is
combined with the new one -- you will only receive a completion indication
once.

* dtn_dht_announce(struct dtn_dht_context *ctx, const char *eid,
		size_t eidlen, enum dtn_dht_lookup_type type)

This starts a periodic announcement of the given EID to the DHT. The type
(SINGLETON, NEIGHBOUR, or GROUP) gives the EID, which has to be announced.
The SINGLETON type must be used for the daemons EID. If a neighbor should
be announced, the type has to be NEIGHBOR. GROUP should be used for all
group memberships, which should be announced.
  
* dtn_dht_deannounce(const char *eid, size_t eidlen)

A previous announcement will be stopped, which means, it will not announced
to the DHT anymore. It does not mean, that the announcement will be deleted
on the DHT. So it could happen, that an announcement will stay on the DHT
for about 30 minutes since this function has been called. There is no way to
delete announced informations from the DHT, because it is a design feature.

Information queries
*******************

* dtn_dht_nodes

This returns the number of known good, dubious and cached nodes in our
routing table.  This can be used to decide whether it's reasonable to start
a search; a search is likely to be successful as long as we have a few good
nodes; however, in order to avoid overloading your bootstrap nodes, you may
want to wait until good is at least 4 and good + doubtful is at least 30 or
so.

It also includes the number of nodes that recently send us an unsolicited
request; this can be used to determine if the UDP port used for the DHT is
firewalled.

If you want to display a single figure to the user, you should display
good + doubtful, which is the total number of nodes in your routing table.
Some clients try to estimate the total number of nodes, but this doesn't
make much sense -- since the result is exponential in the number of nodes
in the routing table, small variations in the latter cause huge jumps in
the former.

* dtn_dht_ready_for_work(struct dtn_dht_context *ctx)

Will return true, if the limits described above has been reached. So if
true is returned, you could start lookups.

* dtn_dht_get_nodes

This retrieves the list of known good nodes, starting with the nodes in our
own bucket.  It is a good idea to save the list of known good nodes at
shutdown, and ping them at startup.

* dtn_dht_save_conf(const char *filename)

Saves the routing table to the given file.

* dtn_dht_load_prev_conf(const char *filename)

Loads a previously saved routing file and bootstraps from it.

Functions provided by you
*************************

* dtn_dht_handle_lookup_result(const struct dtn_dht_lookup_result *result)

The function is called if a lookup for an EID has been answered. The result
contains the parsed contact informations about EID of the answering node,
its convergence layer informations and its neighbors and group membership.

* dtn_dht_operation_done(const unsigned char *info_hash)

This will be called if a lookup or an announcement has been finished.
The returned info_hash is a SHA1 hash of the EID, which operation has been
finished.

Functions possibly helpful for you
**********************************

* dtn_dht_blacklist(int enable)

Switches blacklist on/off. Default is on. The blacklist blocks wrong acting
DHT nodes.

* unsigned int dtn_dht_blacklisted_nodes(unsigned int *ipv4_return,
		unsigned int *ipv6_return)

If blacklist has been enabled (default) the number of blocked addresses
(IPv4 and IPv6) is returned.

* dtn_dht_free_convergence_layer_struct(struct dtn_convergence_layer *clayer)

Delete a given list of convergence layers. It is just a util function for
simpler cleanup after changing the convergence layer in the context.

Final notes
***********

* NAT

Nothing works well across NATs, but Kademlia is somewhat less impacted than
many other protocols. While there is no periodic pinging in this
implementation, maintaining a full routing table requires slightly more than
one packet exchange per minute, even in a completely idle network; this
should be sufficient to make most full cone NATs happy. So taking part as a
member of the DHT is possible for node behind a NAT, but it isn't that good
as it could be. BUT announcing any DTN informations to other will fail with
a very high probability. Lookups are working, because the UDP communication
direction is from behind the NAT into the internet.

* IBRDTN

This library is used by IBRDTN (http://www.ibr.cs.tu-bs.de/projects/ibr-dtn/).
So a good usage example is the integration in the IBRDTN daemon. This could
be very helpful to understand the way it works.

* Evaluation

In the folder "eval" are a lot of scripts for evaluation. They are very
fast coded and personalized, so simply running them will probably fail.
But with a little bit of changing paths and IPs it should be easy to run
them.

* Missing functionality

Some of the code has had very little testing.  If it breaks, you get to
keep both pieces.


                                        Till Lorentzen
                                        <[email protected]>

About

DTNDHT library using BitTorrent DHT

Resources

License

MIT, Unknown licenses found

Licenses found

MIT
LICENCE
Unknown
COPYING

Stars

Watchers

Forks

Packages

No packages published

Languages

  • C 80.7%
  • Perl 12.5%
  • Shell 6.8%