µtorrent - DHT?
Solved
kang59
Posted messages
103
Status
Membre
-
verdy_p Posted messages 214 Status Membre -
verdy_p Posted messages 214 Status Membre -
Hello,
I am a brand new arrival on this site, which I already know quite well, thanks to browsing the internet in search of various and (in)diverse information ^^.
My problem (to keep it brief):
I've been using µtorrent for a while.
Until today, I had no issues, but this morning, since it connected, it displays at the bottom, "DHT waiting for identification" or "DHT: 0 nodes (identification)" and my download speed has relatively decreased.
Yet, I haven't changed anything in my software and hardware configuration...
(connected via USB modem, ADSL 2M; Bitdefender 9 professionalplus with integrated firewall, permissions granted, ports open, etc...)
My question(s):
- What is DHT? I've searched everywhere on the site and I was unable to find a definition or explanation.
- How can I fix my problem?
Thank you in advance for any help!Configuration: Windows XP Firefox 2.0.0.4
I am a brand new arrival on this site, which I already know quite well, thanks to browsing the internet in search of various and (in)diverse information ^^.
My problem (to keep it brief):
I've been using µtorrent for a while.
Until today, I had no issues, but this morning, since it connected, it displays at the bottom, "DHT waiting for identification" or "DHT: 0 nodes (identification)" and my download speed has relatively decreased.
Yet, I haven't changed anything in my software and hardware configuration...
(connected via USB modem, ADSL 2M; Bitdefender 9 professionalplus with integrated firewall, permissions granted, ports open, etc...)
My question(s):
- What is DHT? I've searched everywhere on the site and I was unable to find a definition or explanation.
- How can I fix my problem?
Thank you in advance for any help!Configuration: Windows XP Firefox 2.0.0.4
4 réponses
http://ww12.quebectorrent.com
Distributed hash tables are used in Chord protocols, P2P CAN protocol, Tapestry, Kademlia (used by eMule), Ares Galaxy. This system is also used in many recent clients for the BitTorrent protocol like Azureus, Bitcomet, or µTorrent (pronounced Micro Torrent). The first BitTorrent client to use DHT was Azureus, followed by the official BitTorrent client, which were two different versions. The official version was then called Mainline DHT. Now most clients support the Mainline DHT version.
Principle [edit]
Suppose there is a large number of users (5 million) who have launched their P2P (Peer-to-Peer) software on their computers. Each shares some files (movies in MPEG format, images, disks, etc.)
A user (Luc) owns, for example, the music album "Les idees saines de Serge Dassault" (available under Creative Commons license).
Suppose another user (Pierre) wants to download this album. How can his P2P software find Luc's computer?
Pierre's software could possibly ask the 5 million computers if they happen to have this album. Luc's software would then respond: "I have it and can start transferring it."
However, it would be quite slow to ask the 5 million computers if they have this album, as there would be millions of queries like "I'm looking for this album, do you have it?" resulting in millions of responses: "No, sorry!"
A large directory archiving the names of the files shared by all users would solve the problem: one would only need to ask this "large directory" (= the hash table) for the music album "Les idees saines de Serge Dassault" to get the response: "it is available on Luc's computer (and Mathieu's, Paul's, etc.)."
Distributed hash tables are used in Chord protocols, P2P CAN protocol, Tapestry, Kademlia (used by eMule), Ares Galaxy. This system is also used in many recent clients for the BitTorrent protocol like Azureus, Bitcomet, or µTorrent (pronounced Micro Torrent). The first BitTorrent client to use DHT was Azureus, followed by the official BitTorrent client, which were two different versions. The official version was then called Mainline DHT. Now most clients support the Mainline DHT version.
Principle [edit]
Suppose there is a large number of users (5 million) who have launched their P2P (Peer-to-Peer) software on their computers. Each shares some files (movies in MPEG format, images, disks, etc.)
A user (Luc) owns, for example, the music album "Les idees saines de Serge Dassault" (available under Creative Commons license).
Suppose another user (Pierre) wants to download this album. How can his P2P software find Luc's computer?
Pierre's software could possibly ask the 5 million computers if they happen to have this album. Luc's software would then respond: "I have it and can start transferring it."
However, it would be quite slow to ask the 5 million computers if they have this album, as there would be millions of queries like "I'm looking for this album, do you have it?" resulting in millions of responses: "No, sorry!"
A large directory archiving the names of the files shared by all users would solve the problem: one would only need to ask this "large directory" (= the hash table) for the music album "Les idees saines de Serge Dassault" to get the response: "it is available on Luc's computer (and Mathieu's, Paul's, etc.)."
Eric
The goal then is to construct a graph that allows querying from close to close (using all the peers we are "connected" to (generally no more than 64) by transmitting our key searches: the DHT is designed so that, after the peers have queried each of their own DHT peers, we will have transmitted minimal duplicate searches to the same members of the global DHT via different paths. However, the DHT allows for multiple paths (DHT members can disconnect or become inaccessible or overloaded). It thus creates a global graph, very robust, stable, allowing querying of a network containing a plethora of disparate data. A DHT by itself does not allow searching for a specific file, just a digital fingerprint: all peers connected to the DHT will ultimately query their immediate peers to see if they know a fingerprint: the responses will be returned directly to the one who requested it by the reverse path (we can have a duplicate response through several paths, but there will generally be very few).
Thus, we end up with a list of hosts on the Internet that know a key and where we can connect to search for it explicitly: here we can check for example a real identity (perform a PKI key exchange, verify public signatures, PGP keys, etc...), connect securely once the identity is verified, then ask for what we really wanted with the searched fingerprint on the DHT: the host will say if it actually has what we requested with this anonymous fingerprint. We can also use the DHT to search for a non-anonymous file (only by its known and public digital fingerprint of its content: this is what is used for file sharing for example on Torrents where we generally search for a file with a public SHA1 fingerprint and we check that we indeed have the file we wanted once connected by asking for a longer but more secure key like another SHA512 fingerprint).
The DHT can be used for many things, including consolidating a blockchain and preventing a takeover and fragmentation of the network into several large islands and many isolated islets, as it would be very complicated to block 100% of the Internet (at worst, a country can block all Internet addresses of other countries to cut off bridges and create an islet, but it's actually very difficult because of VPNs and various means of transmitting a routing address or a mandatory service allowing interconnection, and there are many encryption processes that prevent seeing what we do on a private connection transiting over the Internet through various intermediaries depending on the load of the routers, etc.)
A DHT is therefore quite generic. With it, one can consolidate any peer-to-peer protocol, and even consolidate the routing of the internet itself (the links allowing several distinct networks to locate a routing path through announcements, which are themselves digitally signed): indeed, the Internet itself, already IPv4 alone, is a peer-to-peer network, just like the entire DNS system (everything works through delegations of trust and multiple possible paths, aiming to consolidate the structure so that if one of the network links fails, we can still find alternative routes, assess them, see which ones work best, provoke the least duplicate transmissions through the possible alternative paths).
The concept of the DHT is therefore very valuable: it is difficult to break the network and isolate parts because alternative routes can easily be found. And the DHT network optimizes itself (even though the DHT is not very fast to respond: the paths followed in the DHT searches make many "hops" but in fact 64 hops each covered in ~50ms gives a very acceptable response time of just over 3 seconds to search for something precious because unique in a dataset comprising millions (I should say billions of billions) of very different elements of different types known over billions of connected objects to the Net, and it requires very little bandwidth because we are not obliged to query every billion hosts on the network by connecting to each (which takes much longer).
In principle, the DHT is thus much faster and more reliable and does not require any permanent session (no need for TCP to interconnect its peers: queries are made with very small isolated datagrams: DHT members who receive a datagram can directly ignore it without doing anything else if this datagram does not "match" enough common bits with the unique keys searched, and if they respond, they do not need to wait for an acknowledgment from the one who was searching for the key; they can however respond multiple times to a search if a response has been lost along the way and if the same pair from the DHT does not request too frequently, for example with a minimum of 30 seconds to 1 minute before retransmitting the same response)
Based on a DHT, one can therefore build a protocol as reliable as TCP, with very good delivery guarantees, even across a giant network that is very unstable and heterogeneous in the speed and throughput of its individual connections. The DHT itself does not require cryptography; anyone can enter it (even a hacker wanting to sabotage it will not sabotage it, on the contrary, it will strengthen it and we do not need to know "who we are talking to" on a DHT, all lies eventually isolate and relegate far in the network those who do not respect the normal rules allowing the DHT to be constructed: the DHT easily detects "spoofing," for example, which is "man-in-the-middle" type attacks because it constrains each member to respect the highest degrees and otherwise will see themselves relegated to a greater distance and quickly become increasingly inaccessible and unused: the DHT requires trust but first and foremost enforces honesty, which is not easy to achieve with traditional P2P including IP routing protocols and possible routing announcements where someone can still easily divert traffic and "silence" certain parts of the IP network).
The DHT is an excellent building block in terms of efficiency; it is very economical. One can create a DHT connected simultaneously in IPv4 and IPv6 or any other protocol; there will always be an intermediary that understands and adapts the protocols (although the DHT is normally designed to be connected in "datagram" mode, generally UDP, it may have some peers using reliable transits like TCP or a serial port, a USB hub, a PCI bus, shared memory addresses in a host, no matter what, the DHT functions over a very heterogeneous network).