Author Archives: ldao

1. BitTorrent clients

     If you want to download a large blu-ray movie file in its original format at your maximum sustaining download speed, more sooner than later you will run into or will have to consider the use of peer-to-peer file transfer, most notably the deployment of BitTorrent technology.

     To employ the power and versatility of the BitTorrent protocol, you will need to have a BitTorrent client, which is a computer program designed for peer-to-peer file sharing using the message communications and rules as defined by the BitTorrent protocol.

     In a nutshell, the BitTorrent protocol segments a file into many smaller chunks of a specified size and coordinates the transfer of these chunks among other computer users called peers connected in one or more large global network called swarms. The chunks are transferred in no particular order based upon the availability of one of the participating peers. One peer at any moment can act as a server to send a particular chunk of data to thousands of other peers. The same peer can simultaneously receive data from these thousands of other peers acting as servers. So it is quite possible that in a normal file transfer, you may receive the entire movie minus a small chunk somewhere in the middle of the movie before the download is completed.

     Popular BitTorrent clients such as BitComet, BitTorrent, uTorrent, qBitTorrent, Tixati, Transmission, or Vuze, to name a few, are compiled programs developed either on C++ or Java platform. These are computer programs specifically designed for peer-to-peer file sharing using the BitTorrent protocol.

     The BitTorrent protocol coordinates segmented file transfer among peers connected in a swarm. A BitTorrent client enables a user to exchange data as a peer in one or more swarms. From the user perspective, a peer participating in a transfer is simply an Internet address with an associated port number. There are tools available to map the public Internet address into city and country location. A swarm is a collection of peers participating in the transfer of a particular digital content. The mainline DHT and the Vuze DHT network contains millions of swarms with hundreds of millions of peers worldwide at any moment.

     It is entirely possible that one can get a complete download from many peers who do not have the complete download yet but do have different fragments that made up the complete content. In this sense, each peer in a swarm acts as both a client and a server which is very much different from the conventional meaning of a client–server model.

     The BitTorrent protocol after all is one of those standards to transfer files over the Internet in the most efficient manner possible. It neither cares nor it is capable to recognize which material circulating in its many swarms as copyrighted or as in public domain. In the United States, copyright infringement is a serious offense punishable by federal law. To be specific, violating the United States Copyright Act (17 U.S.C. §§ 101-1332) is a typical complaint in many copyright infringement lawsuits against the users of BitTorrent.

     Not all BitTorrent clients are open-source software. Open-source implies that anyone can build the client based upon its available source code. For example, Vuze is open source and its Java source code is freely available. This is easier said than done since it is a major and difficult task to convert the source code into its executable form. Once in its executable form, no more features can be added or modified. Vuze supports a plugin architecture to enable extension of the main program, but again no more features can be added or modified to the extension. Any changes, major or minute, require the conversion process from the modified source code to its updated executable form.

2. Legal implications

     The following excerpt of explanation to the court of law is often found among the many lawsuits. BitTorrent protocol is a decentralized method of distributing data. Instead of relying on a central server to distribute data directly to individual users, the BitTorrent protocol allows individual users to distribute data among themselves by exchanging pieces of the file with each other to eventually obtain a whole copy of the file. When using the BitTorrent protocol, every user simultaneously receives information from and transfers information to one another. Peers are individual downloaders or distributors of a particular file. A swarm is a group of peers involved in downloading or distributing a particular file. A tracker is a server which stores a list of peers in a swarm. Each swarm is unique to a particular file.

     The BitTorrent protocol functions as follows:
     First, a user locates a small torrent file, typically from a traditional search engine or from a torrent index site. This file contains information about the files to be shared and about the tracker, the computer that coordinates the file distribution.

     Second, the user loads this torrent file into a BitTorrent client, which automatically attempts to connect the tracker listed in the torrent file.

     Third, the tracker responds with a list of peers and the BitTorrent client connects to those peers to begin downloading data from and distributing data to the other peers in the swarm. When the download is complete, the BitTorrent client continues distributing data to other peers in the swarm until the user manually disconnects from the swarm or the BitTorrent client otherwise does the same.

     The torrent index sites are also target of copyright infringement lawsuits since they host the torrents which facilitate the infringement activities and therefore play the role of an accomplice even though the sites do not store any actual copyrighted material.

3. Overview of how a BitTorrent client works

     Increasing Internet bandwidth at gigabit transfer speed, the availability of high-quality encoding and compression of digital media, the increasing capabilities of residential personal computers at much lower cost, and efficient low-powered small footprint single computer on a chip are some of the main factors that contributed to the widespread adoption and facilitation of peer-to-peer file sharing. Users are able to transfer one or more files from one computer to another across the Internet through various file transfer systems and other file-sharing networks. Peer-to-peer file sharing is different from traditional file downloading from a website server. In peer-to-peer sharing, you use a client program (not a typical web browser) to locate computers participating in the peer-to-peer networks that have the file or movie you want. Because these are ordinary computers like yours, as opposed to a traditional server, they are called peers. In a nutshell, a flexible peer-to-peer client program must offer the following capabilities with optional user configurable parameters:

     a. You run a peer-to-peer file-sharing software on your computer and send out a request for the specific file or content you want to download. The request is typical a search word or a phrase, the partial name of a known file, or a mechanism that can uniquely identify the desired digital contents to download without having to rely on any additional information from any other sites. The BitTorrent peer-to-peer file sharing uses a small torrent file or simply a 20-byte hash code to uniquely identify the file on its network.

     b. To locate the file, the software on your computer queries other computers that are connected to the Internet and running the file-sharing software using the same message protocols that your computer employs. There are two large BitTorrent networks: the mainline DHT and the Vuze DHT network. The mainline DHT can have more than several hundred millions of users worldwide at a time. The Vuze DHT network is a lot smaller but can also have millions of users at a time and offers much more relevant hits in terms of successful search.

     c. When the software finds one or more computers that have either the complete file or parts of the file you want on their local storage, the download begins. The BitTorrent protocol breaks the file into many smaller chunks and the data transfer of these chunks can occur in any order. Therefore it is possible to download successfully a complete file from a group of computers that have only parts of the desired file. Many users abuse the transfer protocol by employing asymmetric transfer methods (receiving a lot more than sending) so the peer-to-peer file-sharing software must offer users customized means to balance, limit, or modify such transfer behavior.

     d. Besides the visible Internet, the peer-to-peer file-sharing software must offer access to contents available on other invisible Internet networks such as I2P and the Tor networks as well as what other peers are currently downloading.

4. The need for a scripting engine.

     It does not matter how well designed a BitTorrent client is, after a certain period of use, specific needs will arise out of the usage pattern of the client program. For example, the Vuze BitTorrent client displays on the program status bar the number of peers currently available on the Vuze DHT network and also the mainline DHT network if the mainline DHT plugin is installed. Like most BitTorrent clients, Vuze has two independent parts: the engine or API (Application Programming Interface) part that calculates the number of peers, and the GUI (Graphic User Interface) part that displays this number on the program status bar.

     If the engine part is a component of a scripting engine or its components are made available to external programs, the number of peers can be obtained without having to deal with the GUI display part of the client program. In this scenario, a scripting engine can support any interactive requirements in a true scripting environment such as Python or even headless environment such as a web server. This is possible due to all features and parameters displayed by the GUI component can be configured by the scripting environment without having to modify and compile or rebuild the client from its source code.

5. The Azureus engine

     The Azureus engine was developed for the Java platform. It was first released in June 2003 at SourceForge.net, mostly to experiment with the Standard Widget Toolkit from Eclipse. It was used by the BitTorrent client Vuze, one of the most popular BitTorrent clients. The Azureus engine was released under the GNU General Public License, and remains as a free opne-source application. It offers the following major benefits:

     1. Platform-independent engine to support the development of BitTorrent applications. The engine can work with or without a graphic user interface (GUI). Popular BitTorrent client such as Vuze uses the Eclipse Standard Widget Toolkit (SWT) as the graphic user interface to communicate with the Azureus engine. Without the requirement to use a GUI, the Azureus engine can work flawlessly with all essential features provided by a typical GUI in either headless or scripting environment with a much smaller footprint and less requirements of system hardware resources.

     2. Since the Azureus engine was developed as a Java application, all public functions of its Application Programming Interface (API) and utilities can be accessed directly by scripting in the Python language using Jython, a Python-compatible interpreter for Java application. A complete functional BitTorrent environment can be built quickly using standard interactive Python features together with robust Java run-time libraries and Python standard libraries.

     3. Besides the obvious advantages of not having to reinvent the wheel in terms of deployment a new BitTorrent application with all essential functions and tools already provided by the Azureus engine, the Java and Python platform provides vast and robust resources to develop and meet additional network, storage, and database requirements.

6. Design concept and requirements of the btScript environment

     This project offers a Python scripting environment to build a general-purpose BitTorrent search engine. The engine monitors users currently active in one or more transfer activities on two large BitTorrent public networks. The Python scripting environment is in beta testing phase and is planned to be released in late 2020. Since the script is in Python, it is cross-platform (Windows, Linux, etc.) and can be used with a relational database engine to build a private torrent index site. Running btScript on 4 low-cost Linux computers can collect a one-million unique torrents in a matter of a few days.

     BitTorrent has established itself as a robust, fast, and efficient message transfer protocol to transfer large files among peer-to-peer networks. In trackerless DHT networks, the use of trackers or special servers to locate the location of desired files is optional. There are two major popular trackerless networks in use, the mainline DHT and the Vuze DHT network. The mainline DHT uses the LibTorrent (LT) messaging protocol. The Vuze DHT network supports both the LibTorrent (LT) and the Azureus (AZ) messaging protocols. The mainline DHT network is much larger than the Vuze DHT network and can have as many as 200 million peers active at a moment. The Vuze DHT network is much smaller but still can have as many as 1 million users at a time.

     Most BitTorrent clients support only the LibTorrent message protocol to enable peer-to-peer data transfer on the mainline DHT network. Only a few BitTorrent clients (Vuze, BiglyBT, Transmission) support both message protocols to access the mainline DHT and the Vuze DHT network at the same time. The Vuze message protocol allows the title of torrents in each peer potentially participating in transfer activity to be seen by any other peer in the network. The address of up to 50 active peers currently associated with a particular peer can also be seen. This capability makes possible features such as “Related Content” in the BitTorrent client Vuze and the automatic detection of new contents from peers participating in a BitTorrent swarm without having to rely on external torrent sites or any other type of search engine.

     Due to a huge and diverse volume of digital contents in many languages available on the DHT networks, both public and copyrighted, it is desirable to have a general-purpose torrent search engine to find new and recent contents without having to rely on traditional torrent index sites, search engines or RSS feeds. The search engine has the ability to monitor specific peers without having to participate in any type of contents transfer activities. Over a certain period of time, finding or indexing new contents by monitoring the downloading queue of a specific group of active peers and thus requesting data transfer from these specific peers will render more relevant and satisfactory results. This should be a signficant feature of a torrent search engine because most top hits returned by most if not all traditional search engines based upon categories from index sites or feeds most often do not match the user interest or simply unavailable to download.

     Vuze is the only BitTorrent client that offers the automatic discovery of new contents via its “Swarm Dicovery” and “Related Contents Management” feature. This feature makes Vuze unique among the many outstanding BitTorrent clients because it offers users a mechanism through which to find new content, one that does not involve visiting a torrent site. This mechanism finds contents to download based on the activities of other BitTorrent users that have downloaded the same or similar contents as you do. In other words, the “Swarm Discoveries” feature helps BitTorrent users to anonymously relate one piece of downloaded content with
another.

     Users have always been looking for simple ways to get more content that they might enjoy, new materials similar to the kind of contents they have already downloaded. Searching the traditional search engines or existing torrent discovery sites tend not to offer anything beyond a small set of predetermined categories such as by format or by genre. Most users are looking for something that works across many similar sites, while at the same time having the capability to focus in on things that may be of interest.

     “Swarm Discoveries” is based on the concept of related contents. In a nutshell, if you currently downloads torrents A, B and C and there are two or more users in the swarm download torrents A, B, D and A, B, E, then there is a strong likelihood that you might be interested in torrents D and E as related contents as well. The amount of related contents will grow quickly since at any moment there are millions of people exchanging contents via the mainline and Vuze DHT network.

     Since Vuze is open source software, its core design and plugin modules can be expanded and modified to suit our needs. By bringing this related content feature of the “Swarm Discoveries” to the next level by automating the detection and acquisition of related contents from the DHT network, a local database which acts like a private content discovery site can be built in just a matter of days.

     btScript is a group of convenient Python functions designed to seamlessly integrate the Java and Python platform by exposing the application programming interface (API) of the open-source Azureus engine to the Python scripting environment. Developers no longer need to design, develop and test existing or added features to the Azureus engine on the Java platform.

     All features of the Azureus engine including plugins can be accessed and developed strictly in the interactive Python scripting environment using btScript. This capability enables the rapid development and testing of new BitTorrent applications since the Azureus engine is already known as a stable, robust, and time-tested engine for BitTorrent applications.

     The Azureus engine is the only platform that makes torrents on the visible Internet (clearnet) available on I2P and vice versa via its plugin architecture. If the user adds a torrent from I2P, it will be seeded on both I2P and the clearnet, and if a user adds a torrent from the clearnet, it will be seeded on both the clearnet and I2P. Therefore, torrents previously published only on I2P are now available to the entire visible Internet, and users of I2P can download any torrent on the Internet while maintaining the anonymity of I2P.