Initially written 09 Mar 2002
Updated 2003-03-01
Compared to a client-server connection, peer-to-peer is many times more efficient. Instead of centralizing all the transfers on one server for every servent, users can transfer a file from someone who already has or is currently transferring the file. Bandwidth is spread across all users, which makes P2P a very exciting technology. However, current P2P implementations lacks one very important aspect: decent searching.
It makes sense to concentrate on searching P2P networks, as actual transfer of files is trivial compared to searching. This document is not concerned with downloading technologies, such as swarming, parallel downloading, and resuming.
Specifications of individual files, used in searches. General metadata includes:
Most sharing programs make use of the filename, though some like Freenet use a hash instead. Common filenames are artist - title and artist - album - track no - title.
In bytes, can be used to derive time required to download. Some servents like WinMX 3.0 are able to download from users who have not finished downloading the file, so you'll see things like "29% of 5,464,064".
Allows the program to quickly identify if files of the same size are identical, without checking each byte multiple times. Blubster uses MD5, but other P2P's are using SHA1 (which is superior).
Three types of metadata are:
Gained from filesystem, for example Filename and Size.
From within file's contents, most metadata falls in this category.
Stored in database somewhere separate from file, editing does not effect contents of file. Fasttrack's keywords and description are examples of this.
Audio is the killer file type for P2P networks. Pioneered by Napster, some newer sharing programs such as the 23 million user Audiogalaxy and the ever-growing Blubster only allow sharing of audio.
ID3v1 tag, exactly 128 bytes long, located at very end of file.
ID3v2 tag (specifications), relevant fields include: :
Protocol | Filename | Artist | Title | Bytes | Length | Bitrate | Frequency | Hash |
---|---|---|---|---|---|---|---|---|
Gnutella | X | X | - | - | - | - | - | - |
Blubster | X | - | - | X | X | X | X | X |
OpenNap | X | - | - | X | X | X | X | X |
Audiogalaxy | - | X | X | X | X | X | - | - |
Reverse-engineered protocol used by Napster. Specification.
As shown above, Napster includes:
Once the most metadata-lacking network (as shown above), servents are now beginning to add and recognize metadata on shared files. Yet most are limited to:
BearShare, however, adds:
LimeWire has the possibility of using XML metadata
King of metadata. This protocol has capabilities to store and transmit tons of information about many types of media and information. All file types can have:
Audio:
Document:
Image:
Other:
Software:
Video:
All:
Audio (broken):
Book:
Image:
Video:
magnet:?xt=urn:bitprint:SHA1&dn=filename - magnet-uri project, Bearshare, Xolox, Shareaza
gnutella://urn:bitprint:SHA1
urn:sha1:SHA1 - linked to by Bitzi
ed2k: - eDonkey2000, based on precursor to MD5
sig2dat:/// "UUHash" - FastTrack, first 300k is MD5, rest is custom
http://bitzi.com/lookup/SHA1 - publishes metadata, used by BearShare, Limewire, Shareaza, Xolox, Acquisition, Mutella, Atomwire, FreeAmp (music player)
Audio part hash - allows files with different ID tags but same audio data to be grouped together. Hashing is essential to swarming. with different ID tags to be grouped together (very good)
Modified Sun Mar 25 08:48:47 2007
generated Sun Mar 25 08:56:33 2007
http://jeff.tk/p2p/search.html