ThesisArchive/SecondHalf2002(via decentralization) Melc writes: "We have started development on our CP2PC uberclient file-sharing application. As part of the CP2PC project we have developed a minimal programming interface (API) to peer-to-peer file-sharing systems. Now, based on this API, we have started building a file-sharing uberclient that will provide seamless access to multiple file-sharing networks from a single client."
gnutella search | http://www.grouter.net/gnutella/search.htm
(via decentralization) Serguei Osokine writes: "I've just published a study that can be of some interest to the people who think about the ways to improve the P2P network search performance."
[december 5, 2002]
Paul Ford's use of the Google News Search results for java as an example of why metadata is necessary (http://www.ftrain.com/links_sub_Semantic_Web.html) gave me the idea to create a new page here to keep track of other similar examples. I don't disagree that implicit metadata is powerful and useful, but it doesn't solve every problem and that's what WhyMetadata attempts to illustrate. (full disclosure: I've got a vested interest since U-P2P is built around explicit metadata)
[november 24, 2002]
Project that allows users to easily publish websites into Freenet. Interesting because Freenet URLs (URIs?) are location independent and theoretically persistent (you still have to update your site daily for it to be accessible!). From search perspective this is interesting because it is an implementation of link structure in an "exact identifier" system that could be used to create Google-style indices. Applies equally well to DHTs. Of course it only works for hypertext documents, also requires the spider to have an entry point and for the link network to be well connected.
exact identifers / search in dhts | http://www.employees.org/~alokem/thesis/notes-11-23-2002.txt
Some notes about why exactly "exact-identifer" systems (e.g. freenet, chord, can, etc.) don't support "partial match" search. Search requires access to metadata that is meaningful to humans. The part of these systems that is problematic for search is that file identifiers are opaque (hashes) or that the file contents are opaque (encrypted). If the requirement for anonymity / privacy is relaxed (i.e. if human-readable filenames and file contents were also maintained) then there is nothing to stop implementation of traditional metadata search strategies on top of these stores. The more challenging question is how to accomodate search within the architecture.
userv papers | http://www.almaden.ibm.com/cs/people/bayardo/userv/plugins/plugin.html
(via slashdot) P2P sharing of web applications (plugins) written in Java.
"One important feature of the PluginAPI? is a suite of methods that allow the plugin code to query the set of all (private or public) files accessible on the current site, and to obtain a file handle and/or URL to the content. The plugin can also query the site owner's name, the site's domain, the location of the shared web folder, the name assigned to the plugin, and so on. This state allows plugins to be implemented so that they work properly on any YouServ? site within which they are installed, without this sort of information having to be explicitly configured by the user."
http://www-db.stanford.edu/~bawa/Pub/usearch.pdf - "Make it Fresh, Make it Quick — Searching a Network of Personal Webservers"
Distributed search engine for "YouServ?" personal webservers. Connect to an individual peer through a web interface (e.g. Bob can use Alice's YouServ? instance to search the network). Each peer indexes their local collection creating an inverted index associating each keyword with a set of documents. Information about which keywords have matches are transferred to a central "registrar" which maintains a mapping between keywords and IP addresses of peers that have advertised matches for that keyword. Mapping is done using Bloom Filters - peer generates a bitmap indicating what keywords it supports, bitmap is sent to registrar - peer's ip address is added into the bucket for each set bit. Searches can be restricted to a group of peers - users maintain group definitions.
More YouServ? papers here: http://www.almaden.ibm.com/cs/people/bayardo/userv/
[november 21, 2002]
[november 19, 2002]
Order form. Table of contents is here: http://computer.org/proceedings/p2p/1810/1810toc.htm.
[november 13, 2002]
[october 17, 2002]
"Pond" prototype source. Project based at Berkeley run by John Kubiatowicz. More info here: http://oceanstore.cs.berkeley.edu/
"OceanStore? is a global persistent data store designed to scale to billions of users. It provides a consistent, highly-available, and durable storage utility atop an infrastructure comprised of untrusted servers."
conference | http://www2003.org/cfp.htm
WWW 2003 - Budapest Hungary - 20-24 May 2003 - paper submission deadline november 15 - 8-10 pages - template: http://www2003.org/www2003-submission.doc
Possible tracks: Search and Data Mining, Semantic Web
todo |
[october 5, 2002]
Some papers of interest...
http://research.microsoft.com/sn/Herald/papers/tr-2002-48.pdf - "Overlook: Scalable Name Service on an Overlay Network", Marvin Theimer, Michael B. Jones, April 2002, Technical Report, MSR-TR-2002-48
Pastry as a basis for an "internet-scale" (fast update, resilient to flash crowds) naming service.
http://www.ececs.uc.edu/~mjovanov/thesis/thesis.html - "Modeling Large-scale Peer-to-Peer Networks and a Case Study of Gnutella", Masters Thesis, Mihajlo A. Jovanovic, June 2001
Modelled topology and latency of Gnutella network. Includes Java code for a Gnutella simulation "gnutsim".
http://dbpubs.stanford.edu:8090/pub/2002-13 - "Designing a Super-Peer Network" - Yang, Beverly; Garcia-Molina, Hector, 22 February 2002
Good description of "super-peer" networks, identifies important parameters for efficient network operation.
http://www.research.microsoft.com/~antr/PAST/location.pdf - "Topology-aware routing in structured peer-to-peer overlay networks", Miguel Castro, Peter Druschel, Y. Charlie Hu, Antony Rowstron, submitted for publication February 2002, Technical Report MSR-TR-2002-82
How to take into account IP-level proximity in the overlay network.
http://www.research.microsoft.com/~antr/PAST/ring.pdf - "One ring to rule them all: Service discovery and binding in structured peer-to-peer overlay networks", M. Castro, P. Druschel, A-M. Kermarrec and A. Rowstron, SIGOPS European Workshop, France, September, 2002
Talks about distributed inverted index (keyword: loc1, loc2, loc3) for search. Also idea of "universal ring" - a bootstrap for service discovery.
[october 1, 2002]
Submissions due: 25 October 2002 Notification of Acceptance: 20 December 2002 Camera-ready copy due: 15 January 2003 Workshop: 20-21 February 2003
iris project | http://www.ddj.com/documents/s=7338/ddj1033245969026/
Article from Dr. Dobbs about IRIS - a project to develop "a secure, fault-tolerant, distributed system for data storage" based on Distributed Hashtables (DHTs). They just got $12 million in funding over the next five years from the NSF. Most of the quotes in this article are from Frans Kaashoek who was one of the researchers behind Chord. The IRIS project website is here: http://iris.lcs.mit.edu.
The article also includes this interesting link: http://www.planet-lab.org - an Intel Research (http://www.intel-research.net) backed project to create "a global testbed for developing and accessing new network services"
Another IRIS article here: http://www.newscientist.com/news/news.jsp?id=ns99992861 (no mention of its difficulties with search?)
conference | http://wwwteo.informatik.uni-rostock.de/DASD/
Neal writes: "There's a call for papers for the Design, Analysis and Simulation of Distributed Systems (DASD) conference that is due on October 18th (three weeks) if y'all are interested. It's in Orlando, Florida and the CFP is here:http://wwwteo.informatik.uni-rostock.de/DASD/."
[september 28, 2002]
Project somewhat along the lines of U-P2P to develop a framework for plugging together different peer-to-peer components to form applications. Basically, specifying interfaces for the components that allow them to interoperate without specifying the inner workings. It does seem to be standardizing a search interface though: http://tristero.sourceforge.net/search-java.html. (involves Brandon Wiley, one of the original Freenet programmers as well as Sam "Neurogrid" Joseph and Aaron "W3C" Swartz)
overview of p2p meta-data search | http://www.neurogrid.net/Decentralized_Meta-Data_Strategies-neat.html
Excellent and comprehensive overview of some of the existing approaches to distributed meta-data search. The techniques he lists are:
Bloom Filter, Semantic Routing, Reputation Learning, Query Spaces, Trust Metrics, Query Forwarding, Distributed Hash Tables, Caching
A U-P2P user could choose which one he wanted by plugging in a different Peer Network Adapter. Might be fruitful to discuss how each would work in U-P2P framework.
p2p radio | http://www.openp2p.com/pub/a/p2p/2002/09/24/p2pradio.html
Two recent pieces of software allow users to host and share their own radio stations. Peercast: http://www.peercast.org/ Streamer: http://www.chaotica.u-net.com/streamer.htm
Some implementation information from Peercast's website:
The client software uses the Gnutella 0.6 protocol, but is not connected to the Gnutella file share network. It works in much the same way as other Gnutella clients except that instead of downloading files, the users download streams. These streams are then exchanged in real-time with other users. No data is stored locally on any machine connected to the network.
The client software has the ability to serve web pages to normal browsers such as Mozilla and Internet Explorer. This means that people on your LAN can search for and listen to channels without having to install the client software on their PC. Offices can have one PeerCast? client providing audio streams to the entire LAN. Or you can set up a private network with your friends on the Internet to listen to music. Its your choice about whether you connect directly to the PeerCast? network or not.
[august 12, 2002]
"Developers use the Buddy Script SDK released Monday to write interactive agents that mine information from databases in a corporate network or on the Internet. End users at a company can then add a new contact, or "buddy," to their instant messaging software that makes use of those agents to retrieve answers to questions, company officials said." (via Professor E.)
[august 2, 2002]
This might be useful for quickly converting our up2p objects into XML. Need to download it and give it a try.
[july 21, 2002]
Starts out with the traditional "taxonomy of p2p" section then mentions a few apps: