Give P2P a ChanceWhy you should be using peer-to-peer networks to share your data.
Remember peer-to-peer (P2P) networking? It's the software technology that incurred the wrath of the entertainment industry for its use in pirating copyrighted material. But P2P isn't bad; it exists simply to share data, a mission that dovetails nicely with science's collaborative ethos. With P2P you can create pooled image libraries, disseminate genomic-scale datasets, and share publications, patient data, and poster presentations. You could do that from your lab's home page, too, of course, but P2P has an edge over traditional Web pages. Should your host computer go down, lab priorities change, or funding falter, the site could be lost. You also might lack the bandwidth to accommodate a popular site, or the hardware to host a large site. That's why P2P is so useful: All you need to deliver files via P2P is a desktop computer. Simply download the application, select the files you want to share, and let the software do the rest. In traditional client-server transactions, you (the client) request data from a database (the server), which then delivers that content back to you. With only two parties to the conversation, you have no recourse should the server go down. And if the transferred file was large enough, your one request could grind the entire system to a halt. In P2P networks, every machine is both client and server. There is no central repository; each user can request files from other machines and deliver them to its peers in turn. This arrangement eliminates bandwidth issues while also improving stability: By duplicating key files among peers, you ensure their availability, even if their primary host goes offline. One P2P system designed expressly for academia is Pennsylvania State University's LionShare (http://lionshare.its.psu.edu), whose first public release was expected in June. Employing a mixed P2P/client-server architecture, LionShare blends traditional P2P functionality with traditional academic data repositories. "So the real value added in LionShare," says Mike Halm, senior strategist for e-learning technologies at Penn State, "is that you're doing a federated search over all resources being shared, plus a variety of learning resource repositories around the world, so you get rich results coming back to you when you do a query." Unlike traditional P2P networks, LionShare places a heavy emphasis on security and user identification, says Halm. Users choose which files on their hard drives will be shared, describe them using metadata tags, and then broadcast those descriptions across the network. Private data remains private, but even public files can be restricted to a specific audience, such as class participants. LionShare offers a relatively sophisticated and secure form of P2P, but that won't necessarily ensure wide adoption in the scientific community. To do that, P2P software must: 1. Enable scheduled searching: Searching each day to see if anything new and interesting has been added to the network gets old fast. Use RSS feeds to advise users of new content, and allow filtering to show only files matching certain keywords. 2. Grant write permission: Give trusted colleagues the ability not only to access (read) files, but also to edit them remotely. That would simplify the maintenance of consortium-wide spreadsheets. Use of version control can allow owners can roll back unwanted changes. 3. Develop OS-specific versions: As a Java application, LionShare can run on any OS, but it's also slow. OS-specific versions would be faster, and it would then be possible to browse the P2P network as just another folder in your file system. 4. Address copyright issues: Give trusted individuals authorization to oversee the network and enforce copyright laws. Also, give each file an associated license, so it's clear as to who owns it and how it may be used. 5. Integrate with other P2P networks: Enable users to search across different P2P networks. Otherwise, they'll have to run different clients and de-duplicate the resulting hit lists in order to perform a comprehensive search. The final key: Evangelize. P2P is useless without a large pool of users to sustain it. So get out there and remind your peers: Share and share alike.
Advertisement
Rate this article
Give p2p a chance ... ? by Rudolf Potucek [Comment posted 2006-07-14 14:27:11] This article presents and odd understanding of the difference between P2P and more traditional methods such as webpages, ftp servers and, god forbid, paper.
P2P has classically been touted for it's speed of access, a claim to fame that is important in a market where all the sources of the data have only limited bandwidth - not really a problem for a major research institution - and in circumstances where caching of the information on the web is problematic, e.g. due to copyright violations - again not an issue in voluntary sharing of scientific information. A claim, also, that only holds during periods of high popularity of the desired data, e.g., immediately after the release of a movie on DVD. In contrast to classical web data, however, P2P tends to perform miserably when only one copy of the data is available, a situation that should be considered typical for old, yet valid, data. More importantly, however, there is a purpose to sharing data that have been carefully selected and edited by the source, e.g. in the form of a paper or a webpage, rather than carelessly, meaninglessly, shared on the harddrive. I have gigabytes worth of images of cells but they would be useless without knowing anything else about the cells. In contrast a single image with the correct caption can tell the story of years of research. Sharing information is work, and it is much appreciated that someone thought about how to make it simpler. However, there is a purpose to the editorial process, and any tool that aims to make information exchange more efficient should strive to retain that editorial process. Give P2P a Chance by Ambrin Fatima [Comment posted 2006-07-06 10:25:34] P2P sharing, the idea is indeed majestic but this world is no Utopia. In the present state, where plagiarism is not only restricted to entertainment industry but by default can be a part of scientific research (e.g. process patents), little can be said. No wonder this might be a pessimistic approach but the question is of practicality. First and foremost, do we have equal relationship with every ally in scientific community. If the answer is no, then how to manage time and energy in controlling transfer of information from personal hard drives depending on the permissibility of relationship. Secondly, with increasing hue and cry for confidentiality especially in clinical trials and growing consumer rights, sharing patient data via P2P needs reconsideration for relative risk. Lastly, who all will be part of P2P sharing. Is pharmaceutical industry ready with amount of information they are holding and the rate at which they are producing? One of the main goal of P2P sharing is to expedite and channelise scientific research importantly contibuting in fast drug approvals but the key question is where does academia stands in this race? |
Register for FREE Online Access
Subscribe to the Magazine