Disclaimers: these images are taken from a Microsoft article, so they’re probably copyrighted by Microsoft. They’re not mine. :P
About a year ago, Microsoft quietly announced that they were developing a peer-to-peer networking protocol. It’s not a file sharing program, mind you; it’s a network routing protocol. It’s called, unimaginatively, Windows Peer to Peer Networking. Since late March 2004, I’ve been working on implementing this protocol. I’d like to create an interoperable library for use on other operating systems, such as Mac OS X and Linux, as well as Windows.
The proof-of-concept app for this protocol is Threedegrees, a social interaction program that lets people send instant messages, share a browser and a music player, and interact in various other ways. More importantly, this protocol will be a fundamental part of Indigo, the networking infrastructure in the Vista platform.
In a nutshell, Windows P2P uses a distributed hash table for name resolution, with some interesting quirks. It includes decent mechanisms for creating, maintaining, and broadcasting to small groups of peers. It uses IPv6 as its underlying transport, with 6to4 and Teredo for traversing NATs and the IPv4 Internet. It also includes security and authentication. Compared to other related work, all of these are solidly designed. Specifically, most of the new ideas are in the DHT implementation, which they call a Peer Name Resolution Protocol (PNRP).
There are lots of technical details in the white paper, the API docs, and the FAQ. (These pictures are taken from the white paper; they’re Microsoft’s, not mine.) Amusingly, there’s also a lot of deep technical information in Microsoft’s patent filing for PNRP, including some details that aren’t published anywhere else. Unfortunately, the network protocol itself is proprietary, so I need to reverse engineer it from network traces obtained with a sniffer.
This is where I am right now. I’ve written a decent amount of code before, but I’ve done more or less zero reverse engineering of network protocols. Furthermore, there aren’t a lot of resources. It’s definitely been done before - examples include instant messaging protocols such as Yahoo, Oscar (which AIM and ICQ now use), and MSN. However, these focus on describing the protocols themselves, not the methodologies used to reverse engineer them. If you know of good resources on reverse engineering network protocols, or you’d like to help, let me know!
As an aside, I got sidetracked from the reverse engineering one day, and wrote an implementation of the PNRP multi-level cache. You can download the code and a unit test. (The multi-level cache is specifically covered by their patent, but this code isn’t in the context of name resolution, so I’m not too worried. :P)
Finally, this usually is (and should be) legal. Reverse engineering is expressly permitted under the right of “fair use” established by copyright law. Recent rulings have eroded this right with respect to software; specifically, they have held that software publishers can withhold this right in “shrinkwrap” or click-through licenses. I don’t know if Microsoft’s license for Windows P2P forbids reverse engineering, but I’m not inclined to find out.