Additional Readings in Distributed and Cluster Computing

Index

Note: I have hard copies of references that are not listed as links.

Background

  1. End-To-End Arguments in System Design J.H. Saltzer, D.P. Reed and D.D. Clark. ACM Transactions on Computer Systems, 4(4):277-288, November 1984

Distributed Systems Overview

  1. Distributed Operating Systems Andrew S. Tanenbaum and Robbert Van Renesse; ACM Comput. Surv. 17, 4 (Dec. 1985), Pages 419 - 470.
  2. A Comparison of Two Distributed Systems: Amoeba and Sprite Fred Douglas, John K. Ousterhout, M. Frans Kaashoek, Andrew S. Tanenbaum,
  3. Specifying graceful degradation in distributed systems, M. P. Herlihy and J. M. Wing. Proceedings Sixth ACM Symposium on Principles of Distributed Computing, Vancouver, British Columbia, Canada, 1987, pages 167-177.
  4. Distributed systems L. Kleinrock. Communications of the ACM 28(11):1200-1213, November 1985.
  5. Design and implementation of a distributed virtual machine for networked computers Emin Gun Sirer, Robert Grimm, Arthur J. Gregory, Brian N. Bershad (University of Washington), Proceedings of the 17th ACM Symposium on Operating Systems Principles (SOSP), Charleston, South Carolina, December, 1999: pp: 202-216
  6. Plan 9 From Bell Labs Rob Pike, Dave Presotto, Sean Dorward, Bob Flandrena, Ken Thompson, Howard Trickey, and Phil Winterbottom, Computing Systems, Vol 8 #3, Summer 1995, pp. 221-254.
    Plan 9 homepage
  7. Introduction to Parallel Computing and Cluster Computers", by Dave Turner - Ames Laboratory

Clusters

  1. Cluster Computing at a Glance Mark Baker and Rajkumar Buyya, High Performance Cluster Computing, Volume 1, Chapter 1, Prentice Hall, 1999.
  2. High-Performance Computing: Clusters, Constellations, MPPs, and Future Directions Jack Dongarra, Thomas Sterling, Horst Simon, and Eric Strohmaier, IEEE Computing Volume 7 No. 2, March/April 2005
  3. What's Next in High Performance Computing Gordon Bell, Jim Gray, Communications of the ACM, Volume 45 Issue 2, February 2002
  4. A Case for NOW (Networks of Workstations) T. Anderson, D. E. Culler, D. A. Patterson, et. al..
  5. BEOWULF: A PARALLEL WORKSTATION FOR SCIENTIFIC COMPUTATION Donald J. Becker, Thomas Sterling, Daniel Savarese, John E. Dorband, Udaya A. Ranawak, Charles V. Packer, Proceedings, International Conference on Parallel Processing, 1995
  6. Scalable Cluster Computing with MOSIX for Linux Barak, La'adan, Shiloh, Proc. Linux Expo '99, pp. 95-100, Raleigh, N.C., May 1999.
  7. Introduction to Single System Imaging in Clusters Bruce J. Walker, Compaq.
  8. "GLUnix: A Global Layer Unix for a Network of Workstations", Douglas P. Ghormley, David Petrou, Steven H. Rodrigues, Amin M. Vahdat, Thomas E. Anderson, Software Practice and Experience.
  9. "Efficient, Portable, and Robust Extension of Operating System Functionality" Amin M. Vahdat, Douglas P. Ghormley, and Thomas E. Anderson, UC Berkeley Technical Report CS-94-842, December, 1994.
  10. File Systems for Clusters from a Protocol Perspective Braam, P.J. Second Extreme Linux Topics Workshop Jun. 1999, Monterey CA
  11. Resource Aware Cluster Computing James D. Teresco, Jamal Faik, Joseph E. Flaherty, IEEE Computing, March/April 2005 (Vol. 7, No. 2)

Distributed Communication

  1. The Design Philosophy of the DARPA Internet Procotols David D. Clark, Proceedings of the 1988 SIGCOMM Symposium, pp 106-114, Stanford, CA, August 1988.
  2. Architectural Considerations for a New Generation of Protocols D.D. Clark and D.L. Tennenhouse, In Proceedings of the 1990 SIGCOMM Symposium on Communications Architectures and Protocols, pp. 200-208, Philadelphia, PA, September 1990.
  3. A Brief History of the Internet Barry M. Leiner, Vinton G. Cerf, David D. Clark, Robert E. Kahn, Leonard Kleinrock, Daniel C. Lynch, Jon Postel, Larry G. Roberts, Stephen Wolff
  4. Ethernet: Distributed Packet Switching for Local Computer Networks, Robert M. Metcalfe and David R. Boggs, Communications of the ACM, Vol. 19, No. 5, July 1976 pp. 395 - 404
  5. Masking the Overhead of Protocol Layering Robbert van Renesse, Proceedings of the 1996 ACM SIGCOMM Conference, Stanford, September 1996
  6. MBone: The Multicast Backbone H. Eriksson, Communications of the ACM, 37(8):54-60, August 1994.
  7. "Building reliable, high-performance communication systems from components", Xiaoming Liu, Christoph Kreitz, Robbert van Renesse, Jason Hickey, Mark Hayden, Kenneth Birman, and Robert Constable (Cornell University), Proceedings of the 17th ACM Symposium on Operating Systems Principles (SOSP), Charleston, South Carolina, December, 1999: pp: 80-92
  8. "Building TCP/IP Active Messages" Lok Tin Liu, Alan Mainwaring, Chad Yoshikawa, Berkeley NOW Project White Paper, 1994.
  9. Improving The Performance of Distributed Applications Using Active Networks Ulana Legedza, David J. Wetherall, and John Guttag Appears in IEEE INFOCOM'98.
  10. Towards an Active Network D. Tennenhouse and D. Wetherall, ACM SIGCOMM CCR, Vol. 26, No. 2, April 1996.
  11. "High Level Programming for Distributed Computing", J. A. Feldman, Communications of the ACM, 22 6, June 1989, pp. 353-368.

  12. Intro to Socket Programming From University of Wisconsin
  13. Davin's collection of unix programming links Lots of links to Network programming references.

    Message Passing Libraries

  14. The PVM Concurrent Computing System: Evolution, Experiences, and Trends V. S. Sunderam, G. A. Geist, J. Dongarra, R. Manchek, ACM Journal of Parallel Computing, vol. 20 no. 4, 1994
  15. A message passing standard for MPP and workstations J. J. Dongarra, S. W. Otto, M. Snir, and D. Walker, CACM, 39(7), 1996, pp. 84-90
  16. "PVM: A Framework for Parallel Distributed Computing", V. S. Sunderam, Concurrency: Practice and Experience, 2, 4, pp 315--339, December, 1990.
  17. "A Users' Guide to PVM Parallel Virtual Machine", A. Beguelin, J. J. Dongarra, G. A. Geist, R. Manchek, and V. S. Sunderam, Oak Ridge National Laboratory, ORNL/TM-12187, September, 1994

    RPC

  18. Lightweight Remote Procedure Call B. N. Bershad, T. E. Anderson, E. D. Lazowska and H. M. Levy, ACM Transactions on Computer Systems 8, 1 (February 1990), 37-55
  19. Reflective Remote Method Invocation George K. Thiruvathukal, Lovely S. Thomas, and Andy T. Korczynski, ACM Java '98, Stanford University, Palo Alto, CA and Concurrency: Practice and Experience 1998.
  20. Implementing remote procedure calls Andrew D. Birrel and Bruce Jay Nelson, ACM Transactions on Computer Systems, 2(1):39-59, February 1984.
  21. "Performance of the Firefly RPC" M. D. Schroeder and M. Burrows, ACM Trans. on Computer Systems, 8 1, February 1990, pp. 1-17.

    some on-line references:

    1. Whitepaper - RMI
    2. RMI

    Distributed Shared Memory

  22. Memory Coherence in Shared Virtual Memory Systems K. Li and P. Hudak. ACM Trans. Computer Systems Vol. 7, No. 4. Nov. 1989. pp. 321-359.
  23. TreadMarks: Distributed shared memory on standard workstations and operating systems A. Cox, S. Dwarkadas, P. Keleher, and W. Zwaenepoel. Proceedings of the Winter 94 Usenix Conference, USENIX Assoc., Berkeley, Calif. pp. 115-131
  24. "An Evaluation of Software Based Release Consistent Protocols" Pete Keleher, Alan L. Cox, Sandhya Dwarkadas, Willy Zwaenepoel, JPDC, 29(2), Sept. 1995, pp 126-141.
  25. "Towards Transparent and Efficient Software Distributed Shared Memory" D.J. Scales and K. Gharachorloo, 16th Symposium on Operating Systems Principles, Saint Malo, France, October 1997, pp. 157-169.
  26. "Cashmere-2L: Software Coherent Shared Memory on a Clustered Remote-Write Network", R. Stets, S. Dwarkadas, N. Hardavellas, G. Hunt, L. Kontothanassis, S. Parthasarathy, and M. Scott, 16th Symposium on Operating Systems Principles, Saint Malo, France, October 1997, pp. 170-183.
  27. Scalable fault-tolerant distributed shared memory, Florin Sultan, Liviu Iftode, Thu Nguyen, Proceedings of the 2000 ACM/IEEE conference on Supercomputing
  28. OpenMP: An Industry Standard API for Shared Memory Programming, Leonardo Dagum, Ramesh Menon , IEEE Computing in Science and Engineering, January-March 1998, Vol. 5, No. 1
  29. wwwopenmp.org

    Distributed Data Structures/Objects

  30. The S/Net's Linda Kernel N. Carriero and D. Gelernter, ACM Trans. on Computer Systems, 4 2, May 1986, pp. 110-129.
  31. Network Objects A. Birrell, G. Nelson, S. Owicki, and E. Wobber (DEC SRC). In Proceedings of the 14th ACM Symposium on Operating Systems Principles, pp. 217-230, Asheville, NC, December 1993.
  32. "Distributed Component Object Model (DCOM) Binary Protocol", Nat Brown and Charlie Kindel, Network Working Group Microsoft Corporation, May 1996,
  33. "Bringing Distributed Objects to the World Wide Web" Ron I. Resnick
  34. "Generative communication in Linda" David Gelernter, ACM Trans. Program. Lang. Syst. 7, 1 (Jan. 1985), Pages 80 - 112
  35. Some on-line references:
    1. CORBA meets Java, JavaWorld October 1997
    2. OMG Homepage
    3. Distributed Object Computing with CORBA
    4. Concurrent Programming in Java

    Also look for CORBA and JavaSpaces documents on-line

Network RAM

  1. Nswap: A Network Swapping Module for Linux Clusters", Tia Newhall, Sean Finney, Kuzman Ganchev, Michael Spiegel. In Proceedings of Euro-Par'03 International Conference on Parallel and Distributed Computing, Klagenfurt, Austria, August 2003.
  2. Implementing Global Memory Management in a Workstation Cluster Michael J. Feeley and William E. Morgan and Frederic H. Pighin, Anna R. Karlin Henry M. Levy Chandramohan A. Thekkath, 15th ACM Symposium on Operating Systems Principles, December, 1995.
  3. Implementation of a Reliable Remote Memory Pager Evangelos P. Markatos and George Dramitinos, USENIX 1996 Annual Technical Conference
  4. Parallel Network RAM: Effectively Utilizing Global Cluster Memory for Large Data-Intensive Parallel Programs John Oleszkiewicz, Li Xiao, Yunhao Liu, 2004 International Conference on Parallel Processing (ICPP'04), August 2004, pp. 353-360
  5. Adaptive memory allocations in clusters to handle unexpectedly large data-intensive jobs, Li Xiao, Songqing Chen, and Xiaodong Zhang IEEE Transactions on Parallel and Distributed Systems, Vol. 15, No. 7, 2004, pp. 577-592. Abstract
  6. Availability and Utility of Idle Memory in Workstation Clusters
  7. Incorporating Job Migration and Network RAM to Share Cluster Memory Resources
  8. Remote Memory Paging in Networks of Workstations
  9. The Network RamDisk: Using Remote Memory on Heterogeneous NOWs
  10. Memory Servers for Multicomputers
  11. Collaborative Memory Pool in Cluster System, Nan Wang, Xuhui Liu, Jin He, Jizhong Han, Lisheng Zhang, Zhiyong Xu, Proceedings of the 2007 IEEE International Conference on Parallel Processing
  12. Performance analysis of a user-level memory server, S. Pakin and G. Johnson, Cluster Computing, 2007 IEEE International Conference on Cluster Computing
  13. Remote Paging references

Event Ordering and Distributed State

  1. Time, clocks, and the ordering of events in a distributed system Leslie Lamport, Communications of the ACM, 21(7):558-565, July 1978.
  2. Distributed snapshots: determining global states of distributed systems K. Mani Chandy and Leslie Lamport; ACM Trans. Comput. Syst. 3, 1 (Feb. 1985), Pages 63 - 75
  3. The Role of Distributed State John K. Ousterhout, University of California Berkeley

Replication and Fault Tolerance and Recovery

  1. "Replicated Distributed Programs", E. C. Cooper, 10th ACM Symposium on Operating Systems Principles (SOSP), Orcas Island, WA, December 1985 pp. 63-78.
  2. "Replication and Fault-Tolerance in the ISIS System", K.P. Birman, 10th Symposium on Operating Systems Principles, Orcas Island, WA, December 1985
  3. "Reliable Communication in the Presence of Failures", K.P. Birman and T.A. Joseph, ACM Transactions on Computer Systems, 5 1, February 1987, pp. 47-76.
  4. Fundamentatls of Fault-Tolerant Distributed Computing in Asynchronous Environments , Felix C. Gartner, Environments", ACM Computing Surveys, 31(1), March 1999
  5. Weighted voting for replicated data, D. K. Gifford. Proceedings Seventh ACM Symposium on Operating Systems Principles, Pacific Grove, California, December 1979, pages 150-162.
  6. Paxos Made Simple Leslie Lamport, November 2001
  7. "The Byzantine Generals Problem" L. Lamport, R. Shostak, and M. Pease, ACM Transactions on Programming Languages Systems, 4 3, July 1982, pp. 382-401.
  8. MicrorebootA Technique for Cheap Recovery, George Candea, Shinichi Kawamoto, Yuichi Fujiki, Greg Friedman, and Armando Fox, Stanford University, OSDI'04
  9. Path-Based Failure and Evolution Management, Mike Y. Chen, University of California, Berkeley; Anthony Accardi, Tellme; Emre Kiciman, Stanford University; Dave Patterson, University of California, Berkeley; Armando Fox, Stanford University; Eric Brewer, University of California, Berkeley, NSDI'04

Distributed Coordination

  1. "Communicating Sequential Processes" C.A.R. Hoare, Communications of the ACM 21, 8, August 1978, pp. 666-677.
  2. Experiences with Processes and Monitors in Mesa Butler W. Lampson, David D. Redell, Communications of the ACM, 23 2, February 1980, pp. 105-117.
  3. Scheduler activations: effective kernel support for the user-level management of parallelism. Thomas E. Anderson, Brian N. Bershad, Edward D. Lazowska and Henry M. Levy; Proceedings of the thirteenth ACM symposium on Operating systems principles, 1991, Pages 95-109

Naming in Distributed Systems

  1. Developement of the Domain Name System Paul V. Mockapetris, Kevin J. Dunlap, ACM SIGCOMM Computer Communication Review, Volume 25 Issue 1 January 1995
  2. "Grapevine: An Exercise in Distributed Computing", Andrew D. Birrell, Roy Levin, Roger M. Needham, Michael D. Schroeder, Communications of the ACM, 25 4, April 1982, pp. 260-274.
  3. Decentralizing a global naming service for improved performance and fault tolerance D. R. Cheriton and T. P. Mann. ACM Transactions on Computer Systems 7(2):147-183, May 1989.

Peer-to-Peer Systems

  1. Chord: a scalable peer-to-peer lookup protocol for internet applications Ion Stoica, Robert Morris, David Liben-Nowell, David R. Karger, M. Frans Kaashoek, Frank Dabek, Hari Balakrishnan, IEEE/ACM Transactions on Networking (TON), Volume 11 Issue 1, February 2003
  2. Peer-to-peer: Making gnutella-like P2P systems scalable Yatin Chawathe, Sylvia Ratnasamy, Lee Breslau, Nick Lanham, Scott Shenker, Proceedings of the 2003 conference on Applications, technologies, architectures, and protocols for computer communications, August 2003
  3. Ivy: A Read/Write Peer-to-Peer File System Athicha Muthitacharoen, Robert Morris, Thomer M. Gil, and Benjie Chen, Proceedings of OSDI'02, 2002.
  4. "A Survey of Peer-to-Peer Storage Techniques for Distributed File Systems", Ragib Hasan, Zahid Anwar, William Yurcik, Roy Campbell, IEEE International Conference on Information Technology (ITCC), Las Vegas, NV, April 2005
  5. "Incentives Build Robustness in BitTorrent", Bram Cohen, 2003
  6. Wide-Area Cooperative Storage with CFS, Frank Dabek, M. Frans Kaashoek, David Karger, Robert Morris, Ion Stoica, SOSP'01
  7. Storage Management and Caching in PAST, A Large-scale, Persistent Peer-to-peer Storage Utility, Antony Rowstron, Peter Druschel, SOSP'01
  8. An Analysis of Internet Content Delivery Systems Stefan Saroiu, Krishna P. Gummadi, Richard J. Dunn, Steven D. Gribble, and Henry M. Levy, University of Washington, Proceedings of USENIX OSDI 2002.
  9. Technical and social components of peer-to-peer computing: Extracting guarantees from chaos John Kubiatowicz, Communications of the ACM, Volume 46 Issue 2 February 2003
  10. A Scalable Content-Addressable Network Sylvia Ratnasamy, Paul Francis, Mark Handley, Richard Karp, Scott Shenker(UCB), ACM SIGCOMM, 2001
  11. Scheduling and resource allocation: Samsara: honor among thieves in peer-to-peer storage Landon P. Cox, Brian D. Noble, Proceedings of the nineteenth ACM symposium on Operating systems principles October 2003
  12. Technical and social components of peer-to-peer computing: Looking up data in P2P systems Hari Balakrishnan, M. Frans Kaashoek, David Karger, Robert Morris, Ion Stoica, Communications of the ACM, Volume 46 Issue 2, February 2003
  13. Distributed object location in a dynamic network Kirsten Hildrum, John D. Kubiatowicz, Satish Rao, Ben Y. Zhao, Proceedings of the fourteenth annual ACM symposium on Parallel algorithms and architectures, August 2002
  14. Finding Good Peers in Peer-to-Peer Networks Murali Krishna Ramanathan1, Vana Kalogeraki, Jim Pruyne HP Laboratories Palo Alto
  15. Incentives Build Robustness in BitTorent Bram Cohen, 2003

Distributed File and Storage Systems

  1. "Spritely NFS: experiments with cache-consistency protocols", V. Srinivasan and J. Mogul, Proceedings of the Twelfth ACM symposium on Operating Systems Principles, December 3 - 6, 1989, Litchfield Pk., AZ USA
  2. "Scalable, Secure, and Highly Available Distributed File Access", M. Satyanarayanan, IEEE Computer, May 1990, Vol. 23, No. 5
  3. A Case for Redundant Array of Inexpensive Disks (RAID) David A. Patterson, Garth Gibson, Randy H. Katz, ACM SIGMOD International Conference on Management of Data, 1988, pp. 109-116.
  4. "Zebra: A Striped Network File System" John Hartman and John Ousterhout, In the Proceedings of the USENIX Workshop on File Systems.
  5. "The Design and Implementation of a Log-Structured File System", Mendel Rosenblum and John K. Ousterhout, Proceedings of the Thirteenth ACM Symposium on Operating Systems Principles , 1991, Pages 1 - 15
  6. "Serverless Network File Systems", Tom Anderson, Michael Dahlin, Jeanna Neefe, David Patterson, Drew Roselli, Randy Wang, 15th Symposium on Operating Systems Principles, ACM Transactions on Computer Systems , 1995.
  7. The Google file system Sanjay Ghemawat, Howard Gobioff, Shun-Tak Leung, Proceedings of the nineteenth ACM symposium on Operating systems principles, October 2003
  8. Taming Aggressive Replication in the Pangaea Wide-Area File System, Yasushi Saito, Christos Karamanolis, Magnus Karlsson, and Mallik Mahalingam, HP Labs, Proceedings of OSDI'02
  9. Feasibility of a Serverless Distributed File System Deployed on an Existing Set of Desktop PCs W. Bolosky, J. Douceur, D. Ely, and M. Theimer. In SIGMETRICS, pages 34--43, 2000.
  10. Scale and Performance in a Distributed File System J. Howard, M. Kazar, S. Menees, D. Nichols, M. Satyanarayanan, R. Sidebotham, and M. West., ACM Transactions on Computer Systems, Vol. 6, No. 1, February 1988, pp. 51-81.
  11. The Sun Network Filesystem: Design, Implementation and Experience Russel Sandberg, Sun Microsystems, Inc.
  12. Design and Implementation of the Sun Network Filesystem Russel Sandberg, David Goldberg, Steve Kleiman, Dan Walsh, Bob Lyon, Sun Microsystems, Inc.
  13. "The LOCUS Distributed Operating System", Bruce Walker, Gerald Popek, Robert English, Charles Kline, Greg Thiel, 9th Symposium on Operating Systems Principles (SOSP), Bretton Woods, New Hampshire, November 1983, pp. 49-70.
  14. Scalable, Secure, and Highly Available Distributed File Access M. Satyanarayanan IEEE Computer, May 1990, Vol. 23, No. 5
  15. "Storage Management and Caching in PAST, A Large-scale, Persistent Peer-to-peer Storage Utility", Antony Rowstron, Peter Druschel, Proceedings of SOSP'01

Authentication and Security

  1. "The Internet Worm: Crisis and Aftermath" Eugene H. Spafford, Communications of the ACM, 32 (6): 678-687, June 1989
  2. "Encryption and Secure Computer Networks" Gerald J. Popek, Charles S. Kline, Computing Surveys, 11 4, December 1979, pp. 331-356.
  3. "Kerberos: an Authentication Service for Computer Networks", B. Clifford Neuman and Theodore Ts'o, IEEE Communications, 32(9):33-39, September 1994.
  4. "New directions in cryptography" Diffie, W.; Hellman, M., IEEE Transactions on Information Theory, Volume: 22 , Issue: 6 , Nov 1976
  5. "A Logic of Authenication" , M. Burrows, M. Abāadi, and R. Needham, 12th Symposium on Operating Systems Principles, Litchfield Park, AZ, December 1989, pp. 1-13.
  6. Decentralized user authentication in a global file system Michael Kaminsky, George Savvides, David Mazieres, M. Frans Kaashoek Proceedings of the nineteenth ACM symposium on Operating systems principles, October 2003
  7. "Using Encryption for Authentication in Large Networks of Computers" R. M. Needham and M. D. Schroeder, Communications of the ACM, 21 12, December 1978, pp. 993-999.
  8. How SSL Works, http://developer.netscape.com/tech/security/basics/index.html
  9. Exploring RSA Encryption in OpenSSL Linux Journal, September 25, 2003 by James Tandon
  10. Authentication in distributed systems: Theory and practice. B. Lampson, M. Abadi, M. Burrows, and E. Wobber. ACM Transcations on Computer Systems 10, 4 (Nov. 1992), pp 265-310.
  11. "Kerberos: An authentication service for open network systems." J. G. Steiner, B. C. Neuman, and J. I. Schiller In Proceedings of the Winter 1988 Usenix Conference, pages 191-201, February 1988
  12. Extensible Security Architectures for Java Dan S. Wallach, Dirk Balfanz, Drew Dean, and Edward W. Felten. 16th Symposium on Operating Systems Principles (Saint-Malo, France), October 1997
  13. Reflections on Trusting Trust, Ken Thompson, from Communication of the ACM, Vol. 27, No. 8, August 1984, pp. 761-763.

The Grid (Meta-Computing)

  1. The Anatomy of the Grid: Enabling Scalable Virtual Organizations, I. Foster, C. Kesselman, S. Tuecke, International Journal of Supercomputer Applications, 15(3), 2001.
  2. Grids, the TeraGrid, and Beyond, Daniel Reed, IEEE Computer, January 2003 (Vol. 36, No. 1)
  3. Grid Services for Distributed System Integration, I. Foster, C. Kesselman, J.M. Nick, S. Tuecke, IEEE Computer, Volume 35, Issue 6, June 2002 pp.37 - 46
  4. www.globus.org
  5. "Globus: A Metacomputing Infrastructure Toolkit", I. Foster and C. Kesselman, International Journal of Supercomputer Applications, 11(2):115-128, 1997.
  6. "Legion--A View From 50,000 Feet", Andrew S. Grimsaw and William A. Wulf, Proceedings of the Fifth IEEE International Symposium on High Performance Distributed Compuing, IEEE Computer Society Press, Los Alamitos, CA, August 1996
  7. Javelin: Internet-Based Parallel Computing Using Java P. Cappello, B. O. Christiansen, Mihai F. Ionescu, M. O. Neary, K. E. Schauser. June 20, 1997 ACM Workshop on Java for Science and Engineering Computation, Las Vegas.
  8. WebOS: Operating System Services For Wide Area Applications Amin Vahdat, Thom as Anderson, Michael Dahlin, David Culler, Eshwar Belani, Paul Eastham, and Chad Yoshikawa. July 1998. The Seventh IEEE Symposium on High Performance Distributed Computing.

    Meta-computing Security

  9. "A Security Architecture for Computation Grids", I. Foster, C. Kesselman, G. Tsudik and S. Tuecke, Proc. 5th ACM Conference on Computer and Communications Security Conference, pg. 83-92, 1998.
  10. A Flexible Security System for Metacomputing Environments" Adam Ferrari, Frederick Knabe, Marty Humphrey, Steve Chapin, and Andrew Grimshaw, Proceedings of HPCN'99 (High-Performance Computing and Networking), April 1999, Amsterdam, The Netherlands.
  11. A New Model of Security for Metasystems Steve Chapin, Chenxi Wang, William Wulf, Fritz Knabe, Andrew Grimshaw, University of Virginia Technical Report CS-95-34, August 1995

Web Computing

  1. "Maintaining Strong Cache Consistency in the World-Wide Web", C. Liu and P. Cao, 17th International Conf. on Distributed Computing Systems, 1997.
  2. "On the scale and performance of cooperative Web proxy caching", Alec Wolman, Geoffrey M. Voelker, Nitin Sharma, Neal Cardwell, Anna Karlin, and Henry M. Levy, Proc. of the 17th ACM Symposium on Operating Systems Principles (SOSP '99), December 1999.
  3. "Design Considerations for Integrated Proxy Servers" S. Sahu, P. Shenoy, and D. Towsley, Proc. 9th IEEE Int'l. Workshop on Network and Operating System Support for Digital Audio and Video (NOSSDAV'99), June 1999, pp. 247-250.
  4. World-Wide Web Cache Consistency, Gwertzman, J., Seltzer, M., Proceedings of the 1996 Usenix Technical Conference, San Diego, CA January 1996.
  5. Websys Alec Wolman, Geoff Voelker, Nitin Sharma, Neal Cardwell, Molly Brown, Tashana Landray, Denise Pinnel, Anna Karlin, and Henry Levy.

    Security and the Web

  6. "Weaving a Web of Trust" Rohit Khare and Adam Rifkin, (working draft version of the version that appeared in of the World Wide Web Journal summer 1997(Volume 2, Number 3, Pages 77-112)).

    Web File Systems

  7. AFS and the Web: Competitors or Collaborators? M. Satyanarayanan and Mirjana Spasojevic. Proceedings of the Seventh ACM SIGOPS European Workshop, Connemara, Ireland September 1996
  8. WebFS: A Global Cache Coherent Filesystem Amin Vahdat, Paul Eastham, and Thomas Anderson. December 1996. Technical Draft. Computer Science Division, University of California Berkely
  9. WebNFS: Filesystem for the Internet Brent Callaghan Sun Microsystems, Inc. technical report, April 1997

Scheduling

  1. "The Interaction of Parallel and Sequential Workloads on a Network of Workstations" Remzi H. Arpaci, Andrea C. Dusseau, Amin M. Vahdat, Lok T. Liu, Thomas E. Anderson, and David A. Patterson, SIGMETRICS 1995
  2. "Scheduling with Implicit Information in Distributed Systems" Andrea C. Arpaci-Dusseau, David E. Culler, Alan Mainwaring, Sigmetrics'98 Conference on the Measurement and Modeling of Computer Systems
  3. "A closer look at coscheduling approaches for a network of workstations" Shailabh Nagar, Ajit Banerjee, Anand Sivasubramaniam and Chita R. Das, Proceedings of the eleventh annual ACM symposium on Parallel algorithms and architectures , 1999, Pages 96 - 105
  4. Gang Scheduling
  5. Gang Scheduling on Clusters
  6. Self Scheduling in Clusters
  7. Task Scheduling on Clusters
  8. Workload Managment, more than just job scheduling
  9. Adaptive Scheduling on Clusters

Process Migration, Load Balancing

  1. Process Migration Milojicic, Douglis, Paindaveine, Wheeler, Zhou 1999.
  2. Checkpoint and Migration of UNIX Processes in the Condor Distributed Processing System, Michael Litzkow, Todd Tannenbaum, Jim Basney, Miron Livny Computer Sciences Technical Report #1346, University of Wisconsin-Madison, April 1997
  3. "Deploying a High Throughput Computing Cluster" , Jim Basney and Miron Livny, High Performance Cluster Computing, Rajkumar Buyya, Editor, Vol. 1, Chapter 5, Prentice Hall PTR, May 1999.
  4. "Process Migration in DEMOS/MP", M.L. Powell and B.P. Miller, 9th Symposium on Operating Systems Principles, Bretton Woods, NH, October 1983, pp. 110-119.
  5. "The Kangaroo Approach to Data Movement on the Grid", D. Thain, J. Basney, S.-C. Son, and M. Livny, Tenth IEEE Symposium on High Performance Distributed Computing (HPDC10), San Francisco, California, August 7-9, 2001
  6. Exploiting idle periods in Clusters
  7. References to online documents about process migration, checkpointing and load balancing

Performance Tools

  1. "The Paradyn Parallel Performance Measurement Tools", B. P. Miller, M. D. Callaghan, J. M. Cargille, J. K. Hollingsworth, R. B. Irvin, K. L. Karavanic, K. Kunchithapadam, and T. Newhall, IEEE Computer, Nov. 1995. 28(11), pp. 37-46.
  2. "Scalable Performance Analysis: The Pablo Performance Analysis Environment" D. A. Reed, R. A. Aydt, R. J. Noe, P. C. Roth, K. A. Shields, B. W. Schwartz, an d L. F. Tavera, Proceedings of the Scalable Parallel Libraries Conference, A. Skjellum, Editor. 1993, IEEE Computer Society.

Distributed Database Management Systems

  1. "Transaction Management in the R* Distributed Database Management System" Mohan, Lindsay, and Obermark, TODS, 11(4), 1986
  2. The Dangers of Replication and a Solution, Gray, Helland, O'Neil, and Shasha, Proceedings of the ACM SIGMOD Conference, 1996
  3. Mariposa: A Wide-Area Distributed Database , Stonebraker et. al., VLDB Journal 5, 1996
  4. Client-Server Paradise , D. DeWitt, J. Patel, J. Luo, and J. Yu, Proceedings of the 1994 VLDB Conference, Chile, August 1994.

Parallel Algorithms

Parallel Algorithms for Regular Architectures by Russ Miller Quentin F. Stout, MIT Press, 1996.

Supercomputers

  1. An Overview of the BlueGene/L Supercomputer, The BlueGeneL/Team, Proceedings of the IEEE SC2002, 2002.
  2. Top 500 list

Misc.

Does Systems Research Measure Up? Small, C., Ghosh, N., Saleeb, H., Seltzer, M., Smith, K., Harvard University Computer Science Technical Report TR-16-97, November 1997.