--- doc/gutshtml/SessionOne.html	2002/06/28 20:30:29	1.1
+++ doc/gutshtml/SessionOne.html	2003/07/22 14:47:00	1.2
@@ -1,444 +1,888 @@
-<html>
-<head>
-<meta name=Title
-content="Session One: Intro/Demo, lonc/d, Replication and Load Balancing (Gerd)">
-<meta http-equiv=Content-Type content="text/html; charset=macintosh">
-<link rel=Edit-Time-Data href="Session%20One_files/editdata.mso">
-<title>Session One: Intro/Demo, lonc/d, Replication and Load Balancing (Gerd)</title>
-<style><!--
-.Section1
-	{page:Section1;}
-.Section2
-	{page:Section2;}
--->
-</style>
-</head>
-<body bgcolor=#FFFFFF link=blue vlink=purple class="Normal" lang=EN-US>
-<div class=Section1> 
-  <h2>Session One: Intro/Demo, lonc/d, Replication and Load Balancing (Gerd)</h2>
-  <p> <img width=432 height=555
-src="Session%20One_files/image002.jpg" v:shapes="_x0000_i1025"> <span
-style='font-size:14.0pt'><b>Fig. 1.1.1</b></span><span style='font-size:14.0pt'> 
-    – Overview of Network</span></p>
-  <h3><a name="_Toc514840838"></a><a name="_Toc421867040">Overview</a></h3>
-  <p>Physically, the Network consists of relatively inexpensive upper-PC-class 
-    server machines which are linked through the commodity internet in a load-balancing, 
-    dynamically content-replicating and failover-secure way. <b>Fig. 1.1.1</b><span style='font-weight:normal'> 
-    shows an overview of this network.</span></p>
-  <p>All machines in the Network are connected with each other through two-way 
-    persistent TCP/IP connections. Clients (<b>B</b><span
-style='font-weight:normal'>, </span><b>F</b><span style='font-weight:normal'>, 
-    </span><b>G</b><span
-style='font-weight:normal'> and </span><b>H</b><span style='font-weight:normal'> 
-    in </span><b>Fig. 1.1.1</b><span style='font-weight:normal'>) connect to the 
-    servers via standard HTTP. There are two classes of servers, Library Servers 
-    (</span><b>A</b><span
-style='font-weight:normal'> and </span><b>E</b><span style='font-weight:normal'> 
-    in </span><b>Fig. 1.1.1</b><span style='font-weight:normal'>) and Access Servers 
-    (</span><b>C</b><span style='font-weight:normal'>, </span><b>D</b><span
-style='font-weight:normal'>, </span><b>I</b><span style='font-weight:normal'> 
-    and </span><b>J</b><span style='font-weight:normal'> in </span><b>Fig. 1.1.1</b><span
-style='font-weight:normal'>). Library Servers are used to store all personal records 
-    of a set of users, and are responsible for their initial authentication when 
-    a session is opened on any server in the Network. For Authors, Library Servers 
-    also hosts their construction area and the authoritative copy of the current 
-    and previous versions of every resource that was published by that author. 
-    Library servers can be used as backups to host sessions when all access servers 
-    in the Network are overloaded. Otherwise, for learners, access servers are 
-    used to host the sessions. Library servers need to be strong on I/O, while 
-    access servers can generally be cheaper hardware. The network is designed 
-    so that the number of concurrent sessions can be increased over a wide range 
-    by simply adding additional Access Servers before having to add additional 
-    Library Servers. Preliminary tests showed that a Library Server could handle 
-    up to 10 Access Servers fully parallel.</span></p>
-  <p>The Network is divided into so-called domains, which are logical boundaries 
-    between participating institutions. These domains can be used to limit the 
-    flow of personal user information across the network, set access privileges 
-    and enforce royalty schemes.</p>
-  <h3><a name="_Toc514840839"></a><a name="_Toc421867041">Example of Transactions</a></h3>
-  <p><b>Fig. 1.1.1</b><span style='font-weight:normal'> also depicts examples 
-    for several kinds of transactions conducted across the Network. </span></p>
-  <p>An instructor at client <b>B</b><span style='font-weight:
-normal'> modifies and publishes a resource on her Home Server </span><b>A</b><span
-style='font-weight:normal'>. Server </span><b>A</b><span style='font-weight:
-normal'> has a record of all server machines currently subscribed to this resource, 
-    and replicates it to servers </span><b>D</b><span style='font-weight:
-normal'> and </span><b>I</b><span style='font-weight:normal'>. However, server 
-    </span><b>D</b><span
-style='font-weight:normal'> is currently offline, so the update notification gets 
-    buffered on </span><b>A</b><span style='font-weight:normal'> until </span><b>D</b><span
-style='font-weight:normal'> comes online again.</span><b> </b><span
-style='font-weight:normal'>Servers </span><b>C</b><span style='font-weight:
-normal'> and </span><b>J</b><span style='font-weight:normal'> are currently not 
-    subscribed to this resource. </span></p>
-  <p>Learners <b>F</b><span style='font-weight:normal'> and </span><b>G</b><span
-style='font-weight:normal'> have open sessions on server </span><b>I</b><span
-style='font-weight:normal'>, and the new resource is immediately available to 
-    them. </span></p>
-  <p>Learner <b>H</b><span style='font-weight:normal'> tries to connect to server 
-    </span><b>I</b><span style='font-weight:normal'> for a new session, however, 
-    the machine is not reachable, so he connects to another Access Server </span><b>J</b><span style='font-weight:normal'> 
-    instead. This server currently does not have all necessary resources locally 
-    present to host learner </span><b>H</b><span style='font-weight:normal'>, 
-    but subscribes to them and replicates them as they are accessed by </span><b>H</b><span
-style='font-weight:normal'>. </span></p>
-  <p>Learner <b>H</b><span style='font-weight:normal'> solves a problem on server 
-    </span><b>J</b><span style='font-weight:normal'>. Library Server </span><b>E</b><span style='font-weight:normal'> 
-    is </span><b>H</b><span
-style='font-weight:normal'>’s Home Server, so this information gets forwarded 
-    to </span><b>E</b><span style='font-weight:normal'>, where the records of 
-    </span><b>H</b><span
-style='font-weight:normal'> are updated. </span></p>
-  <h3><a name="_Toc514840840"></a><a name="_Toc421867042">lonc/lond/lonnet</a></h3>
-  <p><b>Fig. 1.1.2</b><span style='font-weight:normal'> elaborates on the details 
-    of this network infrastructure. </span></p>
-  <p><b>Fig. 1.1.2A</b><span style='font-weight:normal'> depicts three servers 
-    (</span><b>A</b><span style='font-weight:normal'>, </span><b>B</b><span
-style='font-weight:normal'> and </span><b>C</b><span style='font-weight:normal'>, 
-    </span><b>Fig. 1.1.2A</b><span style='font-weight:normal'>) and a client who 
-    has a session on server </span><b>C.</b></p>
-  <p>As <b>C</b><span style='font-weight:normal'> accesses different resources 
-    in the system, different handlers, which are incorporated as modules into 
-    the child processes of the web server software, process these requests.</span></p>
-  <p>Our current implementation uses <span style='font-family:
-"Courier New"'>mod_perl</span> inside of the Apache web server software. As an 
-    example, server <b>C</b><span style='font-weight:normal'> currently has four 
-    active web server software child processes. The chain of handlers dealing 
-    with a certain resource is determined by both the server content resource 
-    area (see below) and the MIME type, which in turn is determined by the URL 
-    extension. For most URL structures, both an authentication handler and a content 
-    handler are registered.</span></p>
-  <p>Handlers use a common library <span style='font-family:"Courier New"'>lonnet</span> 
-    to interact with both locally present temporary session data and data across 
-    the server network. For example, <span style='font-family:"Courier New"'>lonnet</span> 
-    provides routines for finding the home server of a user, finding the server 
-    with the lowest loadavg, sending simple command-reply sequences, and sending 
-    critical messages such as a homework completion, etc. For a non-critical message, 
-    the routines reply with a simple “connection lost” if the message could not 
-    be delivered. For critical messages,<i> </i><span style='font-family:
-"Courier New";font-style:normal'>lonnet</span><i> </i><span style='font-style:
-normal'>tries to re-establish</span><i> </i><span style='font-style:normal'>connections, 
-    re-send the command, etc. If no valid reply could be received, it answers 
-    “connection deferred” and stores the message in</span><i> </i><span
-style='font-style:normal'>buffer space to be sent</span><i> </i><span
-style='font-style:normal'>at a later point in time. Also, failed critical messages 
-    are logged.</span></p>
-  <p>The interface between <span style='font-family:"Courier New"'>lonnet</span> 
-    and the Network is established by a multiplexed UNIX domain socket, denoted 
-    DS in <b>Fig. 1.1.2A</b><span style='font-weight:normal'>. The rationale behind 
-    this rather involved architecture is that httpd processes (Apache children) 
-    dynamically come and go on the timescale of minutes, based on workload and 
-    number of processed requests. Over the lifetime of an httpd child, however, 
-    it has to establish several hundred connections to several different servers 
-    in the Network.</span></p>
-  <p>On the other hand, establishing a TCP/IP connection is resource consuming 
-    for both ends of the line, and to optimize this connectivity between different 
-    servers, connections in the Network are designed to be persistent on the timescale 
-    of months, until either end is rebooted. This mechanism will be elaborated 
-    on below.</p>
-  <p>Establishing a connection to a UNIX domain socket is far less resource consuming 
-    than the establishing of a TCP/IP connection. <span
-style='font-family:"Courier New"'>lonc</span> is a proxy daemon that forks off 
-    a child for every server in the Network. . Which servers are members of the 
-    Network is determined by a lookup table, which <b>Fig. 1.1.2B</b><span
-style='font-weight:normal'> is an example of. In order, the entries denote an 
-    internal name for the server, the domain of the server, the type of the server, 
-    the host name and the IP address.</span></p>
-  <p>The <span style='font-family:"Courier New"'>lonc</span> parent process maintains 
-    the population and listens for signals to restart or shutdown, as well as 
-    <i>USR1</i><span style='font-style:normal'>. Every child establishes a multiplexed 
-    UNIX domain socket for its server and opens a TCP/IP connection to the </span><span style='font-family:"Courier New"'>lond</span> 
-    daemon (discussed below) on the remote machine, which it keeps alive.<i> </i><span
-style='font-style:normal'>If the connection is interrupted, the child dies, whereupon 
-    the parent makes several attempts to fork another child for that server. </span></p>
-  <p>When starting a new child (a new connection), first an init-sequence is carried 
-    out, which includes receiving the information from the remote <span style='font-family:"Courier New"'>lond</span> 
-    which is needed to establish the 128-bit encryption key – the key is different 
-    for every connection. Next, any buffered (delayed) messages for the server 
-    are sent.</p>
-  <p>In normal operation, the child listens to the UNIX socket, forwards requests 
-    to the TCP connection, gets the reply from <span
-style='font-family:"Courier New"'>lond</span>, and sends it back to the UNIX socket. 
-    Also, <span style='font-family:"Courier New"'>lonc</span> takes care to the 
-    encryption and decryption of messages.</p>
-  <p><span style='font-family:"Courier New"'>lonc</span> was build by putting 
-    a non-forking multiplexed UNIX domain socket server into a framework that 
-    forks a TCP/IP client for every remote <span style='font-family:
-"Courier New"'>lond</span>.</p>
-  <p><span style='font-family:"Courier New"'>lond</span> is the remote end of 
-    the TCP/IP connection and acts as a remote command processor. It receives 
-    commands, executes them, and sends replies. In normal operation,<i> </i><span
-style='font-style:normal'>a </span><span style='font-family:"Courier New"'>lonc</span> 
-    child is constantly connected to a dedicated <span style='font-family:"Courier New"'>lond</span> 
-    child on the remote server, and the same is true vice versa (two persistent 
-    connections per server combination). </p>
-  <p><span style='font-family:"Courier New"'>lond</span><i>&nbsp; </i><span style='font-style:normal'>listens 
-    to a TCP/IP port (denoted P in <b>Fig. 1.1.2A</b></span>) and forks off enough 
-    child processes to have one for each other server in the network plus two 
-    spare children. The parent process maintains the population and listens for 
-    signals to restart or shutdown. Client servers are authenticated by IP<i>.</i></p>
-  <br
-clear=ALL style='page-break-before:always'>
-  <p><span style='font-size:14.0pt'> <img width=432 height=492
-src="Session%20One_files/image004.jpg" v:shapes="_x0000_i1026"> </span></p>
-  <p><span style='font-size:14.0pt'><b>Fig. 1.1.2A</b></span><span
-style='font-size:14.0pt'> – Overview of Network Communication</span></p>
-  <p>When a new client server comes online<i>,</i><span
-style='font-style:normal'> </span><span style='font-family:"Courier New"'>lond</span> 
-    sends a signal<i> USR1 </i><span style='font-style:normal'>to </span><span
-style='font-family:"Courier New"'>lonc</span>, whereupon <span
-style='font-family:"Courier New"'>lonc</span> tries again to reestablish all lost 
-    connections, even if it had given up on them before – a new client connecting 
-    could mean that that machine came online again after an interruption.</p>
-  <p>The gray boxes in <b>Fig. 1.1.2A</b><span style='font-weight:
-normal'> denote the entities involved in an example transaction of the Network. 
-    The Client is logged into server </span><b>C</b><span style='font-weight:normal'>, 
-    while server </span><b>B</b><span style='font-weight:normal'> is her Home 
-    Server. Server </span><b>C</b><span style='font-weight:normal'> can be an 
-    Access Server or a Library Server, while server </span><b>B</b><span
-style='font-weight:normal'> is a Library Server. She submits a solution to a homework 
-    problem, which is processed by the appropriate handler for the MIME type “problem”. 
-    Through </span><span style='font-family:"Courier New"'>lonnet</span>, the 
-    handler writes information about this transaction to the local session data. 
-    To make a permanent log entry, <span style='font-family:"Courier New"'>lonnet 
-    </span>establishes a connection to the UNIX domain socket for server <b>B</b><span
-style='font-weight:normal'>. </span><span style='font-family:"Courier New"'>lonc</span> 
-    receives this command, encrypts it, and sends it through the persistent TCP/IP 
-    connection to the TCP/IP port of the remote <span style='font-family:"Courier New"'>lond</span>. 
-    <span style='font-family:"Courier New"'>lond</span> decrypts the command, 
-    executes it by writing to the permanent user data files of the client, and 
-    sends back a reply regarding the success of the operation. If the operation 
-    was unsuccessful, or the connection would have broken down, <span style='font-family:
-"Courier New"'>lonc</span> would write the command into a FIFO buffer stack to 
-    be sent again later. <span style='font-family:"Courier New"'>lonc</span> now 
-    sends a reply regarding the overall success of the operation to <span
-style='font-family:"Courier New"'>lonnet</span> via the UNIX domain port, which 
-    is eventually received back by the handler.</p>
-  <h3><a name="_Toc514840841"></a><a name="_Toc421867043">Scalability and Performance 
-    Analysis</a></h3>
-  <p>The scalability was tested in a test bed of servers between different physical 
-    network segments, <b>Fig. 1.1.2B</b><span style='font-weight:
-normal'> shows the network configuration of this test.</span></p>
-  <table border=1 cellspacing=0 cellpadding=0>
-    <tr> 
-      <td width=443 valign=top class="Normal"> <p><span style='font-family:"Courier New"'>msul1:msu:library:zaphod.lite.msu.edu:35.8.63.51</span></p>
-        <p><span style='font-family:"Courier New"'>msua1:msu:access:agrajag.lite.msu.edu:35.8.63.68</span></p>
-        <p><span style='font-family:"Courier New"'>msul2:msu:library:frootmig.lite.msu.edu:35.8.63.69</span></p>
-        <p><span style='font-family:"Courier New"'>msua2:msu:access:bistromath.lite.msu.edu:35.8.63.67</span></p>
-        <p><span style='font-family:"Courier New"'>hubl14:hub:library:hubs128-pc-14.cl.msu.edu:35.8.116.34</span></p>
-        <p><span style='font-family:"Courier New"'>hubl15:hub:library:hubs128-pc-15.cl.msu.edu:35.8.116.35</span></p>
-        <p><span style='font-family:"Courier New"'>hubl16:hub:library:hubs128-pc-16.cl.msu.edu:35.8.116.36</span></p>
-        <p><span style='font-family:"Courier New"'>huba20:hub:access:hubs128-pc-20.cl.msu.edu:35.8.116.40</span></p>
-        <p><span style='font-family:"Courier New"'>huba21:hub:access:hubs128-pc-21.cl.msu.edu:35.8.116.41</span></p>
-        <p><span style='font-family:"Courier New"'>huba22:hub:access:hubs128-pc-22.cl.msu.edu:35.8.116.42</span></p>
-        <p><span style='font-family:"Courier New"'>huba23:hub:access:hubs128-pc-23.cl.msu.edu:35.8.116.43</span></p>
-        <p><span style='font-family:"Courier New"'>hubl25:other:library:hubs128-pc-25.cl.msu.edu:35.8.116.45</span></p>
-        <p><span style='font-family:"Courier New"'>huba27:other:access:hubs128-pc-27.cl.msu.edu:35.8.116.47</span></p></td>
-    </tr>
-  </table>
-  <p><span style='font-size:14.0pt'><b>Fig. 1.1.2B</b></span><span
-style='font-size:14.0pt'> – Example of Hosts Lookup Table </span><span
-style='font-size:9.0pt;font-family:"Courier New"'>/home/httpd/lonTabs/hosts.tab</span></p>
-  <p>In the first test,<span style='layout-grid-mode:line'> the simple </span><span style='font-family:"Courier New";layout-grid-mode:line'>ping</span><span
-style='layout-grid-mode:line'> command was used. The </span><span
-style='font-family:"Courier New";layout-grid-mode:line'>ping</span><span
-style='layout-grid-mode:line'> command is used to test connections and yields 
-    the server short name as reply.&nbsp; In this scenario, </span><span style='font-family:"Courier New";layout-grid-mode:
-line'>lonc</span><span style='layout-grid-mode:line'> was expected to be the speed-determining 
-    step, since </span><span style='font-family:"Courier New";
-layout-grid-mode:line'>lond</span><span style='layout-grid-mode:line'> at the 
-    remote end does not need any disk access to reply.&nbsp; The graph <b>Fig. 
-    1.1.2C</b></span><span style='layout-grid-mode:
-line'> shows number of seconds till completion versus number of processes issuing 
-    10,000 ping commands each against one Library Server (450 MHz Pentium II in 
-    this test, single IDE HD). For the solid dots, the processes were concurrently 
-    started on <i>the same</i></span><span style='layout-grid-mode:
-line'> Access Server and the time was measured till the processes finished – all 
-    processes finished at the same time. One Access Server (233 MHz Pentium II 
-    in the test bed) can process about 150 pings per second, and as expected, 
-    the total time grows linearly with the number of pings.</span></p>
-  <p><span style='layout-grid-mode:line'>The gray dots were taken with up to seven 
-    processes concurrently running on <i>different</i></span><span
-style='layout-grid-mode:line'> machines and pinging the same server – the processes 
-    ran fully concurrent, and each process finished as if the other ones were 
-    not present (about 1000 pings per second). Execution was fully parallel.</span></p>
-  <p>In a second test, <span style='font-family:"Courier New"'>lond</span> was 
-    the speed-determining step – 10,000 <span style='font-family:"Courier New"'>put</span> 
-    commands each were issued first from up to seven concurrent processes on the 
-    same machine, and then from up to seven processes on different machines. The 
-    <span
-style='font-family:"Courier New"'>put</span> command requires data to be written 
-    to the permanent record of the user on the remote server.</p>
-  <p>In particular, one <span style='font-family:"Courier New"'>&quot;put&quot;</span> 
-    request meant that the process on the Access Server would connect to the UNIX 
-    domain socket dedicated to the library server, <span style='font-family:"Courier New"'>lonc</span> 
-    would take the data from there, shuffle it through the persistent TCP connection, 
-    <span style='font-family:"Courier New"'>lond</span> on the remote library 
-    server would take the data, write to disk (both to a dbm-file and to a flat-text 
-    transaction history file), answer &quot;ok&quot;, <span
-style='font-family:"Courier New"'>lonc</span> would take that reply and send it 
-    to the domain socket, the process would read it from there and close the domain-socket 
-    connection.</p>
-  <p><span style='font-size:14.0pt'> <img width=220 height=190
-src="Session%20One_files/image005.jpg" v:shapes="_x0000_i1027"> </span></p>
-  <p><span style='font-size:14.0pt'><b>Fig. 1.1.2C</b></span><span
-style='font-size:14.0pt'> – Benchmark on Parallelism of Server-Server Communication 
-    (no disk access)</span></p>
-  <p>The graph <b>Fig. 1.1.2D</b><span style='font-weight:normal'> shows the results. 
-    Series 1 (solid black diamond) is the result of concurrent processes on the 
-    same server – all of these are handled by the same server-dedicated </span><span style='font-family:"Courier New"'>lond-</span>child, 
-    which lets the total amount of time grow linearly.</p>
-  <p><span style='font-size:14.0pt'> <img width=432 height=311
-src="Session%20One_files/image007.jpg" v:shapes="_x0000_i1028"> </span></p>
-  <p><span style='font-size:14.0pt'><b>Fig. 2D</b></span><span
-style='font-size:14.0pt'> – Benchmark on Parallelism of Server-Server Communication 
-    (with disk access as in Fig. 2A)</span></p>
-  <p>Series 2 through 8 were obtained from running the processes on different 
-    Access Servers against one Library Server, each series goes with one server. 
-    In this experiment, the processes did not finish at the same time, which most 
-    likely is due to disk-caching on the Library Server – <span
-style='font-family:"Courier New"'>lond</span>-children whose datafile was (partly) 
-    in disk cache finished earlier. With seven processes from seven different 
-    servers, the operation took 255 seconds till the last process was finished 
-    for 70,000 <span style='font-family:"Courier New"'>put</span> commands (270 
-    per second) – versus 530 seconds if the processes ran on the same server (130 
-    per second).</p>
-  <h3><a name="_Toc514840842"></a><a name="_Toc421867044">Dynamic Resource Replication</a></h3>
-  <p>Since resources are assembled into higher order resources simply by reference, 
-    in principle it would be sufficient to retrieve them from the respective Home 
-    Servers of the authors. However, there are several problems with this simple 
-    approach: since the resource assembly mechanism is designed to facilitate 
-    content assembly from a large number of widely distributed sources, individual 
-    sessions would depend on a large number of machines and network connections 
-    to be available, thus be rather fragile. Also, frequently accessed resources 
-    could potentially drive individual machines in the network into overload situations.</p>
-  <p>Finally, since most resources depend on content handlers on the Access Servers 
-    to be served to a client within the session context, the raw source would 
-    first have to be transferred across the Network from the respective Library 
-    Server to the Access Server, processed there, and then transferred on to the 
-    client.</p>
-  <p>To enable resource assembly in a reliable and scalable way, a dynamic resource 
-    replication scheme was developed. <b>Fig. 1.1.3</b><span
-style='font-weight:normal'> shows the details of this mechanism.</span></p>
-  <p>Anytime a resource out of the resource space is requested, a handler routine 
-    is called which in turn calls the replication routine (<b>Fig. 1.1.3A</b><span style='font-weight:normal'>). 
-    As a first step, this routines determines whether or not the resource is currently 
-    in replication transfer (</span><b>Fig. 1.1.3A,</b><span style='font-weight:normal'> 
-    </span><b>Step D1a</b><span
-style='font-weight:normal'>). During replication transfer, the incoming data is 
-    stored in a temporary file, and </span><b>Step D1a</b><span style='font-weight:
-normal'> checks for the presence of that file. If transfer of a resource is actively 
-    going on, the controlling handler receives an error message, waits for a few 
-    seconds, and then calls the replication routine again. If the resource is 
-    still in transfer, the client will receive the message “Service currently 
-    not available”.</span></p>
-  <p>In the next step (<b>Fig. 1.1.3A, Step D1b</b><span
-style='font-weight:normal'>), the replication routine checks if the URL is locally 
-    present. If it is, the replication routine returns OK to the controlling handler, 
-    which in turn passes the request on to the next handler in the chain.</span></p>
-  <p>If the resource is not locally present, the Home Server of the resource author 
-    (as extracted from the URL) is determined (<b>Fig. 1.1.3A, Step D2</b><span style='font-weight:normal'>). 
-    This is done by contacting all library servers in the author’s domain (as 
-    determined from the lookup table, see </span><b>Fig. 1.1.2B</b><span style='font-weight:normal'>). 
-    In </span><b>Step D2b</b><span style='font-weight:normal'> a query is sent 
-    to the remote server whether or not it is the Home Server of the author (in 
-    our current implementation, an additional cache is used to store already identified 
-    Home Servers (not shown in the figure)). In Step </span><b>D2c</b><span
-style='font-weight:normal'>, the remote server answers the query with True or 
-    False. If the Home Server was found, the routine continues, otherwise it contacts 
-    the next server (</span><b>Step D2a</b><span style='font-weight:normal'>). 
-    If no server could be found, a “File not Found” error message is issued. In 
-    our current implementation, in this step the Home Server is also written into 
-    a cache for faster access if resources by the same author are needed again 
-    (not shown in the figure). </span></p>
-  <br
-clear=ALL style='page-break-before:always'>
-  <p><span style='font-size:14.0pt'> <img width=432 height=581
-src="Session%20One_files/image009.jpg" v:shapes="_x0000_i1029"> </span></p>
-  <p><span style='font-size:14.0pt'><b>Fig. 1.1.3A</b></span><span
-style='font-size:14.0pt'> – Dynamic Resource Replication, subscription</span></p>
-  <br
-clear=ALL style='page-break-before:always'>
-  <p><span style='font-size:14.0pt'> <img width=432 height=523
-src="Session%20One_files/image011.jpg" v:shapes="_x0000_i1030"> </span></p>
-  <p><span style='font-size:14.0pt'><b>Fig. 1.1.3B</b></span><span
-style='font-size:14.0pt'> – Dynamic Resource Replication, modification</span></p>
-  <p>In <b>Step D3a</b><span style='font-weight:normal'>, the routine sends a 
-    subscribe command for the URL to the Home Server of the author. The Home Server 
-    first determines if the resource is present, and if the access privileges 
-    allow it to be copied to the requesting server (</span><b>Fig. 1.1.3A, Step 
-    D3b</b><span style='font-weight:normal'>). If this is true, the requesting 
-    server is added to the list of subscribed servers for that resource (</span><b>Step 
-    D3c</b><span style='font-weight:normal'>). The Home Server will reply with 
-    either OK or an error message, which is determined in </span><b>Step D4</b><span style='font-weight:normal'>. 
-    If the remote resource was not present, the error message “File not Found” 
-    will be passed on to the client, if the access was not allowed, the error 
-    message “Access Denied” is passed on. If the operation succeeded, the requesting 
-    server sends an HTTP request for the resource out of the /</span><span style='font-family:"Courier New"'>raw</span> 
-    server content resource area of the Home Server.</p>
-  <p>The Home Server will then check if the requesting server is part of the network, 
-    and if it is subscribed to the resource (<b>Step D5b</b><span
-style='font-weight:normal'>). If it is, it will send the resource via HTTP to 
-    the requesting server without any content handlers processing it (</span><b>Step 
-    D5c</b><span style='font-weight:normal'>). The requesting server will store 
-    the incoming data in a temporary data file (</span><b>Step D5a</b><span
-style='font-weight:normal'>) – this is the file that </span><b>Step D1a</b><span
-style='font-weight:normal'> checks for. If the transfer could not complete, and 
-    appropriate error message is sent to the client (</span><b>Step D6</b><span
-style='font-weight:normal'>). Otherwise, the transferred temporary file is renamed 
-    as the actual resource, and the replication routine returns OK to the controlling 
-    handler (</span><b>Step D7</b><span style='font-weight:normal'>). </span></p>
-  <p><b>Fig. 1.1.3B</b><span style='font-weight:normal'>&nbsp; depicts the process 
-    of modifying a resource. When an author publishes a new version of a resource, 
-    the Home Server will contact every server currently subscribed to the resource 
-    (</span><b>Fig. 1.1.3B, Step U1</b><span style='font-weight:normal'>), as 
-    determined from the list of subscribed servers for the resource generated 
-    in </span><b>Fig. 1.1. 3A, Step D3c</b><span style='font-weight:normal'>. 
-    The subscribing servers will receive and acknowledge the update message (</span><b>Step 
-    U1c</b><span
-style='font-weight:normal'>). The update mechanism finishes when the last subscribed 
-    server has been contacted (messages to unreachable servers are buffered).</span></p>
-  <p>Each subscribing server will check if the resource in question had been accessed 
-    recently, that is, within a configurable amount of time (<b>Step U2</b><span style='font-weight:normal'>). 
-    </span></p>
-  <p>If the resource had not been accessed recently, the local copy of the resource 
-    is deleted (<b>Step U3a</b><span style='font-weight:normal'>) and an unsubscribe 
-    command is sent to the Home Server (</span><b>Step U3b</b><span
-style='font-weight:normal'>). The Home Server will check if the server had indeed 
-    originally subscribed to the resource (</span><b>Step U3c</b><span
-style='font-weight:normal'>) and then delete the server from the list of subscribed 
-    servers for the resource (</span><b>Step U3d</b><span
-style='font-weight:normal'>).</span></p>
-  <p>If the resource had been accessed recently, the modified resource will be 
-    copied over using the same mechanism as in <b>Step D5a</b><span
-style='font-weight:normal'> through </span><b>D7</b><span style='font-weight:
-normal'> of </span><b>Fig. 1.1.3A</b><span style='font-weight:normal'> (</span><b>Fig. 
-    1.1.3B</b><span style='font-weight:normal'>, </span><b>Steps U4a </b><span
-style='font-weight:normal'>through</span><b> U6</b><span style='font-weight:
-normal'>).</span></p>
-  <p><span style='font-family:Arial'>Load Balancing</span></p>
-  <p><span style='font-family:"Courier New"'>lond</span> provides a function to 
-    query the server’s current <span style='font-family:"Courier New"'>loadavg</span><span
-style='font-size:14.0pt'>. </span>As a configuration parameter, one can determine 
-    the value of <span style='font-family:"Courier New"'>loadavg,</span> which 
-    is to be considered 100%, for example, 2.00. </p>
-  <p>Access servers can have a list of spare access servers, <span
-style='font-size:9.0pt;font-family:"Courier New"'>/home/httpd/lonTabs/spares.tab</span>, 
-    to offload sessions depending on own workload. This check happens is done 
-    by the login handler. It re-directs the login information and session to the 
-    least busy spare server if itself is overloaded. An additional round-robin 
-    IP scheme possible. See <b>Fig. 1.1.4</b><span style='font-weight:normal'> 
-    for an example of a load-balancing scheme.</span></p>
-  <p><span style='font-size:28.0pt;color:green'> <img width=241 height=139
-src="Session%20One_files/image013.jpg" v:shapes="_x0000_i1031"> </span></p>
-  <p><span
-style='font-size:14.0pt'><b>Fig. 1.1.4 – </b></span><span style='font-size:14.0pt'>Example 
-    of Load Balancing</span><span style='font-size:14.0pt'> <b><i><br
-clear=ALL style='page-break-before:always'>
-    </i></b></span></p>
-</div>
-<br
-clear=ALL style='page-break-before:always;'>
-<div class=Section2> </div>
-</body>
-</html>
+<html>
+
+<head>
+
+<meta name=Title
+
+content="Session One: Intro/Demo, lonc/d, Replication and Load Balancing (Gerd)">
+
+<meta http-equiv=Content-Type content="text/html; charset=macintosh">
+
+<link rel=Edit-Time-Data href="Session%20One_files/editdata.mso">
+
+<title>Session One: Intro/Demo, lonc/d, Replication and Load Balancing (Gerd)</title>
+
+<style><!--
+
+.Section1
+
+	{page:Section1;}
+
+.Section2
+
+	{page:Section2;}
+
+-->
+
+</style>
+
+</head>
+
+<body bgcolor=#FFFFFF link=blue vlink=purple class="Normal" lang=EN-US>
+
+<div class=Section1> 
+
+  <h2>Session One: Intro/Demo, lonc/d, Replication and Load Balancing (Gerd)</h2>
+
+  <p> <img width=432 height=555
+
+src="Session%20One_files/image002.jpg" v:shapes="_x0000_i1025"> <span
+
+style='font-size:14.0pt'><b>Fig. 1.1.1</b></span><span style='font-size:14.0pt'> 
+
+    – Overview of Network</span></p>
+
+  <h3><a name="_Toc514840838"></a><a name="_Toc421867040">Overview</a></h3>
+
+  <p>Physically, the Network consists of relatively inexpensive upper-PC-class 
+
+    server machines which are linked through the commodity internet in a load-balancing, 
+
+    dynamically content-replicating and failover-secure way. <b>Fig. 1.1.1</b><span style='font-weight:normal'> 
+
+    shows an overview of this network.</span></p>
+
+  <p>All machines in the Network are connected with each other through two-way 
+
+    persistent TCP/IP connections. Clients (<b>B</b><span
+
+style='font-weight:normal'>, </span><b>F</b><span style='font-weight:normal'>, 
+
+    </span><b>G</b><span
+
+style='font-weight:normal'> and </span><b>H</b><span style='font-weight:normal'> 
+
+    in </span><b>Fig. 1.1.1</b><span style='font-weight:normal'>) connect to the 
+
+    servers via standard HTTP. There are two classes of servers, Library Servers 
+
+    (</span><b>A</b><span
+
+style='font-weight:normal'> and </span><b>E</b><span style='font-weight:normal'> 
+
+    in </span><b>Fig. 1.1.1</b><span style='font-weight:normal'>) and Access Servers 
+
+    (</span><b>C</b><span style='font-weight:normal'>, </span><b>D</b><span
+
+style='font-weight:normal'>, </span><b>I</b><span style='font-weight:normal'> 
+
+    and </span><b>J</b><span style='font-weight:normal'> in </span><b>Fig. 1.1.1</b><span
+
+style='font-weight:normal'>). Library Servers are used to store all personal records 
+
+    of a set of users, and are responsible for their initial authentication when 
+
+    a session is opened on any server in the Network. For Authors, Library Servers 
+
+    also hosts their construction area and the authoritative copy of the current 
+
+    and previous versions of every resource that was published by that author. 
+
+    Library servers can be used as backups to host sessions when all access servers 
+
+    in the Network are overloaded. Otherwise, for learners, access servers are 
+
+    used to host the sessions. Library servers need to be strong on I/O, while 
+
+    access servers can generally be cheaper hardware. The network is designed 
+
+    so that the number of concurrent sessions can be increased over a wide range 
+
+    by simply adding additional Access Servers before having to add additional 
+
+    Library Servers. Preliminary tests showed that a Library Server could handle 
+
+    up to 10 Access Servers fully parallel.</span></p>
+
+  <p>The Network is divided into so-called domains, which are logical boundaries 
+
+    between participating institutions. These domains can be used to limit the 
+
+    flow of personal user information across the network, set access privileges 
+
+    and enforce royalty schemes.</p>
+
+  <h3><a name="_Toc514840839"></a><a name="_Toc421867041">Example of Transactions</a></h3>
+
+  <p><b>Fig. 1.1.1</b><span style='font-weight:normal'> also depicts examples 
+
+    for several kinds of transactions conducted across the Network. </span></p>
+
+  <p>An instructor at client <b>B</b><span style='font-weight:
+
+normal'> modifies and publishes a resource on her Home Server </span><b>A</b><span
+
+style='font-weight:normal'>. Server </span><b>A</b><span style='font-weight:
+
+normal'> has a record of all server machines currently subscribed to this resource, 
+
+    and replicates it to servers </span><b>D</b><span style='font-weight:
+
+normal'> and </span><b>I</b><span style='font-weight:normal'>. However, server 
+
+    </span><b>D</b><span
+
+style='font-weight:normal'> is currently offline, so the update notification gets 
+
+    buffered on </span><b>A</b><span style='font-weight:normal'> until </span><b>D</b><span
+
+style='font-weight:normal'> comes online again.</span><b> </b><span
+
+style='font-weight:normal'>Servers </span><b>C</b><span style='font-weight:
+
+normal'> and </span><b>J</b><span style='font-weight:normal'> are currently not 
+
+    subscribed to this resource. </span></p>
+
+  <p>Learners <b>F</b><span style='font-weight:normal'> and </span><b>G</b><span
+
+style='font-weight:normal'> have open sessions on server </span><b>I</b><span
+
+style='font-weight:normal'>, and the new resource is immediately available to 
+
+    them. </span></p>
+
+  <p>Learner <b>H</b><span style='font-weight:normal'> tries to connect to server 
+
+    </span><b>I</b><span style='font-weight:normal'> for a new session, however, 
+
+    the machine is not reachable, so he connects to another Access Server </span><b>J</b><span style='font-weight:normal'> 
+
+    instead. This server currently does not have all necessary resources locally 
+
+    present to host learner </span><b>H</b><span style='font-weight:normal'>, 
+
+    but subscribes to them and replicates them as they are accessed by </span><b>H</b><span
+
+style='font-weight:normal'>. </span></p>
+
+  <p>Learner <b>H</b><span style='font-weight:normal'> solves a problem on server 
+
+    </span><b>J</b><span style='font-weight:normal'>. Library Server </span><b>E</b><span style='font-weight:normal'> 
+
+    is </span><b>H</b><span
+
+style='font-weight:normal'>’s Home Server, so this information gets forwarded 
+
+    to </span><b>E</b><span style='font-weight:normal'>, where the records of 
+
+    </span><b>H</b><span
+
+style='font-weight:normal'> are updated. </span></p>
+
+  <h3><a name="_Toc514840840"></a><a name="_Toc421867042">lonc/lond/lonnet</a></h3>
+
+  <p><b>Fig. 1.1.2</b><span style='font-weight:normal'> elaborates on the details 
+
+    of this network infrastructure. </span></p>
+
+  <p><b>Fig. 1.1.2A</b><span style='font-weight:normal'> depicts three servers 
+
+    (</span><b>A</b><span style='font-weight:normal'>, </span><b>B</b><span
+
+style='font-weight:normal'> and </span><b>C</b><span style='font-weight:normal'>, 
+
+    </span><b>Fig. 1.1.2A</b><span style='font-weight:normal'>) and a client who 
+
+    has a session on server </span><b>C.</b></p>
+
+  <p>As <b>C</b><span style='font-weight:normal'> accesses different resources 
+
+    in the system, different handlers, which are incorporated as modules into 
+
+    the child processes of the web server software, process these requests.</span></p>
+
+  <p>Our current implementation uses <span style='font-family:
+
+"Courier New"'>mod_perl</span> inside of the Apache web server software. As an 
+
+    example, server <b>C</b><span style='font-weight:normal'> currently has four 
+
+    active web server software child processes. The chain of handlers dealing 
+
+    with a certain resource is determined by both the server content resource 
+
+    area (see below) and the MIME type, which in turn is determined by the URL 
+
+    extension. For most URL structures, both an authentication handler and a content 
+
+    handler are registered.</span></p>
+
+  <p>Handlers use a common library <span style='font-family:"Courier New"'>lonnet</span> 
+
+    to interact with both locally present temporary session data and data across 
+
+    the server network. For example, <span style='font-family:"Courier New"'>lonnet</span> 
+
+    provides routines for finding the home server of a user, finding the server 
+
+    with the lowest loadavg, sending simple command-reply sequences, and sending 
+
+    critical messages such as a homework completion, etc. For a non-critical message, 
+
+    the routines reply with a simple “connection lost” if the message could not 
+
+    be delivered. For critical messages,<i> </i><span style='font-family:
+
+"Courier New";font-style:normal'>lonnet</span><i> </i><span style='font-style:
+
+normal'>tries to re-establish</span><i> </i><span style='font-style:normal'>connections, 
+
+    re-send the command, etc. If no valid reply could be received, it answers 
+
+    “connection deferred” and stores the message in</span><i> </i><span
+
+style='font-style:normal'>buffer space to be sent</span><i> </i><span
+
+style='font-style:normal'>at a later point in time. Also, failed critical messages 
+
+    are logged.</span></p>
+
+  <p>The interface between <span style='font-family:"Courier New"'>lonnet</span> 
+
+    and the Network is established by a multiplexed UNIX domain socket, denoted 
+
+    DS in <b>Fig. 1.1.2A</b><span style='font-weight:normal'>. The rationale behind 
+
+    this rather involved architecture is that httpd processes (Apache children) 
+
+    dynamically come and go on the timescale of minutes, based on workload and 
+
+    number of processed requests. Over the lifetime of an httpd child, however, 
+
+    it has to establish several hundred connections to several different servers 
+
+    in the Network.</span></p>
+
+  <p>On the other hand, establishing a TCP/IP connection is resource consuming 
+
+    for both ends of the line, and to optimize this connectivity between different 
+
+    servers, connections in the Network are designed to be persistent on the timescale 
+
+    of months, until either end is rebooted. This mechanism will be elaborated 
+
+    on below.</p>
+
+  <p>Establishing a connection to a UNIX domain socket is far less resource consuming 
+
+    than the establishing of a TCP/IP connection. <span
+
+style='font-family:"Courier New"'>lonc</span> is a proxy daemon that forks off 
+
+    a child for every server in the Network. . Which servers are members of the 
+
+    Network is determined by a lookup table, which <b>Fig. 1.1.2B</b><span
+
+style='font-weight:normal'> is an example of. In order, the entries denote an 
+
+    internal name for the server, the domain of the server, the type of the server, 
+
+    the host name and the IP address.</span></p>
+
+  <p>The <span style='font-family:"Courier New"'>lonc</span> parent process maintains 
+
+    the population and listens for signals to restart or shutdown, as well as 
+
+    <i>USR1</i><span style='font-style:normal'>. Every child establishes a multiplexed 
+
+    UNIX domain socket for its server and opens a TCP/IP connection to the </span><span style='font-family:"Courier New"'>lond</span> 
+
+    daemon (discussed below) on the remote machine, which it keeps alive.<i> </i><span
+
+style='font-style:normal'>If the connection is interrupted, the child dies, whereupon 
+
+    the parent makes several attempts to fork another child for that server. </span></p>
+
+  <p>When starting a new child (a new connection), first an init-sequence is carried 
+
+    out, which includes receiving the information from the remote <span style='font-family:"Courier New"'>lond</span> 
+
+    which is needed to establish the 128-bit encryption key – the key is different 
+
+    for every connection. Next, any buffered (delayed) messages for the server 
+
+    are sent.</p>
+
+  <p>In normal operation, the child listens to the UNIX socket, forwards requests 
+
+    to the TCP connection, gets the reply from <span
+
+style='font-family:"Courier New"'>lond</span>, and sends it back to the UNIX socket. 
+
+    Also, <span style='font-family:"Courier New"'>lonc</span> takes care to the 
+
+    encryption and decryption of messages.</p>
+
+  <p><span style='font-family:"Courier New"'>lonc</span> was build by putting 
+
+    a non-forking multiplexed UNIX domain socket server into a framework that 
+
+    forks a TCP/IP client for every remote <span style='font-family:
+
+"Courier New"'>lond</span>.</p>
+
+  <p><span style='font-family:"Courier New"'>lond</span> is the remote end of 
+
+    the TCP/IP connection and acts as a remote command processor. It receives 
+
+    commands, executes them, and sends replies. In normal operation,<i> </i><span
+
+style='font-style:normal'>a </span><span style='font-family:"Courier New"'>lonc</span> 
+
+    child is constantly connected to a dedicated <span style='font-family:"Courier New"'>lond</span> 
+
+    child on the remote server, and the same is true vice versa (two persistent 
+
+    connections per server combination). </p>
+
+  <p><span style='font-family:"Courier New"'>lond</span><i>&nbsp; </i><span style='font-style:normal'>listens 
+
+    to a TCP/IP port (denoted P in <b>Fig. 1.1.2A</b></span>) and forks off enough 
+
+    child processes to have one for each other server in the network plus two 
+
+    spare children. The parent process maintains the population and listens for 
+
+    signals to restart or shutdown. Client servers are authenticated by IP<i>.</i></p>
+
+  <br
+
+clear=ALL style='page-break-before:always'>
+
+  <p><span style='font-size:14.0pt'> <img width=432 height=492
+
+src="Session%20One_files/image004.jpg" v:shapes="_x0000_i1026"> </span></p>
+
+  <p><span style='font-size:14.0pt'><b>Fig. 1.1.2A</b></span><span
+
+style='font-size:14.0pt'> – Overview of Network Communication</span></p>
+
+  <p>When a new client server comes online<i>,</i><span
+
+style='font-style:normal'> </span><span style='font-family:"Courier New"'>lond</span> 
+
+    sends a signal<i> USR1 </i><span style='font-style:normal'>to </span><span
+
+style='font-family:"Courier New"'>lonc</span>, whereupon <span
+
+style='font-family:"Courier New"'>lonc</span> tries again to reestablish all lost 
+
+    connections, even if it had given up on them before – a new client connecting 
+
+    could mean that that machine came online again after an interruption.</p>
+
+  <p>The gray boxes in <b>Fig. 1.1.2A</b><span style='font-weight:
+
+normal'> denote the entities involved in an example transaction of the Network. 
+
+    The Client is logged into server </span><b>C</b><span style='font-weight:normal'>, 
+
+    while server </span><b>B</b><span style='font-weight:normal'> is her Home 
+
+    Server. Server </span><b>C</b><span style='font-weight:normal'> can be an 
+
+    Access Server or a Library Server, while server </span><b>B</b><span
+
+style='font-weight:normal'> is a Library Server. She submits a solution to a homework 
+
+    problem, which is processed by the appropriate handler for the MIME type “problem”. 
+
+    Through </span><span style='font-family:"Courier New"'>lonnet</span>, the 
+
+    handler writes information about this transaction to the local session data. 
+
+    To make a permanent log entry, <span style='font-family:"Courier New"'>lonnet 
+
+    </span>establishes a connection to the UNIX domain socket for server <b>B</b><span
+
+style='font-weight:normal'>. </span><span style='font-family:"Courier New"'>lonc</span> 
+
+    receives this command, encrypts it, and sends it through the persistent TCP/IP 
+
+    connection to the TCP/IP port of the remote <span style='font-family:"Courier New"'>lond</span>. 
+
+    <span style='font-family:"Courier New"'>lond</span> decrypts the command, 
+
+    executes it by writing to the permanent user data files of the client, and 
+
+    sends back a reply regarding the success of the operation. If the operation 
+
+    was unsuccessful, or the connection would have broken down, <span style='font-family:
+
+"Courier New"'>lonc</span> would write the command into a FIFO buffer stack to 
+
+    be sent again later. <span style='font-family:"Courier New"'>lonc</span> now 
+
+    sends a reply regarding the overall success of the operation to <span
+
+style='font-family:"Courier New"'>lonnet</span> via the UNIX domain port, which 
+
+    is eventually received back by the handler.</p>
+
+  <h3><a name="_Toc514840841"></a><a name="_Toc421867043">Scalability and Performance 
+
+    Analysis</a></h3>
+
+  <p>The scalability was tested in a test bed of servers between different physical 
+
+    network segments, <b>Fig. 1.1.2B</b><span style='font-weight:
+
+normal'> shows the network configuration of this test.</span></p>
+
+  <table border=1 cellspacing=0 cellpadding=0>
+
+    <tr> 
+
+      <td width=443 valign=top class="Normal"> <p><span style='font-family:"Courier New"'>msul1:msu:library:zaphod.lite.msu.edu:35.8.63.51</span></p>
+
+        <p><span style='font-family:"Courier New"'>msua1:msu:access:agrajag.lite.msu.edu:35.8.63.68</span></p>
+
+        <p><span style='font-family:"Courier New"'>msul2:msu:library:frootmig.lite.msu.edu:35.8.63.69</span></p>
+
+        <p><span style='font-family:"Courier New"'>msua2:msu:access:bistromath.lite.msu.edu:35.8.63.67</span></p>
+
+        <p><span style='font-family:"Courier New"'>hubl14:hub:library:hubs128-pc-14.cl.msu.edu:35.8.116.34</span></p>
+
+        <p><span style='font-family:"Courier New"'>hubl15:hub:library:hubs128-pc-15.cl.msu.edu:35.8.116.35</span></p>
+
+        <p><span style='font-family:"Courier New"'>hubl16:hub:library:hubs128-pc-16.cl.msu.edu:35.8.116.36</span></p>
+
+        <p><span style='font-family:"Courier New"'>huba20:hub:access:hubs128-pc-20.cl.msu.edu:35.8.116.40</span></p>
+
+        <p><span style='font-family:"Courier New"'>huba21:hub:access:hubs128-pc-21.cl.msu.edu:35.8.116.41</span></p>
+
+        <p><span style='font-family:"Courier New"'>huba22:hub:access:hubs128-pc-22.cl.msu.edu:35.8.116.42</span></p>
+
+        <p><span style='font-family:"Courier New"'>huba23:hub:access:hubs128-pc-23.cl.msu.edu:35.8.116.43</span></p>
+
+        <p><span style='font-family:"Courier New"'>hubl25:other:library:hubs128-pc-25.cl.msu.edu:35.8.116.45</span></p>
+
+        <p><span style='font-family:"Courier New"'>huba27:other:access:hubs128-pc-27.cl.msu.edu:35.8.116.47</span></p></td>
+
+    </tr>
+
+  </table>
+
+  <p><span style='font-size:14.0pt'><b>Fig. 1.1.2B</b></span><span
+
+style='font-size:14.0pt'> – Example of Hosts Lookup Table </span><span
+
+style='font-size:9.0pt;font-family:"Courier New"'>/home/httpd/lonTabs/hosts.tab</span></p>
+
+  <p>In the first test,<span style='layout-grid-mode:line'> the simple </span><span style='font-family:"Courier New";layout-grid-mode:line'>ping</span><span
+
+style='layout-grid-mode:line'> command was used. The </span><span
+
+style='font-family:"Courier New";layout-grid-mode:line'>ping</span><span
+
+style='layout-grid-mode:line'> command is used to test connections and yields 
+
+    the server short name as reply.&nbsp; In this scenario, </span><span style='font-family:"Courier New";layout-grid-mode:
+
+line'>lonc</span><span style='layout-grid-mode:line'> was expected to be the speed-determining 
+
+    step, since </span><span style='font-family:"Courier New";
+
+layout-grid-mode:line'>lond</span><span style='layout-grid-mode:line'> at the 
+
+    remote end does not need any disk access to reply.&nbsp; The graph <b>Fig. 
+
+    1.1.2C</b></span><span style='layout-grid-mode:
+
+line'> shows number of seconds till completion versus number of processes issuing 
+
+    10,000 ping commands each against one Library Server (450 MHz Pentium II in 
+
+    this test, single IDE HD). For the solid dots, the processes were concurrently 
+
+    started on <i>the same</i></span><span style='layout-grid-mode:
+
+line'> Access Server and the time was measured till the processes finished – all 
+
+    processes finished at the same time. One Access Server (233 MHz Pentium II 
+
+    in the test bed) can process about 150 pings per second, and as expected, 
+
+    the total time grows linearly with the number of pings.</span></p>
+
+  <p><span style='layout-grid-mode:line'>The gray dots were taken with up to seven 
+
+    processes concurrently running on <i>different</i></span><span
+
+style='layout-grid-mode:line'> machines and pinging the same server – the processes 
+
+    ran fully concurrent, and each process finished as if the other ones were 
+
+    not present (about 1000 pings per second). Execution was fully parallel.</span></p>
+
+  <p>In a second test, <span style='font-family:"Courier New"'>lond</span> was 
+
+    the speed-determining step – 10,000 <span style='font-family:"Courier New"'>put</span> 
+
+    commands each were issued first from up to seven concurrent processes on the 
+
+    same machine, and then from up to seven processes on different machines. The 
+
+    <span
+
+style='font-family:"Courier New"'>put</span> command requires data to be written 
+
+    to the permanent record of the user on the remote server.</p>
+
+  <p>In particular, one <span style='font-family:"Courier New"'>&quot;put&quot;</span> 
+
+    request meant that the process on the Access Server would connect to the UNIX 
+
+    domain socket dedicated to the library server, <span style='font-family:"Courier New"'>lonc</span> 
+
+    would take the data from there, shuffle it through the persistent TCP connection, 
+
+    <span style='font-family:"Courier New"'>lond</span> on the remote library 
+
+    server would take the data, write to disk (both to a dbm-file and to a flat-text 
+
+    transaction history file), answer &quot;ok&quot;, <span
+
+style='font-family:"Courier New"'>lonc</span> would take that reply and send it 
+
+    to the domain socket, the process would read it from there and close the domain-socket 
+
+    connection.</p>
+
+  <p><span style='font-size:14.0pt'> <img width=220 height=190
+
+src="Session%20One_files/image005.jpg" v:shapes="_x0000_i1027"> </span></p>
+
+  <p><span style='font-size:14.0pt'><b>Fig. 1.1.2C</b></span><span
+
+style='font-size:14.0pt'> – Benchmark on Parallelism of Server-Server Communication 
+
+    (no disk access)</span></p>
+
+  <p>The graph <b>Fig. 1.1.2D</b><span style='font-weight:normal'> shows the results. 
+
+    Series 1 (solid black diamond) is the result of concurrent processes on the 
+
+    same server – all of these are handled by the same server-dedicated </span><span style='font-family:"Courier New"'>lond-</span>child, 
+
+    which lets the total amount of time grow linearly.</p>
+
+  <p><span style='font-size:14.0pt'> <img width=432 height=311
+
+src="Session%20One_files/image007.jpg" v:shapes="_x0000_i1028"> </span></p>
+
+  <p><span style='font-size:14.0pt'><b>Fig. 2D</b></span><span
+
+style='font-size:14.0pt'> – Benchmark on Parallelism of Server-Server Communication 
+
+    (with disk access as in Fig. 2A)</span></p>
+
+  <p>Series 2 through 8 were obtained from running the processes on different 
+
+    Access Servers against one Library Server, each series goes with one server. 
+
+    In this experiment, the processes did not finish at the same time, which most 
+
+    likely is due to disk-caching on the Library Server – <span
+
+style='font-family:"Courier New"'>lond</span>-children whose datafile was (partly) 
+
+    in disk cache finished earlier. With seven processes from seven different 
+
+    servers, the operation took 255 seconds till the last process was finished 
+
+    for 70,000 <span style='font-family:"Courier New"'>put</span> commands (270 
+
+    per second) – versus 530 seconds if the processes ran on the same server (130 
+
+    per second).</p>
+
+  <h3><a name="_Toc514840842"></a><a name="_Toc421867044">Dynamic Resource Replication</a></h3>
+
+  <p>Since resources are assembled into higher order resources simply by reference, 
+
+    in principle it would be sufficient to retrieve them from the respective Home 
+
+    Servers of the authors. However, there are several problems with this simple 
+
+    approach: since the resource assembly mechanism is designed to facilitate 
+
+    content assembly from a large number of widely distributed sources, individual 
+
+    sessions would depend on a large number of machines and network connections 
+
+    to be available, thus be rather fragile. Also, frequently accessed resources 
+
+    could potentially drive individual machines in the network into overload situations.</p>
+
+  <p>Finally, since most resources depend on content handlers on the Access Servers 
+
+    to be served to a client within the session context, the raw source would 
+
+    first have to be transferred across the Network from the respective Library 
+
+    Server to the Access Server, processed there, and then transferred on to the 
+
+    client.</p>
+
+  <p>To enable resource assembly in a reliable and scalable way, a dynamic resource 
+
+    replication scheme was developed. <b>Fig. 1.1.3</b><span
+
+style='font-weight:normal'> shows the details of this mechanism.</span></p>
+
+  <p>Anytime a resource out of the resource space is requested, a handler routine 
+
+    is called which in turn calls the replication routine (<b>Fig. 1.1.3A</b><span style='font-weight:normal'>). 
+
+    As a first step, this routines determines whether or not the resource is currently 
+
+    in replication transfer (</span><b>Fig. 1.1.3A,</b><span style='font-weight:normal'> 
+
+    </span><b>Step D1a</b><span
+
+style='font-weight:normal'>). During replication transfer, the incoming data is 
+
+    stored in a temporary file, and </span><b>Step D1a</b><span style='font-weight:
+
+normal'> checks for the presence of that file. If transfer of a resource is actively 
+
+    going on, the controlling handler receives an error message, waits for a few 
+
+    seconds, and then calls the replication routine again. If the resource is 
+
+    still in transfer, the client will receive the message “Service currently 
+
+    not available”.</span></p>
+
+  <p>In the next step (<b>Fig. 1.1.3A, Step D1b</b><span
+
+style='font-weight:normal'>), the replication routine checks if the URL is locally 
+
+    present. If it is, the replication routine returns OK to the controlling handler, 
+
+    which in turn passes the request on to the next handler in the chain.</span></p>
+
+  <p>If the resource is not locally present, the Home Server of the resource author 
+
+    (as extracted from the URL) is determined (<b>Fig. 1.1.3A, Step D2</b><span style='font-weight:normal'>). 
+
+    This is done by contacting all library servers in the author’s domain (as 
+
+    determined from the lookup table, see </span><b>Fig. 1.1.2B</b><span style='font-weight:normal'>). 
+
+    In </span><b>Step D2b</b><span style='font-weight:normal'> a query is sent 
+
+    to the remote server whether or not it is the Home Server of the author (in 
+
+    our current implementation, an additional cache is used to store already identified 
+
+    Home Servers (not shown in the figure)). In Step </span><b>D2c</b><span
+
+style='font-weight:normal'>, the remote server answers the query with True or 
+
+    False. If the Home Server was found, the routine continues, otherwise it contacts 
+
+    the next server (</span><b>Step D2a</b><span style='font-weight:normal'>). 
+
+    If no server could be found, a “File not Found” error message is issued. In 
+
+    our current implementation, in this step the Home Server is also written into 
+
+    a cache for faster access if resources by the same author are needed again 
+
+    (not shown in the figure). </span></p>
+
+  <br
+
+clear=ALL style='page-break-before:always'>
+
+  <p><span style='font-size:14.0pt'> <img width=432 height=581
+
+src="Session%20One_files/image009.jpg" v:shapes="_x0000_i1029"> </span></p>
+
+  <p><span style='font-size:14.0pt'><b>Fig. 1.1.3A</b></span><span
+
+style='font-size:14.0pt'> – Dynamic Resource Replication, subscription</span></p>
+
+  <br
+
+clear=ALL style='page-break-before:always'>
+
+  <p><span style='font-size:14.0pt'> <img width=432 height=523
+
+src="Session%20One_files/image011.jpg" v:shapes="_x0000_i1030"> </span></p>
+
+  <p><span style='font-size:14.0pt'><b>Fig. 1.1.3B</b></span><span
+
+style='font-size:14.0pt'> – Dynamic Resource Replication, modification</span></p>
+
+  <p>In <b>Step D3a</b><span style='font-weight:normal'>, the routine sends a 
+
+    subscribe command for the URL to the Home Server of the author. The Home Server 
+
+    first determines if the resource is present, and if the access privileges 
+
+    allow it to be copied to the requesting server (</span><b>Fig. 1.1.3A, Step 
+
+    D3b</b><span style='font-weight:normal'>). If this is true, the requesting 
+
+    server is added to the list of subscribed servers for that resource (</span><b>Step 
+
+    D3c</b><span style='font-weight:normal'>). The Home Server will reply with 
+
+    either OK or an error message, which is determined in </span><b>Step D4</b><span style='font-weight:normal'>. 
+
+    If the remote resource was not present, the error message “File not Found” 
+
+    will be passed on to the client, if the access was not allowed, the error 
+
+    message “Access Denied” is passed on. If the operation succeeded, the requesting 
+
+    server sends an HTTP request for the resource out of the /</span><span style='font-family:"Courier New"'>raw</span> 
+
+    server content resource area of the Home Server.</p>
+
+  <p>The Home Server will then check if the requesting server is part of the network, 
+
+    and if it is subscribed to the resource (<b>Step D5b</b><span
+
+style='font-weight:normal'>). If it is, it will send the resource via HTTP to 
+
+    the requesting server without any content handlers processing it (</span><b>Step 
+
+    D5c</b><span style='font-weight:normal'>). The requesting server will store 
+
+    the incoming data in a temporary data file (</span><b>Step D5a</b><span
+
+style='font-weight:normal'>) – this is the file that </span><b>Step D1a</b><span
+
+style='font-weight:normal'> checks for. If the transfer could not complete, and 
+
+    appropriate error message is sent to the client (</span><b>Step D6</b><span
+
+style='font-weight:normal'>). Otherwise, the transferred temporary file is renamed 
+
+    as the actual resource, and the replication routine returns OK to the controlling 
+
+    handler (</span><b>Step D7</b><span style='font-weight:normal'>). </span></p>
+
+  <p><b>Fig. 1.1.3B</b><span style='font-weight:normal'>&nbsp; depicts the process 
+
+    of modifying a resource. When an author publishes a new version of a resource, 
+
+    the Home Server will contact every server currently subscribed to the resource 
+
+    (</span><b>Fig. 1.1.3B, Step U1</b><span style='font-weight:normal'>), as 
+
+    determined from the list of subscribed servers for the resource generated 
+
+    in </span><b>Fig. 1.1. 3A, Step D3c</b><span style='font-weight:normal'>. 
+
+    The subscribing servers will receive and acknowledge the update message (</span><b>Step 
+
+    U1c</b><span
+
+style='font-weight:normal'>). The update mechanism finishes when the last subscribed 
+
+    server has been contacted (messages to unreachable servers are buffered).</span></p>
+
+  <p>Each subscribing server will check if the resource in question had been accessed 
+
+    recently, that is, within a configurable amount of time (<b>Step U2</b><span style='font-weight:normal'>). 
+
+    </span></p>
+
+  <p>If the resource had not been accessed recently, the local copy of the resource 
+
+    is deleted (<b>Step U3a</b><span style='font-weight:normal'>) and an unsubscribe 
+
+    command is sent to the Home Server (</span><b>Step U3b</b><span
+
+style='font-weight:normal'>). The Home Server will check if the server had indeed 
+
+    originally subscribed to the resource (</span><b>Step U3c</b><span
+
+style='font-weight:normal'>) and then delete the server from the list of subscribed 
+
+    servers for the resource (</span><b>Step U3d</b><span
+
+style='font-weight:normal'>).</span></p>
+
+  <p>If the resource had been accessed recently, the modified resource will be 
+
+    copied over using the same mechanism as in <b>Step D5a</b><span
+
+style='font-weight:normal'> through </span><b>D7</b><span style='font-weight:
+
+normal'> of </span><b>Fig. 1.1.3A</b><span style='font-weight:normal'> (</span><b>Fig. 
+
+    1.1.3B</b><span style='font-weight:normal'>, </span><b>Steps U4a </b><span
+
+style='font-weight:normal'>through</span><b> U6</b><span style='font-weight:
+
+normal'>).</span></p>
+
+  <p><span style='font-family:Arial'>Load Balancing</span></p>
+
+  <p><span style='font-family:"Courier New"'>lond</span> provides a function to 
+
+    query the server’s current <span style='font-family:"Courier New"'>loadavg</span><span
+
+style='font-size:14.0pt'>. </span>As a configuration parameter, one can determine 
+
+    the value of <span style='font-family:"Courier New"'>loadavg,</span> which 
+
+    is to be considered 100%, for example, 2.00. </p>
+
+  <p>Access servers can have a list of spare access servers, <span
+
+style='font-size:9.0pt;font-family:"Courier New"'>/home/httpd/lonTabs/spares.tab</span>, 
+
+    to offload sessions depending on own workload. This check happens is done 
+
+    by the login handler. It re-directs the login information and session to the 
+
+    least busy spare server if itself is overloaded. An additional round-robin 
+
+    IP scheme possible. See <b>Fig. 1.1.4</b><span style='font-weight:normal'> 
+
+    for an example of a load-balancing scheme.</span></p>
+
+  <p><span style='font-size:28.0pt;color:green'> <img width=241 height=139
+
+src="Session%20One_files/image013.jpg" v:shapes="_x0000_i1031"> </span></p>
+
+  <p><span
+
+style='font-size:14.0pt'><b>Fig. 1.1.4 – </b></span><span style='font-size:14.0pt'>Example 
+
+    of Load Balancing</span><span style='font-size:14.0pt'> <b><i><br
+
+clear=ALL style='page-break-before:always'>
+
+    </i></b></span></p>
+
+</div>
+
+<br
+
+clear=ALL style='page-break-before:always;'>
+
+<div class=Section2> </div>
+
+</body>
+
+</html>
+