File:  [LON-CAPA] / doc / gutshtml / SessionOne.html
Revision 1.1: download - view: text, annotated - select for diffs
Fri Jun 28 20:30:29 2002 UTC (21 years, 10 months ago) by www
Branches: MAIN
CVS tags: version_0_99_3, version_0_99_2, version_0_99_1, version_0_99_0, version_0_6_2, version_0_6, version_0_5_1, version_0_5, version_0_4, stable_2002_july, conference_2003, STABLE, HEAD
HTML version of GUTS manual. Individual files will still need cleanup.

    1: <html>
    2: <head>
    3: <meta name=Title
    4: content="Session One: Intro/Demo, lonc/d, Replication and Load Balancing (Gerd)">
    5: <meta http-equiv=Content-Type content="text/html; charset=macintosh">
    6: <link rel=Edit-Time-Data href="Session%20One_files/editdata.mso">
    7: <title>Session One: Intro/Demo, lonc/d, Replication and Load Balancing (Gerd)</title>
    8: <style><!--
    9: .Section1
   10: 	{page:Section1;}
   11: .Section2
   12: 	{page:Section2;}
   13: -->
   14: </style>
   15: </head>
   16: <body bgcolor=#FFFFFF link=blue vlink=purple class="Normal" lang=EN-US>
   17: <div class=Section1> 
   18:   <h2>Session One: Intro/Demo, lonc/d, Replication and Load Balancing (Gerd)</h2>
   19:   <p> <img width=432 height=555
   20: src="Session%20One_files/image002.jpg" v:shapes="_x0000_i1025"> <span
   21: style='font-size:14.0pt'><b>Fig. 1.1.1</b></span><span style='font-size:14.0pt'> 
   22:     Ð Overview of Network</span></p>
   23:   <h3><a name="_Toc514840838"></a><a name="_Toc421867040">Overview</a></h3>
   24:   <p>Physically, the Network consists of relatively inexpensive upper-PC-class 
   25:     server machines which are linked through the commodity internet in a load-balancing, 
   26:     dynamically content-replicating and failover-secure way. <b>Fig. 1.1.1</b><span style='font-weight:normal'> 
   27:     shows an overview of this network.</span></p>
   28:   <p>All machines in the Network are connected with each other through two-way 
   29:     persistent TCP/IP connections. Clients (<b>B</b><span
   30: style='font-weight:normal'>, </span><b>F</b><span style='font-weight:normal'>, 
   31:     </span><b>G</b><span
   32: style='font-weight:normal'> and </span><b>H</b><span style='font-weight:normal'> 
   33:     in </span><b>Fig. 1.1.1</b><span style='font-weight:normal'>) connect to the 
   34:     servers via standard HTTP. There are two classes of servers, Library Servers 
   35:     (</span><b>A</b><span
   36: style='font-weight:normal'> and </span><b>E</b><span style='font-weight:normal'> 
   37:     in </span><b>Fig. 1.1.1</b><span style='font-weight:normal'>) and Access Servers 
   38:     (</span><b>C</b><span style='font-weight:normal'>, </span><b>D</b><span
   39: style='font-weight:normal'>, </span><b>I</b><span style='font-weight:normal'> 
   40:     and </span><b>J</b><span style='font-weight:normal'> in </span><b>Fig. 1.1.1</b><span
   41: style='font-weight:normal'>). Library Servers are used to store all personal records 
   42:     of a set of users, and are responsible for their initial authentication when 
   43:     a session is opened on any server in the Network. For Authors, Library Servers 
   44:     also hosts their construction area and the authoritative copy of the current 
   45:     and previous versions of every resource that was published by that author. 
   46:     Library servers can be used as backups to host sessions when all access servers 
   47:     in the Network are overloaded. Otherwise, for learners, access servers are 
   48:     used to host the sessions. Library servers need to be strong on I/O, while 
   49:     access servers can generally be cheaper hardware. The network is designed 
   50:     so that the number of concurrent sessions can be increased over a wide range 
   51:     by simply adding additional Access Servers before having to add additional 
   52:     Library Servers. Preliminary tests showed that a Library Server could handle 
   53:     up to 10 Access Servers fully parallel.</span></p>
   54:   <p>The Network is divided into so-called domains, which are logical boundaries 
   55:     between participating institutions. These domains can be used to limit the 
   56:     flow of personal user information across the network, set access privileges 
   57:     and enforce royalty schemes.</p>
   58:   <h3><a name="_Toc514840839"></a><a name="_Toc421867041">Example of Transactions</a></h3>
   59:   <p><b>Fig. 1.1.1</b><span style='font-weight:normal'> also depicts examples 
   60:     for several kinds of transactions conducted across the Network. </span></p>
   61:   <p>An instructor at client <b>B</b><span style='font-weight:
   62: normal'> modifies and publishes a resource on her Home Server </span><b>A</b><span
   63: style='font-weight:normal'>. Server </span><b>A</b><span style='font-weight:
   64: normal'> has a record of all server machines currently subscribed to this resource, 
   65:     and replicates it to servers </span><b>D</b><span style='font-weight:
   66: normal'> and </span><b>I</b><span style='font-weight:normal'>. However, server 
   67:     </span><b>D</b><span
   68: style='font-weight:normal'> is currently offline, so the update notification gets 
   69:     buffered on </span><b>A</b><span style='font-weight:normal'> until </span><b>D</b><span
   70: style='font-weight:normal'> comes online again.</span><b> </b><span
   71: style='font-weight:normal'>Servers </span><b>C</b><span style='font-weight:
   72: normal'> and </span><b>J</b><span style='font-weight:normal'> are currently not 
   73:     subscribed to this resource. </span></p>
   74:   <p>Learners <b>F</b><span style='font-weight:normal'> and </span><b>G</b><span
   75: style='font-weight:normal'> have open sessions on server </span><b>I</b><span
   76: style='font-weight:normal'>, and the new resource is immediately available to 
   77:     them. </span></p>
   78:   <p>Learner <b>H</b><span style='font-weight:normal'> tries to connect to server 
   79:     </span><b>I</b><span style='font-weight:normal'> for a new session, however, 
   80:     the machine is not reachable, so he connects to another Access Server </span><b>J</b><span style='font-weight:normal'> 
   81:     instead. This server currently does not have all necessary resources locally 
   82:     present to host learner </span><b>H</b><span style='font-weight:normal'>, 
   83:     but subscribes to them and replicates them as they are accessed by </span><b>H</b><span
   84: style='font-weight:normal'>. </span></p>
   85:   <p>Learner <b>H</b><span style='font-weight:normal'> solves a problem on server 
   86:     </span><b>J</b><span style='font-weight:normal'>. Library Server </span><b>E</b><span style='font-weight:normal'> 
   87:     is </span><b>H</b><span
   88: style='font-weight:normal'>Õs Home Server, so this information gets forwarded 
   89:     to </span><b>E</b><span style='font-weight:normal'>, where the records of 
   90:     </span><b>H</b><span
   91: style='font-weight:normal'> are updated. </span></p>
   92:   <h3><a name="_Toc514840840"></a><a name="_Toc421867042">lonc/lond/lonnet</a></h3>
   93:   <p><b>Fig. 1.1.2</b><span style='font-weight:normal'> elaborates on the details 
   94:     of this network infrastructure. </span></p>
   95:   <p><b>Fig. 1.1.2A</b><span style='font-weight:normal'> depicts three servers 
   96:     (</span><b>A</b><span style='font-weight:normal'>, </span><b>B</b><span
   97: style='font-weight:normal'> and </span><b>C</b><span style='font-weight:normal'>, 
   98:     </span><b>Fig. 1.1.2A</b><span style='font-weight:normal'>) and a client who 
   99:     has a session on server </span><b>C.</b></p>
  100:   <p>As <b>C</b><span style='font-weight:normal'> accesses different resources 
  101:     in the system, different handlers, which are incorporated as modules into 
  102:     the child processes of the web server software, process these requests.</span></p>
  103:   <p>Our current implementation uses <span style='font-family:
  104: "Courier New"'>mod_perl</span> inside of the Apache web server software. As an 
  105:     example, server <b>C</b><span style='font-weight:normal'> currently has four 
  106:     active web server software child processes. The chain of handlers dealing 
  107:     with a certain resource is determined by both the server content resource 
  108:     area (see below) and the MIME type, which in turn is determined by the URL 
  109:     extension. For most URL structures, both an authentication handler and a content 
  110:     handler are registered.</span></p>
  111:   <p>Handlers use a common library <span style='font-family:"Courier New"'>lonnet</span> 
  112:     to interact with both locally present temporary session data and data across 
  113:     the server network. For example, <span style='font-family:"Courier New"'>lonnet</span> 
  114:     provides routines for finding the home server of a user, finding the server 
  115:     with the lowest loadavg, sending simple command-reply sequences, and sending 
  116:     critical messages such as a homework completion, etc. For a non-critical message, 
  117:     the routines reply with a simple Òconnection lostÓ if the message could not 
  118:     be delivered. For critical messages,<i> </i><span style='font-family:
  119: "Courier New";font-style:normal'>lonnet</span><i> </i><span style='font-style:
  120: normal'>tries to re-establish</span><i> </i><span style='font-style:normal'>connections, 
  121:     re-send the command, etc. If no valid reply could be received, it answers 
  122:     Òconnection deferredÓ and stores the message in</span><i> </i><span
  123: style='font-style:normal'>buffer space to be sent</span><i> </i><span
  124: style='font-style:normal'>at a later point in time. Also, failed critical messages 
  125:     are logged.</span></p>
  126:   <p>The interface between <span style='font-family:"Courier New"'>lonnet</span> 
  127:     and the Network is established by a multiplexed UNIX domain socket, denoted 
  128:     DS in <b>Fig. 1.1.2A</b><span style='font-weight:normal'>. The rationale behind 
  129:     this rather involved architecture is that httpd processes (Apache children) 
  130:     dynamically come and go on the timescale of minutes, based on workload and 
  131:     number of processed requests. Over the lifetime of an httpd child, however, 
  132:     it has to establish several hundred connections to several different servers 
  133:     in the Network.</span></p>
  134:   <p>On the other hand, establishing a TCP/IP connection is resource consuming 
  135:     for both ends of the line, and to optimize this connectivity between different 
  136:     servers, connections in the Network are designed to be persistent on the timescale 
  137:     of months, until either end is rebooted. This mechanism will be elaborated 
  138:     on below.</p>
  139:   <p>Establishing a connection to a UNIX domain socket is far less resource consuming 
  140:     than the establishing of a TCP/IP connection. <span
  141: style='font-family:"Courier New"'>lonc</span> is a proxy daemon that forks off 
  142:     a child for every server in the Network. . Which servers are members of the 
  143:     Network is determined by a lookup table, which <b>Fig. 1.1.2B</b><span
  144: style='font-weight:normal'> is an example of. In order, the entries denote an 
  145:     internal name for the server, the domain of the server, the type of the server, 
  146:     the host name and the IP address.</span></p>
  147:   <p>The <span style='font-family:"Courier New"'>lonc</span> parent process maintains 
  148:     the population and listens for signals to restart or shutdown, as well as 
  149:     <i>USR1</i><span style='font-style:normal'>. Every child establishes a multiplexed 
  150:     UNIX domain socket for its server and opens a TCP/IP connection to the </span><span style='font-family:"Courier New"'>lond</span> 
  151:     daemon (discussed below) on the remote machine, which it keeps alive.<i> </i><span
  152: style='font-style:normal'>If the connection is interrupted, the child dies, whereupon 
  153:     the parent makes several attempts to fork another child for that server. </span></p>
  154:   <p>When starting a new child (a new connection), first an init-sequence is carried 
  155:     out, which includes receiving the information from the remote <span style='font-family:"Courier New"'>lond</span> 
  156:     which is needed to establish the 128-bit encryption key Ð the key is different 
  157:     for every connection. Next, any buffered (delayed) messages for the server 
  158:     are sent.</p>
  159:   <p>In normal operation, the child listens to the UNIX socket, forwards requests 
  160:     to the TCP connection, gets the reply from <span
  161: style='font-family:"Courier New"'>lond</span>, and sends it back to the UNIX socket. 
  162:     Also, <span style='font-family:"Courier New"'>lonc</span> takes care to the 
  163:     encryption and decryption of messages.</p>
  164:   <p><span style='font-family:"Courier New"'>lonc</span> was build by putting 
  165:     a non-forking multiplexed UNIX domain socket server into a framework that 
  166:     forks a TCP/IP client for every remote <span style='font-family:
  167: "Courier New"'>lond</span>.</p>
  168:   <p><span style='font-family:"Courier New"'>lond</span> is the remote end of 
  169:     the TCP/IP connection and acts as a remote command processor. It receives 
  170:     commands, executes them, and sends replies. In normal operation,<i> </i><span
  171: style='font-style:normal'>a </span><span style='font-family:"Courier New"'>lonc</span> 
  172:     child is constantly connected to a dedicated <span style='font-family:"Courier New"'>lond</span> 
  173:     child on the remote server, and the same is true vice versa (two persistent 
  174:     connections per server combination). </p>
  175:   <p><span style='font-family:"Courier New"'>lond</span><i>&nbsp; </i><span style='font-style:normal'>listens 
  176:     to a TCP/IP port (denoted P in <b>Fig. 1.1.2A</b></span>) and forks off enough 
  177:     child processes to have one for each other server in the network plus two 
  178:     spare children. The parent process maintains the population and listens for 
  179:     signals to restart or shutdown. Client servers are authenticated by IP<i>.</i></p>
  180:   <br
  181: clear=ALL style='page-break-before:always'>
  182:   <p><span style='font-size:14.0pt'> <img width=432 height=492
  183: src="Session%20One_files/image004.jpg" v:shapes="_x0000_i1026"> </span></p>
  184:   <p><span style='font-size:14.0pt'><b>Fig. 1.1.2A</b></span><span
  185: style='font-size:14.0pt'> Ð Overview of Network Communication</span></p>
  186:   <p>When a new client server comes online<i>,</i><span
  187: style='font-style:normal'> </span><span style='font-family:"Courier New"'>lond</span> 
  188:     sends a signal<i> USR1 </i><span style='font-style:normal'>to </span><span
  189: style='font-family:"Courier New"'>lonc</span>, whereupon <span
  190: style='font-family:"Courier New"'>lonc</span> tries again to reestablish all lost 
  191:     connections, even if it had given up on them before Ð a new client connecting 
  192:     could mean that that machine came online again after an interruption.</p>
  193:   <p>The gray boxes in <b>Fig. 1.1.2A</b><span style='font-weight:
  194: normal'> denote the entities involved in an example transaction of the Network. 
  195:     The Client is logged into server </span><b>C</b><span style='font-weight:normal'>, 
  196:     while server </span><b>B</b><span style='font-weight:normal'> is her Home 
  197:     Server. Server </span><b>C</b><span style='font-weight:normal'> can be an 
  198:     Access Server or a Library Server, while server </span><b>B</b><span
  199: style='font-weight:normal'> is a Library Server. She submits a solution to a homework 
  200:     problem, which is processed by the appropriate handler for the MIME type ÒproblemÓ. 
  201:     Through </span><span style='font-family:"Courier New"'>lonnet</span>, the 
  202:     handler writes information about this transaction to the local session data. 
  203:     To make a permanent log entry, <span style='font-family:"Courier New"'>lonnet 
  204:     </span>establishes a connection to the UNIX domain socket for server <b>B</b><span
  205: style='font-weight:normal'>. </span><span style='font-family:"Courier New"'>lonc</span> 
  206:     receives this command, encrypts it, and sends it through the persistent TCP/IP 
  207:     connection to the TCP/IP port of the remote <span style='font-family:"Courier New"'>lond</span>. 
  208:     <span style='font-family:"Courier New"'>lond</span> decrypts the command, 
  209:     executes it by writing to the permanent user data files of the client, and 
  210:     sends back a reply regarding the success of the operation. If the operation 
  211:     was unsuccessful, or the connection would have broken down, <span style='font-family:
  212: "Courier New"'>lonc</span> would write the command into a FIFO buffer stack to 
  213:     be sent again later. <span style='font-family:"Courier New"'>lonc</span> now 
  214:     sends a reply regarding the overall success of the operation to <span
  215: style='font-family:"Courier New"'>lonnet</span> via the UNIX domain port, which 
  216:     is eventually received back by the handler.</p>
  217:   <h3><a name="_Toc514840841"></a><a name="_Toc421867043">Scalability and Performance 
  218:     Analysis</a></h3>
  219:   <p>The scalability was tested in a test bed of servers between different physical 
  220:     network segments, <b>Fig. 1.1.2B</b><span style='font-weight:
  221: normal'> shows the network configuration of this test.</span></p>
  222:   <table border=1 cellspacing=0 cellpadding=0>
  223:     <tr> 
  224:       <td width=443 valign=top class="Normal"> <p><span style='font-family:"Courier New"'>msul1:msu:library:zaphod.lite.msu.edu:35.8.63.51</span></p>
  225:         <p><span style='font-family:"Courier New"'>msua1:msu:access:agrajag.lite.msu.edu:35.8.63.68</span></p>
  226:         <p><span style='font-family:"Courier New"'>msul2:msu:library:frootmig.lite.msu.edu:35.8.63.69</span></p>
  227:         <p><span style='font-family:"Courier New"'>msua2:msu:access:bistromath.lite.msu.edu:35.8.63.67</span></p>
  228:         <p><span style='font-family:"Courier New"'>hubl14:hub:library:hubs128-pc-14.cl.msu.edu:35.8.116.34</span></p>
  229:         <p><span style='font-family:"Courier New"'>hubl15:hub:library:hubs128-pc-15.cl.msu.edu:35.8.116.35</span></p>
  230:         <p><span style='font-family:"Courier New"'>hubl16:hub:library:hubs128-pc-16.cl.msu.edu:35.8.116.36</span></p>
  231:         <p><span style='font-family:"Courier New"'>huba20:hub:access:hubs128-pc-20.cl.msu.edu:35.8.116.40</span></p>
  232:         <p><span style='font-family:"Courier New"'>huba21:hub:access:hubs128-pc-21.cl.msu.edu:35.8.116.41</span></p>
  233:         <p><span style='font-family:"Courier New"'>huba22:hub:access:hubs128-pc-22.cl.msu.edu:35.8.116.42</span></p>
  234:         <p><span style='font-family:"Courier New"'>huba23:hub:access:hubs128-pc-23.cl.msu.edu:35.8.116.43</span></p>
  235:         <p><span style='font-family:"Courier New"'>hubl25:other:library:hubs128-pc-25.cl.msu.edu:35.8.116.45</span></p>
  236:         <p><span style='font-family:"Courier New"'>huba27:other:access:hubs128-pc-27.cl.msu.edu:35.8.116.47</span></p></td>
  237:     </tr>
  238:   </table>
  239:   <p><span style='font-size:14.0pt'><b>Fig. 1.1.2B</b></span><span
  240: style='font-size:14.0pt'> Ð Example of Hosts Lookup Table </span><span
  241: style='font-size:9.0pt;font-family:"Courier New"'>/home/httpd/lonTabs/hosts.tab</span></p>
  242:   <p>In the first test,<span style='layout-grid-mode:line'> the simple </span><span style='font-family:"Courier New";layout-grid-mode:line'>ping</span><span
  243: style='layout-grid-mode:line'> command was used. The </span><span
  244: style='font-family:"Courier New";layout-grid-mode:line'>ping</span><span
  245: style='layout-grid-mode:line'> command is used to test connections and yields 
  246:     the server short name as reply.&nbsp; In this scenario, </span><span style='font-family:"Courier New";layout-grid-mode:
  247: line'>lonc</span><span style='layout-grid-mode:line'> was expected to be the speed-determining 
  248:     step, since </span><span style='font-family:"Courier New";
  249: layout-grid-mode:line'>lond</span><span style='layout-grid-mode:line'> at the 
  250:     remote end does not need any disk access to reply.&nbsp; The graph <b>Fig. 
  251:     1.1.2C</b></span><span style='layout-grid-mode:
  252: line'> shows number of seconds till completion versus number of processes issuing 
  253:     10,000 ping commands each against one Library Server (450 MHz Pentium II in 
  254:     this test, single IDE HD). For the solid dots, the processes were concurrently 
  255:     started on <i>the same</i></span><span style='layout-grid-mode:
  256: line'> Access Server and the time was measured till the processes finished Ð all 
  257:     processes finished at the same time. One Access Server (233 MHz Pentium II 
  258:     in the test bed) can process about 150 pings per second, and as expected, 
  259:     the total time grows linearly with the number of pings.</span></p>
  260:   <p><span style='layout-grid-mode:line'>The gray dots were taken with up to seven 
  261:     processes concurrently running on <i>different</i></span><span
  262: style='layout-grid-mode:line'> machines and pinging the same server Ð the processes 
  263:     ran fully concurrent, and each process finished as if the other ones were 
  264:     not present (about 1000 pings per second). Execution was fully parallel.</span></p>
  265:   <p>In a second test, <span style='font-family:"Courier New"'>lond</span> was 
  266:     the speed-determining step Ð 10,000 <span style='font-family:"Courier New"'>put</span> 
  267:     commands each were issued first from up to seven concurrent processes on the 
  268:     same machine, and then from up to seven processes on different machines. The 
  269:     <span
  270: style='font-family:"Courier New"'>put</span> command requires data to be written 
  271:     to the permanent record of the user on the remote server.</p>
  272:   <p>In particular, one <span style='font-family:"Courier New"'>&quot;put&quot;</span> 
  273:     request meant that the process on the Access Server would connect to the UNIX 
  274:     domain socket dedicated to the library server, <span style='font-family:"Courier New"'>lonc</span> 
  275:     would take the data from there, shuffle it through the persistent TCP connection, 
  276:     <span style='font-family:"Courier New"'>lond</span> on the remote library 
  277:     server would take the data, write to disk (both to a dbm-file and to a flat-text 
  278:     transaction history file), answer &quot;ok&quot;, <span
  279: style='font-family:"Courier New"'>lonc</span> would take that reply and send it 
  280:     to the domain socket, the process would read it from there and close the domain-socket 
  281:     connection.</p>
  282:   <p><span style='font-size:14.0pt'> <img width=220 height=190
  283: src="Session%20One_files/image005.jpg" v:shapes="_x0000_i1027"> </span></p>
  284:   <p><span style='font-size:14.0pt'><b>Fig. 1.1.2C</b></span><span
  285: style='font-size:14.0pt'> Ð Benchmark on Parallelism of Server-Server Communication 
  286:     (no disk access)</span></p>
  287:   <p>The graph <b>Fig. 1.1.2D</b><span style='font-weight:normal'> shows the results. 
  288:     Series 1 (solid black diamond) is the result of concurrent processes on the 
  289:     same server Ð all of these are handled by the same server-dedicated </span><span style='font-family:"Courier New"'>lond-</span>child, 
  290:     which lets the total amount of time grow linearly.</p>
  291:   <p><span style='font-size:14.0pt'> <img width=432 height=311
  292: src="Session%20One_files/image007.jpg" v:shapes="_x0000_i1028"> </span></p>
  293:   <p><span style='font-size:14.0pt'><b>Fig. 2D</b></span><span
  294: style='font-size:14.0pt'> Ð Benchmark on Parallelism of Server-Server Communication 
  295:     (with disk access as in Fig. 2A)</span></p>
  296:   <p>Series 2 through 8 were obtained from running the processes on different 
  297:     Access Servers against one Library Server, each series goes with one server. 
  298:     In this experiment, the processes did not finish at the same time, which most 
  299:     likely is due to disk-caching on the Library Server Ð <span
  300: style='font-family:"Courier New"'>lond</span>-children whose datafile was (partly) 
  301:     in disk cache finished earlier. With seven processes from seven different 
  302:     servers, the operation took 255 seconds till the last process was finished 
  303:     for 70,000 <span style='font-family:"Courier New"'>put</span> commands (270 
  304:     per second) Ð versus 530 seconds if the processes ran on the same server (130 
  305:     per second).</p>
  306:   <h3><a name="_Toc514840842"></a><a name="_Toc421867044">Dynamic Resource Replication</a></h3>
  307:   <p>Since resources are assembled into higher order resources simply by reference, 
  308:     in principle it would be sufficient to retrieve them from the respective Home 
  309:     Servers of the authors. However, there are several problems with this simple 
  310:     approach: since the resource assembly mechanism is designed to facilitate 
  311:     content assembly from a large number of widely distributed sources, individual 
  312:     sessions would depend on a large number of machines and network connections 
  313:     to be available, thus be rather fragile. Also, frequently accessed resources 
  314:     could potentially drive individual machines in the network into overload situations.</p>
  315:   <p>Finally, since most resources depend on content handlers on the Access Servers 
  316:     to be served to a client within the session context, the raw source would 
  317:     first have to be transferred across the Network from the respective Library 
  318:     Server to the Access Server, processed there, and then transferred on to the 
  319:     client.</p>
  320:   <p>To enable resource assembly in a reliable and scalable way, a dynamic resource 
  321:     replication scheme was developed. <b>Fig. 1.1.3</b><span
  322: style='font-weight:normal'> shows the details of this mechanism.</span></p>
  323:   <p>Anytime a resource out of the resource space is requested, a handler routine 
  324:     is called which in turn calls the replication routine (<b>Fig. 1.1.3A</b><span style='font-weight:normal'>). 
  325:     As a first step, this routines determines whether or not the resource is currently 
  326:     in replication transfer (</span><b>Fig. 1.1.3A,</b><span style='font-weight:normal'> 
  327:     </span><b>Step D1a</b><span
  328: style='font-weight:normal'>). During replication transfer, the incoming data is 
  329:     stored in a temporary file, and </span><b>Step D1a</b><span style='font-weight:
  330: normal'> checks for the presence of that file. If transfer of a resource is actively 
  331:     going on, the controlling handler receives an error message, waits for a few 
  332:     seconds, and then calls the replication routine again. If the resource is 
  333:     still in transfer, the client will receive the message ÒService currently 
  334:     not availableÓ.</span></p>
  335:   <p>In the next step (<b>Fig. 1.1.3A, Step D1b</b><span
  336: style='font-weight:normal'>), the replication routine checks if the URL is locally 
  337:     present. If it is, the replication routine returns OK to the controlling handler, 
  338:     which in turn passes the request on to the next handler in the chain.</span></p>
  339:   <p>If the resource is not locally present, the Home Server of the resource author 
  340:     (as extracted from the URL) is determined (<b>Fig. 1.1.3A, Step D2</b><span style='font-weight:normal'>). 
  341:     This is done by contacting all library servers in the authorÕs domain (as 
  342:     determined from the lookup table, see </span><b>Fig. 1.1.2B</b><span style='font-weight:normal'>). 
  343:     In </span><b>Step D2b</b><span style='font-weight:normal'> a query is sent 
  344:     to the remote server whether or not it is the Home Server of the author (in 
  345:     our current implementation, an additional cache is used to store already identified 
  346:     Home Servers (not shown in the figure)). In Step </span><b>D2c</b><span
  347: style='font-weight:normal'>, the remote server answers the query with True or 
  348:     False. If the Home Server was found, the routine continues, otherwise it contacts 
  349:     the next server (</span><b>Step D2a</b><span style='font-weight:normal'>). 
  350:     If no server could be found, a ÒFile not FoundÓ error message is issued. In 
  351:     our current implementation, in this step the Home Server is also written into 
  352:     a cache for faster access if resources by the same author are needed again 
  353:     (not shown in the figure). </span></p>
  354:   <br
  355: clear=ALL style='page-break-before:always'>
  356:   <p><span style='font-size:14.0pt'> <img width=432 height=581
  357: src="Session%20One_files/image009.jpg" v:shapes="_x0000_i1029"> </span></p>
  358:   <p><span style='font-size:14.0pt'><b>Fig. 1.1.3A</b></span><span
  359: style='font-size:14.0pt'> Ð Dynamic Resource Replication, subscription</span></p>
  360:   <br
  361: clear=ALL style='page-break-before:always'>
  362:   <p><span style='font-size:14.0pt'> <img width=432 height=523
  363: src="Session%20One_files/image011.jpg" v:shapes="_x0000_i1030"> </span></p>
  364:   <p><span style='font-size:14.0pt'><b>Fig. 1.1.3B</b></span><span
  365: style='font-size:14.0pt'> Ð Dynamic Resource Replication, modification</span></p>
  366:   <p>In <b>Step D3a</b><span style='font-weight:normal'>, the routine sends a 
  367:     subscribe command for the URL to the Home Server of the author. The Home Server 
  368:     first determines if the resource is present, and if the access privileges 
  369:     allow it to be copied to the requesting server (</span><b>Fig. 1.1.3A, Step 
  370:     D3b</b><span style='font-weight:normal'>). If this is true, the requesting 
  371:     server is added to the list of subscribed servers for that resource (</span><b>Step 
  372:     D3c</b><span style='font-weight:normal'>). The Home Server will reply with 
  373:     either OK or an error message, which is determined in </span><b>Step D4</b><span style='font-weight:normal'>. 
  374:     If the remote resource was not present, the error message ÒFile not FoundÓ 
  375:     will be passed on to the client, if the access was not allowed, the error 
  376:     message ÒAccess DeniedÓ is passed on. If the operation succeeded, the requesting 
  377:     server sends an HTTP request for the resource out of the /</span><span style='font-family:"Courier New"'>raw</span> 
  378:     server content resource area of the Home Server.</p>
  379:   <p>The Home Server will then check if the requesting server is part of the network, 
  380:     and if it is subscribed to the resource (<b>Step D5b</b><span
  381: style='font-weight:normal'>). If it is, it will send the resource via HTTP to 
  382:     the requesting server without any content handlers processing it (</span><b>Step 
  383:     D5c</b><span style='font-weight:normal'>). The requesting server will store 
  384:     the incoming data in a temporary data file (</span><b>Step D5a</b><span
  385: style='font-weight:normal'>) Ð this is the file that </span><b>Step D1a</b><span
  386: style='font-weight:normal'> checks for. If the transfer could not complete, and 
  387:     appropriate error message is sent to the client (</span><b>Step D6</b><span
  388: style='font-weight:normal'>). Otherwise, the transferred temporary file is renamed 
  389:     as the actual resource, and the replication routine returns OK to the controlling 
  390:     handler (</span><b>Step D7</b><span style='font-weight:normal'>). </span></p>
  391:   <p><b>Fig. 1.1.3B</b><span style='font-weight:normal'>&nbsp; depicts the process 
  392:     of modifying a resource. When an author publishes a new version of a resource, 
  393:     the Home Server will contact every server currently subscribed to the resource 
  394:     (</span><b>Fig. 1.1.3B, Step U1</b><span style='font-weight:normal'>), as 
  395:     determined from the list of subscribed servers for the resource generated 
  396:     in </span><b>Fig. 1.1. 3A, Step D3c</b><span style='font-weight:normal'>. 
  397:     The subscribing servers will receive and acknowledge the update message (</span><b>Step 
  398:     U1c</b><span
  399: style='font-weight:normal'>). The update mechanism finishes when the last subscribed 
  400:     server has been contacted (messages to unreachable servers are buffered).</span></p>
  401:   <p>Each subscribing server will check if the resource in question had been accessed 
  402:     recently, that is, within a configurable amount of time (<b>Step U2</b><span style='font-weight:normal'>). 
  403:     </span></p>
  404:   <p>If the resource had not been accessed recently, the local copy of the resource 
  405:     is deleted (<b>Step U3a</b><span style='font-weight:normal'>) and an unsubscribe 
  406:     command is sent to the Home Server (</span><b>Step U3b</b><span
  407: style='font-weight:normal'>). The Home Server will check if the server had indeed 
  408:     originally subscribed to the resource (</span><b>Step U3c</b><span
  409: style='font-weight:normal'>) and then delete the server from the list of subscribed 
  410:     servers for the resource (</span><b>Step U3d</b><span
  411: style='font-weight:normal'>).</span></p>
  412:   <p>If the resource had been accessed recently, the modified resource will be 
  413:     copied over using the same mechanism as in <b>Step D5a</b><span
  414: style='font-weight:normal'> through </span><b>D7</b><span style='font-weight:
  415: normal'> of </span><b>Fig. 1.1.3A</b><span style='font-weight:normal'> (</span><b>Fig. 
  416:     1.1.3B</b><span style='font-weight:normal'>, </span><b>Steps U4a </b><span
  417: style='font-weight:normal'>through</span><b> U6</b><span style='font-weight:
  418: normal'>).</span></p>
  419:   <p><span style='font-family:Arial'>Load Balancing</span></p>
  420:   <p><span style='font-family:"Courier New"'>lond</span> provides a function to 
  421:     query the serverÕs current <span style='font-family:"Courier New"'>loadavg</span><span
  422: style='font-size:14.0pt'>. </span>As a configuration parameter, one can determine 
  423:     the value of <span style='font-family:"Courier New"'>loadavg,</span> which 
  424:     is to be considered 100%, for example, 2.00. </p>
  425:   <p>Access servers can have a list of spare access servers, <span
  426: style='font-size:9.0pt;font-family:"Courier New"'>/home/httpd/lonTabs/spares.tab</span>, 
  427:     to offload sessions depending on own workload. This check happens is done 
  428:     by the login handler. It re-directs the login information and session to the 
  429:     least busy spare server if itself is overloaded. An additional round-robin 
  430:     IP scheme possible. See <b>Fig. 1.1.4</b><span style='font-weight:normal'> 
  431:     for an example of a load-balancing scheme.</span></p>
  432:   <p><span style='font-size:28.0pt;color:green'> <img width=241 height=139
  433: src="Session%20One_files/image013.jpg" v:shapes="_x0000_i1031"> </span></p>
  434:   <p><span
  435: style='font-size:14.0pt'><b>Fig. 1.1.4 Ð </b></span><span style='font-size:14.0pt'>Example 
  436:     of Load Balancing</span><span style='font-size:14.0pt'> <b><i><br
  437: clear=ALL style='page-break-before:always'>
  438:     </i></b></span></p>
  439: </div>
  440: <br
  441: clear=ALL style='page-break-before:always;'>
  442: <div class=Section2> </div>
  443: </body>
  444: </html>

FreeBSD-CVSweb <freebsd-cvsweb@FreeBSD.org>