File:  [LON-CAPA] / doc / gutshtml / SessionOne.html
Revision 1.2: download - view: text, annotated - select for diffs
Tue Jul 22 14:47:00 2003 UTC (20 years, 9 months ago) by bowersj2
Branches: MAIN
CVS tags: version_2_9_X, version_2_9_99_0, version_2_9_1, version_2_9_0, version_2_8_X, version_2_8_99_1, version_2_8_99_0, version_2_8_2, version_2_8_1, version_2_8_0, version_2_7_X, version_2_7_99_1, version_2_7_99_0, version_2_7_1, version_2_7_0, version_2_6_X, version_2_6_99_1, version_2_6_99_0, version_2_6_3, version_2_6_2, version_2_6_1, version_2_6_0, version_2_5_X, version_2_5_99_1, version_2_5_99_0, version_2_5_2, version_2_5_1, version_2_5_0, version_2_4_X, version_2_4_99_0, version_2_4_2, version_2_4_1, version_2_4_0, version_2_3_X, version_2_3_99_0, version_2_3_2, version_2_3_1, version_2_3_0, version_2_2_X, version_2_2_99_1, version_2_2_99_0, version_2_2_2, version_2_2_1, version_2_2_0, version_2_1_X, version_2_1_99_3, version_2_1_99_2, version_2_1_99_1, version_2_1_99_0, version_2_1_3, version_2_1_2, version_2_1_1, version_2_1_0, version_2_12_X, version_2_11_X, version_2_11_4_uiuc, version_2_11_4_msu, version_2_11_4, version_2_11_3_uiuc, version_2_11_3_msu, version_2_11_3, version_2_11_2_uiuc, version_2_11_2_msu, version_2_11_2_educog, version_2_11_2, version_2_11_1, version_2_11_0_RC3, version_2_11_0_RC2, version_2_11_0_RC1, version_2_11_0, version_2_10_X, version_2_10_1, version_2_10_0_RC2, version_2_10_0_RC1, version_2_10_0, version_2_0_X, version_2_0_99_1, version_2_0_2, version_2_0_1, version_2_0_0, version_1_99_3, version_1_99_2, version_1_99_1_tmcc, version_1_99_1, version_1_99_0_tmcc, version_1_99_0, version_1_3_X, version_1_3_3, version_1_3_2, version_1_3_1, version_1_3_0, version_1_2_X, version_1_2_99_1, version_1_2_99_0, version_1_2_1, version_1_2_0, version_1_1_X, version_1_1_99_5, version_1_1_99_4, version_1_1_99_3, version_1_1_99_2, version_1_1_99_1, version_1_1_99_0, version_1_1_3, version_1_1_2, version_1_1_1, version_1_1_0, version_1_0_99_3, version_1_0_99_2, version_1_0_99_1, version_1_0_99, version_1_0_3, version_1_0_2, version_1_0_1, version_1_0_0, version_0_99_5, version_0_99_4, loncapaMITrelate_1, language_hyphenation_merge, language_hyphenation, bz6209-base, bz6209, HEAD, GCI_3, GCI_2, GCI_1, BZ4492-merge, BZ4492-feature_horizontal_radioresponse, BZ4492-feature_Support_horizontal_radioresponse, BZ4492-Support_horizontal_radioresponse
Convert GUTs HTML to PROPER line endings.

    1: <html>
    2: 
    3: <head>
    4: 
    5: <meta name=Title
    6: 
    7: content="Session One: Intro/Demo, lonc/d, Replication and Load Balancing (Gerd)">
    8: 
    9: <meta http-equiv=Content-Type content="text/html; charset=macintosh">
   10: 
   11: <link rel=Edit-Time-Data href="Session%20One_files/editdata.mso">
   12: 
   13: <title>Session One: Intro/Demo, lonc/d, Replication and Load Balancing (Gerd)</title>
   14: 
   15: <style><!--
   16: 
   17: .Section1
   18: 
   19: 	{page:Section1;}
   20: 
   21: .Section2
   22: 
   23: 	{page:Section2;}
   24: 
   25: -->
   26: 
   27: </style>
   28: 
   29: </head>
   30: 
   31: <body bgcolor=#FFFFFF link=blue vlink=purple class="Normal" lang=EN-US>
   32: 
   33: <div class=Section1> 
   34: 
   35:   <h2>Session One: Intro/Demo, lonc/d, Replication and Load Balancing (Gerd)</h2>
   36: 
   37:   <p> <img width=432 height=555
   38: 
   39: src="Session%20One_files/image002.jpg" v:shapes="_x0000_i1025"> <span
   40: 
   41: style='font-size:14.0pt'><b>Fig. 1.1.1</b></span><span style='font-size:14.0pt'> 
   42: 
   43:     Ð Overview of Network</span></p>
   44: 
   45:   <h3><a name="_Toc514840838"></a><a name="_Toc421867040">Overview</a></h3>
   46: 
   47:   <p>Physically, the Network consists of relatively inexpensive upper-PC-class 
   48: 
   49:     server machines which are linked through the commodity internet in a load-balancing, 
   50: 
   51:     dynamically content-replicating and failover-secure way. <b>Fig. 1.1.1</b><span style='font-weight:normal'> 
   52: 
   53:     shows an overview of this network.</span></p>
   54: 
   55:   <p>All machines in the Network are connected with each other through two-way 
   56: 
   57:     persistent TCP/IP connections. Clients (<b>B</b><span
   58: 
   59: style='font-weight:normal'>, </span><b>F</b><span style='font-weight:normal'>, 
   60: 
   61:     </span><b>G</b><span
   62: 
   63: style='font-weight:normal'> and </span><b>H</b><span style='font-weight:normal'> 
   64: 
   65:     in </span><b>Fig. 1.1.1</b><span style='font-weight:normal'>) connect to the 
   66: 
   67:     servers via standard HTTP. There are two classes of servers, Library Servers 
   68: 
   69:     (</span><b>A</b><span
   70: 
   71: style='font-weight:normal'> and </span><b>E</b><span style='font-weight:normal'> 
   72: 
   73:     in </span><b>Fig. 1.1.1</b><span style='font-weight:normal'>) and Access Servers 
   74: 
   75:     (</span><b>C</b><span style='font-weight:normal'>, </span><b>D</b><span
   76: 
   77: style='font-weight:normal'>, </span><b>I</b><span style='font-weight:normal'> 
   78: 
   79:     and </span><b>J</b><span style='font-weight:normal'> in </span><b>Fig. 1.1.1</b><span
   80: 
   81: style='font-weight:normal'>). Library Servers are used to store all personal records 
   82: 
   83:     of a set of users, and are responsible for their initial authentication when 
   84: 
   85:     a session is opened on any server in the Network. For Authors, Library Servers 
   86: 
   87:     also hosts their construction area and the authoritative copy of the current 
   88: 
   89:     and previous versions of every resource that was published by that author. 
   90: 
   91:     Library servers can be used as backups to host sessions when all access servers 
   92: 
   93:     in the Network are overloaded. Otherwise, for learners, access servers are 
   94: 
   95:     used to host the sessions. Library servers need to be strong on I/O, while 
   96: 
   97:     access servers can generally be cheaper hardware. The network is designed 
   98: 
   99:     so that the number of concurrent sessions can be increased over a wide range 
  100: 
  101:     by simply adding additional Access Servers before having to add additional 
  102: 
  103:     Library Servers. Preliminary tests showed that a Library Server could handle 
  104: 
  105:     up to 10 Access Servers fully parallel.</span></p>
  106: 
  107:   <p>The Network is divided into so-called domains, which are logical boundaries 
  108: 
  109:     between participating institutions. These domains can be used to limit the 
  110: 
  111:     flow of personal user information across the network, set access privileges 
  112: 
  113:     and enforce royalty schemes.</p>
  114: 
  115:   <h3><a name="_Toc514840839"></a><a name="_Toc421867041">Example of Transactions</a></h3>
  116: 
  117:   <p><b>Fig. 1.1.1</b><span style='font-weight:normal'> also depicts examples 
  118: 
  119:     for several kinds of transactions conducted across the Network. </span></p>
  120: 
  121:   <p>An instructor at client <b>B</b><span style='font-weight:
  122: 
  123: normal'> modifies and publishes a resource on her Home Server </span><b>A</b><span
  124: 
  125: style='font-weight:normal'>. Server </span><b>A</b><span style='font-weight:
  126: 
  127: normal'> has a record of all server machines currently subscribed to this resource, 
  128: 
  129:     and replicates it to servers </span><b>D</b><span style='font-weight:
  130: 
  131: normal'> and </span><b>I</b><span style='font-weight:normal'>. However, server 
  132: 
  133:     </span><b>D</b><span
  134: 
  135: style='font-weight:normal'> is currently offline, so the update notification gets 
  136: 
  137:     buffered on </span><b>A</b><span style='font-weight:normal'> until </span><b>D</b><span
  138: 
  139: style='font-weight:normal'> comes online again.</span><b> </b><span
  140: 
  141: style='font-weight:normal'>Servers </span><b>C</b><span style='font-weight:
  142: 
  143: normal'> and </span><b>J</b><span style='font-weight:normal'> are currently not 
  144: 
  145:     subscribed to this resource. </span></p>
  146: 
  147:   <p>Learners <b>F</b><span style='font-weight:normal'> and </span><b>G</b><span
  148: 
  149: style='font-weight:normal'> have open sessions on server </span><b>I</b><span
  150: 
  151: style='font-weight:normal'>, and the new resource is immediately available to 
  152: 
  153:     them. </span></p>
  154: 
  155:   <p>Learner <b>H</b><span style='font-weight:normal'> tries to connect to server 
  156: 
  157:     </span><b>I</b><span style='font-weight:normal'> for a new session, however, 
  158: 
  159:     the machine is not reachable, so he connects to another Access Server </span><b>J</b><span style='font-weight:normal'> 
  160: 
  161:     instead. This server currently does not have all necessary resources locally 
  162: 
  163:     present to host learner </span><b>H</b><span style='font-weight:normal'>, 
  164: 
  165:     but subscribes to them and replicates them as they are accessed by </span><b>H</b><span
  166: 
  167: style='font-weight:normal'>. </span></p>
  168: 
  169:   <p>Learner <b>H</b><span style='font-weight:normal'> solves a problem on server 
  170: 
  171:     </span><b>J</b><span style='font-weight:normal'>. Library Server </span><b>E</b><span style='font-weight:normal'> 
  172: 
  173:     is </span><b>H</b><span
  174: 
  175: style='font-weight:normal'>Õs Home Server, so this information gets forwarded 
  176: 
  177:     to </span><b>E</b><span style='font-weight:normal'>, where the records of 
  178: 
  179:     </span><b>H</b><span
  180: 
  181: style='font-weight:normal'> are updated. </span></p>
  182: 
  183:   <h3><a name="_Toc514840840"></a><a name="_Toc421867042">lonc/lond/lonnet</a></h3>
  184: 
  185:   <p><b>Fig. 1.1.2</b><span style='font-weight:normal'> elaborates on the details 
  186: 
  187:     of this network infrastructure. </span></p>
  188: 
  189:   <p><b>Fig. 1.1.2A</b><span style='font-weight:normal'> depicts three servers 
  190: 
  191:     (</span><b>A</b><span style='font-weight:normal'>, </span><b>B</b><span
  192: 
  193: style='font-weight:normal'> and </span><b>C</b><span style='font-weight:normal'>, 
  194: 
  195:     </span><b>Fig. 1.1.2A</b><span style='font-weight:normal'>) and a client who 
  196: 
  197:     has a session on server </span><b>C.</b></p>
  198: 
  199:   <p>As <b>C</b><span style='font-weight:normal'> accesses different resources 
  200: 
  201:     in the system, different handlers, which are incorporated as modules into 
  202: 
  203:     the child processes of the web server software, process these requests.</span></p>
  204: 
  205:   <p>Our current implementation uses <span style='font-family:
  206: 
  207: "Courier New"'>mod_perl</span> inside of the Apache web server software. As an 
  208: 
  209:     example, server <b>C</b><span style='font-weight:normal'> currently has four 
  210: 
  211:     active web server software child processes. The chain of handlers dealing 
  212: 
  213:     with a certain resource is determined by both the server content resource 
  214: 
  215:     area (see below) and the MIME type, which in turn is determined by the URL 
  216: 
  217:     extension. For most URL structures, both an authentication handler and a content 
  218: 
  219:     handler are registered.</span></p>
  220: 
  221:   <p>Handlers use a common library <span style='font-family:"Courier New"'>lonnet</span> 
  222: 
  223:     to interact with both locally present temporary session data and data across 
  224: 
  225:     the server network. For example, <span style='font-family:"Courier New"'>lonnet</span> 
  226: 
  227:     provides routines for finding the home server of a user, finding the server 
  228: 
  229:     with the lowest loadavg, sending simple command-reply sequences, and sending 
  230: 
  231:     critical messages such as a homework completion, etc. For a non-critical message, 
  232: 
  233:     the routines reply with a simple Òconnection lostÓ if the message could not 
  234: 
  235:     be delivered. For critical messages,<i> </i><span style='font-family:
  236: 
  237: "Courier New";font-style:normal'>lonnet</span><i> </i><span style='font-style:
  238: 
  239: normal'>tries to re-establish</span><i> </i><span style='font-style:normal'>connections, 
  240: 
  241:     re-send the command, etc. If no valid reply could be received, it answers 
  242: 
  243:     Òconnection deferredÓ and stores the message in</span><i> </i><span
  244: 
  245: style='font-style:normal'>buffer space to be sent</span><i> </i><span
  246: 
  247: style='font-style:normal'>at a later point in time. Also, failed critical messages 
  248: 
  249:     are logged.</span></p>
  250: 
  251:   <p>The interface between <span style='font-family:"Courier New"'>lonnet</span> 
  252: 
  253:     and the Network is established by a multiplexed UNIX domain socket, denoted 
  254: 
  255:     DS in <b>Fig. 1.1.2A</b><span style='font-weight:normal'>. The rationale behind 
  256: 
  257:     this rather involved architecture is that httpd processes (Apache children) 
  258: 
  259:     dynamically come and go on the timescale of minutes, based on workload and 
  260: 
  261:     number of processed requests. Over the lifetime of an httpd child, however, 
  262: 
  263:     it has to establish several hundred connections to several different servers 
  264: 
  265:     in the Network.</span></p>
  266: 
  267:   <p>On the other hand, establishing a TCP/IP connection is resource consuming 
  268: 
  269:     for both ends of the line, and to optimize this connectivity between different 
  270: 
  271:     servers, connections in the Network are designed to be persistent on the timescale 
  272: 
  273:     of months, until either end is rebooted. This mechanism will be elaborated 
  274: 
  275:     on below.</p>
  276: 
  277:   <p>Establishing a connection to a UNIX domain socket is far less resource consuming 
  278: 
  279:     than the establishing of a TCP/IP connection. <span
  280: 
  281: style='font-family:"Courier New"'>lonc</span> is a proxy daemon that forks off 
  282: 
  283:     a child for every server in the Network. . Which servers are members of the 
  284: 
  285:     Network is determined by a lookup table, which <b>Fig. 1.1.2B</b><span
  286: 
  287: style='font-weight:normal'> is an example of. In order, the entries denote an 
  288: 
  289:     internal name for the server, the domain of the server, the type of the server, 
  290: 
  291:     the host name and the IP address.</span></p>
  292: 
  293:   <p>The <span style='font-family:"Courier New"'>lonc</span> parent process maintains 
  294: 
  295:     the population and listens for signals to restart or shutdown, as well as 
  296: 
  297:     <i>USR1</i><span style='font-style:normal'>. Every child establishes a multiplexed 
  298: 
  299:     UNIX domain socket for its server and opens a TCP/IP connection to the </span><span style='font-family:"Courier New"'>lond</span> 
  300: 
  301:     daemon (discussed below) on the remote machine, which it keeps alive.<i> </i><span
  302: 
  303: style='font-style:normal'>If the connection is interrupted, the child dies, whereupon 
  304: 
  305:     the parent makes several attempts to fork another child for that server. </span></p>
  306: 
  307:   <p>When starting a new child (a new connection), first an init-sequence is carried 
  308: 
  309:     out, which includes receiving the information from the remote <span style='font-family:"Courier New"'>lond</span> 
  310: 
  311:     which is needed to establish the 128-bit encryption key Ð the key is different 
  312: 
  313:     for every connection. Next, any buffered (delayed) messages for the server 
  314: 
  315:     are sent.</p>
  316: 
  317:   <p>In normal operation, the child listens to the UNIX socket, forwards requests 
  318: 
  319:     to the TCP connection, gets the reply from <span
  320: 
  321: style='font-family:"Courier New"'>lond</span>, and sends it back to the UNIX socket. 
  322: 
  323:     Also, <span style='font-family:"Courier New"'>lonc</span> takes care to the 
  324: 
  325:     encryption and decryption of messages.</p>
  326: 
  327:   <p><span style='font-family:"Courier New"'>lonc</span> was build by putting 
  328: 
  329:     a non-forking multiplexed UNIX domain socket server into a framework that 
  330: 
  331:     forks a TCP/IP client for every remote <span style='font-family:
  332: 
  333: "Courier New"'>lond</span>.</p>
  334: 
  335:   <p><span style='font-family:"Courier New"'>lond</span> is the remote end of 
  336: 
  337:     the TCP/IP connection and acts as a remote command processor. It receives 
  338: 
  339:     commands, executes them, and sends replies. In normal operation,<i> </i><span
  340: 
  341: style='font-style:normal'>a </span><span style='font-family:"Courier New"'>lonc</span> 
  342: 
  343:     child is constantly connected to a dedicated <span style='font-family:"Courier New"'>lond</span> 
  344: 
  345:     child on the remote server, and the same is true vice versa (two persistent 
  346: 
  347:     connections per server combination). </p>
  348: 
  349:   <p><span style='font-family:"Courier New"'>lond</span><i>&nbsp; </i><span style='font-style:normal'>listens 
  350: 
  351:     to a TCP/IP port (denoted P in <b>Fig. 1.1.2A</b></span>) and forks off enough 
  352: 
  353:     child processes to have one for each other server in the network plus two 
  354: 
  355:     spare children. The parent process maintains the population and listens for 
  356: 
  357:     signals to restart or shutdown. Client servers are authenticated by IP<i>.</i></p>
  358: 
  359:   <br
  360: 
  361: clear=ALL style='page-break-before:always'>
  362: 
  363:   <p><span style='font-size:14.0pt'> <img width=432 height=492
  364: 
  365: src="Session%20One_files/image004.jpg" v:shapes="_x0000_i1026"> </span></p>
  366: 
  367:   <p><span style='font-size:14.0pt'><b>Fig. 1.1.2A</b></span><span
  368: 
  369: style='font-size:14.0pt'> Ð Overview of Network Communication</span></p>
  370: 
  371:   <p>When a new client server comes online<i>,</i><span
  372: 
  373: style='font-style:normal'> </span><span style='font-family:"Courier New"'>lond</span> 
  374: 
  375:     sends a signal<i> USR1 </i><span style='font-style:normal'>to </span><span
  376: 
  377: style='font-family:"Courier New"'>lonc</span>, whereupon <span
  378: 
  379: style='font-family:"Courier New"'>lonc</span> tries again to reestablish all lost 
  380: 
  381:     connections, even if it had given up on them before Ð a new client connecting 
  382: 
  383:     could mean that that machine came online again after an interruption.</p>
  384: 
  385:   <p>The gray boxes in <b>Fig. 1.1.2A</b><span style='font-weight:
  386: 
  387: normal'> denote the entities involved in an example transaction of the Network. 
  388: 
  389:     The Client is logged into server </span><b>C</b><span style='font-weight:normal'>, 
  390: 
  391:     while server </span><b>B</b><span style='font-weight:normal'> is her Home 
  392: 
  393:     Server. Server </span><b>C</b><span style='font-weight:normal'> can be an 
  394: 
  395:     Access Server or a Library Server, while server </span><b>B</b><span
  396: 
  397: style='font-weight:normal'> is a Library Server. She submits a solution to a homework 
  398: 
  399:     problem, which is processed by the appropriate handler for the MIME type ÒproblemÓ. 
  400: 
  401:     Through </span><span style='font-family:"Courier New"'>lonnet</span>, the 
  402: 
  403:     handler writes information about this transaction to the local session data. 
  404: 
  405:     To make a permanent log entry, <span style='font-family:"Courier New"'>lonnet 
  406: 
  407:     </span>establishes a connection to the UNIX domain socket for server <b>B</b><span
  408: 
  409: style='font-weight:normal'>. </span><span style='font-family:"Courier New"'>lonc</span> 
  410: 
  411:     receives this command, encrypts it, and sends it through the persistent TCP/IP 
  412: 
  413:     connection to the TCP/IP port of the remote <span style='font-family:"Courier New"'>lond</span>. 
  414: 
  415:     <span style='font-family:"Courier New"'>lond</span> decrypts the command, 
  416: 
  417:     executes it by writing to the permanent user data files of the client, and 
  418: 
  419:     sends back a reply regarding the success of the operation. If the operation 
  420: 
  421:     was unsuccessful, or the connection would have broken down, <span style='font-family:
  422: 
  423: "Courier New"'>lonc</span> would write the command into a FIFO buffer stack to 
  424: 
  425:     be sent again later. <span style='font-family:"Courier New"'>lonc</span> now 
  426: 
  427:     sends a reply regarding the overall success of the operation to <span
  428: 
  429: style='font-family:"Courier New"'>lonnet</span> via the UNIX domain port, which 
  430: 
  431:     is eventually received back by the handler.</p>
  432: 
  433:   <h3><a name="_Toc514840841"></a><a name="_Toc421867043">Scalability and Performance 
  434: 
  435:     Analysis</a></h3>
  436: 
  437:   <p>The scalability was tested in a test bed of servers between different physical 
  438: 
  439:     network segments, <b>Fig. 1.1.2B</b><span style='font-weight:
  440: 
  441: normal'> shows the network configuration of this test.</span></p>
  442: 
  443:   <table border=1 cellspacing=0 cellpadding=0>
  444: 
  445:     <tr> 
  446: 
  447:       <td width=443 valign=top class="Normal"> <p><span style='font-family:"Courier New"'>msul1:msu:library:zaphod.lite.msu.edu:35.8.63.51</span></p>
  448: 
  449:         <p><span style='font-family:"Courier New"'>msua1:msu:access:agrajag.lite.msu.edu:35.8.63.68</span></p>
  450: 
  451:         <p><span style='font-family:"Courier New"'>msul2:msu:library:frootmig.lite.msu.edu:35.8.63.69</span></p>
  452: 
  453:         <p><span style='font-family:"Courier New"'>msua2:msu:access:bistromath.lite.msu.edu:35.8.63.67</span></p>
  454: 
  455:         <p><span style='font-family:"Courier New"'>hubl14:hub:library:hubs128-pc-14.cl.msu.edu:35.8.116.34</span></p>
  456: 
  457:         <p><span style='font-family:"Courier New"'>hubl15:hub:library:hubs128-pc-15.cl.msu.edu:35.8.116.35</span></p>
  458: 
  459:         <p><span style='font-family:"Courier New"'>hubl16:hub:library:hubs128-pc-16.cl.msu.edu:35.8.116.36</span></p>
  460: 
  461:         <p><span style='font-family:"Courier New"'>huba20:hub:access:hubs128-pc-20.cl.msu.edu:35.8.116.40</span></p>
  462: 
  463:         <p><span style='font-family:"Courier New"'>huba21:hub:access:hubs128-pc-21.cl.msu.edu:35.8.116.41</span></p>
  464: 
  465:         <p><span style='font-family:"Courier New"'>huba22:hub:access:hubs128-pc-22.cl.msu.edu:35.8.116.42</span></p>
  466: 
  467:         <p><span style='font-family:"Courier New"'>huba23:hub:access:hubs128-pc-23.cl.msu.edu:35.8.116.43</span></p>
  468: 
  469:         <p><span style='font-family:"Courier New"'>hubl25:other:library:hubs128-pc-25.cl.msu.edu:35.8.116.45</span></p>
  470: 
  471:         <p><span style='font-family:"Courier New"'>huba27:other:access:hubs128-pc-27.cl.msu.edu:35.8.116.47</span></p></td>
  472: 
  473:     </tr>
  474: 
  475:   </table>
  476: 
  477:   <p><span style='font-size:14.0pt'><b>Fig. 1.1.2B</b></span><span
  478: 
  479: style='font-size:14.0pt'> Ð Example of Hosts Lookup Table </span><span
  480: 
  481: style='font-size:9.0pt;font-family:"Courier New"'>/home/httpd/lonTabs/hosts.tab</span></p>
  482: 
  483:   <p>In the first test,<span style='layout-grid-mode:line'> the simple </span><span style='font-family:"Courier New";layout-grid-mode:line'>ping</span><span
  484: 
  485: style='layout-grid-mode:line'> command was used. The </span><span
  486: 
  487: style='font-family:"Courier New";layout-grid-mode:line'>ping</span><span
  488: 
  489: style='layout-grid-mode:line'> command is used to test connections and yields 
  490: 
  491:     the server short name as reply.&nbsp; In this scenario, </span><span style='font-family:"Courier New";layout-grid-mode:
  492: 
  493: line'>lonc</span><span style='layout-grid-mode:line'> was expected to be the speed-determining 
  494: 
  495:     step, since </span><span style='font-family:"Courier New";
  496: 
  497: layout-grid-mode:line'>lond</span><span style='layout-grid-mode:line'> at the 
  498: 
  499:     remote end does not need any disk access to reply.&nbsp; The graph <b>Fig. 
  500: 
  501:     1.1.2C</b></span><span style='layout-grid-mode:
  502: 
  503: line'> shows number of seconds till completion versus number of processes issuing 
  504: 
  505:     10,000 ping commands each against one Library Server (450 MHz Pentium II in 
  506: 
  507:     this test, single IDE HD). For the solid dots, the processes were concurrently 
  508: 
  509:     started on <i>the same</i></span><span style='layout-grid-mode:
  510: 
  511: line'> Access Server and the time was measured till the processes finished Ð all 
  512: 
  513:     processes finished at the same time. One Access Server (233 MHz Pentium II 
  514: 
  515:     in the test bed) can process about 150 pings per second, and as expected, 
  516: 
  517:     the total time grows linearly with the number of pings.</span></p>
  518: 
  519:   <p><span style='layout-grid-mode:line'>The gray dots were taken with up to seven 
  520: 
  521:     processes concurrently running on <i>different</i></span><span
  522: 
  523: style='layout-grid-mode:line'> machines and pinging the same server Ð the processes 
  524: 
  525:     ran fully concurrent, and each process finished as if the other ones were 
  526: 
  527:     not present (about 1000 pings per second). Execution was fully parallel.</span></p>
  528: 
  529:   <p>In a second test, <span style='font-family:"Courier New"'>lond</span> was 
  530: 
  531:     the speed-determining step Ð 10,000 <span style='font-family:"Courier New"'>put</span> 
  532: 
  533:     commands each were issued first from up to seven concurrent processes on the 
  534: 
  535:     same machine, and then from up to seven processes on different machines. The 
  536: 
  537:     <span
  538: 
  539: style='font-family:"Courier New"'>put</span> command requires data to be written 
  540: 
  541:     to the permanent record of the user on the remote server.</p>
  542: 
  543:   <p>In particular, one <span style='font-family:"Courier New"'>&quot;put&quot;</span> 
  544: 
  545:     request meant that the process on the Access Server would connect to the UNIX 
  546: 
  547:     domain socket dedicated to the library server, <span style='font-family:"Courier New"'>lonc</span> 
  548: 
  549:     would take the data from there, shuffle it through the persistent TCP connection, 
  550: 
  551:     <span style='font-family:"Courier New"'>lond</span> on the remote library 
  552: 
  553:     server would take the data, write to disk (both to a dbm-file and to a flat-text 
  554: 
  555:     transaction history file), answer &quot;ok&quot;, <span
  556: 
  557: style='font-family:"Courier New"'>lonc</span> would take that reply and send it 
  558: 
  559:     to the domain socket, the process would read it from there and close the domain-socket 
  560: 
  561:     connection.</p>
  562: 
  563:   <p><span style='font-size:14.0pt'> <img width=220 height=190
  564: 
  565: src="Session%20One_files/image005.jpg" v:shapes="_x0000_i1027"> </span></p>
  566: 
  567:   <p><span style='font-size:14.0pt'><b>Fig. 1.1.2C</b></span><span
  568: 
  569: style='font-size:14.0pt'> Ð Benchmark on Parallelism of Server-Server Communication 
  570: 
  571:     (no disk access)</span></p>
  572: 
  573:   <p>The graph <b>Fig. 1.1.2D</b><span style='font-weight:normal'> shows the results. 
  574: 
  575:     Series 1 (solid black diamond) is the result of concurrent processes on the 
  576: 
  577:     same server Ð all of these are handled by the same server-dedicated </span><span style='font-family:"Courier New"'>lond-</span>child, 
  578: 
  579:     which lets the total amount of time grow linearly.</p>
  580: 
  581:   <p><span style='font-size:14.0pt'> <img width=432 height=311
  582: 
  583: src="Session%20One_files/image007.jpg" v:shapes="_x0000_i1028"> </span></p>
  584: 
  585:   <p><span style='font-size:14.0pt'><b>Fig. 2D</b></span><span
  586: 
  587: style='font-size:14.0pt'> Ð Benchmark on Parallelism of Server-Server Communication 
  588: 
  589:     (with disk access as in Fig. 2A)</span></p>
  590: 
  591:   <p>Series 2 through 8 were obtained from running the processes on different 
  592: 
  593:     Access Servers against one Library Server, each series goes with one server. 
  594: 
  595:     In this experiment, the processes did not finish at the same time, which most 
  596: 
  597:     likely is due to disk-caching on the Library Server Ð <span
  598: 
  599: style='font-family:"Courier New"'>lond</span>-children whose datafile was (partly) 
  600: 
  601:     in disk cache finished earlier. With seven processes from seven different 
  602: 
  603:     servers, the operation took 255 seconds till the last process was finished 
  604: 
  605:     for 70,000 <span style='font-family:"Courier New"'>put</span> commands (270 
  606: 
  607:     per second) Ð versus 530 seconds if the processes ran on the same server (130 
  608: 
  609:     per second).</p>
  610: 
  611:   <h3><a name="_Toc514840842"></a><a name="_Toc421867044">Dynamic Resource Replication</a></h3>
  612: 
  613:   <p>Since resources are assembled into higher order resources simply by reference, 
  614: 
  615:     in principle it would be sufficient to retrieve them from the respective Home 
  616: 
  617:     Servers of the authors. However, there are several problems with this simple 
  618: 
  619:     approach: since the resource assembly mechanism is designed to facilitate 
  620: 
  621:     content assembly from a large number of widely distributed sources, individual 
  622: 
  623:     sessions would depend on a large number of machines and network connections 
  624: 
  625:     to be available, thus be rather fragile. Also, frequently accessed resources 
  626: 
  627:     could potentially drive individual machines in the network into overload situations.</p>
  628: 
  629:   <p>Finally, since most resources depend on content handlers on the Access Servers 
  630: 
  631:     to be served to a client within the session context, the raw source would 
  632: 
  633:     first have to be transferred across the Network from the respective Library 
  634: 
  635:     Server to the Access Server, processed there, and then transferred on to the 
  636: 
  637:     client.</p>
  638: 
  639:   <p>To enable resource assembly in a reliable and scalable way, a dynamic resource 
  640: 
  641:     replication scheme was developed. <b>Fig. 1.1.3</b><span
  642: 
  643: style='font-weight:normal'> shows the details of this mechanism.</span></p>
  644: 
  645:   <p>Anytime a resource out of the resource space is requested, a handler routine 
  646: 
  647:     is called which in turn calls the replication routine (<b>Fig. 1.1.3A</b><span style='font-weight:normal'>). 
  648: 
  649:     As a first step, this routines determines whether or not the resource is currently 
  650: 
  651:     in replication transfer (</span><b>Fig. 1.1.3A,</b><span style='font-weight:normal'> 
  652: 
  653:     </span><b>Step D1a</b><span
  654: 
  655: style='font-weight:normal'>). During replication transfer, the incoming data is 
  656: 
  657:     stored in a temporary file, and </span><b>Step D1a</b><span style='font-weight:
  658: 
  659: normal'> checks for the presence of that file. If transfer of a resource is actively 
  660: 
  661:     going on, the controlling handler receives an error message, waits for a few 
  662: 
  663:     seconds, and then calls the replication routine again. If the resource is 
  664: 
  665:     still in transfer, the client will receive the message ÒService currently 
  666: 
  667:     not availableÓ.</span></p>
  668: 
  669:   <p>In the next step (<b>Fig. 1.1.3A, Step D1b</b><span
  670: 
  671: style='font-weight:normal'>), the replication routine checks if the URL is locally 
  672: 
  673:     present. If it is, the replication routine returns OK to the controlling handler, 
  674: 
  675:     which in turn passes the request on to the next handler in the chain.</span></p>
  676: 
  677:   <p>If the resource is not locally present, the Home Server of the resource author 
  678: 
  679:     (as extracted from the URL) is determined (<b>Fig. 1.1.3A, Step D2</b><span style='font-weight:normal'>). 
  680: 
  681:     This is done by contacting all library servers in the authorÕs domain (as 
  682: 
  683:     determined from the lookup table, see </span><b>Fig. 1.1.2B</b><span style='font-weight:normal'>). 
  684: 
  685:     In </span><b>Step D2b</b><span style='font-weight:normal'> a query is sent 
  686: 
  687:     to the remote server whether or not it is the Home Server of the author (in 
  688: 
  689:     our current implementation, an additional cache is used to store already identified 
  690: 
  691:     Home Servers (not shown in the figure)). In Step </span><b>D2c</b><span
  692: 
  693: style='font-weight:normal'>, the remote server answers the query with True or 
  694: 
  695:     False. If the Home Server was found, the routine continues, otherwise it contacts 
  696: 
  697:     the next server (</span><b>Step D2a</b><span style='font-weight:normal'>). 
  698: 
  699:     If no server could be found, a ÒFile not FoundÓ error message is issued. In 
  700: 
  701:     our current implementation, in this step the Home Server is also written into 
  702: 
  703:     a cache for faster access if resources by the same author are needed again 
  704: 
  705:     (not shown in the figure). </span></p>
  706: 
  707:   <br
  708: 
  709: clear=ALL style='page-break-before:always'>
  710: 
  711:   <p><span style='font-size:14.0pt'> <img width=432 height=581
  712: 
  713: src="Session%20One_files/image009.jpg" v:shapes="_x0000_i1029"> </span></p>
  714: 
  715:   <p><span style='font-size:14.0pt'><b>Fig. 1.1.3A</b></span><span
  716: 
  717: style='font-size:14.0pt'> Ð Dynamic Resource Replication, subscription</span></p>
  718: 
  719:   <br
  720: 
  721: clear=ALL style='page-break-before:always'>
  722: 
  723:   <p><span style='font-size:14.0pt'> <img width=432 height=523
  724: 
  725: src="Session%20One_files/image011.jpg" v:shapes="_x0000_i1030"> </span></p>
  726: 
  727:   <p><span style='font-size:14.0pt'><b>Fig. 1.1.3B</b></span><span
  728: 
  729: style='font-size:14.0pt'> Ð Dynamic Resource Replication, modification</span></p>
  730: 
  731:   <p>In <b>Step D3a</b><span style='font-weight:normal'>, the routine sends a 
  732: 
  733:     subscribe command for the URL to the Home Server of the author. The Home Server 
  734: 
  735:     first determines if the resource is present, and if the access privileges 
  736: 
  737:     allow it to be copied to the requesting server (</span><b>Fig. 1.1.3A, Step 
  738: 
  739:     D3b</b><span style='font-weight:normal'>). If this is true, the requesting 
  740: 
  741:     server is added to the list of subscribed servers for that resource (</span><b>Step 
  742: 
  743:     D3c</b><span style='font-weight:normal'>). The Home Server will reply with 
  744: 
  745:     either OK or an error message, which is determined in </span><b>Step D4</b><span style='font-weight:normal'>. 
  746: 
  747:     If the remote resource was not present, the error message ÒFile not FoundÓ 
  748: 
  749:     will be passed on to the client, if the access was not allowed, the error 
  750: 
  751:     message ÒAccess DeniedÓ is passed on. If the operation succeeded, the requesting 
  752: 
  753:     server sends an HTTP request for the resource out of the /</span><span style='font-family:"Courier New"'>raw</span> 
  754: 
  755:     server content resource area of the Home Server.</p>
  756: 
  757:   <p>The Home Server will then check if the requesting server is part of the network, 
  758: 
  759:     and if it is subscribed to the resource (<b>Step D5b</b><span
  760: 
  761: style='font-weight:normal'>). If it is, it will send the resource via HTTP to 
  762: 
  763:     the requesting server without any content handlers processing it (</span><b>Step 
  764: 
  765:     D5c</b><span style='font-weight:normal'>). The requesting server will store 
  766: 
  767:     the incoming data in a temporary data file (</span><b>Step D5a</b><span
  768: 
  769: style='font-weight:normal'>) Ð this is the file that </span><b>Step D1a</b><span
  770: 
  771: style='font-weight:normal'> checks for. If the transfer could not complete, and 
  772: 
  773:     appropriate error message is sent to the client (</span><b>Step D6</b><span
  774: 
  775: style='font-weight:normal'>). Otherwise, the transferred temporary file is renamed 
  776: 
  777:     as the actual resource, and the replication routine returns OK to the controlling 
  778: 
  779:     handler (</span><b>Step D7</b><span style='font-weight:normal'>). </span></p>
  780: 
  781:   <p><b>Fig. 1.1.3B</b><span style='font-weight:normal'>&nbsp; depicts the process 
  782: 
  783:     of modifying a resource. When an author publishes a new version of a resource, 
  784: 
  785:     the Home Server will contact every server currently subscribed to the resource 
  786: 
  787:     (</span><b>Fig. 1.1.3B, Step U1</b><span style='font-weight:normal'>), as 
  788: 
  789:     determined from the list of subscribed servers for the resource generated 
  790: 
  791:     in </span><b>Fig. 1.1. 3A, Step D3c</b><span style='font-weight:normal'>. 
  792: 
  793:     The subscribing servers will receive and acknowledge the update message (</span><b>Step 
  794: 
  795:     U1c</b><span
  796: 
  797: style='font-weight:normal'>). The update mechanism finishes when the last subscribed 
  798: 
  799:     server has been contacted (messages to unreachable servers are buffered).</span></p>
  800: 
  801:   <p>Each subscribing server will check if the resource in question had been accessed 
  802: 
  803:     recently, that is, within a configurable amount of time (<b>Step U2</b><span style='font-weight:normal'>). 
  804: 
  805:     </span></p>
  806: 
  807:   <p>If the resource had not been accessed recently, the local copy of the resource 
  808: 
  809:     is deleted (<b>Step U3a</b><span style='font-weight:normal'>) and an unsubscribe 
  810: 
  811:     command is sent to the Home Server (</span><b>Step U3b</b><span
  812: 
  813: style='font-weight:normal'>). The Home Server will check if the server had indeed 
  814: 
  815:     originally subscribed to the resource (</span><b>Step U3c</b><span
  816: 
  817: style='font-weight:normal'>) and then delete the server from the list of subscribed 
  818: 
  819:     servers for the resource (</span><b>Step U3d</b><span
  820: 
  821: style='font-weight:normal'>).</span></p>
  822: 
  823:   <p>If the resource had been accessed recently, the modified resource will be 
  824: 
  825:     copied over using the same mechanism as in <b>Step D5a</b><span
  826: 
  827: style='font-weight:normal'> through </span><b>D7</b><span style='font-weight:
  828: 
  829: normal'> of </span><b>Fig. 1.1.3A</b><span style='font-weight:normal'> (</span><b>Fig. 
  830: 
  831:     1.1.3B</b><span style='font-weight:normal'>, </span><b>Steps U4a </b><span
  832: 
  833: style='font-weight:normal'>through</span><b> U6</b><span style='font-weight:
  834: 
  835: normal'>).</span></p>
  836: 
  837:   <p><span style='font-family:Arial'>Load Balancing</span></p>
  838: 
  839:   <p><span style='font-family:"Courier New"'>lond</span> provides a function to 
  840: 
  841:     query the serverÕs current <span style='font-family:"Courier New"'>loadavg</span><span
  842: 
  843: style='font-size:14.0pt'>. </span>As a configuration parameter, one can determine 
  844: 
  845:     the value of <span style='font-family:"Courier New"'>loadavg,</span> which 
  846: 
  847:     is to be considered 100%, for example, 2.00. </p>
  848: 
  849:   <p>Access servers can have a list of spare access servers, <span
  850: 
  851: style='font-size:9.0pt;font-family:"Courier New"'>/home/httpd/lonTabs/spares.tab</span>, 
  852: 
  853:     to offload sessions depending on own workload. This check happens is done 
  854: 
  855:     by the login handler. It re-directs the login information and session to the 
  856: 
  857:     least busy spare server if itself is overloaded. An additional round-robin 
  858: 
  859:     IP scheme possible. See <b>Fig. 1.1.4</b><span style='font-weight:normal'> 
  860: 
  861:     for an example of a load-balancing scheme.</span></p>
  862: 
  863:   <p><span style='font-size:28.0pt;color:green'> <img width=241 height=139
  864: 
  865: src="Session%20One_files/image013.jpg" v:shapes="_x0000_i1031"> </span></p>
  866: 
  867:   <p><span
  868: 
  869: style='font-size:14.0pt'><b>Fig. 1.1.4 Ð </b></span><span style='font-size:14.0pt'>Example 
  870: 
  871:     of Load Balancing</span><span style='font-size:14.0pt'> <b><i><br
  872: 
  873: clear=ALL style='page-break-before:always'>
  874: 
  875:     </i></b></span></p>
  876: 
  877: </div>
  878: 
  879: <br
  880: 
  881: clear=ALL style='page-break-before:always;'>
  882: 
  883: <div class=Section2> </div>
  884: 
  885: </body>
  886: 
  887: </html>
  888: 

FreeBSD-CVSweb <freebsd-cvsweb@FreeBSD.org>