Diff for /doc/build/Attic/loncapasqldatabase.html between versions 1.1 and 1.8

version 1.1, 2001/02/07 12:49:17 version 1.8, 2001/02/14 14:33:35
Line 8 Line 8
 Scott Harrison  Scott Harrison
 </P>  </P>
 <P>  <P>
 Last updated: 02/07/2001  Last updated: 02/12/2001
 </P>  </P>
   <P>
 This file describes issues associated with LON-CAPA  This file describes issues associated with LON-CAPA
 and a SQL database.  and a SQL database.
 </P>  </P>
   <H2>Latest HOWTO</H2>
   <P>
   <UL>
   <LI>Current status of documentation</LI>
   <LI>Current status of implementation</LI>
   <LI>Purpose within LON-CAPA</LI>
   <LI>Dependencies</LI>
   <LI>Installation</LI>
   <LI>Installation from source</LI>
   <LI>Configuration (automated)</LI>
   <LI>Manual configuration</LI>
   <LI>Testing</LI>
   <LI>Example sections of code relevant to LON-CAPA</LI>
   </UL>
   </P>
   <H2>Current status of documentation</H2>
   <P>
   I am going to begin documentation by inserting what notes
   I have into this file.  I will be subsequently rearranging
   them and editing them based on the tests that I conduct.
   I am trying to make sure that documentation, installation,
   and run-time issues are all consistent and correct.  The
   current status of everything is that it works and has
   been minimally tested, but things need to be cleaned up
   and checked again!
   </P>
   <H2>Current status of implementation</H2>
   <P>
   Need to
   <UL>
   <LI>Installation: Fix binary file listings for user permissions and ownership.
   <LI>Installation: Make sure sql server starts, and if database does not
   exist, then create. (/etc/rc.d).
   <LI>Processes: Make sure loncron initiates lonsql on library machines.
   <LI>Read in metadata from right place periodically.
   <LI>Implement tested perl module handler.
   </UL>
   <P>
   Right now, a lot of "feasibility" work has been done.
   Recipes for manual installation and configuration have
   been gathered.  Network connectivity of lond->lonsql->lond->lonc
   type tests have been performed.  A binary installation
   has been compiled in an RPM (LON-CAPA-mysql, with perl components
   a part of LON-CAPA-systemperl).
   The most lacking test in terms of feasibility has
   been looking at benchmarks to analyze the load at which
   the SQL database can efficiently allow many users to
   make simultaneous requests of the metadata database.
   </P>
   <P>
   Documentation has been pieced together over time.  But,
   as mentioned in the previous section, it needs an
   overhaul.
   </P>
   <P>
   The binary installation has some quirks associated with it.
   Some of the user permissions are wrong, although this is
   benign.  Also, other options of binary installation (such
   as using binary RPMs put together by others) were dismissed
   given the difficulty of getting differing combinations of
   these external RPMs to work together.
   </P>
   <P>
   Most configuration questions have been initially worked out
   to the point of getting this SQL software component working,
   however there may be more optimal approaches than currently
   exist.
   </P>
   <H2>Purpose within LON-CAPA</H2>
   <P>
   LON-CAPA is meant to distribute A LOT of educational content
   to A LOT of people.  It is ineffective to directly rely on contents
   within the ext2 filesystem to be speedily scanned for 
   on-the-fly searches of content descriptions.  (Simply put,
   it takes a cumbersome amount of time to open, read, analyze, and
   close thousands of files.)
   </P>
   <P>
   The solution is to hash-index various data fields that are
   descriptive of the educational resources on a LON-CAPA server
   machine.  Descriptive data fields are referred to as
   "metadata".  The question then arises as to how this metadata
   is handled in terms of the rest of the LON-CAPA network
   without burdening client and daemon processes.  I now
   answer this question in the format of Problem and Solution
   below.
   </P>
   <P>
   <PRE>
   PROBLEM SITUATION:
   
     If Server A wants data from Server B, Server A uses a lonc process to
     send a database command to a Server B lond process.
       lonc= loncapa client process    A-lonc= a lonc process on Server A
       lond= loncapa daemon process
   
                    database command
       A-lonc  --------TCP/IP----------------> B-lond
   
     The problem emerges that A-lonc and B-lond are kept waiting for the
     MySQL server to "do its stuff", or in other words, perform the conceivably
     sophisticated, data-intensive, time-sucking database transaction.  By tying
     up a lonc and lond process, this significantly cripples the capabilities
     of LON-CAPA servers. 
   
     While commercial databases have a variety of features that ATTEMPT to
     deal with this, freeware databases are still experimenting and exploring
     with different schemes with varying degrees of performance stability.
   
   THE SOLUTION:
   
     A separate daemon process was created that B-lond works with to
     handle database requests.  This daemon process is called "lonsql".
   
     So,
                   database command
     A-lonc  ---------TCP/IP-----------------> B-lond =====> B-lonsql
            <---------------------------------/                |
              "ok, I'll get back to you..."                    |
                                                               |
                                                               /
     A-lond  <-------------------------------  B-lonc   <======
              "Guess what? I have the result!"
   
     Of course, depending on success or failure, the messages may vary,
     but the principle remains the same where a separate pool of children
     processes (lonsql's) handle the MySQL database manipulations.
   </PRE>
   </P>
   <H2>Dependencies</H2>
   <P>
   I believe (but am not 100% confident) that the following
   RPMs are necessary (in addition to the current ones
   in rpm_list.txt) to run MySQL.  Basically I discovered these
   dependencies while trying to do external RPM based installs.
   I assume, and sometimes found, that these dependencies apply
   to tarball-based distributions too.  (So to play it on the
   safe side, I am going to include these RPMs as part of the
   core, minimal RPM set.)
   <UL>
   <LI>egcs-1.1.2-30</LI>
   <LI>cpp-1.1.2-30</LI>
   <LI>glibc-devel-2.1.3-15</LI>
   <LI>glibc-devel-2.1.3-15</LI>
   <LI>zlib-devel-1.1.3-6</LI>
   </UL>
   
   </P>
   <H2>Installation</H2>
   <P>
   Installation of the LON-CAPA SQL database normally occurs
   by default when using the LON-CAPA installation CD
   (see http://install.lon-capa.org).  It is installed
   as the LON-CAPA-mysql RPM.  This RPM encodes for the MySQL
   engine.  Related perl interfaces (Perl::DBI, Perl::Msql-Mysql)
   are encoded in the LON-CAPA-systemperl RPM.
   </P>
 <P>  <P>
 <H3>Latest HOWTO</H3>  The three components of a MySQL installation for the
   LON-CAPA system are further described immediately below.
   <TABLE BORDER="0">
   <TR><TD COLSPAN="2"><STRONG>Perl::DBI module</STRONG>-
   the API "front-end"...</TD></TR>
   <TR><TD WIDTH="10%"></TD><TD>database interface module for organizing generic
   database commands which are independent of specific
   database implementation (such as MySQL, mSQL, Postgres, etc).
   </TD></TR>
   <TR><TD COLSPAN="2"><STRONG>Perl::MySQL module</STRONG>-
   the API "mid-section"...</TD></TR>
   <TR><TD WIDTH="10%"></TD><TD>the module to directly interface with the actual
   MySQL database engine</TD></TR>
   <TR><TD COLSPAN="2"><STRONG>MySQL database engine</STRONG>-
   the "back-end"...</TD></TR>
   <TR><TD WIDTH="10%"></TD><TD>the binary installation (compiled either
   from source or pre-compiled file listings) which provides the
   actual MySQL functionality on the system</TD></TR>
   </TABLE>
   </P>
   <H2>Installation from source</H2>
   <P>
   Note: the mysql site recommends that Linux users install by
   using the MySQL RPMs (MySQL-client, MySQL, MySQL-shared, etc).
   While these RPMs work, I was unsuccessful at integrating
   this RPM-installed database with perl modules from www.cpan.org.
   Hence, I <STRONG>strongly</STRONG> recommend that, when installing
   from "source", MySQL and the perl components be in fact installed
   from their tarballs (.tar.gz, .tgz).  (Perl components, when installed
   from RPMs, also wound up in incorrect locations on the disk.)
   Do not coordinate a source install with externally made RPMs!
   It is, of course, okay to use LON-CAPA RPMs such as LON-CAPA-systemperl
   and LON-CAPA-mysql since we, in fact, made these RPMs correctly :).
   <UL>
   <LI>http://www.cpan.org/authors/id/JWIED/Msql-Mysql-modules-1.2215.tar.gz
   <BR>This tarball Released 20th August 2000
   <LI>http://www.mysql.com/Downloads/MySQL-3.23/mysql-3.23.33-pc-linux-gnu-i686.tar.gz
   <BR>This tarball Last changed 2000-11-11
   <BR>This is actually a binary tarball (as opposed to source code
   that is subsequently compiled).
   <LI>
   <BR>
   </UL>
   </P>
   <FONT COLOR="green"> old notes in green
   <P>
   The following set of tarballs was found to work together
   properly on a LON-CAPA RedHat 6.2 system:
   <UL>
   <LI>DBI-1.13.tar.gz
   <LI>Msql-Mysql-modules-1.2209.tar.gz
   <LI>mysql-3.22.32.tar.gz
   </UL>
   </P>
   <P>
   Installation was simply a matter of following the instructions
   and typing the several "make" commands for each 
   </P>
   </FONT>
   <H2>Configuration (automated)</H2>
   <P>
   Not yet developed.  This will be part of an interface
   present on LON-CAPA systems that can be launched by
   entering the command <TT>/usr/sbin/loncapaconfig</TT>.
   </P>
   <H2>Manual configuration</H2>
   <P>
   This is not complete.
   </P>
   <P>
   <STRONG>Starting the mysql daemon</STRONG>: Login on the Linux
   system as user 'www'.  Enter the command
   <TT>/usr/local/bin/safe_mysqld &</TT>
   </P>
   <P>
   <STRONG>Set a password for 'root'</STRONG>:
   <TT>/usr/local/bin/mysqladmin -u root password 'new-password'</TT>
   </P>
   <P>
   <STRONG>Adding a user</STRONG>:  Start the mysql daemon.  Login to the
   mysql system as root (<TT>mysql -u root -p mysql</TT>)
   and enter the right password (for instance 'newmysql').  Add the user
   www
   <PRE>
   INSERT INTO user (Host, User, Password)
   VALUES ('localhost','www',password('newmysql'));
   </PRE>
   </P>
   <P>
   <STRONG>Granting privileges to user 'www'</STRONG>:
   <PRE>
   GRANT ALL PRIVILEGES ON *.* TO www@localhost;
   FLUSH PRIVILEGES;
   </PRE>
   </P>
   <P>
   <STRONG>Set the SQL server to start upon system startup</STRONG>:
   Copy support-files/mysql.server to the right place on the system
   (/etc/rc.d/...).
   </P>
   <P>
   <STRONG>The Perl API</STRONG>
   <PRE>
      $dbh = DBI->connect( "DBI:mysql:loncapa",
    "www",
    "SOMEPASSWORD",
    { RaiseError =>0,PrintError=>0});
   
   There is an obvious need to CONNECT to the database, and in order to do
   this, there must be:
     a RUNNING mysql daemon;
     a DATABASE named "loncapa";
     a USER named "www";
     and an ABILITY for LON-CAPA on one machine to access
          SQL database on another machine;
     
   So, here are some notes on implementing these configurations.
   
   ** RUNNING mysql daemon (safe_mysqld method)
   
   The recommended way to run the MySQL daemon is as a non-root user
   (probably www)...
   
   so, 1) login as user www on the linux machine
       2) start the mysql daemon as /usr/local/bin/safe_mysqld &
   
   safe_mysqld only works if the local installation of MySQL is set to the
   right directory permissions which I found to be:
   chown www:users /usr/local/var/mysql
   chown www:users /usr/local/lib/mysql
   chown -R www:users /usr/local/mysql
   chown www:users /usr/local/include/mysql
   chown www:users /usr/local/var
   
   ** DATABASE named "loncapa"
   
   As user www, run this command
       mysql -u root -p mysql
   enter the password as SOMEPASSWORD
   
   This allows you to manually enter MySQL commands.
   The MySQL command to generate the loncapa DATABASE is:
   
   CREATE DATABASE 'loncapa';
   
   ** USER named "www"
   
   As user www, run this command
       mysql -u root -p mysql
   enter the password as SOMEPASSWORD
   
   To add the user www to the MySQL server, and grant all
   privileges on *.* to www@localhost identified by 'SOMEPASSWORD'
   with grant option;
   
   INSERT INTO user (Host, User, Password)
   VALUES ('localhost','www',password('SOMEPASSWORD'));
   
   GRANT ALL PRIVILEGES ON *.* TO www@localhost;
   
   FLUSH PRIVILEGES;
   
   ** ABILITY for LON-CAPA machines to communicate with SQL databases on
      other LON-CAPA machines
   
   An up-to-date lond and lonsql.
   </PRE>
   </P>
   <H2>Testing</H2>
   <P>
   <PRE>
   <STRONG>** TEST the database connection with my current tester.pl code
   which mimics what command will eventually be sent through lonc.</STRONG>
   
   $reply=reply(
       "querysend:SELECT * FROM general_information WHERE Id='AAAAA'",$lonID);
   </PRE>
   </P>
   <H2>Example sections of code relevant to LON-CAPA</H2>
   <P>
   Here are excerpts of code which implement the above handling:
   </P>
   <P>
   <PRE>
   <STRONG>**LONSQL
   A subroutine from "lonsql" which establishes a child process for handling
   database interactions.</STRONG>
   
   sub make_new_child {
       my $pid;
       my $sigset;
       
       # block signal for fork
       $sigset = POSIX::SigSet->new(SIGINT);
       sigprocmask(SIG_BLOCK, $sigset)
           or die "Can't block SIGINT for fork: $!\n";
       
       die "fork: $!" unless defined ($pid = fork);
       
       if ($pid) {
           # Parent records the child's birth and returns.
           sigprocmask(SIG_UNBLOCK, $sigset)
               or die "Can't unblock SIGINT for fork: $!\n";
           $children{$pid} = 1;
           $children++;
           return;
       } else {
           # Child can *not* return from this subroutine.
           $SIG{INT} = 'DEFAULT';      # make SIGINT kill us as it did before
       
           # unblock signals
           sigprocmask(SIG_UNBLOCK, $sigset)
               or die "Can't unblock SIGINT for fork: $!\n";
   
   
           #open database handle
    # making dbh global to avoid garbage collector
    unless (
    $dbh = DBI->connect("DBI:mysql:loncapa","www","SOMEPASSWORD",{ RaiseError =>0,PrintError=>0})
    ) { 
               my $st=120+int(rand(240));
       &logthis("<font color=blue>WARNING: Couldn't connect to database  ($st secs): $@</font>");
       print "database handle error\n";
       sleep($st);
       exit;
   
     };
    # make sure that a database disconnection occurs with ending kill signals
    $SIG{TERM}=$SIG{INT}=$SIG{QUIT}=$SIG{__DIE__}=\&DISCONNECT;
   
           # handle connections until we've reached $MAX_CLIENTS_PER_CHILD
           for ($i=0; $i < $MAX_CLIENTS_PER_CHILD; $i++) {
               $client = $server->accept()     or last;
               
               # do something with the connection
       $run = $run+1;
       my $userinput = <$client>;
       chomp($userinput);
           
       my ($conserver,$querytmp)=split(/&/,$userinput);
       my $query=unescape($querytmp);
   
               #send query id which is pid_unixdatetime_runningcounter
       $queryid = $thisserver;
       $queryid .="_".($$)."_";
       $queryid .= time."_";
       $queryid .= $run;
       print $client "$queryid\n";
       
               #prepare and execute the query
       my $sth = $dbh->prepare($query);
       my $result;
       unless ($sth->execute())
       {
    &logthis("<font color=blue>WARNING: Could not retrieve from database: $@</font>");
    $result="";
       }
       else {
    my $r1=$sth->fetchall_arrayref;
    my @r2; map {my $a=$_; my @b=map {escape($_)} @$a; push @r2,join(",", @b)} (@$r1);
    $result=join("&",@r2) . "\n";
       }
               &reply("queryreply:$queryid:$result",$conserver);
   
           }
       
           # tidy up gracefully and finish
   
           #close the database handle
    $dbh->disconnect
      or &logthis("<font color=blue>WARNING: Couldn't disconnect from database  $DBI::errstr ($st secs): $@</font>");
       
           # this exit is VERY important, otherwise the child will become
           # a producer of more and more children, forking yourself into
           # process death.
           exit;
       }
   }
   </P>
   <P>
   <STRONG>** LOND enabling of MySQL requests</STRONG>
   <BR />
   This code is part of every lond child process in the
   way that it parses command request syntax sent to it
   from lonc processes.  Based on the diagram above, querysend
   corresponds to B-lonc sending the result of the query.
   queryreply corresponds to B-lond indicating that it has
   received the request and will start the database transaction
   (it returns "ok" to
   A-lonc ($client)).
   <PRE>
   # ------------------------------------------------------------------- querysend
                      } elsif ($userinput =~ /^querysend/) {
                          my ($cmd,$query)=split(/:/,$userinput);
          $query=~s/\n*$//g;
                        print $client sqlreply("$hostid{$clientip}\&$query")."\n";
   # ------------------------------------------------------------------ queryreply
                      } elsif ($userinput =~ /^queryreply/) {
                          my ($cmd,$id,$reply)=split(/:/,$userinput); 
          my $store;
                          my $execdir=$perlvar{'lonDaemons'};
                          if ($store=IO::File->new(">$execdir/tmp/$id")) {
      print $store $reply;
      close $store;
      print $client "ok\n";
          }
          else {
      print $client "error:$!\n";
          }
   
   </PRE>
   
 </P>  </P>
 </BODY>  </BODY>
 </HTML>  
   
   </HTML>

Removed from v.1.1  
changed lines
  Added in v.1.8


FreeBSD-CVSweb <freebsd-cvsweb@FreeBSD.org>