Diff for /loncom/lond between versions 1.408 and 1.409

version 1.408, 2008/09/06 00:47:13 version 1.409, 2008/10/07 10:08:06
Line 7118  linux Line 7118  linux
 Server/Process  Server/Process
   
 =cut  =cut
   
   
   =pod
   
   =head1 LOG MESSAGES
   
   The messages below can be emitted in the lond log.  This log is located
   in ~httpd/perl/logs/lond.log  Many log messages have HTML encapsulation
   to provide coloring if examined from inside a web page. Some do not.
   Where color is used, the colors are; Red for sometihhng to get excited
   about and to follow up on. Yellow for something to keep an eye on to
   be sure it does not get worse, Green,and Blue for informational items.
   
   In the discussions below, sometimes reference is made to ~httpd
   when describing file locations.  There isn't really an httpd 
   user, however there is an httpd directory that gets installed in the
   place that user home directories go.  On linux, this is usually
   (always?) /home/httpd.
   
   
   Some messages are colorless.  These are usually (not always)
   Green/Blue color level messages.
   
   =over 2
   
   =item (Red)  LocalConnection rejecting non local: <ip> ne 127.0.0.1
   
   A local connection negotiation was attempted by
   a host whose IP address was not 127.0.0.1.
   The socket is closed and the child will exit.
   lond has three ways to establish an encyrption
   key with a client:
   
   =over 2
   
   =item local 
   
   The key is written and read from a file.
   This is only valid for connections from localhost.
   
   =item insecure 
   
   The key is generated by the server and
   transmitted to the client.
   
   =item  ssl (secure)
   
   An ssl connection is negotiated with the client,
   the key is generated by the server and sent to the 
   client across this ssl connection before the
   ssl connectionis terminated and clear text
   transmission resumes.
   
   =back
   
   =item (Red) LocalConnection: caller is insane! init = <init> and type = <type>
   
   The client is local but has not sent an initialization
   string that is the literal "init:local"  The connection
   is closed and the child exits.
   
   =item Red CRITICAL Can't get key file <error>        
   
   SSL key negotiation is being attempted but the call to
   lonssl::KeyFile  failed.  This usually means that the
   configuration file is not correctly defining or protecting
   the directories/files lonCertificateDirectory or
   lonnetPrivateKey
   <error> is a string that describes the reason that
   the key file could not be located.
   
   =item (Red) CRITICAL  Can't get certificates <error>  
   
   SSL key negotiation failed because we were not able to retrives our certificate
   or the CA's certificate in the call to lonssl::CertificateFile
   <error> is the textual reason this failed.  Usual reasons:
   
   =over 2
          
   =item Apache config file for loncapa  incorrect:
    
   one of the variables 
   lonCertificateDirectory, lonnetCertificateAuthority, or lonnetCertificate
   undefined or incorrect
   
   =item Permission error:
   
   The directory pointed to by lonCertificateDirectory is not readable by lond
   
   =item Permission error:
   
   Files in the directory pointed to by lonCertificateDirectory are not readable by lond.
   
   =item Installation error:                         
   
   Either the certificate authority file or the certificate have not
   been installed in lonCertificateDirectory.
   
   =item (Red) CRITICAL SSL Socket promotion failed:  <err> 
   
   The promotion of the connection from plaintext to SSL failed
   <err> is the reason for the failure.  There are two
   system calls involved in the promotion (one of which failed), 
   a dup to produce
   a second fd on the raw socket over which the encrypted data
   will flow and IO::SOcket::SSL->new_from_fd which creates
   the SSL connection on the duped fd.
   
   =item (Blue)   WARNING client did not respond to challenge 
   
   This occurs on an insecure (non SSL) connection negotiation request.
   lond generates some number from the time, the PID and sends it to
   the client.  The client must respond by echoing this information back.
   If the client does not do so, that's a violation of the challenge
   protocols and the connection will be failed.
   
   =item (Red) No manager table. Nobody can manage!!    
   
   lond has the concept of privileged hosts that
   can perform remote management function such
   as update the hosts.tab.   The manager hosts
   are described in the 
   ~httpd/lonTabs/managers.tab file.
   this message is logged if this file is missing.
   
   
   =item (Green) Registering manager <dnsname> as <cluster_name> with <ipaddress>
   
   Reports the successful parse and registration
   of a specific manager. 
   
   =item Green existing host <clustername:dnsname>  
   
   The manager host is already defined in the hosts.tab
   the information in that table, rather than the info in the
   manager table will be used to determine the manager's ip.
   
   =item (Red) Unable to craete <filename>                 
   
   lond has been asked to create new versions of an administrative
   file (by a manager).  When this is done, the new file is created
   in a temp file and then renamed into place so that there are always
   usable administrative files, even if the update fails.  This failure
   message means that the temp file could not be created.
   The update is abandoned, and the old file is available for use.
   
   =item (Green) CopyFile from <oldname> to <newname> failed
   
   In an update of administrative files, the copy of the existing file to a
   backup file failed.  The installation of the new file may still succeed,
   but there will not be a back up file to rever to (this should probably
   be yellow).
   
   =item (Green) Pushfile: backed up <oldname> to <newname>
   
   See above, the backup of the old administrative file succeeded.
   
   =item (Red)  Pushfile: Unable to install <filename> <reason>
   
   The new administrative file could not be installed.  In this case,
   the old administrative file is still in use.
   
   =item (Green) Installed new < filename>.                      
   
   The new administrative file was successfullly installed.                                               
   
   =item (Red) Reinitializing lond pid=<pid>                    
   
   The lonc child process <pid> will be sent a USR2 
   signal.
   
   =item (Red) Reinitializing self                                    
   
   We've been asked to re-read our administrative files,and
   are doing so.
   
   =item (Yellow) error:Invalid process identifier <ident>  
   
   A reinit command was received, but the target part of the 
   command was not valid.  It must be either
   'lond' or 'lonc' but was <ident>
   
   =item (Green) isValideditCommand checking: Command = <command> Key = <key> newline = <newline>
   
   Checking to see if lond has been handed a valid edit
   command.  It is possible the edit command is not valid
   in that case there are no log messages to indicate that.
   
   =item Result of password change for  <username> pwchange_success
   
   The password for <username> was
   successfully changed.
   
   =item Unable to open <user> passwd to change password
   
   Could not rewrite the 
   internal password file for a user
   
   =item Result of password change for <user> : <result>
                                                                        
   A unix password change for <user> was attempted 
   and the pipe returned <result>  
   
   =item LWP GET: <message> for <fname> (<remoteurl>)
   
   The lightweight process fetch for a resource failed
   with <message> the local filename that should
   have existed/been created was  <fname> the
   corresponding URI: <remoteurl>  This is emitted in several
   places.
   
   =item Unable to move <transname> to <destname>     
   
   From fetch_user_file_handler - the user file was replicated but could not
   be mv'd to its final location.
   
   =item Looking for <domain> <username>              
   
   From user_has_session_handler - This should be a Debug call instead
   it indicates lond is about to check whether the specified user has a 
   session active on the specified domain on the local host.
   
   =item Client <ip> (<name>) hanging up: <input>     
   
   lond has been asked to exit by its client.  The <ip> and <name> identify the
   client systemand <input> is the full exit command sent to the server.
   
   =item Red CRITICAL: ABNORMAL EXIT. child <pid> for server <hostname> died through a crass with this error->[<message>].
                                                    
   A lond child terminated.  NOte that this termination can also occur when the
   child receives the QUIT or DIE signals.  <pid> is the process id of the child,
   <hostname> the host lond is working for, and <message> the reason the child died
   to the best of our ability to get it (I would guess that any numeric value
   represents and errno value).  This is immediately followed by
   
   =item  Famous last words: Catching exception - <log> 
   
   Where log is some recent information about the state of the child.
   
   =item Red CRITICAL: TIME OUT <pid>                     
   
   Some timeout occured for server <pid>.  THis is normally a timeout on an LWP
   doing an HTTP::GET.
   
   =item child <pid> died                              
   
   The reaper caught a SIGCHILD for the lond child process <pid>
   This should be modified to also display the IP of the dying child
   $children{$pid}
   
   =item Unknown child 0 died                           
   A child died but the wait for it returned a pid of zero which really should not
   ever happen. 
   
   =item Child <which> - <pid> looks like we missed it's death 
   
   When a sigchild is received, the reaper process checks all children to see if they are
   alive.  If children are dying quite quickly, the lack of signal queuing can mean
   that a signal hearalds the death of more than one child.  If so this message indicates
   which other one died. <which> is the ip of a dead child
   
   =item Free socket: <shutdownretval>                
   
   The HUNTSMAN sub was called due to a SIGINT in a child process.  The socket is being shutdown.
   for whatever reason, <shutdownretval> is printed but in fact shutdown() is not documented
   to return anything. This is followed by: 
   
   =item Red CRITICAL: Shutting down                       
   
   Just prior to exit.
   
   =item Free socket: <shutdownretval>                 
   
   The HUPSMAN sub was called due to a SIGHUP.  all children get killsed, and lond execs itself.
   This is followed by:
   
   =item (Red) CRITICAL: Restarting                         
   
   lond is about to exec itself to restart.
   
   =item (Blue) Updating connections                        
   
   (In response to a USR2).  All the children (except the one for localhost)
   are about to be killed, the hosts tab reread, and Apache reloaded via apachereload.
   
   =item (Blue) UpdateHosts killing child <pid> for ip <ip>   
   
   Due to USR2 as above.
   
   =item (Green) keeping child for ip <ip> (pid = <pid>)    
   
   In response to USR2 as above, the child indicated is not being restarted because
   it's assumed that we'll always need a child for the localhost.
   
   
   =item Going to check on the children                
   
   Parent is about to check on the health of the child processes.
   Note that this is in response to a USR1 sent to the parent lond.
   there may be one or more of the next two messages:
   
   =item <pid> is dead                                 
   
   A child that we have in our child hash as alive has evidently died.
   
   =item  Child <pid> did not respond                   
   
   In the health check the child <pid> did not update/produce a pid_.txt
   file when sent it's USR1 signal.  That process is killed with a 9 signal, as it's
   assumed to be hung in some un-fixable way.
   
   =item Finished checking children                   
    
   Master processs's USR1 processing is cojmplete.
   
   =item (Red) CRITICAL: ------- Starting ------            
   
   (There are more '-'s on either side).  Lond has forked itself off to 
   form a new session and is about to start actual initialization.
   
   =item (Green) Attempting to start child (<client>)       
   
   Started a new child process for <client>.  Client is IO::Socket object
   connected to the child.  This was as a result of a TCP/IP connection from a client.
   
   =item Unable to determine who caller was, getpeername returned nothing
                                                     
   In child process initialization.  either getpeername returned undef or
   a zero sized object was returned.  Processing continues, but in my opinion,
   this should be cause for the child to exit.
   
   =item Unable to determine clientip                  
   
   In child process initialization.  The peer address from getpeername was not defined.
   The client address is stored as "Unavailable" and processing continues.
   
   =item (Yellow) INFO: Connection <ip> <name> connection type = <type>
                                                     
   In child initialization.  A good connectionw as received from <ip>.
   
   =over 2
   
   =item <name> 
   
   is the name of the client from hosts.tab.
   
   =item <type> 
   
   Is the connection type which is either 
   
   =over 2
   
   =item manager 
   
   The connection is from a manager node, not in hosts.tab
   
   =item client  
   
   the connection is from a non-manager in the hosts.tab
   
   =item both
   
   The connection is from a manager in the hosts.tab.
   
   =back
   
   =back
   
   =item (Blue) Certificates not installed -- trying insecure auth
   
   One of the certificate file, key file or
   certificate authority file could not be found for a client attempting
   SSL connection intiation.  COnnection will be attemptied in in-secure mode.
   (this would be a system with an up to date lond that has not gotten a 
   certificate from us).
   
   =item (Green)  Successful local authentication            
   
   A local connection successfully negotiated the encryption key. 
   In this case the IDEA key is in a file (that is hopefully well protected).
   
   =item (Green) Successful ssl authentication with <client>  
   
   The client (<client> is the peer's name in hosts.tab), has successfully
   negotiated an SSL connection with this child process.
   
   =item (Green) Successful insecure authentication with <client>
                                                      
   
   The client has successfully negotiated an  insecure connection withthe child process.
   
   =item (Yellow) Attempted insecure connection disallowed    
   
   The client attempted and failed to successfully negotiate a successful insecure
   connection.  This can happen either because the variable londAllowInsecure is false
   or undefined, or becuse the child did not successfully echo back the challenge
   string.
   
   
   =back
   
   
   =cut

Removed from v.1.408  
changed lines
  Added in v.1.409


FreeBSD-CVSweb <freebsd-cvsweb@FreeBSD.org>