--- loncom/lond 2008/09/06 00:47:13 1.408 +++ loncom/lond 2008/10/07 10:08:06 1.409 @@ -2,7 +2,7 @@ # The LearningOnline Network # lond "LON Daemon" Server (port "LOND" 5663) # -# $Id: lond,v 1.408 2008/09/06 00:47:13 raeburn Exp $ +# $Id: lond,v 1.409 2008/10/07 10:08:06 foxr Exp $ # # Copyright Michigan State University Board of Trustees # @@ -59,7 +59,7 @@ my $DEBUG = 0; # Non zero to ena my $status=''; my $lastlog=''; -my $VERSION='$Revision: 1.408 $'; #' stupid emacs +my $VERSION='$Revision: 1.409 $'; #' stupid emacs my $remoteVERSION; my $currenthostid="default"; my $currentdomainid; @@ -7118,3 +7118,406 @@ linux Server/Process =cut + + +=pod + +=head1 LOG MESSAGES + +The messages below can be emitted in the lond log. This log is located +in ~httpd/perl/logs/lond.log Many log messages have HTML encapsulation +to provide coloring if examined from inside a web page. Some do not. +Where color is used, the colors are; Red for sometihhng to get excited +about and to follow up on. Yellow for something to keep an eye on to +be sure it does not get worse, Green,and Blue for informational items. + +In the discussions below, sometimes reference is made to ~httpd +when describing file locations. There isn't really an httpd +user, however there is an httpd directory that gets installed in the +place that user home directories go. On linux, this is usually +(always?) /home/httpd. + + +Some messages are colorless. These are usually (not always) +Green/Blue color level messages. + +=over 2 + +=item (Red) LocalConnection rejecting non local: ne 127.0.0.1 + +A local connection negotiation was attempted by +a host whose IP address was not 127.0.0.1. +The socket is closed and the child will exit. +lond has three ways to establish an encyrption +key with a client: + +=over 2 + +=item local + +The key is written and read from a file. +This is only valid for connections from localhost. + +=item insecure + +The key is generated by the server and +transmitted to the client. + +=item ssl (secure) + +An ssl connection is negotiated with the client, +the key is generated by the server and sent to the +client across this ssl connection before the +ssl connectionis terminated and clear text +transmission resumes. + +=back + +=item (Red) LocalConnection: caller is insane! init = and type = + +The client is local but has not sent an initialization +string that is the literal "init:local" The connection +is closed and the child exits. + +=item Red CRITICAL Can't get key file + +SSL key negotiation is being attempted but the call to +lonssl::KeyFile failed. This usually means that the +configuration file is not correctly defining or protecting +the directories/files lonCertificateDirectory or +lonnetPrivateKey + is a string that describes the reason that +the key file could not be located. + +=item (Red) CRITICAL Can't get certificates + +SSL key negotiation failed because we were not able to retrives our certificate +or the CA's certificate in the call to lonssl::CertificateFile + is the textual reason this failed. Usual reasons: + +=over 2 + +=item Apache config file for loncapa incorrect: + +one of the variables +lonCertificateDirectory, lonnetCertificateAuthority, or lonnetCertificate +undefined or incorrect + +=item Permission error: + +The directory pointed to by lonCertificateDirectory is not readable by lond + +=item Permission error: + +Files in the directory pointed to by lonCertificateDirectory are not readable by lond. + +=item Installation error: + +Either the certificate authority file or the certificate have not +been installed in lonCertificateDirectory. + +=item (Red) CRITICAL SSL Socket promotion failed: + +The promotion of the connection from plaintext to SSL failed + is the reason for the failure. There are two +system calls involved in the promotion (one of which failed), +a dup to produce +a second fd on the raw socket over which the encrypted data +will flow and IO::SOcket::SSL->new_from_fd which creates +the SSL connection on the duped fd. + +=item (Blue) WARNING client did not respond to challenge + +This occurs on an insecure (non SSL) connection negotiation request. +lond generates some number from the time, the PID and sends it to +the client. The client must respond by echoing this information back. +If the client does not do so, that's a violation of the challenge +protocols and the connection will be failed. + +=item (Red) No manager table. Nobody can manage!! + +lond has the concept of privileged hosts that +can perform remote management function such +as update the hosts.tab. The manager hosts +are described in the +~httpd/lonTabs/managers.tab file. +this message is logged if this file is missing. + + +=item (Green) Registering manager as with + +Reports the successful parse and registration +of a specific manager. + +=item Green existing host + +The manager host is already defined in the hosts.tab +the information in that table, rather than the info in the +manager table will be used to determine the manager's ip. + +=item (Red) Unable to craete + +lond has been asked to create new versions of an administrative +file (by a manager). When this is done, the new file is created +in a temp file and then renamed into place so that there are always +usable administrative files, even if the update fails. This failure +message means that the temp file could not be created. +The update is abandoned, and the old file is available for use. + +=item (Green) CopyFile from to failed + +In an update of administrative files, the copy of the existing file to a +backup file failed. The installation of the new file may still succeed, +but there will not be a back up file to rever to (this should probably +be yellow). + +=item (Green) Pushfile: backed up to + +See above, the backup of the old administrative file succeeded. + +=item (Red) Pushfile: Unable to install + +The new administrative file could not be installed. In this case, +the old administrative file is still in use. + +=item (Green) Installed new < filename>. + +The new administrative file was successfullly installed. + +=item (Red) Reinitializing lond pid= + +The lonc child process will be sent a USR2 +signal. + +=item (Red) Reinitializing self + +We've been asked to re-read our administrative files,and +are doing so. + +=item (Yellow) error:Invalid process identifier + +A reinit command was received, but the target part of the +command was not valid. It must be either +'lond' or 'lonc' but was + +=item (Green) isValideditCommand checking: Command = Key = newline = + +Checking to see if lond has been handed a valid edit +command. It is possible the edit command is not valid +in that case there are no log messages to indicate that. + +=item Result of password change for pwchange_success + +The password for was +successfully changed. + +=item Unable to open passwd to change password + +Could not rewrite the +internal password file for a user + +=item Result of password change for : + +A unix password change for was attempted +and the pipe returned + +=item LWP GET: for () + +The lightweight process fetch for a resource failed +with the local filename that should +have existed/been created was the +corresponding URI: This is emitted in several +places. + +=item Unable to move to + +From fetch_user_file_handler - the user file was replicated but could not +be mv'd to its final location. + +=item Looking for + +From user_has_session_handler - This should be a Debug call instead +it indicates lond is about to check whether the specified user has a +session active on the specified domain on the local host. + +=item Client () hanging up: + +lond has been asked to exit by its client. The and identify the +client systemand is the full exit command sent to the server. + +=item Red CRITICAL: ABNORMAL EXIT. child for server died through a crass with this error->[]. + +A lond child terminated. NOte that this termination can also occur when the +child receives the QUIT or DIE signals. is the process id of the child, + the host lond is working for, and the reason the child died +to the best of our ability to get it (I would guess that any numeric value +represents and errno value). This is immediately followed by + +=item Famous last words: Catching exception - + +Where log is some recent information about the state of the child. + +=item Red CRITICAL: TIME OUT + +Some timeout occured for server . THis is normally a timeout on an LWP +doing an HTTP::GET. + +=item child died + +The reaper caught a SIGCHILD for the lond child process +This should be modified to also display the IP of the dying child +$children{$pid} + +=item Unknown child 0 died +A child died but the wait for it returned a pid of zero which really should not +ever happen. + +=item Child - looks like we missed it's death + +When a sigchild is received, the reaper process checks all children to see if they are +alive. If children are dying quite quickly, the lack of signal queuing can mean +that a signal hearalds the death of more than one child. If so this message indicates +which other one died. is the ip of a dead child + +=item Free socket: + +The HUNTSMAN sub was called due to a SIGINT in a child process. The socket is being shutdown. +for whatever reason, is printed but in fact shutdown() is not documented +to return anything. This is followed by: + +=item Red CRITICAL: Shutting down + +Just prior to exit. + +=item Free socket: + +The HUPSMAN sub was called due to a SIGHUP. all children get killsed, and lond execs itself. +This is followed by: + +=item (Red) CRITICAL: Restarting + +lond is about to exec itself to restart. + +=item (Blue) Updating connections + +(In response to a USR2). All the children (except the one for localhost) +are about to be killed, the hosts tab reread, and Apache reloaded via apachereload. + +=item (Blue) UpdateHosts killing child for ip + +Due to USR2 as above. + +=item (Green) keeping child for ip (pid = ) + +In response to USR2 as above, the child indicated is not being restarted because +it's assumed that we'll always need a child for the localhost. + + +=item Going to check on the children + +Parent is about to check on the health of the child processes. +Note that this is in response to a USR1 sent to the parent lond. +there may be one or more of the next two messages: + +=item is dead + +A child that we have in our child hash as alive has evidently died. + +=item Child did not respond + +In the health check the child did not update/produce a pid_.txt +file when sent it's USR1 signal. That process is killed with a 9 signal, as it's +assumed to be hung in some un-fixable way. + +=item Finished checking children + +Master processs's USR1 processing is cojmplete. + +=item (Red) CRITICAL: ------- Starting ------ + +(There are more '-'s on either side). Lond has forked itself off to +form a new session and is about to start actual initialization. + +=item (Green) Attempting to start child () + +Started a new child process for . Client is IO::Socket object +connected to the child. This was as a result of a TCP/IP connection from a client. + +=item Unable to determine who caller was, getpeername returned nothing + +In child process initialization. either getpeername returned undef or +a zero sized object was returned. Processing continues, but in my opinion, +this should be cause for the child to exit. + +=item Unable to determine clientip + +In child process initialization. The peer address from getpeername was not defined. +The client address is stored as "Unavailable" and processing continues. + +=item (Yellow) INFO: Connection connection type = + +In child initialization. A good connectionw as received from . + +=over 2 + +=item + +is the name of the client from hosts.tab. + +=item + +Is the connection type which is either + +=over 2 + +=item manager + +The connection is from a manager node, not in hosts.tab + +=item client + +the connection is from a non-manager in the hosts.tab + +=item both + +The connection is from a manager in the hosts.tab. + +=back + +=back + +=item (Blue) Certificates not installed -- trying insecure auth + +One of the certificate file, key file or +certificate authority file could not be found for a client attempting +SSL connection intiation. COnnection will be attemptied in in-secure mode. +(this would be a system with an up to date lond that has not gotten a +certificate from us). + +=item (Green) Successful local authentication + +A local connection successfully negotiated the encryption key. +In this case the IDEA key is in a file (that is hopefully well protected). + +=item (Green) Successful ssl authentication with + +The client ( is the peer's name in hosts.tab), has successfully +negotiated an SSL connection with this child process. + +=item (Green) Successful insecure authentication with + + +The client has successfully negotiated an insecure connection withthe child process. + +=item (Yellow) Attempted insecure connection disallowed + +The client attempted and failed to successfully negotiate a successful insecure +connection. This can happen either because the variable londAllowInsecure is false +or undefined, or becuse the child did not successfully echo back the challenge +string. + + +=back + + +=cut