loncom/build/piml_parse.pl - view

File: [LON-CAPA] / loncom / build / piml_parse.pl
Revision 1.10: download - view: text, annotated - select for diffs
Tue Dec 3 22:36:32 2002 UTC (21 years, 9 months ago) by harris41
Branches: MAIN
CVS tags: version_2_0_X, version_2_0_2, version_2_0_1, version_2_0_0, version_1_99_3, version_1_99_2, version_1_99_1_tmcc, version_1_99_1, version_1_99_0_tmcc, version_1_99_0, version_1_3_X, version_1_3_3, version_1_3_2, version_1_3_1, version_1_3_0, version_1_2_X, version_1_2_99_1, version_1_2_99_0, version_1_2_1, version_1_2_0, version_1_1_X, version_1_1_99_5, version_1_1_99_4, version_1_1_99_3, version_1_1_99_2, version_1_1_99_1, version_1_1_99_0, version_1_1_3, version_1_1_2, version_1_1_1, version_1_1_0, version_1_0_99_3, version_1_0_99_2, version_1_0_99_1, version_1_0_99, version_1_0_3, version_1_0_2, version_1_0_1, version_1_0_0, version_0_99_5, version_0_99_4, version_0_99_3, version_0_99_2, version_0_99_1, version_0_99_0, version_0_6_2, version_0_6, conference_2003, HEAD

BUG 1018 FIXED; fixing command-line argument handling; "dist" tag processing
of *.piml files is now effective; fixing documentation; minor code
beautification

1: #!/usr/bin/perl 2: 3: # -------------------------------------------------------- Documentation notice 4: # Run "perldoc ./piml_parse.pl" in order to best view the software 5: # documentation internalized in this program. 6: 7: # --------------------------------------------------------- License Information 8: # The LearningOnline Network with CAPA 9: # piml_parse.pl - Linux Packaging Markup Language parser 10: # 11: # $Id: piml_parse.pl,v 1.10 2002/12/03 22:36:32 harris41 Exp $ 12: # 13: # Written by Scott Harrison, codeharrison@yahoo.com 14: # 15: # Copyright Michigan State University Board of Trustees 16: # 17: # This file is part of the LearningOnline Network with CAPA (LON-CAPA). 18: # 19: # LON-CAPA is free software; you can redistribute it and/or modify 20: # it under the terms of the GNU General Public License as published by 21: # the Free Software Foundation; either version 2 of the License, or 22: # (at your option) any later version. 23: # 24: # LON-CAPA is distributed in the hope that it will be useful, 25: # but WITHOUT ANY WARRANTY; without even the implied warranty of 26: # MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the 27: # GNU General Public License for more details. 28: # 29: # You should have received a copy of the GNU General Public License 30: # along with LON-CAPA; if not, write to the Free Software 31: # Foundation, Inc., 59 Temple Place, Suite 330, Boston, MA 02111-1307 USA 32: # 33: # /home/httpd/html/adm/gpl.txt 34: # 35: # http://www.lon-capa.org/ 36: # 37: # YEAR=2002 38: # 1/28,1/29,1/30,1/31,2/5,4/8 - Scott Harrison 39: # 40: ### 41: 42: ############################################################################### 43: ## ## 44: ## ORGANIZATION OF THIS PERL SCRIPT ## 45: ## 1. Notes ## 46: ## 2. Get command line arguments ## 47: ## 3. First pass through (grab distribution-specific information) ## 48: ## 4. Second pass through (parse out what is not necessary) ## 49: ## 5. Third pass through (translate markup according to specified mode) ## 50: ## 6. Functions (most all just format contents of different markup tags) ## 51: ## 7. POD (plain old documentation, CPAN style) ## 52: ## ## 53: ############################################################################### 54: 55: # ----------------------------------------------------------------------- Notes 56: # 57: # I am using a multiple pass-through approach to parsing 58: # the piml file. This saves memory and makes sure the server 59: # will never be overloaded. 60: # 61: # This is meant to parse files meeting the piml document type. 62: # See piml.dtd. PIML=Post Installation Markup Language. 63: 64: # To reduce system dependencies, I'm using a lightweight 65: # parser. At some point, I need to get serious with a 66: # better xml parsing engine and stylesheet usage. 67: use HTML::TokeParser; 68: 69: my $usage=(<<END); 70: **** ERROR ERROR ERROR ERROR **** 71: Usage is for piml file to come in through standard input. 72: 1st argument is the category permissions to use (runtime or development) 73: 2nd argument is the distribution (default,redhat6,debian2.2,redhat7,etc). 74: 3rd argument is to manually specify a targetroot 75: 76: Only the 1st argument is mandatory for the program to run. 77: 78: Example: 79: 80: cat ../../doc/sanitycheck.piml |\\ 81: perl piml_parse.pl development default /home/sherbert/loncapa 82: END 83: 84: # ------------------------------------------------- Grab command line arguments 85: 86: # If number of arguments is incorrect, then give up and print usage message. 87: unless (@ARGV == 3) 88: { 89: @ARGV=();shift(@ARGV); 90: while(<>){} # throw away the input to avoid broken pipes 91: print($usage); # print usage message 92: exit -1; # exit with error status 93: } 94: 95: my $categorytype; 96: if (@ARGV) 97: { 98: $categorytype = shift(@ARGV); 99: } 100: 101: my $dist; 102: if (@ARGV) 103: { 104: $dist = shift(@ARGV); 105: } 106: 107: my $targetroot; 108: my $targetrootarg; 109: if (@ARGV) 110: { 111: $targetroot = shift(@ARGV); 112: } 113: 114: $targetroot=~s/\/$//; 115: $targetrootarg=$targetroot; 116: 117: my $logcmd='| tee -a WARNINGS'; 118: 119: my $invocation; 120: # --------------------------------------------------- Record program invocation 121: if ($mode eq 'install' or $mode eq 'configinstall' or $mode eq 'build') 122: { 123: $invocation=(<<END); 124: # Invocation: STDINPUT | piml_parse.pl 125: # 1st argument (category type) is: $categorytype 126: # 2nd argument (distribution) is: $dist 127: # 3rd argument (targetroot) is: described below 128: END 129: } 130: 131: # ---------------------------------------------------- Start first pass through 132: my @parsecontents = <>; 133: my $parsestring = join('',@parsecontents); 134: my $outstring=''; 135: 136: # Need to make a pass through and figure out what defaults are 137: # overrided. Top-down overriding strategy (leaves don't know 138: # about distant leaves). 139: 140: my @hierarchy; 141: $hierarchy[0]=0; 142: my $hloc=0; 143: my $token=''; 144: $parser = HTML::TokeParser->new(\$parsestring) or 145: die('can\'t create TokeParser object'); 146: $parser->xml_mode('1'); 147: my %hash; 148: my $key; 149: while ($token = $parser->get_token()) 150: { 151: if ($token->[0] eq 'S') 152: { 153: $hloc++; 154: $hierarchy[$hloc]++; 155: $key=$token->[1].join(',',@hierarchy[0..($hloc-1)]); 156: my $thisdist=' '.$token->[2]{'dist'}.' '; 157: if ($thisdist eq ' default ') 158: { 159: $hash{$key}=1; # there is a default setting for this key 160: } 161: elsif ($dist && $hash{$key}==1 && $thisdist=~/\s$dist\s/) 162: { 163: $hash{$key}=2; # disregard default setting for this key if 164: # there is a directly requested distribution match 165: } 166: } 167: if ($token->[0] eq 'E') 168: { 169: $hloc--; 170: } 171: } 172: 173: # --------------------------------------------------- Start second pass through 174: undef $hloc; 175: undef @hierarchy; 176: undef $parser; 177: $hierarchy[0]=0; 178: $parser = HTML::TokeParser->new(\$parsestring) or 179: die('can\'t create TokeParser object'); 180: $parser->xml_mode('1'); 181: my $cleanstring; 182: while ($token = $parser->get_token()) { 183: if ($token->[0] eq 'S') { 184: $hloc++; 185: $hierarchy[$hloc]++; 186: $key=$token->[1].join(',',@hierarchy[0..($hloc-1)]); 187: my $thisdist=' '.$token->[2]{'dist'}.' '; 188: # This conditional clause is set up to ignore two sets 189: # of invalid conditions before accepting entry into 190: # the cleanstring. 191: if ($hash{$key}==2 and 192: !($thisdist eq ' ' or $thisdist =~/\s$dist\s/)) { 193: if ($token->[4]!~/\/>$/) { 194: $parser->get_tag('/'.$token->[1]); 195: $hloc--; 196: } 197: } 198: elsif ($thisdist ne ' ' and $thisdist!~/\s$dist\s/ and 199: !($thisdist eq ' default ' and $hash{$key}!=2)) { 200: if ($token->[4]!~/\/>$/) { 201: $parser->get_tag('/'.$token->[1]); 202: $hloc--; 203: } 204: } 205: else { 206: $cleanstring.=$token->[4]; 207: } 208: if ($token->[4]=~/\/>$/) { 209: # $hloc--; 210: } 211: } 212: if ($token->[0] eq 'E') { 213: $cleanstring.=$token->[2]; 214: $hloc--; 215: } 216: if ($token->[0] eq 'T') { 217: $cleanstring.=$token->[1]; 218: } 219: } 220: $cleanstring=&trim($cleanstring); 221: $cleanstring=~s/\>\s*\n\s*\</\>\</g; 222: 223: # ---------------------------------------------------- Start final pass through 224: 225: # storage variables 226: my $piml; 227: my $categories; 228: my @categorynamelist; 229: my $category; 230: my $category_att_name; 231: my $category_att_type; 232: my $chown; 233: my $chmod; 234: my $abbreviation; # space-free abbreviation; esp. for image names 235: my $categoryname; 236: my $description; 237: my $files; 238: my $file; 239: my $target; 240: my $note; 241: my $commands; 242: my $command; 243: my $dependencies; 244: my @links; 245: my %categoryhash; 246: my $dpathlength; 247: my %fab; # file category abbreviation 248: my $directory_count; 249: my $file_count; 250: my $link_count; 251: my $fileglob_count; 252: my $fileglobnames_count; 253: my %categorycount; 254: 255: my $mode; 256: 257: my @buildall; 258: my @buildinfo; 259: 260: my @configall; 261: 262: # Make new parser with distribution specific input 263: undef($parser); 264: $parser = HTML::TokeParser->new(\$cleanstring) or 265: die('can\'t create TokeParser object'); 266: $parser->xml_mode('1'); 267: 268: # Define handling methods for mode-dependent text rendering 269: 270: $parser->{textify}={ 271: specialnotices => \&format_specialnotices, 272: specialnotice => \&format_specialnotice, 273: targetroot => \&format_targetroot, 274: categories => \&format_categories, 275: category => \&format_category, 276: abbreviation => \&format_abbreviation, 277: chown => \&format_chown, 278: chmod => \&format_chmod, 279: categoryname => \&format_categoryname, 280: files => \&format_files, 281: file => \&format_file, 282: target => \&format_target, 283: note => \&format_note, 284: build => \&format_build, 285: dependencies => \&format_dependencies, 286: filenames => \&format_filenames, 287: perlscript => \&format_perlscript, 288: TARGET => \&format_TARGET, 289: }; 290: 291: my $text; 292: my $token; 293: undef($hloc); 294: undef(@hierarchy); 295: my $hloc; 296: my @hierarchy2; 297: while ($token = $parser->get_tag('piml')) 298: { 299: &format_piml(@{$token}); 300: $text = &trim($parser->get_text('/piml')); 301: $token = $parser->get_tag('/piml'); 302: print($piml); 303: print("\n"); 304: print($text); 305: print("\n"); 306: print(&end()); 307: } 308: exit(0); 309: 310: # ---------- Functions (most all just format contents of different markup tags) 311: 312: # ------------------------ Final output at end of markup parsing and formatting 313: sub end { 314: 315: } 316: 317: # ----------------------- Take in string to parse and the separation expression 318: sub extract_array { 319: my ($stringtoparse,$sepexp) = @_; 320: my @a=split(/$sepexp/,$stringtoparse); 321: return \@a; 322: } 323: 324: # --------------------------------------------------------- Format piml section 325: sub format_piml { 326: my (@tokeninfo)=@_; 327: my $date=`date`; chop $date; 328: $piml=<<END; 329: #!/usr/bin/perl 330: 331: # Generated from a PIML (Post Installation Markup Language) document 332: 333: END 334: } 335: 336: # --------------------------------------------------- Format targetroot section 337: sub format_targetroot { 338: my $text=&trim($parser->get_text('/targetroot')); 339: $text=$targetroot if $targetroot; 340: $parser->get_tag('/targetroot'); 341: return('# TARGET INSTALL LOCATION is "'.$targetroot."\"\n"); 342: } 343: 344: # -------------------------------------------------- Format perl script section 345: sub format_perlscript { 346: my (@tokeninfo)=@_; 347: $mode=$tokeninfo[2]->{'mode'}; 348: my $text=$parser->get_text('/perlscript'); 349: $parser->get_tag('/perlscript'); 350: if ($mode eq 'bg') { 351: open(OUT,">/tmp/piml$$.pl"); 352: print(OUT $text); 353: close(OUT); 354: return(<<END); 355: # launch background process for $target 356: system("perl /tmp/piml$$.pl &"); 357: END 358: } 359: else { 360: return($text); 361: } 362: } 363: 364: # --------------------------------------------------------------- Format TARGET 365: sub format_TARGET { 366: my (@tokeninfo)=@_; 367: $parser->get_tag('/TARGET'); 368: return($target); 369: } 370: 371: # --------------------------------------------------- Format categories section 372: sub format_categories { 373: my $text=&trim($parser->get_text('/categories')); 374: $parser->get_tag('/categories'); 375: return('# CATEGORIES'."\n".$text); 376: } 377: 378: # --------------------------------------------------- Format categories section 379: sub format_category { 380: my (@tokeninfo)=@_; 381: $category_att_name=$tokeninfo[2]->{'name'}; 382: $category_att_type=$tokeninfo[2]->{'type'}; 383: $abbreviation=''; $chmod='';$chown=''; 384: $parser->get_text('/category'); 385: $parser->get_tag('/category'); 386: $fab{$category_att_name}=$abbreviation; 387: if ($category_att_type eq $categorytype) { 388: my ($user,$group)=split(/\:/,$chown); 389: $categoryhash{$category_att_name}='-o '.$user.' -g '.$group. 390: ' -m '.$chmod; 391: } 392: return(''); 393: } 394: 395: # --------------------------------------------------- Format categories section 396: sub format_abbreviation { 397: my @tokeninfo=@_; 398: $abbreviation=''; 399: my $text=&trim($parser->get_text('/abbreviation')); 400: if ($text) { 401: $parser->get_tag('/abbreviation'); 402: $abbreviation=$text; 403: } 404: return(''); 405: } 406: 407: # -------------------------------------------------------- Format chown section 408: sub format_chown { 409: my @tokeninfo=@_; 410: $chown=''; 411: my $text=&trim($parser->get_text('/chown')); 412: if ($text) { 413: $parser->get_tag('/chown'); 414: $chown=$text; 415: } 416: return(''); 417: } 418: 419: # -------------------------------------------------------- Format chmod section 420: sub format_chmod { 421: my @tokeninfo=@_; 422: $chmod=''; 423: my $text=&trim($parser->get_text('/chmod')); 424: if ($text) { 425: $parser->get_tag('/chmod'); 426: $chmod=$text; 427: } 428: return(''); 429: } 430: 431: # ------------------------------------------------- Format categoryname section 432: sub format_categoryname { 433: my @tokeninfo=@_; 434: $categoryname=''; 435: my $text=&trim($parser->get_text('/categoryname')); 436: if ($text) { 437: $parser->get_tag('/categoryname'); 438: $categoryname=$text; 439: } 440: return(''); 441: } 442: 443: # -------------------------------------------------------- Format files section 444: sub format_files { 445: my $text=$parser->get_text('/files'); 446: $parser->get_tag('/files'); 447: return("\n".'# There are '.$file_count.' files this script works on'. 448: "\n\n".$text); 449: } 450: 451: # --------------------------------------------------------- Format file section 452: sub format_file { 453: my @tokeninfo=@_; 454: $file=''; $source=''; $target=''; $categoryname=''; $description=''; 455: $note=''; $build=''; $status=''; $dependencies=''; 456: my $text=&trim($parser->get_text('/file')); 457: $file_count++; 458: $categorycount{$categoryname}++; 459: $parser->get_tag('/file'); 460: return("# File: $target\n". 461: "$text\n"); 462: } 463: 464: # ------------------------------------------------------- Format target section 465: sub format_target { 466: my @tokeninfo=@_; 467: $target=''; 468: my $text=&trim($parser->get_text('/target')); 469: if ($text) { 470: $parser->get_tag('/target'); 471: $target=$targetrootarg.$text; 472: } 473: return(''); 474: } 475: 476: # --------------------------------------------------------- Format note section 477: sub format_note { 478: my @tokeninfo=@_; 479: $note=''; 480: my $aref; 481: my $text; 482: while ($aref=$parser->get_token()) { 483: if ($aref->[0] eq 'E' && $aref->[1] eq 'note') { 484: last; 485: } 486: elsif ($aref->[0] eq 'S') { 487: $text.=$aref->[4]; 488: } 489: elsif ($aref->[0] eq 'E') { 490: $text.=$aref->[2]; 491: } 492: else { 493: $text.=$aref->[1]; 494: } 495: } 496: if ($text) { 497: $note=$text; 498: } 499: return(''); 500: } 501: 502: # ------------------------------------------------- Format dependencies section 503: sub format_dependencies { 504: my @tokeninfo=@_; 505: $dependencies=''; 506: my $text=&trim($parser->get_text('/dependencies')); 507: if ($text) { 508: $parser->get_tag('/dependencies'); 509: $dependencies=join(';', 510: (map {s/^\s*//;s/\s$//;$_} split(/\;/,$text))); 511: } 512: return(''); 513: } 514: 515: # ------------------------------------------------ Format specialnotice section 516: sub format_specialnotices { 517: $parser->get_tag('/specialnotices'); 518: return(''); 519: } 520: 521: # ------------------------------------------------ Format specialnotice section 522: sub format_specialnotice { 523: $parser->get_tag('/specialnotice'); 524: return(''); 525: } 526: 527: # ------------------------------------- Render less-than and greater-than signs 528: sub htmlsafe { 529: my $text=@_[0]; 530: $text =~ s/</</g; 531: $text =~ s/>/>/g; 532: return($text); 533: } 534: 535: # --------------------------------------- remove starting and ending whitespace 536: sub trim { 537: my ($s)=@_; $s=~s/^\s*//; $s=~s/\s*$//; return($s); 538: } 539: 540: # ----------------------------------- POD (plain old documentation, CPAN style) 541: 542: =pod 543: 544: =head1 NAME 545: 546: piml_parse.pl - This is meant to parse files meeting the piml document type. 547: See piml.dtd. PIML=Post Installation Markup Language. 548: 549: =head1 SYNOPSIS 550: 551: Usage is for piml file to come in through standard input. 552: 553: =over 4 554: 555: =item * 556: 557: 1st argument is the category permissions to use (runtime or development) 558: 559: =item * 560: 561: 2nd argument is the distribution 562: (default,redhat6,debian2.2,redhat7,etc). 563: 564: =item * 565: 566: 3rd argument is to manually specify a targetroot. 567: 568: =back 569: 570: Only the 1st argument is mandatory for the program to run. 571: 572: Example: 573: 574: cat ../../doc/loncapafiles.piml |\\ 575: perl piml_parse.pl development default /home/sherbert/loncapa 576: 577: =head1 DESCRIPTION 578: 579: I am using a multiple pass-through approach to parsing 580: the piml file. This saves memory and makes sure the server 581: will never be overloaded. 582: 583: =head1 README 584: 585: I am using a multiple pass-through approach to parsing 586: the piml file. This saves memory and makes sure the server 587: will never be overloaded. 588: 589: =head1 PREREQUISITES 590: 591: HTML::TokeParser 592: 593: =head1 COREQUISITES 594: 595: =head1 OSNAMES 596: 597: linux 598: 599: =head1 SCRIPT CATEGORIES 600: 601: Packaging/Administrative 602: 603: =head1 AUTHOR 604: 605: Scott Harrison 606: sharrison@users.sourceforge.net 607: 608: Please let me know how/if you are finding this script useful and 609: any/all suggestions. -Scott 610: 611: =cut