File:  [LON-CAPA] / loncom / localize / localize / checkduplicates.pl
Revision 1.2: download - view: text, annotated - select for diffs
Wed Apr 8 15:10:22 2009 UTC (15 years, 2 months ago) by bisitz
Branches: MAIN
CVS tags: version_2_9_99_0, version_2_12_X, version_2_11_X, version_2_11_4_uiuc, version_2_11_4_msu, version_2_11_4, version_2_11_3_uiuc, version_2_11_3_msu, version_2_11_3, version_2_11_2_uiuc, version_2_11_2_msu, version_2_11_2_educog, version_2_11_2, version_2_11_1, version_2_11_0_RC3, version_2_11_0_RC2, version_2_11_0_RC1, version_2_11_0, version_2_10_X, version_2_10_1, version_2_10_0_RC2, version_2_10_0_RC1, version_2_10_0, loncapaMITrelate_1, language_hyphenation_merge, language_hyphenation, bz6209-base, bz6209, bz5969, bz2851, PRINT_INCOMPLETE_base, PRINT_INCOMPLETE, HEAD, GCI_3, BZ5971-printing-apage, BZ5434-fox, BZ4492-merge, BZ4492-feature_horizontal_radioresponse
Heavily optimized version how to search for duplicate keys:
- Read translation file only once and directly count key occurrences
  (inclusion of lexicon hash not needed anymore; Thanks to Stefan Droeschler for the idea)
- More flexible key matching pattern
  (leading white spaces)
- Optimized key matching pattern (quotes)
- Now also print amount of each duplicate key

    1: #!/usr/bin/perl
    2: # The LearningOnline Network with CAPA
    3: # $Id: checkduplicates.pl,v 1.2 2009/04/08 15:10:22 bisitz Exp $
    4: 
    5: # 07.04.2009 Stefan Bisitz
    6: # Optimization ideas by Stefan Droeschler
    7: 
    8: use strict;
    9: use warnings;
   10: 
   11: my $man = "
   12: checkduplicates - Checks if hash keys in translation files occur more than one time. If so, a warning is displayed.
   13: 
   14: The found keys and corresponding values need to be changed. Otherwise, there is no gurantee which value is taken. This is dangerous, if same keys but different values are used or if one value is changed but the screen still shows the old value which actually comes from the other occurence.
   15: 
   16: 
   17: SYNOPSIS:\tcheckduplicates -h 
   18: \t\tcheckduplicates FILE
   19: 
   20: OPTIONS:
   21: -h\t\tDisplay this help and exit.
   22: 
   23: ";
   24: 
   25: my $filename; 
   26: die "Use option -h for help.\n" unless exists $ARGV[0];
   27: #analyze options
   28: if ( $ARGV[0] =~ m/^\s*-h/ ) {
   29: 	print $man;
   30: 	exit();
   31: }else{
   32: 	$filename = ($ARGV[0]);
   33: 	die "$filename is not a file.\n" unless -f $ARGV[0];
   34: }
   35: 
   36: 
   37: # ----------------------------------------------------------------
   38: # Start Analysis
   39: print "checkduplicates is searching for duplicates in $filename...\n";
   40: 
   41: # Manually read all stored keys from translation file (inlcuding probable duplicates)
   42: # and count key occurrences in a separate hash.
   43: my %counter;
   44: my $line;
   45: open( FH, "<", $filename ) or die "$filename cannot be opened\n";
   46: while ( !eof(FH) ) {
   47:     $line = readline(FH);
   48:     next if $line=~/^\s*#/; # ignore comments
   49:     #$exprNP=~s/^["'](.*)["']$/$1/; # Remove " and ' at beginning and end
   50:     if ($line =~ m/^\s+["'](.*)["']/) { # Find "..." or '...' key
   51:         $counter{$1}++;
   52:     }
   53: }
   54: close(FH);
   55: 
   56: # Print all keys which occures more than one time
   57: my $dupl = 0; # total counter to count when a key occurred more than one time
   58: foreach my $count_key (keys %counter) {
   59:     my $count_value = $counter{$count_key};
   60:     if ($count_value > 1) {
   61:         print 'Found '.$count_value.' times key: '.$count_key."\n";
   62:         $dupl++;
   63:     }
   64: }
   65: 
   66: if ($dupl == 0) {
   67:     print "Be happy - No duplicates found.\n";
   68: } else {
   69:     print "--- Found $dupl duplicate(s) in $filename which need to be corrected!\n";
   70: }
   71: 
   72: # ----------------------------------------------------------------
   73: 

FreeBSD-CVSweb <freebsd-cvsweb@FreeBSD.org>