File:  [LON-CAPA] / loncom / interface / entities.pm
Revision 1.19: download - view: text, annotated - select for diffs
Thu Jul 23 10:53:03 2009 UTC (14 years, 9 months ago) by foxr
Branches: MAIN
CVS tags: version_2_9_X, version_2_9_99_0, version_2_9_1, version_2_9_0, version_2_8_99_1, version_2_8_99_0, version_2_11_0_RC3, version_2_11_0_RC2, version_2_11_0_RC1, version_2_10_X, version_2_10_1, version_2_10_0_RC2, version_2_10_0_RC1, version_2_10_0, loncapaMITrelate_1, language_hyphenation_merge, language_hyphenation, bz6209-base, bz6209, bz2851, PRINT_INCOMPLETE_base, PRINT_INCOMPLETE, HEAD, GCI_3, GCI_2, BZ4492-merge, BZ4492-feature_horizontal_radioresponse
This merger with head should resolve BZ 5969, 5970, 5732 5927 .. various
problems with printing pages, as well as some missing/not quite right entity
handling in the arrows.

    1: # The LearningOnline Network
    2: # entity -> tex.
    3: #
    4: # $Id: entities.pm,v 1.19 2009/07/23 10:53:03 foxr Exp $
    5: #
    6: # Copyright Michigan State University Board of Trustees
    7: #
    8: # This file is part of the LearningOnline Network with CAPA (LON-CAPA).
    9: #
   10: # LON-CAPA is free software; you can redistribute it and/or modify
   11: # it under the terms of the GNU General Public License as published by
   12: # the Free Software Foundation; either version 2 of the License, or
   13: # (at your option) any later version.
   14: #
   15: # LON-CAPA is distributed in the hope that it will be useful,
   16: # but WITHOUT ANY WARRANTY; without even the implied warranty of
   17: # MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
   18: # GNU General Public License for more details.
   19: #
   20: # You should have received a copy of the GNU General Public License
   21: # along with LON-CAPA; if not, write to the Free Software
   22: # Foundation, Inc., 59 Temple Place, Suite 330, Boston, MA  02111-1307  USA
   23: #
   24: # /home/httpd/html/adm/gpl.txt
   25: # http://www.lon-capa.org/
   26: #
   27: #
   28: 
   29: package Apache::entities;
   30: use strict;
   31: 
   32: =pod
   33: 
   34: =head1 TABLES ASCII code page
   35: 
   36: =over
   37: 
   38: =item (7-13)
   39: 
   40:     Translation to empty strings
   41: 
   42: =item (32-126)
   43: 
   44:     Translations to simple characters
   45: 
   46: =item (130-140)
   47: 
   48:     Controls and Latin-1 supplement.  Note that some entities that have
   49:     visible effect are not printing unicode characters.  Specifically
   50:     ‚- 
   51: 
   52: =item (145-156)
   53: 
   54:     There's a gap here in my entity table
   55: 
   56: =item (159-255)
   57: 
   58:      Another short gap
   59: 
   60: =item (295)
   61: 
   62:      hbar entity number comes from the unicode character:
   63:      see e.g. http://www.unicode.org/charts/PDF/U0100.pdf
   64:      ISO also documents a 'planck' entity.
   65: 
   66: =item (338-376)
   67: 
   68:     Latin extended-A HTML 4.01 entities
   69: 
   70: =item (402)
   71: 
   72:     Latin extended B HTML 4.01 entities
   73: 
   74: =item (710 & 732)
   75: 
   76:     Spacing modifier letters
   77: 
   78: =item (913-937)
   79: 
   80:     Greek uppercase (skipss 930)
   81: 
   82: =item (945-982)
   83: 
   84:     Greek lowercase
   85: 
   86: =item (8194-8364)
   87: 
   88:     The general punctuation set
   89: 
   90: =item (8472-8501)
   91: 
   92:     Letter like symbols
   93: 
   94: =item (8592-8669)
   95: 
   96:     Arrows and then some (harpoons from Hon Kie).
   97: 
   98: =item (8704-8734)
   99: 
  100:     Mathematical operators.
  101: 
  102: =item (8735-9830)
  103: 
  104:     The items below require the isoent latex package which I can't find at least for FC5.
  105:     Temporarily commented out.
  106: 
  107: =back
  108: 
  109: =cut
  110: 
  111: my %entities = (
  112: 
  113:     # Translation to empty strings:
  114: 
  115:     7        => "",
  116:     9        => "",
  117:     10       => "",
  118:     13       => "",
  119:     
  120:     # Translations to simple characters:
  121: 
  122:     32       => " ",
  123:     33       => "!",
  124:     34       => '"',
  125:     'quot'   => '"',
  126:     35       => '\\#',
  127:     36       => '\\$',
  128:     37       => '\%',
  129:     38       => '\&',
  130:     'amp'    => '\&',
  131:     39       => '\'',		# Apostrophe
  132:     40       => '(',
  133:     41       => ')',
  134:     42       => '*',
  135:     43       => '+',
  136:     44       => ',',		#  comma
  137:     45       => '-',
  138:     46       => '.',
  139:     47       => '/',
  140:     48       => '0',
  141:     49       => '1',
  142:     50       => '2',
  143:     51       => '3',
  144:     52       => '4',
  145:     53       => '5',
  146:     54       => '6',
  147:     55       => '7',
  148:     56       => '8',
  149:     57       => '9',
  150:     58       => ':',
  151:     59       => ';',
  152:     60       => '\ensuremath{<}',
  153:     'lt'     => '\ensuremath{<}',
  154:     61       => '\ensuremath{=}',
  155:     62       => '\ensuremath{>}',
  156:     'gt'     => '\ensuremath{>}',
  157:     63       => '?',
  158:     64       => '@',
  159:     65       => 'A',
  160:     66       => 'B',
  161:     67       => 'C',
  162:     68       => 'D',
  163:     69       => 'E',
  164:     70       => 'F',
  165:     71       => 'G',
  166:     72       => 'H',
  167:     73       => 'I',
  168:     74       => 'J',
  169:     75       => 'K',
  170:     76       => 'L',
  171:     77       => 'M',
  172:     78       => 'N',
  173:     79       => 'O',
  174:     80       => 'P',
  175:     81       => 'Q',
  176:     82       => 'R',
  177:     83       => 'S',
  178:     84       => 'T',
  179:     85       => 'U',
  180:     86       => 'V',
  181:     87       => 'W',
  182:     88       => 'X',
  183:     89       => 'Y',
  184:     90       => 'Z',
  185:     91       => '[',
  186:     92       => '\ensuremath{\setminus}', # \setminus is \ with special spacing.
  187:     93       => ']',
  188:     94       => '\ensuremath{\wedge}',
  189:     95       => '\underline{\makebox[2mm]{\strut}}', # Underline 2mm of space for _
  190:     96       => '`',
  191:     97       => 'a',
  192:     98       => 'b',
  193:     99       => 'c',
  194:     100      => 'd',
  195:     101      => 'e',
  196:     102      => 'f',
  197:     103      => 'g',
  198:     104      => 'h', 
  199:     105      => 'i',
  200:     106      => 'j',
  201:     107      => 'k',
  202:     108      => 'l',
  203:     109      => 'm',
  204:     110      => 'n',
  205:     111      => 'o',
  206:     112      => 'p',
  207:     113      => 'q',
  208:     114      => 'r',
  209:     115      => 's',
  210:     116      => 't',
  211:     117      => 'u',
  212:     118      => 'v',
  213:     119      => 'w',
  214:     120      => 'x',
  215:     121      => 'y',
  216:     122      => 'z',
  217:     123      => '\{',
  218:     124      => '|',
  219:     125      => '\}',
  220:     126      => '\~',
  221: 
  222:     # Controls and Latin-1 supplement.
  223: 
  224:     130     => ',',
  225:     131     => '\ensuremath{f}',
  226:     132     => ',,',		# Low double left quotes.
  227:     133     => '\ensuremath{\ldots}',
  228:     134     => '\ensuremath{\dagger}',
  229:     135     => '\ensuremath{\ddagger}',
  230:     136     => '\ensuremath{\wedge}',
  231:     137     => '\textperthousand ',
  232:     138     => '\v{S}',
  233:     139     => '\ensuremath{<}',
  234:     140     => '{\OE}',
  235:     
  236:     # There's a gap here in my entity table
  237: 
  238:     145     => '`',
  239:     146     => '\'',
  240:     147     => '``',
  241:     148     => '\'\'',
  242:     149     => '\ensuremath{\bullet}',
  243:     150     => '--',
  244:     151     => '---',
  245:     152     => '\ensuremath{\sim}',
  246:     153     => '\texttrademark',
  247:     154     => '\v{s}',
  248:     155     => '\ensuremath{>}',
  249:     156     => '\oe ',
  250: 
  251:      # Another short gap:
  252: 
  253:     159     => '\"Y',
  254:     160     => '~',
  255:     'nbsp'  => '~',
  256:     161     => '\textexclamdown ',
  257:     'iexcl' => '\textexclamdown ',
  258:     162     => '\textcent ',
  259:     'cent'  => '\textcent ',
  260:     163     => '\pounds ',
  261:     'pound' => '\pounds ',
  262:     164     => '\textcurrency ',
  263:     'curren' => '\textcurrency ',
  264:     165     => '\textyen ',
  265:     'yen'   => '\textyen ',
  266:     166     => '\textbrokenbar ',
  267:     'brvbar' => '\textbrokenbar ',
  268:     167     => '\textsection ',
  269:     'sect'  => '\textsection ',
  270:     168     => '\"{}',
  271:     'uml'   => '\"{}',
  272:     169     => '\copyright ',
  273:     'copy'  => '\copyright ',
  274:     170     => '\textordfeminine ',
  275:     'ordf'  => '\textordfeminine ',
  276:     171     => '\ensuremath{\ll}', # approximation of left angle quote.
  277:     'laquo' => '\ensuremath{\ll}', #   ""
  278:     172     => '\ensuremath{\neg}',
  279:     'not'   => '\ensuremath{\neg}',
  280:     173     => ' - ',
  281:     'shy'   => ' - ',
  282:     174     => '\textregistered ',
  283:     'reg'   => '\textregistered ',
  284:     175     => '\ensuremath{^{-}}',
  285:     'macr'  => '\ensuremath{^{-}}',
  286:     176     => '\ensuremath{^{\circ}}',
  287:     'deg'   => '\ensuremath{^{\circ}}',
  288:     177     => '\ensuremath{\pm}',
  289:     'plusmn' => '\ensuremath{\pm}',
  290:     178     => '\ensuremath{^2}',
  291:     'sup2'  => '\ensuremath{^2}',
  292:     179     => '\ensuremath{^3}',
  293:     'sup3'  => '\ensuremath{^3}',
  294:     180     => "\\'{}",
  295:     'acute' => "\\'{}",
  296:     181     => '\ensuremath{\mu}',
  297:     'micro' => '\ensuremath{\mu}',
  298:     182     => '\P ',
  299:     para    => '\P ',
  300:     183     => '\ensuremath{\cdot}',
  301:     'middot' => '\ensuremath{\cdot}',
  302:     184     => '\c{\strut}',
  303:     'cedil' => '\c{\strut}',
  304:     185     => '\ensuremath{^1}',
  305:     sup1    => '\ensuremath{^1}',
  306:     186     => '\textordmasculine ',
  307:     'ordm'  => '\textordmasculine ',
  308:     187     => '\ensuremath{\gg}',
  309:     'raquo' => '\ensuremath{\gg}',
  310:     188     => '\textonequarter ',
  311:     'frac14' => '\textonequarter ',
  312:     189     => '\textonehalf' ,
  313:     'frac12' => '\textonehalf' ,
  314:     190     => '\textthreequarters ',
  315:     'frac34' => '\textthreequarters ',
  316:     191     =>  '\textquestiondown ',
  317:     'iquest' => '\textquestiondown ',
  318:     192     => '\\`{A}',
  319:     'Agrave' => '\\`{A}',
  320:     193     => "\\'{A}",
  321:     'Aacute' => "\\'{A}",
  322:     194     => '\^{A}',
  323:     'Acirc' => '\^{A}',
  324:     195     => '\~{A}',
  325:     'Atilde'=> '\~{A}',
  326:     196     => '\\"{A}',
  327:     'Auml'  => '\\"{A}',
  328:     197     => '{\AA}',
  329:     'Aring' => '{\AA}',
  330:     198     => '{\AE}',
  331:     'AElig' => '{\AE}',
  332:     199     => '\c{c}',
  333:     'Ccedil'=> '\c{c}',
  334:      200   =>  '\\`{E}',
  335:     'Egrave'=> '\\`{E}',
  336:     201     => "\\'{E}",
  337:     'Eacute'=> "\\'{E}",
  338:     202     => '\\^{E}',
  339:     'Ecirc' => '\\^{E}',
  340:     203     => '\\"{E}',
  341:     'Euml'  => '\\"{E}',
  342:     204     => '\\`{I}',
  343:     'Igrave'=> '\\`{I}',
  344:     205     => "\\'{I}",
  345:     'Iacute'=> "\\'{I}",
  346:     206     => '\\^{I}',
  347:     'Icirc' => '\\^{I}',
  348:     207     => '\\"{I}',
  349:     'Iuml'  => '\\"{I}',
  350:     208     => '\DH',
  351:     'ETH'   => '\DH',
  352:     209     => '\~{N}',
  353:     'Ntilde'=> '\~{N}',
  354:     210     => '\\`{O}',
  355:     'Ograve'=> '\\`{O}',
  356:     211     => "\\'{O}",
  357:     'Oacute'=> "\\'{O}",
  358:     212     => '\\^{O}',
  359:     'Ocirc' => '\\^{O}',
  360:     213     => '\~{O}',
  361:     'Otilde'=> '\~{O}',
  362:     214     => '\\"{O}',
  363:     'Ouml'  => '\\"{O}',
  364:     215     => '\ensuremath{\times}',
  365:     'times' => '\ensuremath{\times}',
  366:     216     => '\O',
  367:     'Oslash'=> '\O',
  368:     217     => '\\`{U}',
  369:     'Ugrave'=> '\\`{U}',
  370:     218     => "\\'{U}",
  371:     'Uacute'=> "\\'{U}",
  372:     219     => '\\^{U}',
  373:     'Ucirc' => '\\^{U}',
  374:     220     => '\\"{U}',
  375:     'Uuml'  => '\\"{U}',
  376:     221     => "\\'{Y}",
  377:     'Yacute'=> "\\'{Y}",
  378:     223     => '{\ss}',
  379:     'szlig' => '{\ss}',
  380:     224     => '\\`{a}',
  381:     'agrave'=> '\\`{a}',
  382:     225     => "\\'{a}",
  383:     'aacute'=> "\\'{a}",
  384:     226     => '\\^{a}',
  385:     'acirc' => '\\^{a}',
  386:     227     => '\\~{a}',
  387:     'atilde'=> '\\~{a}',
  388:     228     => '\\"{a}',
  389:     'auml'  => '\\"{a}',
  390:     229     => '\aa',
  391:     'aring' => '\aa',
  392:     230     => '\ae',
  393:     'aelig' => '\ae',
  394:     231     => '\c{c}',
  395:     'ccedil'=> '\c{c}',
  396:     232     => '\\`{e}',
  397:     'egrave'=> '\\`{e}',
  398:     233     => "\\'{e}",
  399:     'eacute'=> "\\'{e}",
  400:     234     => '\\^{e}',
  401:     'ecirc' => '\\^{e}',
  402:     235     => '\\"{e}',
  403:     'euml'  => '\\"{e}',
  404:     236     => '\\`{i}',
  405:     'igrave'=> '\\`{i}',
  406:     237     => "\\'{i}",
  407:     'iacute'=> "\\'{i}",
  408:     238     => '\\^{i}',
  409:     'icirc' => '\\^{i}',
  410:     239     => '\\"{i}',
  411:     'iuml'  => '\\"{i}',
  412:     241     => '\\~{n}',
  413:     'ntilde'=> '\\~{n}',
  414:     242     => '\\`{o}',
  415:     'ograve'=> '\\`{o}',
  416:     243     => "\\'{o}",
  417:     'oacute'=> "\\'{o}",
  418:     244     => '\\^{o}',
  419:     'ocirc' => '\\^{o}',
  420:     245     => '\\~{o}',
  421:     'otilde'=> '\\~{o}',
  422:     246     => '\\"{o}',
  423:     'ouml'  => '\\"{o}',
  424:     247     => '\ensuremath{\div}',
  425:     'divide'=> '\ensuremath{\div}',
  426:     248     => '{\o}',
  427:     'oslash'=> '{\o}',
  428:     249     => '\\`{u}',
  429:     'ugrave'=> '\\`{u}',
  430:     250     => "\\'{u}",
  431:     'uacute'=> "\\'{u}",
  432:     251     => '\\^{u}',
  433:     'ucirc' => '\\^{u}',
  434:     252     => '\\"{u}',
  435:     'uuml'  => '\\"{u}',
  436:     253     => "\\'{y}",
  437:     'yacute'=> "\\'{y}",
  438:     255     => '\\"{y}',
  439:     'yuml'  => '\\"{y}',
  440: 
  441: 
  442:      # hbar entity number comes from the unicode character:
  443: 
  444:     295     => '\ensuremath{\hbar}',
  445:     'planck' => '\ensuremath{\hbar}',
  446: 
  447:     # Latin extended-A HTML 4.01 entities:
  448: 
  449:     338      => '\OE',
  450:     'OElig'  => '\OE',
  451:     339      => '\oe',
  452:     'oelig'  => '\oe',
  453:     352      => '\v{S}',
  454:     'Scaron' => '\v{S}',
  455:     353      => '\v{s}',
  456:     'scaron' => '\v{s}',
  457:     376      => '\\"{Y}',
  458:     'Yuml'   => '\\"{Y}', 
  459: 
  460:     # Latin extended B HTML 4.01 entities
  461: 
  462:     402      => '\ensuremath{f}',
  463:     'fnof'   => '\ensuremath{f}',
  464: 
  465:     # Spacing modifier letters:
  466:     
  467:     710      => '\^{}',
  468:     'circ'   => '\^{}',
  469:     732      => '\~{}',
  470:     'tilde'  => '\~{}',
  471: 
  472:     # Greek uppercase:
  473: 
  474:     913      => '\ensuremath{\mathrm{A}}',
  475:     'Alpha'  => '\ensuremath{\mathrm{A}}',
  476:     914      => '\ensuremath{\mathrm{B}}',
  477:     'Beta'   => '\ensuremath{\mathrm{B}}',
  478:     915      => '\ensuremath{\Gamma}',
  479:     'Gamma'  => '\ensuremath{\Gamma}',
  480:     916      => '\ensuremath{\Delta}',
  481:     'Delta'  => '\ensuremath{\Delta}',
  482:     917      => '\ensuremath{\mathrm{E}}',
  483:     'Epsilon'=> '\ensuremath{\mathrm{E}}',
  484:     918      => '\ensuremath{\mathrm{Z}}',
  485:     'Zeta'   => '\ensuremath{\mathrm{Z}}',
  486:     919      => '\ensuremath{\mathrm{H}}',
  487:     'Eta'    => '\ensuremath{\mathrm{H}}',
  488:     920      => '\ensuremath{\Theta}',
  489:     'Theta'  => '\ensuremath{\Theta}',
  490:     921      => '\ensuremath{\mathrm{I}}',
  491:     'Iota'   => '\ensuremath{\mathrm{I}}',
  492:     922      => '\ensuremath{\mathrm{K}}',
  493:     'Kappa'  => '\ensuremath{\mathrm{K}}',
  494:     923      => '\ensuremath{\Lambda}',
  495:     'Lambda' => '\ensuremath{\Lambda}',
  496:     924      => '\ensuremath{\mathrm{M}}',
  497:     'Mu'     => '\ensuremath{\mathrm{M}}',
  498:     925      => '\ensuremath{\mathrm{N}}',
  499:     'Nu'     => '\ensuremath{\mathrm{N}}',
  500:     926      => '\ensuremath{\mathrm{\Xi}}',
  501:     'Xi'     => '\ensuremath{\mathrm{\Xi}}',
  502:     927      => '\ensuremath{\mathrm{O}}',
  503:     'Omicron'=> '\ensuremath{\mathrm{O}}',
  504:     928      => '\ensuremath{\Pi}',
  505:     'Pi'     => '\ensuremath{\Pi}',
  506:     929      => '\ensuremath{\mathrm{P}}',
  507:     'Rho'    => '\ensuremath{\mathrm{P}}',
  508:     931      => '\ensuremath{\Sigma}',
  509:     'Sigma'  => '\ensuremath{\Sigma}',
  510:     932      => '\ensuremath{\mathrm{T}}',
  511:     'Tau'    => '\ensuremath{\mathrm{T}}',
  512:     933      => '\ensuremath{\Upsilon}',
  513:     'Upsilon'=> '\ensuremath{\Upsilon}',
  514:     934      => '\ensuremath{\Phi}',
  515:     'Phi'    => '\ensuremath{\Phi}',
  516:     935      => '\ensuremath{\mathrm{X}}',
  517:     'Chi'    => '\ensuremath{\mathrm{X}}',
  518:     936      => '\ensuremath{\Psi}',
  519:     'Psi'    => '\ensuremath{\Psi}',
  520:     937      => '\ensuremath{\Omega}',
  521:     'Omega'  => '\ensuremath{\Omega}',
  522: 
  523:     # Greek lowercase:
  524: 
  525:     945      => '\ensuremath{\alpha}',
  526:     'alpha'  => '\ensuremath{\alpha}',
  527:     946      => '\ensuremath{\beta}',
  528:     'beta'   => '\ensuremath{\beta}',
  529:     947      => '\ensuremath{\gamma}',
  530:     'gamma'  => '\ensuremath{\gamma}',
  531:     948      => '\ensuremath{\delta}',
  532:     'delta'  => '\ensuremath{\delta}',
  533:     949      => '\ensuremath{\epsilon}',
  534:     'epsilon'=> '\ensuremath{\epsilon}',
  535:     950      => '\ensuremath{\zeta}',
  536:     'zeta'   => '\ensuremath{\zeta}',
  537:     951      => '\ensuremath{\eta}',
  538:     'eta'    => '\ensuremath{\eta}',
  539:     952      => '\ensuremath{\theta}',
  540:     'theta'  => '\ensuremath{\theta}',
  541:     953      => '\ensuremath{\iota}',
  542:     'iota'   => '\ensuremath{\iota}',
  543:     954      => '\ensuremath{\kappa}',
  544:     'kappa'  => '\ensuremath{\kappa}',
  545:     955      => '\ensuremath{\lambda}',
  546:     'lambda' => '\ensuremath{\lambda}',
  547:     956      => '\ensuremath{\mu}',
  548:     'mu'     => '\ensuremath{\mu}',
  549:     957      => '\ensuremath{\nu}',
  550:     'nu'     => '\ensuremath{\nu}',
  551:     958      => '\ensuremath{\xi}',
  552:     'xi'     => '\ensuremath{\xi}',
  553:     959      => '\ensuremath{o}',
  554:     'omicron'=> '\ensuremath{o}',
  555:     960      => '\ensuremath{\pi}',
  556:     'pi'     => '\ensuremath{\pi}',
  557:     961      => '\ensuremath{\rho}',
  558:     'rho'    => '\ensuremath{\rho}',
  559:     962      => '\ensuremath{\varsigma}',
  560:     'sigmaf' => '\ensuremath{\varsigma}',
  561:     963      => '\ensuremath{\sigma}',
  562:     'sigma'  => '\ensuremath{\sigma}',
  563:     964      => '\ensuremath{\tau}',
  564:     'tau'    => '\ensuremath{\tau}',
  565:     965      => '\ensuremath{\upsilon}',
  566:     'upsilon'=> '\ensuremath{\upsilon}',
  567:     966      => '\ensuremath{\phi}',
  568:     'phi'    => '\ensuremath{\phi}',
  569:     967      => '\ensuremath{\chi}',
  570:     'chi'    => '\ensuremath{\chi}',
  571:     968      => '\ensuremath{\psi}',
  572:     'psi'    => '\ensuremath{\psi}',
  573:     969      => '\ensuremath{\omega}',
  574:     'omega'  => '\ensuremath{\omega}',
  575:     977      => '\ensuremath{\vartheta}',
  576:     'thetasym'=>'\ensuremath{\vartheta}',
  577:     978      => '\ensuremath{\mathit{\Upsilon}}',
  578:     'upsih'  => '\ensuremath{\mathit{\Upsilon}}',
  579:     982      => '\ensuremath{\varpi}',
  580:     'piv'    => '\ensuremath{\varpi}',
  581: 
  582:     # The general punctuation set:
  583: 
  584:     8194,    => '\hspace{.5em}',
  585:     'enspc'  => '\hspace{.5em}',
  586:     8195     => '\hspace{1.0em}',
  587:     'emspc'  => '\hspace{1.0em}',
  588:     8201     => '\hspace{0.167em}',
  589:     'thinsp' => '\hspace{0.167em}',
  590:     8204     => '{}',
  591:     'zwnj'   => '{}',
  592:     8205     => '',
  593:     'zwj'    => '',
  594:     8206     => '',
  595:     'lrm'    => '',
  596:     8207     => '',
  597:     'rlm'    => '',
  598:     8211     => '--',
  599:     'ndash'  => '--',
  600:     8212     => '---',
  601:     'mdash'  => '---',
  602:     8216     => '`',
  603:     'lsquo'  => '`',
  604:     8217     => "'",
  605:     'rsquo'  => "'",
  606:     8218     => '\quotesinglbase',
  607:     'sbquo'  => '\quotesinglbase',
  608:     8220     => '``',
  609:     'ldquo'  => '``',
  610:     8221     => "''",
  611:     'rdquo'  => "''",
  612:     8222     => '\quotedblbase',
  613:     'bdquo'  => '\quotedblbase',
  614:     8224     => '\ensuremath{\dagger}',
  615:     'dagger' => '\ensuremath{\dagger}',
  616:     '8225'   => '\ensuremath{\ddag}',
  617:     'Dagger' => '\ensuremath{\ddag}',
  618:     8226     => '\textbullet',
  619:     'bull'   => '\textbullet',
  620:     8230     => '\textellipsis',
  621:     'hellep' => '\textellipsis',
  622:     8240     => '\textperthousand',
  623:     permil   => '\textperthousand',
  624:     8242     => '\textquotesingle',
  625:     'prime'  => '\textquotesingle',
  626:     8243     => '\textquotedbl',
  627:     'Prime'  => '\textquotedbl',
  628:     8249     => '\guilsinglleft',
  629:     'lsaquo' => '\guilsinglleft',
  630:     8250     => '\guilsinglright',
  631:     'rsaquo' => '\guilsinglright',
  632:     8254     => '\textasciimacron',
  633:     oline    => '\textasciimacron',
  634:     8260     => '\textfractionsolidus',
  635:     'frasl'  => '\textfractionsolidus',
  636:     8364     => '\texteuro',
  637:     'euro'   => '\texteuro',
  638: 
  639:     # Letter like symbols
  640:     
  641:     8472     => '\ensuremath{\wp}',
  642:     'weierp' => '\ensuremath{\wp}',
  643:     8465     => '\ensuremath{\Im}',
  644:     'image'  => '\ensuremath{\Im}',
  645:     8476     => '\ensuremath{\Re}',
  646:     'real'   => '\ensuremath{\Re}',
  647:     8482     => '\texttrademark',
  648:     'trade'  => '\texttrademark',
  649:     8501     => '\ensuremath{\aleph}',
  650:     'alefsym'=> '\ensuremath{\aleph}',
  651: 
  652:     # Arrows and then some (harpoons from Hon Kie).
  653: 
  654:     8592     => '\ensuremath{\leftarrow}',
  655:     'larr'   => '\ensuremath{\leftarrow}',
  656:     8593     => '\ensuremath{\uparrow}',
  657:     'uarr'   => '\ensuremath{\uparrow}',
  658:     8594     => '\ensuremath{\rightarrow}',
  659:     'rarr'   => '\ensuremath{\rightarrow}',
  660:     8595     => '\ensuremath{\downarrow}',
  661:     'darr'   => '\ensuremath{\downarrow}',
  662:     8596     => '\ensuremath{\leftrightarrow}',
  663:     'harr'   => '\ensuremath{\leftrightarrow}',
  664:     8598     => '\ensuremath{\nwarrow}',
  665:     8599     => '\ensuremath{\nearrow}',
  666:     8600     => '\ensuremath{\searrow}',
  667:     8601     => '\ensuremath{\swarrow}',
  668:     8605     => '\ensuremath{\leadsto}',
  669:     8614     => '\ensuremath{\mapsto}',
  670:     8617     => '\ensuremath{\hookleftarrow}',
  671:     8618     => '\ensuremath{\hookrightarrow}',
  672:     8629     => '\ensuremath{\hookleftarrow}', # not an exact match but best I know.
  673:     'crarr'  => '\ensuremath{\hookleftarrow}', # not an exact match but best I know.
  674:     8636     => '\ensuremath{\leftharpoonup}',
  675:     8637     => '\ensuremath{\leftharpoondown}',
  676:     8640     => '\ensuremath{\rightharpoonup}',
  677:     8641     => '\ensuremath{\rightharpoondown}',
  678:     8652     => '\ensuremath{\rightleftharpoons}',
  679:     8656     => '\ensuremath{\Leftarrow}',
  680:     'lArr'   => '\ensuremath{\Leftarrow}',
  681:     8657     => '\ensuremath{\Uparrow}',
  682:     'uArr'   => '\ensuremath{\Uparrow}',
  683:     8658     => '\ensuremath{\Rightarrow}',
  684:     'rArr'   => '\ensuremath{\Rightarrow}',
  685:     8659     => '\ensuremath{\Downarrow}',
  686:     'dArr'   => '\ensuremath{\Downarrow}',
  687:     8660     => '\ensuremath{\Leftrightarrow}',
  688:     'hArr'   => '\ensuremath{\Leftrightarrow}',
  689:     8661     => '\ensuremath{\Updownarrow}',
  690:     'vArr'   => '\ensuremath{\Updownarrow}',
  691:     8666     => '\ensuremath{\Lleftarrow}',
  692:     'lAarr'   => '\ensuremath{\Lleftarrow}',
  693:     8667     => '\ensuremath{\Rrightarrow}',
  694:     'rAarr'  => '\ensuremath{\Rrightarrow}',
  695:     8669     => '\ensuremath{\rightsquigarrow}',
  696:     'rarrw'  => '\ensuremath{\rightsquigarrow}',
  697:     
  698:     # Mathematical operators.
  699:     
  700:     'forall' => '\ensuremath{\forall}',
  701:     8704     => '\ensuremath{\forall}',
  702:     'comp'   => '\ensuremath{\complement}',
  703:     8705     => '\ensuremath{\complement}',
  704:     'part'   => '\ensuremath{\partial}',
  705:     8706     => '\ensuremath{\partial}',
  706:     'exist'  => '\ensuremath{\exists}',
  707:     8707     => '\ensuremath{\exists}',
  708:     'nexist' => '\ensuremath{\nexists}',
  709:     8708     => '\ensuremath{\nexists}',
  710:     'empty'  => '\ensuremath{\emptyset}',
  711:     8709     => '\ensuremath{\emptyset}',
  712:     8710     => '\ensuremath{\Delta}',
  713:     'nabla'  => '\ensuremath{\nabla}',
  714:     8711     => '\ensuremath{\nabla}',
  715:     'isin'   => '\ensuremath{\in}',
  716:     8712     => '\ensuremath{\in}',
  717:     'notin'  => '\ensuremath{\notin}',
  718:     8713     => '\ensuremath{\notin}',
  719:     ni       => '\ensuremath{\ni}',
  720:     8715     => '\ensuremath{\ni}',
  721:     8716     => '\ensuremath{\not\ni}',
  722:     'prod'   => '\ensuremath{\prod}',
  723:     8719     => '\ensuremath{\prod}',
  724:     8720     => '\ensuremath{\coprod}',
  725:     'sum'    => '\ensuremath{\sum}',
  726:     8721     => '\ensuremath{\sum}',
  727:     'minus'  => '\ensuremath{-}',
  728:     8722     => '\ensuremath{-}',
  729:     8723     => '\ensuremath{\mp}',
  730:     8724     => '\ensuremath{\dotplus}',
  731:     8725     => '\ensuremath{\diagup}',
  732:     8726     => '\ensuremath{\smallsetminus}',
  733:     'lowast' => '\ensuremath{*}',
  734:     8727     => '\ensuremath{*}',
  735:     8728     => '\ensuremath{\circ}',
  736:     8729     => '\ensuremath{\bullet}',
  737:     'radic'  => '\ensuremath{\surd}',
  738:     8730     => '\ensuremath{\surd}',
  739:     8731     => '\ensuremath{\sqrt[3]{}}',
  740:     8732     => '\ensuremath{\sqrt[4]{}}',
  741:     'prop'   => '\ensuremath{\propto}',
  742:     8733     => '\ensuremath{\propto}',
  743:     'infin'  => '\ensuremath{\infty}',
  744:     8734     => '\ensuremath{\infty}',
  745: 
  746:     # The items below require the isoent latex package which I can't find at least for FC5.
  747:     # Temporarily commented out.
  748:     
  749:     'ang90'  => '\ensuremath{\sqangle}',
  750:     8735     => '\ensuremath{\sqangle}',
  751: 
  752:     'ang'    => '\ensuremath{\angle}',
  753:     8736     => '\ensuremath{\angle}',
  754:     'angmsd' => '\ensuremath{\measuredangle}',
  755:     8737     => '\ensuremath{\measuredangle}',
  756:     'angsph' => '\ensuremath{\sphericalangle}',
  757:     8738     => '\ensuremath{\sphericalangle}',
  758:     8739     => '\ensuremath{\vert}',
  759:     8740     => '\ensuremath{\Vert}',
  760:     'and'    => '\ensuremath{\land}',
  761:     8743     => '\ensuremath{\land}',
  762:     'or'     => '\ensuremath{\lor}',
  763:     8744     => '\ensuremath{\lor}',
  764:     'cap'    => '\ensuremath{\cap}',
  765:     8745     => '\ensuremath{\cap}',
  766:     'cup'    => '\ensuremath{\cup}',
  767:     8746     => '\ensuremath{\cup}',
  768:     'int'    => '\ensuremath{\int}',
  769:     8747     => '\ensuremath{\int}',
  770:     'conint' => '\ensuremath{\oint}',
  771:     8750     => '\ensuremath{\oint}',
  772:     'there4' => '\ensuremath{\therefore}',
  773:     8756     => '\ensuremath{\therefore}',
  774:     'becaus' => '\ensuremath{\because}',
  775:     8757     => '\ensuremath{\because}',
  776:     8758     => '\ensuremath{:}',
  777:     8759     => '\ensuremath{::}',
  778:     'sim'    => '\ensuremath{\sim}',
  779:     8764     => '\ensuremath{\sim}',
  780:     8765     => '\ensuremath{\backsim}',
  781:     'wreath' => '\ensuremath{\wr}',
  782:     8768     => '\ensuremath{\wr}',
  783:     'nsim'   => '\ensuremath{\not\sim}',
  784:     8769     => '\ensuremath{\not\sim}',
  785: #    'asymp'  => '\ensuremath{\asymp}',  &asymp; is actually a different glyph.
  786:     8771     => '\ensuremath{\asymp}',
  787:     8772     => '\ensuremath{\not\asymp}',
  788:     'cong'   => '\ensuremath{\cong}',
  789:     8773     => '\ensuremath{\cong}',
  790:     8775     => '\ensuremath{\ncong}',
  791:     8778     => '\ensuremath{\approxeq}',
  792:     8784     => '\ensuremath{\doteq}',
  793:     8785     => '\ensuremath{\doteqdot}',
  794:     8786     => '\ensuremath{\fallingdotseq}',
  795:     8787     => '\ensuremath{\risingdotseq}',
  796:     8788     => '\ensuremath{:=}',
  797:     8789     => '\ensuremath{=:}',
  798:     8790     => '\ensuremath{\eqcirc}',
  799:     8791     => '\ensuremath{\circeq}',
  800:     'wedgeq' => '\ensuremath{\stackrel{\wedge}{=}}',
  801:     8792     => '\ensuremath{\stackrel{\wedge}{=}}',
  802:     8794     => '\ensuremath{\stackrel{\vee}{=}}',
  803:     8795     => '\ensuremath{\stackrel{\star}{=}}',
  804:     8796     => '\ensuremath{\triangleq}',
  805:     8797     => '\ensuremath{\stackrel{def}{=}}',
  806:     8798     => '\ensuremath{\stackrel{m}{=}}',
  807:     8799     => '\ensuremath{\stackrel{?}{=}}',
  808:     'ne'     => '\ensuremath{\neq}',
  809:     8800     => '\ensuremath{\neq}',
  810:     'equiv'  => '\ensuremath{\equiv}',
  811:     8801     => '\ensuremath{\equiv}',
  812:     8802     => '\ensuremath{\not\equiv}',
  813:     'le'     => '\ensuremath{\leq}',
  814:     8804     => '\ensuremath{\leq}',
  815:     'ge'     => '\ensuremath{\geq}',
  816:     8805     => '\ensuremath{\geq}',
  817:     8806     => '\ensuremath{\leqq}',
  818:     8807     => '\ensuremath{\geqq}',
  819:     8810     => '\ensuremath{\ll}',
  820:     8811     => '\ensuremath{\gg}',
  821:     'twixt'  => '\ensuremath{\between}',
  822:     8812     => '\ensuremath{\between}',
  823:     8813     => '\ensuremath{\not\asymp}',
  824:     8814     => '\ensuremath{\not<}',
  825:     8815     => '\ensuremath{\not>}',
  826:     8816     => '\ensuremath{\not\leqslant}',
  827:     8817     => '\ensuremath{\not\geqslant}',
  828:     8818     => '\ensuremath{\lesssim}',
  829:     8819     => '\ensuremath{\gtrsim}',
  830:     8820     => '\ensuremath{\stackrel{<}{>}}',
  831:     8821     => '\ensuremath{\stackrel{>}{<}}',
  832:     8826     => '\ensuremath{\prec}',
  833:     8827     => '\ensuremath{\succ}',
  834:     8828     => '\ensuremath{\preceq}',
  835:     8829     => '\ensuremath{\succeq}',
  836:     8830     => '\ensuremath{\not\prec}',
  837:     8831     => '\ensuremath{\not\succ}',
  838:     'sub'    => '\ensuremath{\subset}',
  839:     8834     => '\ensuremath{\subset}',
  840:     'sup'    => '\ensuremath{\supset}',
  841:     8835     => '\ensuremath{\supset}',
  842:     'nsub'   => '\ensuremath{\not\subset}',
  843:     8836     => '\ensuremath{\not\subset}',
  844:     8837     => '\ensuremath{\not\supset}',
  845:     'sube'   => '\ensuremath{\subseteq}',
  846:     8838     => '\ensuremath{\subseteq}',
  847:     'supe'   => '\ensuremath{\supseteq}',
  848:     8839     => '\ensuremath{\supseteq}',
  849:     8840     => '\ensuremath{\nsubseteq}',
  850:     8841     => '\ensuremath{\nsupseteq}',
  851:     8842     => '\ensuremath{\subsetneq}',
  852:     8843     => '\ensuremath{\supsetneq}',
  853:     8847     => '\ensuremath{\sqsubset}',
  854:     8848     => '\ensuremath{\sqsupset}',
  855:     8849     => '\ensuremath{\sqsubseteq}',
  856:     8850     => '\ensuremath{\sqsupseteq}',
  857:     8851     => '\ensuremath{\sqcap}',
  858:     8852     => '\ensuremath{\sqcup}',
  859:     'oplus'  => '\ensuremath{\oplus}',
  860:     8853     => '\ensuremath{\oplus}',
  861:     8854     => '\ensuremath{\ominus}',
  862:     'otimes' => '\ensuremath{\otimes}',
  863:     8855     => '\ensuremath{\otimes}',
  864:     8856     => '\ensuremath{\oslash}',
  865:     8857     => '\ensuremath{\odot}',
  866:     8858     => '\ensuremath{\circledcirc}',
  867:     8859     => '\ensuremath{\circledast}',
  868:     8861     => '\ensuremath{\ominus}', # Close enough for government work.
  869:     8862     => '\ensuremath{\boxplus}',
  870:     8863     => '\ensuremath{\boxminus}',
  871:     8864     => '\ensuremath{\boxtimes}',
  872:     8865     => '\ensuremath{\boxdot}',
  873:     'vdash'  => '\ensuremath{\vdash}',
  874:     8866     => '\ensuremath{\vdash}',
  875:     'dashv'  => '\ensuremath{\dashv}',
  876:     8867     => '\ensuremath{\dashv}',
  877:     'perp'   => '\ensuremath{\perp}',
  878:     8869     => '\ensuremath{\perp}',
  879:     8871     => '\ensuremath{\models}',
  880:     8872     => '\ensuremath{\vDash}',    
  881:     8873     => '\ensuremath{\Vdash}',
  882:     8874     => '\ensuremath{\Vvdash}',
  883:     8876     => '\ensuremath{\nvdash}',
  884:     8877     => '\ensuremath{\nvDash}',
  885:     8878     => '\ensuremath{\nVdash}',
  886:     8880     => '\ensuremath{\prec}',
  887:     8881     => '\ensuremath{\succ}',
  888:     8882     => '\ensuremath{\vartriangleleft}',
  889:     8883     => '\ensuremath{\vartriangleright}',
  890:     8884     => '\ensuremath{\trianglelefteq}',
  891:     8885     => '\ensuremath{\trianglerighteq}',
  892:     8891     => '\ensuremath{\veebar}',
  893:     8896     => '\ensuremath{\land}',
  894:     8897     => '\ensuremath{\lor}',
  895:     8898     => '\ensuremath{\cap}',
  896:     8899     => '\ensuremath{\cup}',
  897:     8900     => '\ensuremath{\diamond}',
  898:     'sdot'   => '\ensuremath{\cdot}',
  899:     8901     => '\ensuremath{\cdot}',
  900:     8902     => '\ensuremath{\star}',
  901:     8903     => '\ensuremath{\divideontimes}',
  902:     8904     => '\ensuremath{\bowtie}',
  903:     8905     => '\ensuremath{\ltimes}',
  904:     8906     => '\ensuremath{\rtimes}',
  905:     8907     => '\ensuremath{\leftthreetimes}',
  906:     8908     => '\ensuremath{\rightthreetimes}',
  907:     8909     => '\ensuremath{\simeq}',
  908:     8910     => '\ensuremath{\curlyvee}',
  909:     8911     => '\ensuremath{\curlywedge}',
  910:     8912     => '\ensuremath{\Subset}',
  911:     8913     => '\ensuremath{\Supset}',
  912:     8914     => '\ensuremath{\Cap}',
  913:     8915     => '\ensuremath{\Cup}',
  914:     8916     => '\ensuremath{\pitchfork}',
  915:     8918     => '\ensuremath{\lessdot}',
  916:     8919     => '\ensuremath{\gtrdot}',
  917:     8920     => '\ensuremath{\lll}',
  918:     8921     => '\ensuremath{\ggg}',
  919:     8922     => '\ensuremath{\gtreqless}',
  920:     8923     => '\ensuremath{\lesseqgtr}',
  921:     8924     => '\ensuremath{\eqslantless}',
  922:     8925     => '\ensuremath{\eqslantgtr}',
  923:     8926     => '\ensuremath{\curlyeqprec}',
  924:     8927     => '\ensuremath{\curlyeqsucc}',
  925:     8928     => '\ensuremath{\not\preccurlyeq}',
  926:     8929     => '\ensuremath{\not\succcurlyeq}',
  927:     8930     => '\ensuremath{\not\sqsupseteq}',
  928:     8931     => '\ensuremath{\not\sqsubseteq}',
  929:     8938     => '\ensuremath{\not\vartriangleleft}',
  930:     8939     => '\ensuremath{\not\vartriangleright}',
  931:     8940     => '\ensuremath{\not\trianglelefteq}',
  932:     8941     => '\ensuremath{\not\trianglerighteq}',
  933:     8942     => '\ensuremath{\vdots}',
  934:     8960     => '\ensuremath{\varnothing}',
  935:     'lceil'  => '\ensuremath{\lceil}',
  936:     8968     => '\ensuremath{\lceil}',
  937:     'rceil'  => '\ensuremath{\rceil}',
  938:     8969     => '\ensuremath{\rceil}',
  939:     'lfloor' => '\ensuremath{\lfloor}',
  940:     8970     => '\ensuremath{\lfloor}',
  941:     'rfloor' => '\ensuremath{\rfloor}',
  942:     8971     => '\ensuremath{\rfloor}',
  943:     'lang'   => '\ensuremath{\langle}',
  944:     9001     => '\ensuremath{\langle}',
  945:     'rang'   => '\ensuremath{\rangle}',
  946:     9002     => '\ensuremath{\rangle}',
  947:     'loz'    => '\ensuremath{\lozenge}',
  948:     9674     => '\ensuremath{\lozenge}',
  949:     'spades' => '\ensuremath{\spadesuit}',
  950:     9824     => '\ensuremath{\spadesuit}',
  951:     9825     => '\ensuremath{\heartsuit}',
  952:     9826     => '\ensuremath{\diamondsuit}',
  953:     'clubs'  => '\ensuremath{\clubsuit}',
  954:     9827     => '\ensuremath{\clubsuit}',
  955:     'diams'  => '\ensuremath{\blacklozenge}',
  956:     9830     => '\ensuremath{\blacklozenge}'
  957:     
  958: );
  959: 
  960: =pod
  961: 
  962: =head1 UNICODE TABLE
  963: 
  964: =over
  965: 
  966:     There are some named entities that don't have a good
  967:     latex equivalent, these are converted to utf-8 via this table
  968:     of entity name -> unicode number.
  969: 
  970: =back
  971: 
  972: =cut
  973: 
  974: my  %utf_table = (
  975:     'THORN'  => 222,
  976:     'thorn'  => 254,
  977:     'eth'    => 240,
  978:     'hearts' => 9829
  979: );
  980: 
  981: sub entity_to_utf8 {
  982:     my ($unicode) = @_;
  983:     my $result =  pack("U", $unicode);
  984:     return $result;
  985: }
  986: 
  987: 
  988: 
  989: sub entity_to_latex {
  990:     my ($entity) = @_;
  991: 
  992:     # Try to look up the entity (text or numeric) in the hash:
  993: 
  994: 
  995: 
  996:     my $latex = $entities{"$entity"};
  997:     if (defined $latex) {
  998: 	return $latex;
  999:     }
 1000:     # If the text is purely numeric we can do the UTF-8 conversion:
 1001:     # Otherwise there are a few textual entities that don't have good latex
 1002:     # which can be converted to unicode:
 1003:     #
 1004:     if ($entity =~ /^\d+$/) {
 1005: 	return &entity_to_utf8($entity);
 1006:     } else {
 1007: 	my $result = $utf_table{"$entity"};
 1008: 	if (defined $result) {
 1009: 	    return &entity_to_utf8($result);
 1010: 	}
 1011:     }
 1012:     #  Can't do the conversion`< ...
 1013: 
 1014:     return " ";
 1015: }
 1016: 
 1017: 
 1018: sub replace_entities {
 1019:     my ($input)  = @_;
 1020:     my $start;
 1021:     my $end;
 1022:     my $entity;
 1023:     my $latex;
 1024:     
 1025:     # First the &#nnn; entities:
 1026: 
 1027:     while ($input =~ /(&\#\d+;)/) {
 1028: 	($start) = @-;
 1029: 	($end)   = @+;
 1030: 	$entity  = substr($input, $start+2, $end-$start-3);
 1031: 	$latex = &entity_to_latex($entity);
 1032: 	substr($input, $start, $end-$start) = $latex;
 1033:     }
 1034: 
 1035:     # Hexadecimal entities:
 1036: 
 1037:     while ($input =~ /&\#x(\d|[a-f,A-f])+;/) {
 1038: 	($start) = @-;
 1039: 	($end)   = @+;
 1040: 	$entity  = "0" . substr($input, $start+2, $end-$start-3); # 0xhexnumber
 1041: 	$latex = &entity_to_latex(hex($entity));
 1042: 	substr($input, $start, $end-$start) = $latex;
 1043:     }
 1044: 
 1045: 
 1046:     # Now the &text; entites;
 1047:     
 1048:     while ($input =~/(&\w+;)/) {
 1049: 	($start) = @-;
 1050: 	($end)   = @+;
 1051: 	$entity   = substr($input, $start+1, $end-$start-2);
 1052: 	$latex    = &entity_to_latex($entity);
 1053: 	substr($input, $start, $end-$start) = $latex;
 1054: 	
 1055:    }
 1056:     return $input;
 1057: }
 1058: 
 1059: 1; 
 1060: 
 1061: __END__
 1062: 
 1063: =pod
 1064: 
 1065: =head1 NAME
 1066: 
 1067: Apache::entities.pm
 1068: 
 1069: =head1 SYNOPSIS
 1070: 
 1071: This file contains a table driven entity-->latex converter.
 1072: 
 1073: This is part of the LearningOnline Network with CAPA project
 1074: described at http://www.lon-capa.org.
 1075: 
 1076: =head1 OVERVIEW
 1077: 
 1078: 
 1079: Assumptions:
 1080:  The number of entities in a resource is small compared with the
 1081:  number of possible entities that might be translated.
 1082:  Therefore the strategy is to match a general entity pattern
 1083:  &.+; over and over, pull out the match look it up in an entity -> tex hash
 1084:  and do the replacement.
 1085: 
 1086: In order to simplify the hash, the following reductions are done:
 1087:  &#d+; have the &# and ; stripped and is converted to an int.
 1088:  &#.+; have the &#x and ; stripped and is converted to an int as a hex
 1089:                            value.
 1090:  All others have the & and ; stripped.
 1091: 
 1092: 
 1093: The hash:  Add new conversions here; leave off the leading & and the trailing ;
 1094: all numeric entities need only appear as their decimal versions
 1095: (e.g. no need for 1234 is sufficient, no need for 0x4d2 as well.
 1096: 
 1097: This entity table is mercilessly cribbed from the  HTML pocket reference
 1098: table starting at pg 82.  In most cases the LaTeX equivalent codes come from
 1099: the original massive regular expression replacements originally by 
 1100: A. Sakharuk in lonprintout.pm
 1101: 
 1102: I also want to acknowledge
 1103:  ISO Character entities and their LaTeX equivalents by 
 1104:     Vidar Bronken Gundersen, and Rune Mathisen
 1105:   http://www.bitjungle.com/isoent-ref.pdf
 1106: 
 1107: 
 1108: Note numerical entities are essentially unicode character codes.
 1109: 
 1110: 
 1111: =head1 SUBROUTINES
 1112: 
 1113: =over
 1114: 
 1115: =item entity_to_utf8()
 1116: 
 1117: 
 1118: Convert a numerical entity (that does not exist in our hash)
 1119:  to its UTF-8 equivalent representation.
 1120:  This allows us to support, to some extent, any entity for which
 1121:  dvipdf can find a gylph (given that LaTeX is now UTF-8 clean).
 1122: 
 1123: Parameters:
 1124:   unicode  - The unicode for the character.  This is assumed to
 1125:              be a decimal value
 1126: Returns:
 1127:   The UTF-8 equiavalent of the value.
 1128: 
 1129: =item entity_to_latex()
 1130: 
 1131:  Convert an entity to the corresponding LateX if possible.
 1132:  If not possible, and the entity is numeric,
 1133:  the entity is treated like a Unicode character and converted
 1134:  to UTF-8 which should display as long as dvipdf can find the
 1135:  appropriate glyph.
 1136: 
 1137:  The entity is assumed to have already had the 
 1138:  &;  or & ; removed
 1139: 
 1140: Parameters:
 1141:   entity    - Name of entity to convert.
 1142: Returns:
 1143:  One of the following:
 1144:   - Latex string that produces the entity.
 1145:   - UTF-8 equivalent of a numeric entity for which we don't have a latex string.
 1146:   - ' ' for text entities for which there's no latex equivalent.
 1147: 
 1148: 
 1149: =item replace_entities()
 1150: 
 1151:  Convert all the entities in a string.
 1152:  We locate all the entities, pass them into entity_to_latex and 
 1153:  and replace occurences in the input string.
 1154:  The assumption is that there are few entities in any string/document
 1155:  so this looping is not too bad.  The advantage of looping vs. regexping is
 1156:  that we now can use lookup tables for the translation in entity_to_latex above.
 1157: 
 1158: Parameters:
 1159:   input   - Input string/document
 1160: Returns
 1161:   input with entities replaced by latexable stuff (UTF-8 encodings or
 1162:   latex control strings to produce the entity.
 1163: 
 1164: =back
 1165: 
 1166: =cut

FreeBSD-CVSweb <freebsd-cvsweb@FreeBSD.org>