Choice of COBOL for Braille translation MITRE Corporation MTR-1743 by J. K. Millen; December 1969 the MITRE corporation PRINT Iowa STATE COMMISSION FOR THE BUNO MITRE Technical Report MTR-1743 No. Vol. Series Rev. Supp. Corr. Subject: Choice of COBOL for Braille Translation Author: Dr. Jonathan K. Millen Dept.: D-73 Date: December 1969 Contract No . SR 21697 Sensory Aids Evaluation and Contract Sponsor: Development Center, Massachusetts Institute of Technology Project: 1248 Issued at: Bedford, Massachusetts Department Approval: MITRE Project Approval: Page 1 of 19 Pages This document has been approved for public release. ABSTRACT One product of MITRE Project 1248 is a computer program, DOTSYS II, written in a higher- level language, for translating English text into standard English braille. The emphasis of its design is to make the program functions lucid, and the transfer to other computer systems easily accomplished. An important factor in attaining the stated goals is the choice of higher-level lang'oage. The higher-level language chosen for DOTSYS II is COBOL. This document explains why. TABLE OF CONTENTS Introduction 1 Considerations motivating the choice 1 Implementation on IBM 360 2 Visibility 2 Transferability: availability 2 Transferability: standardization 3 Efficiency: the benchmark 3 Size comparison 4 Speed comparison 5 Conclusion 5 Appendix I LANGUAGE EVALUATION 7 Appendix II FORTRAN BENCHMARK SHIFTING ROUTINES . 8 Appendix III THE COBOL BENCHMARK 9 References 14 Distibution List 15 LIST OF TABLES Table I Number of computer tjrpes whose manufacturer supplies ALGOL, COBOL, or FORTRAN 2 Table II Bytes of storage required for the benchmark programs in COBOL and FORTRAN 5 IV Digitized by the Internet Archive in 2011 with funding from National Federation of the Blind (NFB) http://www.archive.org/details/choiceofcobolforOOmill CHOICE OF COBOL FOR BRAILLE TRANSLATION Introduction One product of Project 1248 is a computer program, DOTSYS II, written in a higher-level language, for translating English text into standard English braille. [1] The emphasis of its design is to make the program functions lucid, and the transfer to other computer systems easily accomplished. An important factor in attaining the stated goals is the choice of higher-level language. The higher-level language chosen for DOTSYS II is COBOL. This document explains why. Candidates for choice were those higher-level languages which are widely used and have character-handling ability. The survey of programming languages by Sammet [2] yielded the following candi- dates: ALGOL, APL, COBOL, COMIT, FORTRAN IV, JOVIAL, LISP 1.5, PL/I, SNOBOL, and TRAC. Considerations motivating the choice : Four considerations motivated the choice among the candidates: implementation on the IBM 360, visibility, transferability, and efficiency. Visibility and transferability were brought up by R. A, J. Gildea in 1968 [3] as reasons for using a higher- level language instead of an assembly- level language. Visibility is the ability of programmers other than the author of the program to determine from reading the program where to make changes , should changes be necessary. Transferability is the ability to run the program on computer systems other than the one on which the program was debugged. Rather than evaluate all the above languages with respect to all four considerations, the considerations were applied successively to eliminate some languages at each step until only one was left. This procedure was necessary because of time limitations. In particular, the judgment of efficiency required writing a benchmark program in each language to be compared. It was likely that doing this for more than two languages would jeopardize timely completion of the project. A table summarizing the results of the evaluation is included as Appendix I. Implementation on IBM 360 Since DOTSYS II is required to run at least on the IBM 360, those higher- level languages were eliminated from candidacy which are not inqslemented on the IBM 360, or xrrhose implementations on the IBM 360 are too recent to be evaluated properly. In this category are API, COMIT, IPL V, LISP 1.5, and TRAC. Visibility Coincidentally, all of these, though to a leaser extent COMIT, have notations x^7hich are difficult to understand, rendering the languages difficult to learn and programs in them difficult to read, at least in the environment of braille translation. In short, their visibility is poor. A COMIT program for the IBM 7090 with partial Grade 2 braille translation capability exists, but the author of that program judged COMIT "...slow for the amount of table searching that must be done for braille translation." [4]. Transferability; availability This left ALGOL, COBOL, FORTRAN IV, JOVIAL, PL/I, and SNOBOL. The next consideration applied was that of transferability. Two aspects of transferability are (1) availability of implementations of the language processor (compiler, interpreter, or other necessary software) and (2) standardization. The number of computer types for which computer manufacturers supply each language was examined as a measure of availability. Data on the number of computer tj^pes on which ALGOL, COBOL and FORTRAN are supplied was found in Computer Characteristics Review, [5] and is summarized in Table I. Clearly, FORTRAN IV is the most wide- spread, with COBOL second, and ALGOL a poor third. U.S. Foreign ALGOL 21 48 COBOL 108 87 FORTRAN 150 120 Table I Number of computer types whose manufacturer supplies ALGOL, COBOL, or FORTRAN Information about implementations of PL/I was obtained from Prof. Robert Rosin, [6] who is maintaining an (unpublished) indepen- dent survey of them. According to Prof. Rosin, the IBM 360 is the only computer for which the manufacturer furnishes PL/l, although there exist various private implementations of versions for other computers, such as one at Northern Electric Co., Ottawa, for the CDC 3300. Information from sales personnel indicates that a number of manufacturers, such as General Electric and Burroughs, plan to supply a PL/I compiler within a year or two, while RCA prefers to wait until the language has been standardized. By comparison with ALGOL, COBOL, and FORTRAN IV, PL/I is still presently the least available. JOVIAL and SNOBOL are not supplied by any major computer manufacturer. Transferability; standardization The second aspect of transferability, standardization, was measured by the public acceptance of a language specifications document. FORTRAN and COBOL both have standards published by the American National Standards Institute, [7,8] which is the official authority for industrial standardization in the United States. There is a proposed international standard for ALGOL, [9] but meanwhile the Revised ALGOL 60 Report [10] is the current accepted standard. PL/I and JOVIAL are under consideration for ANSI standardization; there has been no consideration of ANSI standardization of SNOBOL. On the basis of transferability it was decided to reject all but FORTRAN and COBOL. Efficiency; the benchmark The relative efficiency of FORTRAN and COBOL was then determined by coding a benchmark program in each language, and comparing storage space and running time. The problem programmed was one used by John R. Siems to test the relative efficiency of assembly programs on the IBM 709 and IBM 360/30. We quote from his description; [11] A problem was required which would be of the type encountered in braille translation and which would be primarily a test of internal processing capability. The problem selected was that of locating and marking in an inkprint text all of the occurrences of the letter groups which are sometimes contracted in braille. Whether or not the contraction should be used in the specific case was not made part of the problem. Letter groups, even if overlapped, were to be indicated by parentheses. Text example: READ CONE OTHER RECEIVE Processed: R(EA)D (C(ON)E) 0((TH)(E)R) (RECEIVE) The FORTRAN benchmark program was written and debugged in two weeks. The COBOL langiiage was learned and the COBOL program written and debugged in two weeks (the latter two weeks overlapped the first so that the whole period was about three weeks, working approximately half time) . Careful attention was given to coding the FORTRAN program in the most efficient way, partly because it was the author's guess before the test, as a result of considerable experience with FORTRAN, that FORTRAN would be faster. Letter groups which could be contracted in braille were stored in a table of integers, with four 8-bit characters packed in each 32-bit integer to conserve space. As a consequence of this approach, subroutines had to be written which would isolate individvial characters, and it was found that this could be done by shifting. The remainder of the details would require considerable further explanation; instead, the reader is referred to Appendix II, which contains a listing of the subroutines used. The COBOL program was coded in a simple, straightforward way. COBOL' s ability to represent strings of characters by data names, and to break up the strings into fields by redefining their structure, proved convenient. A listing of the COBOL benchmark program is included as Appendix III. Size comparison The next step was to compile both programs, and observe the size of the load modules.* The storage space required for a load module falls into three categories, as shown in Table II: (1) the table of contracted letter groups; (2) program: intructions and storage explicitly requested; (3) ancillary functions: storage for execution- time routines (such as for input and output) and related work areas. Using Version 17 of the IBM Operating System on an IBM 360 model 50, the level G FORTRAN IV compiler and the level F COBOL compiler. The reason for distinguishing the latter two categories is that the total space in the ancillary functions category is much more likely to vary among computer types and implementations than is the total space in the program category. The number of bytes in each load module for each category are given in Table II. The COBOL program required less storage in all categories. COBOL FORTRAN Contraction Table 2,276 2,454 Program* 3,152 7,458 Ancillary** 832 18,784 Total 6,260 28,696 Table II Bytes of storage required for the benchmark programs in COBOL and FORTRAN * Program storage includes instructions and storage explicitly requested (except contraction table). Ancillary function storage includes execution-time routines (such as for input and output) and related work areas. Speed comparison Finally, a data base on punched cards of about 5,500 words of text from the Wall Street Journal was given as input to each program. The FORTRAN program processed the text at a rate of 676 words per minute (of central processor time); the COBOL program processed the same text at a rate of 2070 words per minute. Conclusion Thus COBOL was chosen. It is implemented on the IBM 360; available on many models of con5>uters, including those of all major manufacturers; easy to learn and read; and efficient. Even if it should, in the future, be deemed desirable to recede DOTSYS II in PL/I, it would be easy, since PL/I practically contains COBOL as a subset. Last, but not least, the ease of coding in COBOL bodes well for the completion of the project on time. rC/ Jonathan K. Mill DrC/ Jonathan K. Mi lien Information Processing JKM:dk APPENDIX I LANGUAGE EVALUATION IBM/360 Implemen- tation Visibility Trans f erabi li t y : Efficiency Availability Standard ALGOL A A B A - APL B F - - - COBOL A A A A A COMIT F B - - JL. FORTRAN IV A A A A F IPL V F F - - _ JOVIAL A A F F - LISP 1.5 B F - - - PL/I A A F F - SNOBOL A B F F - TRAC B F - - - A = acceptable B = borderline F = unacceptable * = reportedly inefficient; see reference 4 - = not evaluated APPENDIX II FORTRAN BENCHMARK SHIFTING ROUTINES C C SHIFT RIGHT 8 BITS C INTEGER FUNCTION SHIFTR(IW) SHIFTR = IW IF (IW.GE.O) GO TO 1 SHIFTR = -(SHIFTP+1) 1 SHIFTR = SHIFTR/2'56 IF ( IW.GE.O) GO TO 2 SHIFTR = -(SHIFTP+1)+16777216 2 RETURN END C C SHIFT LEFT 8 BITS C INTEGER FUNCTION SHIFTL(IW) CLEARKIW) = IMOD( IW, 16777216) SHIFTL = CLEARKIW) IF (IW.GE. 8388608) GO TO 1 SHIFTL = SHIFTL*256 RETURN 1 SHIFTL=-( ( 167772 16- ( SH IF TL+ I) ) *2 56+ 1 )-25 5 RETURN END C C POSITIVE INTEGER PART C FUNCTION IMOD(IW,M) IMOO = MOD(IW,M) IF ( IMOD.GE.O) GO TO IMOD = IMOD + M RETURN END APPENDIX III THE COBOL BENCHMARK It «*♦*♦♦**«♦**«♦**♦**♦*«* *****«:jt*:^:(c:Ot«**)|t*** *♦»♦**♦*******♦# *«*:»:*:(t« ******* **«♦«**♦«* PROGRAM-ID. 'CBOAILLF*. AUTHOH. J. K. MILLFN. INSTALLATION. MITRE D73 DATF-WR ITTEN. SFPT. 24, 1969 FNVIRnNMEMT DIVISION. CONFIGURATION SECTION. SOURCE-COMPUTER. IB^'-360 M50. OBJECT-COMPUTER. 13^-360 M50. INPUT-OUTPUT SbCTIUN. FILF-CONTROL. SELECT SYSINPUT, ASSIGN TO 'SYSIN' UTILITY. SELECT SYSPRINT, ASSIGN TO 'SYSPRINT* UTILITY. DATA OIVISION. FILE SECTION. FD SYSINPUT LABEL c>ccoP-^S ARE STANOAi^D, " ECORDING "inOE I S F , PFCOPH CONTAINS 80 CHARACTERS, BLOCK CONTAINS BO CHARACTERS, DATA RECORD IS INPUT-RECORD. 01 INPUT-RECORO. 02 TEXT. 03 CHAR OCCURS BO TIMES, PICTURE X. 02 TABLF-INFO REDEFINES TEXT. 3 NUM3E° oiCTiJRt 999. 3 ^'EST PICTURE X(77). Fn SYSPRINT LABFL RECORDS ARE STANDARD, RECORDING MODE IS p, RECORD CONTAINS 121 CHARACTERS, BLOCK CONTAINS 121 CHARACTERS, DATA RECORD IS PRINTED. 01 PRINTED. 02 FILLEP °ICTU^E A. 02 OUT. 03 PLACE OCCURS 120 TIMES, PICTURE X. WOPKING-STORAGE SECTION. 77 NALPHABET PICTURE S999, USAGE COMPUTATIONAL. 77 INPTR PICTURE S99, USAGE COMPUTATIONAL. 77 OUTPTR PICTURE S99 , VALUE 1, USAGE CUMPUT AT I ONAL. 77 TEMP PICTURE X(9). 77 LPAR PICTURE X, VALUE •('. 77 RPAR PICTURE X, VALUE •)'. 77 J PICTURE 999, USAGE COMPUTATIONAL. 77 N PICTURE 99, USAGE COMPUTATIONAL. 77 NN PICTURE 999, USAGE COMPUTATIONAL. 77 NXTCHR PICTURE X. 77 OUTCHR PICTURE X. 77 NTABLE PICTURE 999, USAGE COMPUTATIONAL. 77 NINE-BLANKS PICTURE X{9), VALUE SPACES. 77 A-BLANK PICTURE A, VALUE SPACE. 01 REGISTER. 02 L. 10 03 RL9 PICTURE X(9). 03 RRl PICTUPF X. 02 R REDEPINES L. 03 RLl PICTURE X, 03 RR9 PICTURE X(c»). 02 RCHAPS REDEFINES L. 03 FIRSTCHAR PICTURE X. 03 RCHAR OCCURS 9 TIMES, PICTURE X. 01 TABLE. 02 TABLE-FNTRY OCCURS 200 T I ME S , PI C TURE X(9). 03 TARLE-CHAR OCCURS 9 TIMES, PICTURE X. 01 LENGTH-TABLE. 02 LTABLF OCCURS 200 TIMES, DEPENDING ON NTABLE, PICTURE S9, USAGE COMPUTATIONAL. 01 ALPHABET. 02 SYMBOL OCCURS 38 TIMES, PICTURE X. 01 INDICES. 0? EXT-^NT OCCURS 38 TIMES, PICTURE S999, USAGE COMPUTATIONAL. 01 RPAR-COUNT-TABLE. 02 ftCS OCCURS 9 TIMFS, PICTURE S9, USAGE COMPUTATIONAL. PROCEDURE DIVISION. INITIALIZATION. OPEN INPUT SYSINPUT. OPEN OUTPUT SYSPRINT. READ SYSINPUT, AT END GO TO FIN. MOVE NUMBER TO NTABLE. PERFORM FILL-TABLE, VARYING N FROM I BY I UNTIL N = NTABLE. FILL-TARLE. READ SYSINPUT INTO TABLF-FNTRY (N), AT END GO TO ^IN. PERFORM NOTHING, VARYING J FROM 1 3Y I UNTIL CHAR (J) = SPACE. SUBTRACT 1 FROM J, MOVE J TO LTA8LE (N). ENO-FILL-TABLF, READ SYSINPUT, AT END GO TO FIN. MOVE NUMBFR TO NALPHABET, PERFORM READ-ALPHABET, VARYING N FROM 1 BY 1 UNTIL N = NALPHABET, READ- ALPHABET. READ SYSINPUT INTO SYMBOL (N), AT END GO TO FIN. READ SYSINPUT, AT END GO TO FIN. MOVE NUMBER TO EXTENT (N). END-REAO-ALPHABET. REAO-FIRST-RECORO. READ SYSINPUT, AT END GO TO FIN. MOVE 11 TO INPTR. MOVE TEXT TO REGISTER. PERFORM ZERO-RCS, VARYING N FROM 1 BY 1 UNTIL N = 9. ZERO-RCS. MOVE TO RC S (N». END-ZERO-RCS. GO TO TRANSLATION. NOTHING. EXIT. END-INITIALIZATION. TRANSLATION. MOVE 1 TO N. 11 TFST. IP SYf-'BOKN) = 5L1 THFM GO TO OH-SEARCH. IF N = MALPHAHFT THEN GO TO SHIFT. ADD 1 T3 N. GO TO TrsT. DO-SFAPCH, COMPUTE NH = N] + 1. PERFOP''! THF-SFARCH THHIJ pNjO-SFARCH, VAC<~YIN'G J F^^OM EXTENT (N) BY 1 UNTIL J ^ EXTENT (NN). SHIFT. MOVE PLl TO "lUTCH^. PERFORM nUTPUT-CHAR THRU EMn-OUTPUT. PERFORM CHFCK-PCS THRU ENDCKRCS, VARYING N FROM 1 RY 1 UNTIL N = 9. CHECK-TS. IF RCS (N) NOT = 1 THEN GO Tfi SU^TRACT-1. MOVF FPAP TO OUTCH^'. pppf-OPM OUTPUT-CHAR THRU ENO-PUTPUT, SUETPACT-1. IP RCS (N) NOT = THPN SLIRTRACT 1 FROM PCS (N), EWr^CKRCS. EXIT. ENH-CHlCK-^CS. Mnyr SQQ xr: Ti=^P. «'OVE Tfvp TO PL 9, PERFORM INPUT-CHAR THRU ENO-INPUT, MOVE MXTCHR Tn RRi. GO TO TP A^iSL''^T10N, thl-spapch. IF LTAbLP (J) ^ Ti^'^K G'T Tfi END-SEARCH. iVlVC 1 TO M. CO'^PAKE. IF \ > L TABLE (Jl THFM G'"' TO MARK. IF PCHAf^ (M) NOT = TArtLE-CHAR (J,NI THEN GO TO ENO-SFARCH. A 00 1 Ti) N. GO TO COi^r>ARr. fJ-AC'K, PfSEOi^M NOTHPiG, VARYING N PROM 1 3Y 1 UNTIL RCS (Nl = 0. MOVE LTAbLE (J) TO RCS (N). Ann 1 TO PCS (N) . MOVE LPAR TO OUTCHR. PERFORM (JUTPUT-CHAR THRU ENO-OUTPUT, FMO-SEARCH. EXIT. FNP-TRANSLATION. EXIT. INOUT-CHAP. 1"= INPTR = 73 THEN GO TO R EAD-NEW-RFCORO. MOVE CHAP (INPTR) TO NXTCHR, AOO 1 TO INPTR. GO TO ENP- INPUT. READ-NEW-RECORO. READ SYSINPUT, AT END GO TO FIN. N-OVE CHAR(l) TO NXTCHR, MOVE 2 TO INPTR. ENO-INPUT. EXIT. nuTPUT-CHAR. IF OUTPTR = 121 THEN GO TO PRINT. 12 MOVE rUTCHR TH "L AC E ( OUTPTR ) . AOD 1 Tn OUTPT^. GU TU ENn-OUTPUT. PRINT. WHITE PRINTFO AFTER ADVANCING 1. vnVE UUTCHK TO PLACE(l). MHVF ? TG nUTPTR. FNn-OUTPUT. EXIT. FIN, PAR. CLO IP OUTPTR = 121 THEN GO TO CLOSE-FILES. PERFOF'^ PAD, VAi^YING N FROM OUTPTR BY 1 UNTIL N • MOVE A-BLANK TO PLACE (N). SE-FILFS. WRITE PRINTEH AFTE^ AOVATJCING 1. CLOSE SYSINPilT. CLOSE SYSPRINT. STOP r-UN. = 120, 13 REFERENCES 1. English Braille American Edition, 1959 (Revised 1966), American Printing House for the Blind, Louisville, Ky. 2. Jean E. Samraet, Programming Languages: History and Fundamentals , Prentice-Hall, 1969. 3. Robert A. J. Gildea, "Higher Level Computer Languages for Computer Translations", Proc. Conf. on New Processes for Braille Manufacture , 1968, American Printing House for the Blind, Louisville, Ky. 4. Robert C. Garamill, Braille Translation by Computer , U.S. Dept. of Health, Education and Welfare, (Report No. EPL 9211-1), October 1963. 5. Computer Characteristics Review , Keydata Corp., Watertown, Mass., Vol. 9, No. 1, April 1969. 6. Robert F. Rosin, (private communication), State University of New York at Buffalo. 7. United States Standard FORTRAN , American National Standards Institute, ANSI X3. 9-1966. 8. United States Standard COBOL , American National Standards Institute, ANSI X3. 23-1968. 9. ISO Draft Recommendation on the Programming Language ALGOL , International Organization for Standardization, Technical Committee ISO/TC 97 Subcommittee 5, Programming Languages, October 1965. 10. P. Naur, ed. , "Revised Report on the Algorithmic Language ALGOL 60", Comm. ACM, Vol. 6, No. 1, January 1963, pp. 1-17. 11. John R. Siems, "Report of New Braille Translation Program at APH", Proc. Conf. on New Processes for Braille Manufacture , 1968, American Printing House for the Blind, Louisville, Ky. 14 DISTRIBUTION LIST Internal D-Il J. J. Croke C. E. Duke J. F. Jacobs D-12 C. A. Zraket D-06 Project M. Leonard, M. I.T. Prof. R. W. Mann, M.I.T. V. A. Proscia, M.I.T. (50) External P. Davis Mass. Commission for the Blind P. R. Vance D-07 J. H. Burrows A. J. Roberts C-01 C. W. Farr D-73 W. Amory N. A. Anschuetz E. H. Bensley J. A. Clapp T. L. Connors Dr. D. W. Fife R. A. J. Gildea (25) J. B. Glore 0. R. Kinney E. L. Lafferty Dr. J. K. Millen (25) J. Mitchell C. M. Sheehan N. B. Sutherland Dr. D. E. Walker E. W. Williamson 15 Jacobus tenBroek Library 1 03103