Construction of the CD-ROM

The basic process of producing the CD-ROM is to (1) structure the files and directories on the WWW server and verify proper operation and conformity to the ISO 9660 standard, (2) record the files from the functioning WWW server with a commonly available CD-ROM writer and (3) send the resulting CD-ROM disk to IEEE where it can be tested and, accompanied by a label file that they prepare, sent out for replication. Although the procedure is straightforward, some pitfalls exist.

  1. The basic idea is to make the CD-ROM behave as nearly as possible like a WWW server when mounted locally on a machine and viewed by a local WWW browser. When accessed via the Internet, the WWW server makes any machine, file name and operating system dependencies transparent to the local browser. When the same file system is replicated on a CD-ROM and mounted locally, all of these dependencies become critical. Both Windows and Macintosh systems ignore case information in file and directory names. Thus, authors who use these operating systems can name a file and directory in ALL CAPs and later refer to them in an HTML hyperlink path in lower case without consequences noticeable to them. UNIX is case sensitive, however, so a UNIX user who tries to view an HTML document constructed in this way will find that a hyperlink with incorrect capitalization of a file name will not work. Another problem with UNIX systems has to do with the mounting of the CD-ROM at an arbitrary place in the UNIX file structure. For Windows and Macintosh systems, the CD-ROM is mounted at the root. On UNIX systems this is almost never the case.

    To avoid these problems, it is necessary (1) to adhere strictly to the ISO 9660 requirement that all file and directory names on the CD-ROM to be in upper case, (2) to ensure that all occurrences of file and directory names in paths in HTML links in all files on the CD-ROM appear in upper case and (3) to ensure that all HTML hyperlinks are relative to the entry point of the document.

    When delivered from a WWW server, the server provides a default document (usually INDEX.HTM or INDEX.HTML) that is opened automatically when a directory is accessed. Since there is no server in operation when viewing the CD-ROM locally, it is necessary for the CD-ROM to contain a master index document that provides a consistent relative starting point for browsing the CD-ROM and includes links to the INDEX.HTM starting document for each paper. Since the CD-ROM is just a collection of files when mounted locally, there is nothing to prevent accessing any file directly with a browser and thus lose a consistent relative starting point. If this occurs, the local browser may be unable to navigate up and down the file system by following relative links and thus may display a file not found error. The problem really is not an error with the construction of the CD-ROM or the document, but merely an artifact of the differences between browsing the files when a WWW browser connects to a WWW server across the Internet and browsing files locally.

  2. ISO 9660 not only requires file and directory names all to be strictly upper case, it also requires file and directory names to be no longer than 8 characters. Permitted characters include only the 26 upper case alphabetic characters, the numerical digits 0 through 9, and the under bar character. File and directory names must not begin with a numerical digit. File names may be followed by a period and an all upper case extension of up to 3 characters. The extension typically gives some indication of the file type.

    Unfortunately, strict compliance with the upper case and 8.3 naming requirements does not eliminate all problems. UNIX and Windows systems automatically append a semi-colon followed by an integer version number to all file names. In Windows (and MS-DOS) systems, the version number is always 1 and is not displayed. The drivers for reading ISO 9660 CD-ROMs on Windows and UNIX systems simply ignore the version number with no ill effect. The ISO 9660 drivers for older Macintosh systems, however, become confused by the version extension so that files referenced in HTML hyperlinks are not recognized and the links do not function on these systems even though the links may work perfectly on UNIX and Windows systems.

    No easy fix exists for this long-recognized problem. The problem does not occur with recent versions of the Macintosh operating system. Because the fix dates from several years ago, a response to the problem can be a suggestion that anyone who experiences it update the operating system on their Macintosh system to the latest version. An interesting aside is that at least some recent Macintosh ISO 9660 drivers fail to recognize file names too long to satisfy the 8.3 naming requirement, despite the fact that Macintosh systems otherwise support long file names. The drivers for Windows 95 and at least some versions of UNIX successfully accommodate files with long file names on the CD-ROM despite the violation of the ISO 9660 standard. MS-DOS systems, of course, require the 8.3 naming format.

  3. One other cause of hyperlinks that fail when viewed on one platform but function properly when viewed on another is the use of absolute addressing in hyperlinks in HTML documents on the CD-ROM. To avoid this problem in ISO-9660 CD-ROMs intended for cross-platform viewing, all paths in references to files in HTML hyperlinks should employ relative addressing rather than absolute addressing. Suppose, for example, that the table of contents HTML document lies in a subdirectory /CONTENTS/ of the root directory and that a link in the table of contents is to refer to entry point file, INDEX.HTM, in the subdirectory /099/CDROM/. Using absolute addressing in the table of contents document, we would refer to the file as FILE:///099/CDROM/INDEX.HTM. Using relative addressing, however, we would refer to the file as ../099/CDROM/INDEX.HTM. Using relative addressing circumvents the problem that different hardware platforms refer to the root directory in different ways and hence require different path formats for absolute addressing. Relative addressing avoids this problem.

    Even if relative addressing is used in all hyperlinks on the CD-ROM, a remaining difficulty stems from a limit to the accumulated length of relative addresses that can be handled by browsers. After a lengthy session of exploring a CD-ROM on which all links use relative addressing, a browser can lose track of its position. In that case, the user must manually open some file on the disk to reestablish a point of reference. Although no simple cross-platform solution is evident, the problem occurs infrequently and users who encounter it are likely to try opening some file manually, thus fixing the problem, with little aggravation or loss of time.

  4. A final pitfall is the possibility of including a virus on the CD-ROM. The most destructive viruses are usually attached to executable files. Because such files might be submitted by contributors for archiving and distributing on the CD-ROM, virus checks are essential.

    It is extremely important to notice that separate virus checks are required for each platform. A Windows virus checker will not identify a virus attached to a Macintosh executable file, for example. The necessity of running virus checks with software for all three platforms, not just the one on which the CD-ROM is prepared, is absolutely essential if executable files for other platforms are present on the CD-ROM.

    In recent years, so-called macro viruses have made possible virus infection of files other than executable files, Microsoft Word files for example. These viruses are usually more of a nuisance than a hazard and current versions of virus checkers will identify them, so they are not cause for undue alarm.

    Experience with a macro virus on the CD-ROM for the August 1996 issue of the IEEE Transactions on Education illustrates a more fundamental problem, however, for producing and distributing CD-ROMs with archival journals. Although the editors checked the files for viruses before they were recorded on the CD-ROM, the virus checker did not identify the macro virus. After the CD-ROM had been produced and distributed, a later version of the virus checker clearly identified the presence of a macro virus in a single file that, luckily, was incidental to the content of the CD-ROM and hence would not ordinarily be viewed by a user. Even if someone searched out the file and, for some reason opened it and contracted the virus, the consequences were mild enough that a recall or extensive notification seemed unnecessary.

    This experience points out the inevitable virus risks inherent in publishing CD-ROMs. As long as there is a delay between the appearance of a virus and its inclusion in the list of viruses detected by the best virus checkers, the risk of publishing viruses, including dangerously destructive ones, cannot be reduced to zero even by zealous care and concern.

  5. Additional information about the cross-platform standard ISO 9660 at:
    http://www.isomedia.com/homes/isomedia/cd/iso9660_spec.html

  6. Information about the life expectancy of data stored on CD-ROMs is available at:
    http://stargate.jpl.nasa.gov:1087/cd.html

 

Main Instructions Copyright FTP Review Server Acrobat Publicity HTML Tools