[Turkmath:7188] elekotonik matematik dergileri üzerine bir yazı

13 Eyl 2010 Pzt 11:35:08 EEST

İlginizi cekebilecek bir yazi.

Saygılar
Mustafa Akgul

  The Demands on Electronic Journals in the Mathematical Sciences

/

Mark Steinberger

/

Electronic journals offer new capabilities for communicating scholarly
information to the research community. This article explores ways to
optimize that communication for journals in mathematics and related
sciences.

The most obvious benefit of electronic distribution of scholarly work is
convenience. Papers are available instantly from home or office, 24
hours a day. But that is only the tip of the iceberg. Many other
benefits come from the use of links: Internal cross-reference links make
electronic formats easier to read than paper copy, and external links
permit readers to travel through the primary and secondary literature
without leaving their desks. Electronic media also offer more indirect
methods of passing between resources: searchable databases of abstracts,
full texts of papers, and on-line reviews. Also, papers may be linked to
on-line archives of programs and supporting materials, which, since they
are not journal articles in themselves, may be updated. Links may also
be made to reviews and related material published later.

Here, I explore these issues in the context of the mathematical
sciences. One of the central issues in electronic publishing in the
mathematical sciences is need for complex notation, requiring fonts and
typesetting commands not available to the standard Web browsers. Much of
this paper will also apply to more general fields, e.g., foreign
languages with non-Roman alphabets, or sciences that make use of
mathematical or chemical notation in the body of papers. I will also
explore the specific electronic infrastructure available in mathematics.

Economic issues are important in this analysis. The costs of academic
publishing have increased dramatically in recent years, especially in
the sciences. Commercial publishers charge much more than not-for-profit
publishers. Electronic publication cuts some of the publishing costs and
requires less start-up money than paper publication (assuming that the
publisher already has a computer and the technology infrastructure, as
most scholars do today), enabling new publishers to enter the game. Cost
containment now becomes an important issue. Without it, these new
players will be driven out of the business.

    Formats of Articles

The standard paradigm for electronic publication comes from HyperText
Markup Language, HTML. File sizes are small, and connectivity is high:
Hypertext links are freely used, both for internal cross-references and
to external resources.

HTML remains a common format for electronic journal articles, but it has
certain drawbacks. First, it is difficult to control the appearance of
the articles on the reader's screen. While cascading style sheets
provide a certain degree of control, the articles still look different
in different browsers or on different platforms, even when the reader
has all the specified fonts. And the fonts available on different
systems vary considerably.

Symbols not available in standard fonts are generally rendered as
in-line gifs, and illustrations and other graphics are generally
rendered as in-line inclusions of external files, as well. Style sheets
are also external to the HTML file, and are not saved by the browser
when it saves the source file. Thus, many HTML documents are not
portable (that is, they depend on being on the Web to present properly;
they cannot be downloaded easily to an offline computer).

And printed documents are still important. But the print quality for
HTML documents produced by most Web browsers tends to be low, and page
numbers are not standardized. One loses a familiar and important method
of identifying segments of papers, and of identifying the location of a
particular article within a journal.

Also, while links are freely available, creating them is labor
intensive. There is no macro language in HTML that can be used to
generate automatic placement of links. Similarly, apart from the
elements that can be specified in style sheets, there is no automation
available in the markup of HTML files. Table writing, for instance, can
be tortuous. A language other than HTML is therefore preferable for
authoring papers, and conversion programs from other languages to HTML
are needed.

In reaction to some of these issues, a number of journals have chosen to
use Adobe's Portable Document Format, most popularly used in Adobe
Acrobat, for their papers. PDF permits internal and external links,
giving it the same functionality on the Web as HTML. It also has fixed
typesetting and page numbers, permitting the screen and printed copy to
look the same. But placing links in PDF documents can also be labor
intensive, unless they can be placed automatically in the Postscript
file used to create the PDF. Again, conversion software is important.

"Donald Knuth developed a typesetting system from the ground up, in part
to be able to typeset his own work"

On the other hand, fonts and graphics may be incorporated directly into
the PDF file, making the result portable for use on off-line computers.
The Acrobat Reader (a freely available software package for reading PDF
files) also prints excellent copy on virtually any printer, preserving
the fonts, graphics, layout, pagination, etc., specified by the
publisher. That guarantees print-journal quality output from an online
document.

Nevertheless, some readers prefer HTML format for papers. For instance,
the HTML files are smaller and transfer more quickly. And in journals
that offer both paper and electronic formats, the PDF version is often
formatted identically to the paper version. If the paper version has
large pages, small print, and multiple columns (as is the case for some
science journals), an identically formatted PDF version is likely to be
difficult to read on-screen, and even slower to load. Also, some readers
like to be able to override publisher-specified formatting and alter the
appearance to suit their preferences. So some journals offer both HTML
and PDF versions of their papers. Indeed, multiple formats of papers is
likely to become more common in electronic journals.

    Formats for Papers in Mathematics

The mathematical sciences have special typesetting needs, so much so
that the famous computer scientist Donald Knuth developed a typesetting
system from the ground up, in part to be able to typeset his own work.
Known as TeX (pronounced "tech") Knuth's system has become the standard
tool for communicating mathematics. Virtually every mathematical
scientist has access to a TeX system, and uses it for typesetting
papers. (The output contains all information needed for printing the
paper. Mathematical journal and book printings, for instance, are
generally made from photoplates made from printouts of TeX files.)

Since mathematical scientists write in TeX, most mathematics journals
(both print and electronic) either prefer or require that articles be
submitted in TeX, which is then edited and reprocessed by the journal's
production department.

The input language for TeX is ASCII, with the text interspersed with
typesetting commands. It is a mark-up language with macros:
user-definable commands that can be used to automate processes like
repetitive editing functions.

Indeed, macros are used to set all the basic style parameters for an
article: page layout, formatting of headings of sectional units,
formatting of theorems, equations, and all other elements in the paper.
Much of this is done in a style file written by a typesetting designer
and loaded when the paper is processed. The author need only include
statements like

\section{Introduction}

to begin a new section. The style file then sets the headers, passes
information to the table of contents, etc.

Similarly, theorems may be specified by of the form

\begin{theorem} Text of theorem. \end{theorem}

which can be set to produce an automatically numbered display like

Theorem 5. Text of theorem.

Similar formatting instructions, including automatic numbering, may be
done for equations, figures, and other standard sorts of environments.
Commands are differentiated from text by the leading backslash, and may
be defined (or redefined) at various levels, including the in style
file(s) and in the input file for the document itself.

The automatic numbering is extremely useful for electronic publication,
as the counters used for numbering theorems, equations, sections, etc.,
may be used to form names for link anchors that can be set automatically
by commands embedded in the basic style macros. Anchors may also be set
by a \label command that permits the author to create a mnemonic name
for the theorem or equation. If the author inserts \label{maintheorem}
inside a theorem environment, he or she may then make references to
Theorem \ref{maintheorem} later in the work. The "\ref{maintheorem}"
will be replaced by the number of the actual theorem when the paper is
processed, and may also be set to produce a link to an anchor placed by
the \label command. Macros may also be used to create a link from the
output set by an expression like "Theorem \thmref{5}" to the anchor
automatically set in the statement of Theorem 5.

These anchors and links are not visible to every existing software
package for reading TeX documents, but may be converted to anchors and
links in a PDF file that can be prepared from the TeX document.

Notice that that macro capability allows the author to move virtually
all the typesetting decisions to the style file, which HTML cannot
currently do.

The ASCII input file I've been describing is converted by TeX's software
system into a file format called DVI (for "device independent"). The DVI
file contains all the typesetting and font information, but not the font
files themselves. Large software installations (with many font files)
are needed to display or print DVI files, or to convert them to Postscript.

"Postscript readers and printer drivers are far from universal, and are
in fact quite rare in the PC world --- and many mathematicians have
trouble installing them"

Graphics information is not contained in the DVI file itself, and must
be bundled with the DVI file to be accessed by the software for
displaying, printing, or converting the file. Moreover, the commands
used for accessing the graphics are not native to TeX, but are part of a
system of extending TeX's capabilities via commands called "\specials".
The \special commands are passed through to the DVI file to be read by
whatever software is used to view, print, or convert the DVI to another
format. And different software packages recognize different formats of
\special commands (the format being contained in the argument to
"\special").

Because of this difference in syntax, it is difficult to distribute TeX
with graphics over the Web in a way that will be accessible to viewers
on different platforms. Also, the graphics files themselves, usually in
Postscript, are not necessarily viewable with any of the standard
software packages on some platforms. The software and hardware needed to
read or print TeX is both specialized and large. Indeed, many
mathematicians don't bother to keep it locally (instead using network
resources at their offices), which further limits accessibility of TeX
files on the Internet. Nevertheless, many of the earliest electronic
mathematics journals used TeX source files and DVI files as the formats
of their papers.

As an alternative, the most common choice was Postscript, which had
other drawbacks. Postscript files are large, and while it is possible to
compress embedded graphics files, there is no uniform system of
decompression available. So documents that include large graphics files
become unwieldy. Also, Postscript readers and printer drivers are far
from universal, and are in fact quite rare in the PC world --- and many
mathematicians have trouble installing them.

Moreover, hypertext links are not native to either TeX or Postscript.
Systems have been developed to incorporate them (using \specials), but
there are at least two incompatible methods, and most TeX and Postscript
viewers can't process either of them. Thus, publishing in TeX or
Postscript is far from optimal.

But as I mentioned above, by using TeX as input it is possible to
harness the macro capabilities of the TeX input language to produce PDF
files with numerous automatically generated hypertext links, to both
internal and external resources. The internal links are especially
important in mathematical papers, where various theorems get quoted in
the proofs of subsequent results. Thus, when Theorem 2 is invoked in the
proof of Theorem 3, the reader can jump to the statement of Theorem 2
with one mouse click, and jump back with another.

As far as the appearance of the viewed or printed image is concerned
(i.e., in terms of fonts, layout, etc.), the resulting PDF file is
identical to the TeX DVI, as the typesetting is all done by TeX.

The graphical compression available for PDF is also extremely useful in
mathematics journals. Many authors use hand-drawn illustrations, which
must then be scanned to be included in an electronic format, and
high-resolution scans are often needed to get an appealing result.
Indeed, one paper with 12 megabytes of scans
<http://nyjm.albany.edu:8000/PacJ/1997/177-1-8.html> was reduced to a
PDF File of less than 500 kilobytes for the Pacific Journal of
Mathematics. <http://nyjm.albany.edu:8000/PacJ/>

Thus, it would be tempting to use PDF files as the sole format for
electronic journals in the mathematical sciences. But there is a great
deal of inertia among the readership. My own experience in this regard
comes from the New York Journal of Mathematics,
<http://nyjm.albany.edu:8000/nyjm.html> which was started in 1994, using
only DVI and Postscript formats. We introduced the PDF format in 1996,
but it didn't catch on right away. By the summer of 1997, the PDF format
was roughly equal in popularity to DVI and Postscript, despite the fact
the latter two lack links. Currently, PDF is a strong favorite, but the
other formats are still being accessed.

    MathML

While standard HTML cannot handle mathematical notation, an extension
within the XML framework is being developed that does. The idea is to
create new tags that can be embedded in (enriched) HTML documents, to
encode mathematical information. Known as MathML, it has tags for
expressing highly complex mathematical syntax. The language itself is so
complex that it is impractical to write directly in MathML, and
conversion utilities from other input languages are needed. To be cost
effective in the current environment, a good conversion tool from TeX is
needed.

Conversion from TeX, however, is somewhat problematic, as the syntactic
content of MathML is richer than that expressed by even very highly
structured versions of TeX, such as LaTeX. This is not too surprising,
as newer systems are often more ambitious than older ones. The aim is to
encode enough information in the syntax to be able to pass the
mathematical information along to other systems (e.g., computer algebra
systems) in a form that would actually permit calculations to be done.
In TeX, things like integrals are treated as typesetting phenomena,
rather than mathematical syntax, and the information is not readily
transferable to other systems. The ultimate ambitions of MathML in that
regard have not yet been settled (not surprising, given that MathML is
being developed by committee). It is an evolving system.

Given those considerations, it seems that new authoring systems will be
needed for MathML to be useful. Given the reluctance of most researchers
to learn new systems, these developments will likely take years, and
much will depend on the actions of the mathematical societies and other
influential groups. The acceptance of TeX was heavily influenced by
support from the American Mathematical Society.

There are also issues concerning displaying and printing of MathML
documents. MathML cannot be rendered by the standard browsers, so
plug-ins and/or helper programs are needed.

Current and soon-to-be-implemented methods of creating and rendering
MathML include the following, none of which can do everything needed:

    *

      IBM's Techexplorer (
      http://www.software.ibm.com/enetwork/techexplorer/) is a plug-in
      for standard Web browsers. It can render a subset of the current
      version of MathML, and is available on Windows platforms. IBM
      promises a Unix version, too. Users need to purchase the
      professional version of Techexplorer for printing.

    *

      WebEQ (http://www.Webeq.com/) is a Java-based suite of products,
      including an editor that can produce MathML code.

    *

      Amaya (http://www.w3.org/Amaya/) is a Windows- and Unix-based
      stand-alone browser that doesn't interface with standard Web
      browsers. It renders MathML, and has an editing mode for writing
      MathML expressions.

    *

      MathType (http://www.mathtype.com/) is an equation editor that
      runs on Windows and Macintosh machines, and interfaces with the
      standard PC-based word processors (but not with TeX). It can embed
      MathML expressions in HTML documents.

    *

      A subset of MathML can be generated by EZmath (
      http://www.w3.org/People/Raggett/EzMath/), a Windows-based editor.
      It uses its own input language, also developed for use on the Web.

    *

      The computer algebra systems Maple and Mathematica are developing
      support both for generating and rendering MathML expressions.

It will be very interesting to watch these developments and see which,
if any, catch on with the authors and readers of mathematical documents.
MathML may well offer a useful alternative to PDF in the future. It is
quite likely that many journals will eventually publish in both formats.

    Greater Connectivity

The issue here is methods of finding mathematical research papers by
searching for subject or keyword information in central resources, or by
following a sequence of links emanating from such resources.

In mathematics, the most powerful central resources are the reviewing
journals Mathematical Reviews (available onlineas MathSciNet,
http://www.ams.org/mathscinet/) and Zentralblatt fuer
Mathematik,http://www.zblmath.fiz-karlsruhe.de/MATH/subscription/subscription.
They publish reviews of the papers published in mathematical journals.

MathSciNet has an especially powerful system of internal links between
reviews, and provides direct links to the papers themselves, if they are
available on line. The reviews are classified by subject and are
searchable in multiple ways. Thus, it provides an unparalleled method of
browsing the literature for serious content (along with accurate
bibliographic data). To take advantage of MathSciNet, the New York
Journal of Mathematics has begun including links to MathSciNet reviews
from each of the bibliographic entries in its papers. If that practice
becomes widespread in electronic journals, and if more journals go on
line, then readers will be able to tour the primary and secondary
literature in their fields without leaving their desks.

"The xxx archive is the only one I know that attempts to cover all of
mathematics"

A similar, but less systematic source of connectivity comes from the
"living review articles" being pioneered by the Electronic Journal of
Combinatorics, at http://www.combinatorics.org/Surveys/. Those articles
give periodically updated reviews of the literature in particular
subject areas.

Another method of finding papers is through robot-compiled indices. But
one of the drawbacks there is that very few journals distribute the TeX
input files for their articles, and most robots are unable to parse
information from PDF, DVI, or Postscript files. Thus, the only
information available, in most cases, is the material available in HTML
files at the journal site , such as abstracts. The resulting indices are
less powerful than those used by the reviewing journals, but they do
cover articles that have not yet been reviewed.

Some journals, such as the New York Journal of Mathematics and the
Pacific Journal of Mathematics, provide indices of the full text of
their TeX input files (see the NYJM index at
http://nyjm.albany.edu:8000/search/j/ghindex.html) and redirect the
output for a query to the versions available online.

Much work remains to be done on linking such databases at disparate sites.

Another potential source of connectivity comes from preprint and e-print
archives. The xxx Mathematics e-Print Archive
(http://front.math.ucdavis.edu/) at Los Alamos National Labs [1]
<http://quod.lib.umich.edu/cgi/t/text/text-idx?c=jep;cc=jep;rgn=main;view=trgt;idno=3336451.0004.208;id=N1;note=ptr>
is a comprehensive archive of e-prints in mathematics. The author,
title, and abstract information is searchable, and the citations are
given for papers that have subsequently been published, but there are no
links, as yet, to the published versions of the papers.

That archive has a strong connection to the research community, as it
maintains e-mail lists in each of its subject areas, notifying users of
any new papers in areas they choose. [2]
<http://quod.lib.umich.edu/cgi/t/text/text-idx?c=jep;cc=jep;rgn=main;view=trgt;idno=3336451.0004.208;id=N2;note=ptr>
The archive also keeps the full TeX source of its papers, and may at
some point implement full-text indexing. (There are other servers in
some specialty areas that offer subsets of those services, but the xxx
archive is the only one I know that attempts to coverall of mathematics.
Given the interrelationships between fields, it is useful to be able to
search a database that covers the whole of mathematics, rather than a
particular subfield.)

The xxx archive is interesting in that it is in some sense in
competition with journals. The e-prints are freely available in
perpetuity (superseded versions are available on request). Journals are
encouraged to make use of the archive by contributing the final versions
of papers and simply linking to xxx from the journal's own Web site to
provide access to the papers themselves. Such journals are called
"overlay" journals: the journal acts as an overlay to the xxx archive.

But at present, it is unlikely that overlay journals will be able to
recoup the costs of copy editing and typesetting unless they mandate
page charges. Even for journals that don't recoup their costs, the use
of overlay technology would result in a one-size-fits-all look and feel.
Many journals will be motivated to maintain their own archives and
production systems, instead.

Without links to the published version, effective cooperation between
xxx and non-overlay journals is difficult. Thus, in the current
environment, we have a cleavage between xxx, with its full-text database
and broad e-mail notification, and the general run of electronic
mathematics journals, which are served primarily by the reviewing journals.

    Additional Features

An increasing number of mathematical papers include arguments settled by
running a computer program. The electronic environment permits efficient
distribution of such programs, in a form that may be used by the reader
to verify the results. See, e.g., the programs available through the
following "associated links" files for journal articles:

    * http://nyjm.albany.edu:8000/j/1997/3-4-info.html.
      <http://nyjm.albany.edu:8000/j/1997/3-4-info.html>
    * http://nyjm.albany.edu:8000/j/1998/4-1-info.html.
      <http://nyjm.albany.edu:8000/j/1998/4-1-info.html>
    * http://nyjm.albany.edu:8000/j/1996/2-4-info.html.
      <http://nyjm.albany.edu:8000/j/1996/2-4-info.html>

Such programs may be updated to reflect version changes in other programs.

Journals may also maintain links to reviews and other commentary, and
may archive errata files, author's elucidations, and pointers to
subsequent applications of a particular paper.

In particular, the electronic environment permits updating the
connections between published papers (which remain fixed in form) and
new work.

------------------------------------------------------------------------
[figure]

Mark Steinberger <http://nyjm.albany.edu:8000/%7Emark/> is
Editor-in-Chief of the New York Journal of Mathematics
<http://nyjm.albany.edu:8000/nyjm.html>, a refereed electronic
mathematics journal founded in 1994. His mathematical research is in
algebraic and geometric topolgy, with a special interest in symmetry.
(See http://nyjm.albany.edu:8000/~mark/rsch.html for details.) Receiving
his Ph.D. from the University of Chicago in 1977, he has taught at
M.I.T. and Cornell, and has been an Associate Professor of Mathematics
and Statistics at the University at Albany since 1987. His other
activities in electronic publishing include membership in the steering
committee of the xxx Mathematics e-Print Archive
<http://front.math.ucdavis.edu/> at Los Alamos National Labs, and
administration of the EMJ mailing list for discussion of electronic
publishing in mathematics. You may contact him by e-mail at
mark at csc.albany.edu.

    Notes

1. The xxx e-Print archive <http://xxx.lanl.gov/> at Los Alamos National
Labs has been phenomenally successful in physics, primarily as a forum
for circulating preprints: most of the papers appear in traditional
journals later on.

Later, they established an archive in nonlinear sciences, and more
recently, archives in mathematics and comupter science. The URL
http://front.math.ucdavis.edu/ is to a front end set up by the steering
committee of the mathematics archive.

2. A number of electronic journals also maintain such mailing lists.