MathWorld (mathworld.wolfram.com) is the internet’s most popular mathematical resource. Its extensive and detailed entries usually find their way into the top three (if not top one) results returned by a Google search for mathematics terms. Mathematica has long been instrumental in the authoring of MathWorld, which has been hosted by Wolfram Research since 1999. As a result of new work partially funded by a grant from the National Science Digital Library program of the National Science Foundation, MathWorld’s contents are now written entirely in notebooks that are converted to web pages by Mathematica itself. In this article, the processes that bring MathWorld from keyboard to the web will be discussed, focusing especially on the Mathematica-based tools that make this transformation possible. In addition, a number of useful, new interactive features added to MathWorld as a part of the digital library work will be discussed.
What Is MathWorld?
MathWorld, formerly known as Eric Weisstein‘s World of Mathematics, is an online encyclopedia of mathematics provided by Wolfram Research as a free resource to the world mathematics community. It is not only an accessible and up-to-date collection of mathematical knowledge, but also serves as an extensive database of literature references and links to mathematics on the web. In addition to its thousands of entries, MathWorld is amply illustrated with thousands of figures and diagrams, interactive applets for visualizing three-dimensional geometric objects, and a collection of embedded webMathematica-based demonstrations. The main page of the site contains links to MathWorld Headline News, which provides in-depth coverage of breaking mathematics news. Finally, an online classroom section has recently been added, providing mathematical entry points that are accessible to students and educators.
According to reliable sources (in particular the all-knowing and all-powerful PageRank algorithm on Google), MathWorld is the most popular math website on the internet. This is also reflected by the fact that doing a Google search for arbitrary math terms generally returns MathWorld pages near the top of the list, most commonly in the number one spot. Feel free to try this experiment yourself; suggested starting points are “eigenvalue,” “square matrix,” “Khinchin’s constant,” “cylindrical algebraic decomposition,” or “Fourier transform.”
MathWorld receives hundreds of thousands of page hits each day. Analysis of the site’s readership shows that it originates from a broad and diverse set of sources, with technical companies and universities representing the largest share. As a result of its large audience, MathWorld accounts for the majority of web traffic at Wolfram Research. In addition, more Mathematica notebooks are downloaded from MathWorld than from any other website.
While many readers may realize that MathWorld is a vast collection of mathematical material, they may not fully appreciate its true extent. As of September 2006, MathWorld is equivalent to a 4,251-page book (in 9-point type and on 8.5×11″ paper) containing more than 12,500 entries, 10,000 graphics, and 100,000 cross-links. Unlike a book, however, it also contains a variety of interactive components. These include more than 400 entries with interactive Java applets for solid geometry (topics/LiveGraphics3DApplets.html), 100 entries with animated GIFs (topics/AnimatedGIFs.html), and, more recently, nearly 100 entries with interactive webMathematica examples (topics/webMathematicaExamples.html).
In addition to interactivity, MathWorld contains a large number of computational resources that may be helpful to Mathematica users. There are nearly 4,000 downloadable Mathematica notebooks on the site, each of which contains code implementing algorithms, illustrations, and computations related to a given entry. These sample notebooks can be downloaded by clicking the “Download Mathematica Notebook” link at the top of a relevant page, as illustrated in Figure 1 for Reversal.html. In this case, the notebook contains some simple code for computing the reversals of numbers (i.e., numbers formed by reversing the decimal digits of a given number and concatenating), defining the functions Reversal and PalindromicQ.
Figure 1. Navigation elements for Reversal entry.
This code can be used, for example, to find the nontrivial (i.e., nonpalindromic) numbers less that whose reversals are integer multiples of themselves:
(These are the first few terms of sequence A008918 in “The On-line Encyclopedia of Integer Sequences” by Neil Sloane: oeis.org/A008918.)
In addition to downloadable notebooks, more than 50 Mathematica packages (available from the Mathematica Library Archive at library.wolfram.com/infocenter/MathSource) provide functionality used to create MathWorld content.
Authoring MathWorld Before Mathematica
Prior to July 2005, MathWorld was authored in , a language that is widely used for typesetting mathematics but is ill-suited for translation into modern markup languages such as HTML and XML. As a result, it is generally difficult to convert documents into a format suitable for publication on the web . The MathWorld build system accomplished the -to-HTML conversion using a complicated multistep translation process that utilized: (1) a customized version of 2HTML; (2) external image-rasterization libraries; (3) perl database table construction and element extraction; and (4) multiple passes of post-processing on the resulting files. This meant that the build system relied on a large number of moving parts, several software components, and multiple steps. As a result, it was relatively fragile and hence difficult to maintain—let alone extend. In addition, the equations produced by the process were limited to static GIF images (essentially, pictures of the original equations displayed in pixel form).
Pictures of equations have many severe limitations. They cannot be enlarged, rescaled, linebroken, searched, crawled, or fed into speech synthesis systems to provide accessibility to the visually impaired. In addition, because images contain no semantic information, equations so represented cannot be extracted and plugged into a software system such as Mathematica in order to perform computations. A possible solution to these limitations would be to use an XML-based representation for encoding equations that is capable of being directly rendered in browsers. Happily, such a representation exists for mathematics and is known as MathML. MathML has a W3C-recommended standard (www.w3.org/Math), native rendering support in Mozilla and in other browsers via several commercial plug-in products, and built-in export and import capabilities by mathematical software systems such as Mathematica. Unfortunately, widespread adoption of MathML has thus far been hampered by a number of problems that make it difficult to use as a display technology within an arbitrary browser running on an arbitrary operating system.
A further drawback of -based authoring was that while results were derived, verified, and visualized using Mathematica, all derived equations required manual (or semi-manual) transcription into as a result of the fact that textual content, graphics, and executable Mathematica code were all maintained separately.
Authoring MathWorld with Mathematica
Between 2002 and 2005, the MathWorld build system was redesigned and rebuilt from the ground up. This work was undertaken with partial support from a National Science Foundation Digital Library grant to add new interactive components to the site that would be of value to the hundreds of thousands of students, teachers, and researchers who routinely visit MathWorld. In order to accomplish this goal, we created a modern and robust system for building MathWorld that both allows simple authoring of mathematical content and permits easy inclusion of interactive components. In particular, the new system was designed to allow incorporating interactive calculators and plotters, the ability to map MathWorld’s existing subject classification system into the well-established Mathematical Subject Classification (MSC) scheme used by mathematicians, the addition of a rich set of metadata on each MathWorld page to allow its contents to be described and classified, the creation of a new didactic layer for navigation of its content, and the capability to generate pages in multiple formats, especially MathML.
The powerful capabilities of Mathematica provided an ideal system for achieving these goals. In particular, Mathematica has extensive built-in knowledge of HTML, XHTML, and MathML, as well as the ability to perform complicated sets of pattern-based transformations on structured documents. In addition to its utility as a system for building websites, Mathematica is also a powerful authoring environment, symbolic/numerical computation engine, and has extensive import/export capabilities. It offers WYSIWYG for easier authoring and greater efficiency, access to Mathematica notebook/palette programming as tools for authoring, and provides its own markup language standard (the notebook) to allow content to be easily authored and incorporated. It also automatically handles other issues that can be difficult for traditional authoring environments, for example, line breaking and equation numbering. And because of Mathematica’s ability to take its own structured document format and convert it to almost any desired format, the same core build system can be used in a modular way to generate content in multiple formats, in this case (X)HTML and MathML.
Getting MathWorld into Mathematica
Before creating a Mathematica-based build system, it was first necessary to undertake a one-time conversion of the MathWorld source documents from into Mathematica notebooks. Such conversions are challenging, especially since and are presentation- rather than semantics-based and hence accurately describe only the intended positions of blobs of ink on the printed (or electronic) page rather than the actual meaning of those symbols. Because of its ubiquity as a typesetting language for mathematics, a fair amount of effort has been devoted to developing translation programs from to other formats. In particular, a number of systems have been built that purport to translate into MathML and Mathematica. In principle, translation from one box-based language to another is a straightforward process. Unfortunately, the practice is more often than not fraught with difficulties as soon as nontrivial cases are encountered, which inevitably happens almost immediately. In fact, after investigating a fairly exhaustive list of existing translators, we determined that none were flexible or reliable enough to handle the extensive set of source documents comprising MathWorld. It was therefore necessary to implement our own -to-notebook conversion.
Fortunately, an internal software tool known as tex2nb had previously been developed at Wolfram Research that was capable of translating a subset of into notebooks. While this tool had been used internally for some time, it was not under active development and required a large number of extensions, modifications, and changes to be capable of (1) parsing the entire contents of MathWorld’s source documents and (2) producing clean notebooks that were amenable to subsequent programmatic parsing. In particular, the tex2nb conversion had to be done carefully in order to preserve typesetting structures, to tag information that could not be used directly, and to provide highly structured and uniform notebooks. Happily, we were able to eventually overcome these obstacles. It thus became possible to take the entire set of source documents (equivalent to a printed book more than 4000 pages long) and produce clean, machine-readable notebooks in an entirely automated fashion.
Mathematica users may be interested to learn that the inner workings of this tool have subsequently been integrated into Mathematica itself. In fact, Mathematica 5.1 and later do a quite passable job of importing vanilla (as well as a fair bit of vanilla ) directly. To illustrate this, consider the following example, which defines one of the seven so-called “mock theta functions” of order 3. (For more details on these functions, see the MathWorld entry MockThetaFunction.html.)
Define one of the mock theta functions of order 3.
This expression can be converted into a snippet using the command TeXForm.
Show this expression in TeXForm.
Converting this to a string and surrounding it with the signature “double dollar signs” that delimit a displayed equation in , then importing the result into Mathematica by placing it in a notebook, shows that Mathematica does a nice job of directly importing the snippet.
Import the resulting into a notebook (Figure 2).
Figure 2. expression for the mock theta function imported directly into a Mathematica notebook.
After running through the translation process from to notebook, here is the source document for a “typical” entry, in this case Abundance.html. As can be seen, this notebook incorporates metadata tagging, textual content, typeset mathematics, tabular material, and references, all together in one conveniently annotated notebook.
Mathematics:Number Theory:Special Numbers:Digit-Related Numbers
Mathematics:Foundations of Mathematics:Mathematical Problems:Unsolved Problems
, , 4, , , , , … (Sloane’s A033880).
The following table lists special classifications given to a number based on the value of .
|deficient number||A005100||1, 2, 3, 4, 5, 7, 8, 9, 10, 11, 13, 14, 15, 16, 17, …|
|almost perfect number||A000079||1, 2, 4, 8, 16, 32, 64, 128, 256, 512, 1024, 2048, …|
|0||perfect number||A000396||6, 28, 496, 8128, …|
|1||quasiperfect number||none known|
|abundant number||A005101||12, 18, 20, 24, 30, 36, 40, 42, 48, 54, 56, 60, …|
Values of such that is odd are given by , 2, 4, 8, 9, 16, 18, 25, 32, … (Sloane’s A028982; i.e., the union of nonzero squares and twice the squares). Values of such that is square are given by , 12, 28, 70, 88, 108, 168, … (Sloane’s A109510).
Kravitz has conjectured that no numbers exist whose abundance is a (positive) odd square (Guy 2004).
Guy, R. K. Unsolved Problems in Number Theory, 3rd ed. New York: Springer-Verlag, 2004.
Sloane, N. J. A. Sequences A000079/M1129, A000396/M4186, A005100/M0514, A005101/M4825, A028982, A033880, and A109510 in “The On-Line Encyclopedia of Integer Sequences.”
Building MathWorld with Mathematica
Once the contents of MathWorld are present in carefully tagged and structured notebooks, the process of website creation can begin. While Mathematica has the ability to produce web pages directly via Export and FileSave As SpecialHTML, the MathWorld site contains a number of specialized structures that must be created during the build. For example, MathWorld contains both alphabetical (mathworld.wolfram.com/letters) and topical (mathworld.wolfram.com/topics) indices. The indexing information is contained in tagged cells for each entry, but in addition to the link trails that are displayed on individual entry pages, index pages for each topic must also be generated. Similarly, MathWorld contains a “what’s new” page (mathworld.wolfram.com/whatsnew) which lists entries that have been added or substantially modified. These custom navigation elements require special treatment in the website building process.
The production of MathWorld web pages from source notebooks is implemented using a general-purpose Mathematica-based tool known as transmogrify that is currently under active development by the online documentation group at Wolfram Research. Transmogrify is in effect an XML processing tool for website creation that implements the idea of Extensible Stylesheet Language Translation (XSLT) in top-level Mathematica code by making use of Mathematica’s XML and pattern-matching capabilities. Just as XSLT makes single-source documents possible in XML, transmogrify allows conversion of a Mathematica notebook into any type of XML. In fact, prior to being adapted for MathWorld, transmogrify had already been used to create many popular websites at Wolfram Research, including The Wolfram Functions Site (functions.wolfram.com) and the Wolfram Mathematica Documentation Center (reference.wolfram.com).
To use transmogrify, we created a set of XML-like templates that described how various MathWorld notebook structures should be translated. These templates could then be customized for HTML, XHTML+MathML, Java Server Pages (JSPs), and so on, making it easy to export to multiple formats using the same overall system. We were therefore able to develop a master Mathematica program based on transmogrify that builds the entire MathWorld website, all content pages, indices and subject trees, and additionally performs such useful functions as checking link integrity, building custom indices, and so on. This system also automatically takes care of rasterizing inline and displayed equations into GIFs using Mathematica’s built-in ability to export its typeset structures to raster formats, as well as adding equation numbers and other related matter. It is also clever enough to rebuild only needed parts of the site, making it now possible to build and push incremental updates of MathWorld several times a day.
New Features in MathWorld
In addition to building the MathWorld site, the transmogrify-based build system also made it possible to include a number of entirely new features on MathWorld. These include
- Qualified Dublin Core metadata, including MSC headings
- interlinking with The Wolfram Functions Site (functions.wolfram.com)
- a new MathWorld Classroom for browsing mathematical content on MathWorld based on learning prerequisites
- new interactive webMathematica examples, plotters, and calculators for many entries
- the ability to create mathematical markup language (MathML) versions of the site
- a streamlined comment and contribution system
Dublin Core Metadata
The Dublin Core Metadata Initiative (dublincore.org) is an open forum engaged in the development of interoperable online metadata standards that support a broad range of needs. In particular, qualified Dublin Core provides a rigorous and specific set of tags that can be used to precisely describe the contents of a web resource. All of the nearly 13,000 entries on MathWorld now contain Dublin Core metadata, as illustrated below for the MathWorld entry Sphere.html. As can be seen, this metadata provides a summary of a page’s content, revision history, and subject classifications in both the proprietary MathWorld scheme and in MSC, as well as information about the publisher, language, and so on. Inclusion of this metadata means that MathWorld content can be easily discovered, harvested, and re-exposed by protocols such as the Open Archives Initiative (OAI), thus allowing, for example, the type of federated searching provided by the National Science Digital Library project.
Interlinking with The Wolfram Functions Site
The Wolfram Functions Site is a sister site of MathWorld that consists of nearly 90,000 mathematical identities and more than 10,000 visualizations of the elementary and special functions of mathematical physics (and, in particular, of those implemented in Mathematica). As such, it gives many more formulas and identities satisfied by any given function than MathWorld can. At the same time, the handbook-style presentation of The Wolfram Functions Site provides less motivation and background than does the corresponding MathWorld entry. As a result, it makes a great deal of sense to interlink these two large resources. This process has now been completed; readers of The Wolfram Functions Site will note links to MathWorld from its sidebar, while MathWorld readers will notice special links near the references section of certain entries to corresponding pages on The Wolfram Functions Site.
The MathWorld Classroom is an entirely new part of MathWorld designed to help students and educators obtain streamlined definitions and didactic information for a select group of entries. Classroom pages indicate in which course and level of a typical mathematical curriculum a given entry falls, and include educational standards material in addition to examples of and prerequisites for the given concept. The pages also provide interlinking and cross-navigation between the main MathWorld entry and the Classroom entry. By analyzing mathematics curricula, textbooks, and access logs on MathWorld, we were able to compile a set of approximately 300 core entries which we targeted for inclusion in the Classroom. A concise and accessible definition was then carefully written for each such entry, and a database containing information about the educational level and relationships of that concept to others in the Classroom was constructed.
An illustration of a typical Classroom entry is shown in Figure 3. It includes (1) a cross-link to the full MathWorld entry; (2) a definition; (3) the educational level; (4) educational standards in which the entry appears; (5) examples of the entry; (6) other Classroom articles covering similar material; and (7) other Classroom articles for material that would be encountered in the same course.
Figure 3. MathWorld classroom entry for Quadrilateral.
The Classroom is easily accessible in a number of different ways. A convenient link on the sidebar of each MathWorld page takes the reader to an overview page (mathworld.wolfram.com/classroom) listing the names of mathematics courses that an American student would typically encounter from elementary school on through graduate course work. Clicking a given course lists all entries in the Classroom that would normally be encountered in that course, and clicking any one of these opens a popup window like the one illustrated in Figure 3. Classroom popup windows can also be opened from the corresponding full MathWorld entry. When browsing Quadrilateral.html, for example, a small icon appears at the top of the entry indicating that the user can “Explore this topic in the MathWorld Classroom.” Conveniently, the user can easily navigate back and forth between Classroom entries, which always open in the popup window, and main MathWorld entries, which always open in the usual browser window, since each Classroom entry contains a corresponding “Explore this topic in MathWorld” icon and link.
The MathWorld Classroom is currently being used extensively by both educators and students. While there is much additional work that could be done and additional entries that could be added to the Classroom, the current version appears to fill a unique void on the internet between sites containing extremely simple educational resources and those containing very complicated ones. On MathWorld, the two are now integrated together in a way that is especially useful to teachers and students alike.
Interactive webMathematica Examples
While MathWorld contains a huge amount of mathematical content, it is only as a result of work carried out under the National Science Digital Library grant that much of this content is now interactive. webMathematica is a web-based version of Mathematica that allows real-time mathematical computations to be incorporated into web-based content. By combining the power of webMathematica with the ease of website generation provided by MathWorld’s new transmogrify-based build system, it is straightforward to convert static plots into interactive ones, precomputed tables of values into customized ones computed on-the-fly, and so on. MathWorld currently contains 128 interactive webMathematica examples on 86 separate pages. (For a complete listing, see topics/webMathematicaExamples.html.) These examples include plotters for functions on the real line or in the complex plane, but also more complicated examples such as RiemannSum.html shown in Figure 4, which allows the reader to learn about Riemann summation by specifying arbitrary functions, endpoints, styles, and so on.
Due to the ease of creating and incorporating webMathematica examples, we plan to greatly augment the already large number of such examples over time.
Figure 4. webMathematica interactive example for Riemann Sum entry.
MathML Versions of Pages
As already mentioned, by making use of Mathematica’s built-in knowledge of XML and MathML, it is in principle relatively straightforward to use the same notebooks that currently create HTML pages with GIF images and instead construct XHTML+MathML versions of the pages using the same build system. We are currently able to do this, and as a result are presently collaborating with projects seeking to search arbitrary online mathematics. While MathML versions of MathWorld pages are very useful and suitable for this purpose, we have thus far not been able to bring them to a suitably polished state to be able to place them on the live website. There are a number of reasons for this, some of which are related to the nature of the MathWorld source documents themselves, but others of which are due to inherent limitations in the support of the MathML standard in browsers.
The limitation in MathWorld’s source documents is that, because they began life as , there is no semantic information accompanying mathematical markup except that which can be inferred based on the typesetting structures themselves. The existing formulas therefore can be translated into presentation MathML, but only with limited semantic information. As it turns out, while the resulting MathML is sufficient for display, indexing, and searching, only a limited subset of it is of sufficient quality to allow it to perform computations. There does not appear to be any shortcut here; making the transition from presentation to semantic markup requires either rekeying from scratch (an error-prone and labor-intensive operation) or additional development of software tools for inferring and tagging ambiguous interpretations (Does the typeset structure “” represent , “the quantity pi times the quantity ” or , “the prime counting function of the quantity “?).
On the browser side, a number of partial solutions are currently available, but the fact remains that support for MathML is still very problematic. In particular, a variety of commercial and public domain tools exist for viewing MathML, but none of the solutions we investigated were capable of rendering a page of the complexity of a typical MathWorld entry in a way that would be suitable for general usage. The situation is significantly complicated by cross-platform and cross-browser incompatibilities, nontrivial configuration issues that would exceed the technical abilities of the vast majority of MathWorld readers, font limitations, and the lack of a viable solution under Mac OS X.
We continue to investigate technical issues surrounding full MathML versions of MathWorld pages and hope that as some of the technical limitations are overcome by browser and plug-in vendors, it will become possible to view MathWorld pages in MathML.
Streamlined Comment and Contribution System
Reader feedback is an indispensable part of MathWorld’s success. In fact, thousands of contributors have been instrumental in building the site into the pre-eminent mathematics resource that it is today. While maintaining the convenient navigation structures that help make the vast amount of content on MathWorld easy to navigate, a newly redesigned contribution form accessible through links at the top of each MathWorld page lets readers leave comments and suggestions about that specific page. If you have a comment or contribution you’d like to share, please consider leaving a message!
MathWorld is now authored and built using Mathematica and Mathematica-based tools. As outlined in this article, MathWorld continues to grow not only in size, but also with new and useful features. Even more interactive features are on the way, taking advantage of Mathematica-based technologies such as webMathematica. Finally, I would like to take this opportunity to say what a great pleasure and honor it has been to have corresponded with so many MathWorld readers and contributors over the years. Thank you all for your support, readership, comments, contributions, and continued feedback, all of which have enabled MathWorld—which is a labor of love for me—to remain a useful and up-to-date mathematical resource over the last ten years.
The author would like to thank Stephen Wolfram for many useful discussions and Wolfram Research for its continued support of free public resources that are of use not only to Mathematica users, but to the world science and mathematics communities in general. I am grateful to John Renze, who carried out much of the design and implementation of the new build system in addition to working extensively on the MathWorld Classroom. Thanks also go out to Chad Slaughter and Bill White, whose -nical and technical knowledge were invaluable in creating the system to translate MathWorld sources into Mathematica notebooks. Michael Trott was frequently available to lend a small part of his unsurpassed knowledge of Mathematica programming. This project could not have been completed without the help of Jean Buck and her uncanny knack for matching technical needs with the resources required to address them. Thanks also to Megan Gillette and Jeremy Davis for the elegant graphical designs that make MathWorld’s appearance worthy of its content. I would especially like to acknowledge Lee Zia and the National Science Foundation National Science Digital Library program for supporting this work through grant #0226327, and to thank William Mischo and Tim Cole at the University of Illinois for their expert knowledge and close collaboration on many aspects of this work.
|||R. J. Fateman and R. Caspi, “Parsing into Mathematics,” (May 6, 2004) www.cs.berkeley.edu/~fateman/papers/parsing_tex.pdf.|
|E. W. Weisstein, “Making MathWorld,” The Mathematica Journal, 2012. dx.doi.org/10.3888/tmj.10.3-3.|
About the Author
Eric W. Weisstein began compiling scientific encyclopedias as a high school student nearly 20 years ago. Born in Bloomington, Indiana in 1969, he studied physics and astronomy at Cornell University and at Caltech and received his Ph.D. from Caltech in 1996. In 1995, Weisstein took the vast collection of mathematical facts that he had been accumulating since his teenage years and began to deploy them on the early internet. These pioneering efforts at organizing and presenting online content helped define a paradigm that has subsequently been followed by other large-scale informational projects on the web.
Weisstein joined Wolfram Research in 1999 and unveiled the MathWorld website at mathworld.wolfram.com later that year. As a Senior Research Fellow at Wolfram Research, Weisstein has led the development of MathWorld, continuing to expand its scope and depth and fulfilling his vision for bringing accessible mathematical and scientific knowledge to the widest possible audience. Weisstein works closely with the main development teams at Wolfram Research and is a consultant for the CBS television crime drama NUMB3RS.
Eric W. Weisstein
Senior Research Fellow
Wolfram Research, Inc.