The Mathematica Journal
Volume 9, Issue 2

Search

In This Issue
Articles
Tricks of the Trade
In and Out
Trott's Corner
New Products
New Publications
Calendar
News Bulletins
New Resources
Classifieds

Download This Issue 

About the Journal
Editorial Policy
Staff
Submissions
Subscriptions
Advertising
Back Issues
Contact Information

AuthorTools: A Package for Document Processing
Pavi Sandhu

Functions

Introduction

Most AuthorTools functions require you to specify a notebook as the first argument. In general, you can specify a notebook either as a notebook object or a notebook file.

Notebook Objects

The Mathematica kernel refers to any open notebook via an expression of the form NotebookObject[fe, id], where fe specifies the front end in which the notebook is open and id is a unique serial number for the notebook.

The command Notebooks[] returns a list of notebook objects corresponding to all notebooks currently open in the front end.

This assigns the symbol nb to represent the first notebook object in the list.

Notebook Files

Alternatively, you can identify a notebook by specifying the name and location of the notebook file. You can use the ToFileName function to construct a string specifying the full pathname of the file.

Loading the Package

This loads the AuthorTools package.

You can then use the symbol nb as an argument to any notebook function.

Processing Multiple Notebooks

You can use AuthorTools functions to operate either on a single notebook or multiple notebooks in the directory.

To process a single notebook:

Specify the notebook as the first argument of the function you want to use.

In general, a function that acts on notebooks can accept either a notebook object or a notebook file as the first argument. The exceptions are those functions that act on the current selection in an open notebook, for example: HorizontalInsertionPointQ, AddIndexEntry, SelectionRemoveCellTags, or IndexCellOnSelection. These functions can accept a notebook object as the first argument but not a notebook file.

To process multiple notebooks:

1. Create a project file that lists all the notebooks you want to process. (See "Creating a Project File" for more information.)

2. Specify the project file as the first argument of the function you want to use. Alternatively, load the project file using the MakeProject dialog box, then use the notebook object corresponding to the dialog box as the argument of the function.

In general, a function that accepts a notebook file will also accept a project file. The exceptions are those functions that are only defined for a single notebook and hence cannot be applied to a project, such as: NotebookName, NotebookFolder, NotebookCellTags, or NotebookFileOptions. These functions can accept a notebook file as the first argument but not a project file.

General Notebook Manipulation

Getting Notebook Information

This gives the full pathname of the notebook object.

This gives the name of the notebook.

This gives the directory in which the notebook is located.

This gives all front end options set in the notebook.

Some notebook functions.

Using the Notebook Cache

Any notebook file created in the front end has a cache that contains information about the notebook's cell structure. The cache appears at the end of the notebook file and can be inspected by opening the notebook in a text editor.

The notebook's cache contains information on the position, style, grouping, and size of each cell. When you open an existing notebook, the front end uses the information in the cache to read into memory only those cells that need to be displayed on the screen. This reduces the memory and time required to read and render a notebook since the entire notebook does not have to be read into memory at once.

The main component of the cache is a notebook file outline. This has the same form as the notebook expression describing the notebook but with each cell expression replaced by a condensed version. Instead of the actual contents of the cell, the condensed expression contains information on the cell's position in the notebook and the number of bytes of data it contains.

Each condensed expression is of the form: Cell[a, b, c, d, e, rules, string]. The parameters a through e are integers and denote the following:

You can view the file outline in a notebook's cache using the NotebookFileOutline function.

The NotebookLookup function extracts information from the notebook file outline. With "CellOutline" as the second argument of NotebookLookup, you get a list of all the condensed cell expressions occurring in the notebook file outline.

With "CellExpression" as the second argument of NotebookLookup, you get the complete cell expression for each cell in the notebook.

With "CellIndex" as the second argument of NotebookLookup, you get the serial number of each cell in the notebook. The first cell in the notebook has serial number 1 and so on.

The two-argument form of NotebookLookup simply generates a list of the form for a notebook containing n cells. The NotebookLookup command becomes more useful if you use the optional third argument to specify a pattern. This allows you to extract information on all cells that fit the pattern. For example, the following command gives the serial number of all Section cells.

The following command returns the cell expressions of all Section cells.

Some functions that provide information about a notebook's contents.

Adding and Removing Cell Tags

A cell tag is a string that serves as a tag or marker to identify a specific cell in a notebook. Cell tags are used, for example, to define the targets of hyperlinks or to identify the information that appears when you choose a topic in the Help Browser. AuthorTools includes functions for adding or removing cell tags in a notebook.

NotebookCellTags returns a list of all cell tags in the specified notebook. The nth element of the output list is a list of all cell tags for the nth cell in the notebook. Each cell that does not have a cell tag is represented by an empty list.

You can add cell tags to cells in a notebook using AddCellTags.

You can remove cell tags from cells in a notebook using RemoveCellTags.

Some functions for adding and removing cell tags.

Processing Multiple Notebooks

Creating a Project File

The first step in processing multiple notebooks is to create a project file. This is a file that specifies the names and location of all notebooks in a project. You will need to create a different project file for each project you are working on.

A project file is a plain text file with a .m suffix. The file must include data in the following format.

Here, name is the name of the project, directory is the full pathname of the project directory, and is a list of all notebooks that make up the project.

There are several different ways to create a project file:

  • Enter the project data in the text fields of the MakeProject dialog box. (See MakeProject for details.)
  • Use the command WriteProjectData[file, data], where data specifies the project data as specified in the preceding box.
  • Type the project data directly into a text file and save the file with a .m suffix.

Using a Project File

Once you have created a project file, you can operate on all notebooks in the project in one step. To do this you simply specify the project file as the argument to the AuthorTools function you want to use.

For example, to generate a table of contents for a project, you can evaluate the command, MakeContents[projectfile,"Book"]. The command for generating a table of contents for a single notebook is MakeContents[nb,"Book"].

Editing a Project File

When you evaluate an AuthorTools function with a project file as an argument, Mathematica uses Get to read in the project file. Any Mathematica commands present in the project file are evaluated at this time. Thus, the project file is a convenient place to store commands that you want to apply to a project.

For example, suppose you prepend the following command to the contents of your project file.

This sets the option SelectedCellStyles of the AuthorTools function MakeContents. If you then use MakeContents to create a table of contents for that project, the option setting specified in the project file will be automatically used, overriding the default behavior of MakeContents.

Note: Any commands you have added to a project file will be overwritten if you use the WriteProjectData function or the MakeProject dialog box to modify the project file.

Creating a Table of Contents

This generates a table of contents for the notebook. The table of contents is saved in the same directory as the source notebook.

The second argument of the function specifies the format of the table of contents. You can choose from three different formats: "Simple", "Book", or "BookCondensed". (See MakeContents for more information on these formats.)

Note: The MakeContents command inserts cell tags into the source notebook and automatically saves the changes. If you do not want your source notebook modified, you should keep a separate copy as a back-up.

If you are going to generate a table of contents in any format other than Simple, you should first use Paginate. This function calculates the page numbers for the specified notebook and stores them as TaggingRules in the notebook.

The following command generates a table of contents in the Book format.

The option SelectedCellStyles determines which cells in the notebook are included in the table of contents. The default setting is SelectedCellStyles Rule {"Title","Section","Subsection","Subsubsection"}. The following command generates a table of contents in the Book format that includes only Title, Section, and Subsection cells.

Creating an Index

There are two steps involved in setting up an index.

1. Associate index entries with specific cells in your source notebook.

The simplest way to insert index entries is using the Edit Notebook Index dialog box, accessed from the MakeIndex palette. Alternatively, you can type and evaluate the AddIndexEntry function. The following command associates the specified main entry and subentry with the currently selected cell(s) in the notebook.

2. Generate the index using the MakeIndex function.

This generates an index for the notebook.

The second argument of MakeIndex specifies the format of the index. You can choose from four different formats: "Simple", "Book", "TwoColumn", or "BrowserIndex". (See MakeIndex for more information on these formats.)

Note: The MakeIndex command inserts cell tags into the source notebook and automatically saves the changes. If you do not want your source notebook modified, you should keep a separate copy as a back-up.

Finding Differences

This generates a notebook that lists the differences between the two notebooks, nb1 and nb2.

NotebookDiff will also find the differences between two projects, directories, or lists of files. This generates a notebook that summarizes the differences in the two sets of notebooks.

By default, NotebookDiff finds all possible differences between the notebooks, including differences in cell styles or options. However, you can narrow the scope of differences reported by NotebookDiff by specifying options. For example, this excludes cells in the Input and Output style from the diffing operation.

There are several other options to NotebookDiff, for example, to ignore cells that have the same content but differ only in their cell style or only in their options.

To view the differences within two different cells, use CellDiff. This highlights the differences in content by enclosing them in colored brackets of the form: (LeftBracket and RightBracket). Style and option differences are also listed.

If the notebooks you are comparing are style sheets, you can either use NotebookDiff or a more specialized version called StyleSheetDiff.

NotebookDiff and StyleSheetDiff are both implemented in terms of a lower-level function called DiffReport. This compares two generic lists, creating a report of all the insertions, deletions, and updates between the two lists.

Restoring a Notebook

NotebookRestore

NotebookRestore takes a notebook containing one or more syntax errors and creates a new notebook containing all the cells that did not have a syntax error. For example, suppose nb1 represents a notebook file that has been corrupted by removing one quote from a cell style name. If you try to open this file, the front end will report a syntax error and suggest you cancel the open operation.

Using NotebookRestore, you can at least access those cells in the notebook that are not corrupted. NotebookRestore will open a new notebook window containing all the good cells from the given notebook file and insert an indicator at each place where notebook data has been deleted.

Instead of deleting the corrupt data, you can display it verbatim within the indicator by setting the option DeleteCorruptCells Rule False.

All the corrupt cell indicators have the cell tag "Corrupt", so you can easily highlight them by choosing Corrupt from the list of cell tags displayed under the Find RightTriangle Cell Tags menu. Alternatively, you can walk through each indicator one by one using the NextCorruptCell function.

If your notebook contains large graphics cells or large blocks of typesetting, it may take a long time for Mathematica to read in the expression and try to determine if it contains any syntax errors. For that reason, there are options that allow you to skip over graphics and/or typeset cells without processing them.

SalvageCells

SalvageCells is a noninteractive way to extract the list of cells from a corrupt notebook file. This function is intended for users who want to manipulate the cell listing directly, as part of a program. SalvageCells takes the same options, and returns the same information, as NotebookRestore.

This salvages all the cells from a notebook file.

This gives the total number of cells.

This gives the total number of blocks removed from the notebook.

You can view just the bad data in a separate notebook.

Pagination

This paginates the specified notebook. If the argument is a project file, the command paginates all notebooks in the project.

This paginates all notebooks in a project. The setting StartingPages Rule "Next" means that the starting page for each notebook is the next page following the last page of the previous notebook.

Other possible settings for StartingPages are "Even", "Odd", Inherited, or a list of integers. The following command paginates all notebooks in a project and sets the starting page for the first four notebooks to be 1, 25, 57, and 83.

The following paginates all notebooks in a project. The starting page number for each notebook in the project is inherited from the setting of the option StartingPageNumber for that notebook.

Creating a Browser Categories File

This generates a simple browser categories file for the notebook.

The second argument of MakeCategories specifies the format of the BrowserCategories.m file. You can choose from three different formats: "Simple", "Full", or "FullNoTags".

The following command generates a full browser categories file for the notebook.

Note: The MakeCategories command inserts cell tags into the source notebook and automatically saves the changes. If you do not want your source notebook modified, you should keep a separate copy as a back-up.

Creating Bilateral Cells

You can use the MakeBilateral function to create bilateral cells for displaying sample Mathematica calculations. Each bilateral cell consists of two columns with expository text to the left and input and output cells to the right. To create a bilateral cell, you must have a pair of input and output cells preceded by a cell containing text.

This is the integral of the sine function.

This converts the text cell and input-output pair into a bilateral cell.

Here is the resulting bilateral cell.

The cell styles that appear in the left column of the bilateral cell are determined by FirstBilateralStyles, an option to MakeBilateral. The default setting of this option includes only one style, MathCaption.

The cell styles that appear in the right column of the bilateral cell are determined by RestBilateralStyles. The default setting includes Input, Output, and Graphics cell styles.

Once you have created a bilateral cell, you can reverse the process and get back the constituent cells using the following command.

Extracting Cells from a Notebook

This exports all Section style cells to a notebook.

You can also extract all Section cells by specifying a list of two elements as the second argument. ExportNotebook[nb1, {"CellStyle", "Section"}, "Notebook"] is equivalent to ExportNotebook[nb1, "Section" "Notebook"].

This command exports all cell groups in which the heading cell is an input cell. This enables you, for example, to extract all input-output pairs in a notebook.

This command exports all cells having the cell tag "Reference1".

This command exports all graphics cells as separate GIF files.

Printing

This sets the starting page number of the specified notebook to 15.

This sets the printing margins in the specified notebook. The option value is a list of four numbers, {l, r, b, t}, which specify the value of the left, right, bottom, and top margins in printer's points.

This modifies the notebook so the cell brackets are invisible when the notebook is printed.



     
About Mathematica | Download Mathematica Player
Copyright © Wolfram Media, Inc. All rights reserved.