Combinatorics on Words « The Mathematica Journal

Veikko Keränen

Combinatorics on Words
NB CDF PDF

Suppression of Unfavorable Factors in Pattern Avoidance

We explain extensive computer-aided searches that have been carried out for many years to find new ways of constructing abelian square-free words over four letters. These structures have turned out to be very rare and hard to find. We also encountered highly nonlinear phenomena that considerably affected our calculations and usually made them hard to accomplish. However, quite recently, we gained new insight into why these structures are so very rare. Consequently, the present work has the potential to make future explorations easier. The rarity of long words that avoid abelian squares can be explained, at least partly, by using the concept of an unfavorable factor. The purpose of this article is to describe the use of Mathematica in searching for and suppressing these factors. In principle, the same programs can be used with slight modifications for other kinds of word patterns as well.

Introduction

An abelian square is a nonempty word , where and are permutations (anagrams) of each other. In 1961, Paul Erdös [1] raised the question of whether or not abelian squares can be avoided (as factors) in infinitely long words (also called strings). In 1969, Pleasants [2] solved positively the question by Erdös in the case of a five-letter alphabet, but the four-letter case remained open until 1992 when the author [3] presented an abelian square-free (a-2-free) endomorphism over the four-letter alphabet . This endomorphism was found after long computer experiments, and, for a long time, all known methods for constructing arbitrarily long a-2-free words on were based on the structure of , including Carpi’s [4] modification from 1998.

In 2002, after over 11 years of exhaustive searches, we found a completely new endomorphism of , the iteration of which produces an infinite a-2-free word. The endomorphism is not an a-2-free endomorphism itself, since it does not preserve the a-2-freeness of all words of length 7. However, can be used with to produce a-2-free DT0L-languages of unlimited size. For the latest remarkable development, the reader is referred to the author’s publications [5, 6, 7]. Nowadays, a-2-free words have also found their way to applications, for instance, in number theory, algorithmic music, and only recently in cryptography (Rivest [9] in 2008).

An important analogous open avoidability problem for the three-letter case was posed by Sami Mäkelä [10] in 2002. In this problem, shortest possible abelian squares, that is, repetitions of the form or for a letter , are allowed to occur.

Quite recently we have gained new insight into why these structures are so very rare. We are now able to explain, at least partly, the rarity of long words that avoid abelian squares by using the concept of an unfavorable factor. The purpose of this article is to describe the use of Mathematica in searching for and suppressing these factors. In principle, the same programs can be used with slight modifications for other kinds of (possibly nonabelian) word patterns as well.

The concept of an unfavorable factor, together with our programs, can be used in connection with efficient matrix methods presented by Ochem and Reix [11] to find upper bounds for the growth of the number of repetition-free words. Indeed, their approach is also directly applicable to the abelian square-freeness case. The related computations are still proceeding, and, consequently, this topic is not elaborated here.

We have carefully studied the options to make the programs efficient with regard to running time and the required memory space. At the same time, for further development, the structures need to be flexible. Indeed, it is expected that these programs will be run interactively for quite a long time. The programs are also likely to be transformed into various computational environments (such as GRID, C++, FPGA). This work has already been started. In the process of these computations, efficiency will increase due to the natural reduction process. Indeed, the long lists of words that we now have to use can most likely be considerably reduced as one moves on to study longer words.

The programs and some of the structures involved are quite complicated and would have been very hard to develop without using Mathematica. We have been using integer coding and cumulative integer lists for words incorporated into quite extensive precomputations. Moreover, certain symbolic representations for words and Mathematica‘s pattern matching properties (for lists and strings) have been extremely useful.

At the same time, we feel that it would be extremely beneficial for programmability and computational efficiency if in Mathematica there were an efficient way of saying, for example, that when matching the pattern , we are actually interested only in those cases of u and v in which their lengths are equal. Indeed, our words can contain thousands of letters, and the absence of restricted pattern matching led us to write part of the programs in a quite tedious C-like fashion. Fortunately, to our big relief, debugging turned out to be a comfortable process. In the future, the new integrated string functions of Mathematica like RegularExpression can also be used.

We define an unfavorable (or forbidden) word or factor to be an a-2-free word over a fixed alphabet (in our case , or , in which case we allow or for a letter ) if it cannot occur as a proper factor inside any infinite a-2-free word. That is to say, over , an unfavorable a-2-free word cannot be continued to an infinitely long word to the left and to the right without necessarily creating an abelian square at some point. (However, it might well be possible to extend such a word boundlessly in one direction, say to the right, without producing any abelian squares. Experiments support this conjecture, but the existence of such unfavorable factors remains an open question.)

We now give a short explanation of our search for unfavorable factors. Let the alphabet be fixed and consider words over it. We take a word (we actually need to consider all the a-2-free words of a given length) and try to extend it in an a-2-free fashion to the right and left in all possible ways up to a given upper bound for the total length. Each time, the length of the word increases only by a given fixed length. We extend alternately to the right and left, and backtrack when necessary. If the upper bounds are reached, then the original word is so-far-favorable (it may still turn out to be unfavorable after more experiments). If there is no way to reach the upper bounds, then the original word is classified, without any doubt, as unfavorable. Thus, for a given length, we obtain three kinds of words: unfavorable (bad), so-far-favorable(so-far-so-good), and favorable (good). The latter type contains words occurring as factors in a-2-free words obtained by using, for example, , , and Carpi’s modification of .

It is a remarkable phenomenon that relatively short so-far-favorable words turn out to be unfavorable factors after being “safely” extendable (to the right and left) for a long distance (and with a huge number of branches). One might have expected the long buffers to be sufficient for further growth. Due to Carpi [4] and Keränen [7], we know that the number of a-2-free words over four letters grows exponentially (greater than or equal to ) with respect to word length . But how do these words grow? We conjecture that in spite of the exponential growth, the ratio between the number of properly extendable, that is, favorable, words of length , and the number of unfavorable factors of the same length, tends to zero as the length tends to infinity! Thus, we suspect that the vast majority of a-2-free words over four (and, arguably, three) letters cannot occur as proper factors in the middle of very long (infinite) words. In a way, this would also explain why it is so extremely difficult to find a-2-free endomorphisms over four letters. At present we know, for example, that just a little over half (50.5737%) of the a-2-free words over of length 20 are indeed unfavorable. In the future, this and other similar observations could lead to a better understanding of the abelian square-free structures.

As an example, in the case of four letters, the following words together with their permutations and mirror images are unfavorable (forbidden) factors.

These words form a very interesting case, leading to a million-fold fluctuational phenomenon (with respect to the required running time and memory space, if we wish to see the tree of all possibilities). This, actually quite frequently appearing, case is illustrated (in a somewhat different setting than the one we use elsewhere in this article) online at [12]. An example is the following illustration for "abcbdbcbacbdcdacbdac". Note the top at and the death of all branches at word length 87.

Figure 1. Extend abcbdbcbacbdcdacbdac alternately to the right and left, stepping one letter at a time.

We now provide an explanation for the case shown in the preceding figure. We tried to extend the word "abcbdbcbacbdcdacbdac" to the right (blue) and to the left (red) with all possible letters over . The length of the words increases only by one letter at a time (alternately to the right and left). All possible words are gathered in a list and the number of words is plotted according to their length. For each word there are four possibilities: it can be continued with 3, 2, 1, or 0 letters. The case of 0 letters means that the word cannot be continued (in an a-2-free way) at all. If no words can be continued, then we get an empty list and the computation ends. The resulting list consists merely of unfavorable (or bad, or forbidden) factors that start from the original given word, and this starting word is “in the middle” of every nonempty word in the list. Indeed, all the computations for the words in the list of someUnfavourableFactorsOfLength20, together with a great many other similar words, do end with an empty list (i.e., with a dead end) even though the width of the bidirectional tree, and the computational time, can be huge. All this can make finding unfavorable factors really challenging.

Here are some additional illustrations of this phenomenon. This time, the number of words is depicted using logarithmic scaling.

Figure 2. Extend abacadbcbacbdcdacbab alternately to the right and left, stepping one letter at a time.

Figure 3. Extend abacadbdadcacbadcbab alternately to the right and left, stepping one letter at a time.

Figure 4. Extend abacdbacbcdcbadacdab alternately to the right and left, stepping one letter at a time.

Figure 5. Extend abacdbadbdcdbacadcab alternately to the right and left, stepping one letter at a time.

In the last illustration, the word "abcdacbabdabacdacbcdad" of length 22 turns out to be unfavorable in a monstrous way. The blue point, just before the final collapse, is at (117, 7866918). In spite of this, not a single extension is possible to the left (which would otherwise lead to words of length 118).

Figure 6. Extend abcdacbabdabacdacbcdad alternately to the right and left, stepping one letter at a time.

In passing, we mention that some unfavorable factors can even be cut shorter from the right or left ends to (shortened) factors which, after a transitional phase, have the same final trajectory.

Preliminaries

In this section we present some mathematical notations and terminology. Our terminology is more or less standard in the field of combinatorics on words. Consequently, the reader might consult this section later, if need arises.

An alphabet is a finite nonempty set of abstract symbols called letters. A word (string) over is a finite (unless otherwise indicated) string, or sequence, of letters belonging to . The set of all words (nonempty words) over is denoted by (). On the set , the associative binary operation of catenation is defined. For words and , it is the juxtaposition . The empty word, which is the neutral element of catenation, is denoted by . The algebraic structures and are called, respectively, the free monoid and the free semigroup generated by .

Let , . The length of the word , denoted by , is the number of occurrences of letters in , that is, . Let . The number of occurrences of one letter in is denoted by , or simply by if . The notation stands for the Parikh vector of , that is, . Usually we will omit the subscript and write instead of . Quite interchangeably with the Parikh vector notation, we also use formal sums , with . For example, . Thus , with a word over , is an element of the abelian free monoid generated by . We will also consider differences of Parikh vectors and differences of formal sums. Consequently, these vectors and sums are extended into elements of the abelian free group generated by . The neutral element of is denoted by .

A word is called a factor (some authors call it a subword) of a word if for some words and . If (), then is called a prefix (a suffix) of .

Let be a given integer. A -repetition (an abelian -repetition) is a nonempty word of the form (, where for all , that is, the are permutations of each other). Instead of (abelian) 2- and 3-repetitions, the terms (abelian) squares and (abelian) cubes are often used. A word or an -word (explained later) is called -repetition free (abelian -repetition free, or -free in the abelian sense), or in short -free (–-free), if it does not contain any -repetition (abelian -repetition) as a factor. A word sequence or a word set is -free (–-free) if all words in it are -free (a--free). If, for a fixed , it is possible to construct arbitrarily long (infinite) a--free (or other pattern-free) words over a given alphabet , then we say that abelian -repetitions (or those patterns) are avoidable over .

A morphism is a mapping between free monoids and with for every and in . In particular, . A morphism , being compatible with the catenation of words, is uniquely defined if the word is (effectively) given for each . If , we call an endomorphism (and usually write instead of ). For a morphism and a language , we define . A morphism is termed uniformly growing if for every and .

For a given integer , a morphism is called -free (–-free) if is -free (a--free) for every -free (a--free) word .

With regards to -systems (Aristid Lindenmayer 1925-1989), we specify the following concepts. A D0L-system is a triple , where is an alphabet, is an endomorphism, and , called the axiom, is a word over . The (word) sequence generated by consists of the words

where for . The language of is defined by . Languages (sequences) defined by a D0L-system are referred to as D0L-languages (D0L-sequences). D0L-systems provide a very convenient way for defining languages and infinite words. Furthermore, if and are -free (a--free), then the iteration of will yield a -free (a--free) D0L-sequence. An HD0L-system is a 5-tuple , where is a D0L-system, called the underlying D0L-system of , is an alphabet, and is a morphism. The HD0L-sequence generated by consists of the words

and the HD0L-language of is the set . A DT0L-system is a triple , where is a finite nonempty set of morphisms (called tables) and is a D0L-system for every . The DT0L-language of is the set or , where the compositions of morphisms are constructed from . Obviously, a DT0L-system can be regarded as a D0L-system, when contains only one (endo)morphism. For a thorough discussion of various -systems the reader is referred to Rozenberg and Salomaa [13].

An –word is an infinite sequence, from left to right, of letters of an alphabet . Thus an -word can be identified with a mapping of into . One can construct an -word, for example, by iterating an endomorphism , such that and for some , . Such a morphism is called prefix preserving for the reason that is a proper prefix of whenever . An -word is obtained as the “limit” of the sequence ; .

Suppression of Unfavorable Factors with Mathematica

In this section we give examples of the developed Mathematica program. The correctness of the program and related packages has been tested, but exhaustive computer runs are still likely to take a long time in the future. The full program and further versions of it can be downloaded from [14] or from [15], the latter of which is a general link page for the topic. In addition, the program (in original form) is included at the end of this article, and the reader is invited to experiment with it. For this purpose, one may copy and edit the Input cells of the examples presented. The program consists of initialization cells and will be activated automatically.

In the following examples, many of the function definitions are omitted (they can be found at the end of this article). Note that all of the structures, variables, and constants starting with are global. For example, the global variable state represents the state of the construction. Its values are strings, including, for example, "extendOrChangeRight", "extendOrChangeLeft", "testRight", "testLeft", "failBoth", and "succeeded".

As mentioned in the Introduction, we first fix the alphabet and consider words over it. We take a word (in the final investigation, we will actually take all of the a-2-free words of a given length) and try to extend it in an a-2-free fashion to the right and left in all possible ways up to a given upper bound for the total length. Each time, the length of the word increases only by a given fixed length. We extend alternately to the right and left, and backtrack when necessary. If the upper bounds are reached, then the original word is so-far-favorable. If there is no way to reach the upper bounds, then the original word is classified, without any doubt, to be unfavorable. At present, favorable words (for the four-letter case) consist of only those occurring as factors in a-2-free words obtained by using known a-2-free endomorphisms and substitutions such as , , Carpi’s [4] modification of , and the new examples presented in [7].

In the final program we use integer coding for letters of the alphabet and cumulative integer lists for words. This makes detecting abelian squares fast. We use two different cumulative integer lists, cumulIntListRight for the right-hand extensions and cumulIntListLeft for the left-hand extensions. This makes the addition of integers (addition of cumulative integer lists, in fact) fast, but forces us to write the testing function testA2RLpairCumulIntList in a more complicated fashion (nevertheless, still maintaining the necessary high speed). In building all the structures, quite extensive precomputations are needed.

Now we define the alphabets by using letters (strings of length 1) and integers.

This first example using cumulative Parikh vectors is symbolic (Parikh vectors are explained in the Preliminaries section).

In actual computations we use the integer representation for these cumulative Parikh vectors.

Then we transform back to the string representation over three letters.

The case of four letters can be handled in a similar way.

Figure 7. Extend the given word alternately to the right and left. The word sw is favorable or so-far-favorable.

We have also taken care that the selection of the next proper candidate is done in an efficient way. If all the possible candidates for have already been tried out, then the program backtracks to change the other end’s suffix. Indeed, the process is the same for the right- and left-hand extensions (both of their cumulative list representations grow from left to right), so we can really speak of suffixes only. The lengths for the suffix and the extension need to be fixed at the beginning of the computation, and changing (re-fixing) the lengths usually requires new quite extensive precomputations.

In the program at the end of this article, the lengths have been set as and , but they can be, and usually are, selected differently as well. Up to now, we have been using, for example, the lengths , , and , . Longer suffixes would improve the computational time efficiency considerably, but, on the other hand, the structures might need too much memory in our present environment for distributed computing. Moreover, selecting would allow the maximal avoidance of unfavorable factors in the extensions. However, the setting probably allows detecting the abelian squares more quickly, albeit the final decision for this still requires quite extensive experiments.

In the following example, we add all possible words of length 4, represented by lists that follow all possible reduced suffixes of length 4. Here the term “reduced” means that we pay attention only to the structure of the word—the other possibilities can straightforwardly be obtained by permutations, that is, by renaming letters. The example originates from the three-letter case in which short abelian repetitions of the form or for a letter are allowed. For simplicity, we show the words in their string form.

The next input serves only descriptive purposes and is not intended for actual evaluation inside this notebook.

To save memory and to make the computations as fast as possible, the final calculations in fact use symbols instead of strings. The cumulative integer lists are associated to symbols by using rules. Moreover, to save memory, the permutations are added only when needed. In the following example, we show a rule from symbols to cumulative integer lists (in an abbreviated form).

ruleSymbolsToCumulIntegerLists = {aaab -> {0, 0, 0, 1}, aaba -> {0, 0, 1, 1}, aabb -> {0, 0, 1, 2}, aabc -> {0, 0, 1, 65537}, abaa -> {0, 1, 1, 1}, abac -> {0, 1, 1, 65537}, abbb -> {0, 1, 2, 3}, abbc -> {0, 1, 2, 65538}, abca -> {0, 1, 65537, 65537}, abcb -> {0, 1, 65537, 65538}, abcc -> {0, 1, 65537, 131073}, …, acbb -> {0, 65536, 65537, 65538}, bbba -> {1, 2, 3, 3}, …, bcaa -> {1, 65537, 65537, 65537}, ccca -> {65536, 131072, 196608, 196608}, …, cbaa -> {65536, 65537, 65537, 65537}};

In our present setting, we cut the suffix

from the cumulative integer list(s) for the word constructed so far and then select the new extension

from the precomputed symbolic list for

. After this, the corresponding cumulative integer list for

is to be added to (the cumulative integer list of) the previously constructed word. This is done in order to detect the possible new abelian squares quickly. One function we use for selecting the new extensions is tryProperExtensionVisList. Next we present, as an example, one rule (of a great many) connected to it. This time the example is from the four-letter,

case.

These kinds of precomputed rules are quite fast to use, but, of course, a considerable amount of memory is needed to store all the structures.

The following function generateRLthree (or, equivalently, generateRL) constructs the extensions for a given word. Its arguments are states (strings that are also values of the global variable state). As mentioned earlier, some of these states include "extendOrChangeRight", "extendOrChangeLeft", "testRight", "testLeft", "failBoth", and "succeeded". You can see the program for generateRLthree by opening the cell brackets at the end of this article. Note that even though the program was originally developed for the three-letter case (allowing short abelian repetitions of the form or for a letter ), it works perfectly for the four-letter case as well in all our settings. Therefore, the more generic name generateRL is used for it.

Before starting the construction, we need to initialize the global (at this stage) variables depending on whether we use three or four letters. This is accomplished by using the initialise function that has the following main structure.

initialise[howManyStepsLeftAndRight_,givenWordForTest_,alphLetInt_:

alphThreeLetInt]

The variables

and

stand for extension lengths. Both of these lengths should be equal and of the form

addWordLength for a positive integer

. In our examples, and in our present program, the value for addWordLength (length of

) is equal to 4. Here is the full definition of the initialise function.

Here is an example of the three-letter case. The program consists of initialization cells and will be activated automatically.

Thus, the given word is unfavorable.

The next examples deal with the four-letter case. First, we try to extend a word of length 20.

The extra condition in the While loop guarantees that the computation will not last too long.

Well, after all, the computation was not too long. At this point, we know only that the given word is so-far-favorable.

Here are some of the inner values and structures produced by the previous computation.

The lists indexListLeft and indexListRight contain the pointer values that indicate which words from the precomputed lists should be next catenated for testing to corresponding suffixes . Variables howFarExtendedLeft and howFarExtendedRight are used for efficient a-2-freeness testing and to find the next proper word for catenation as efficiently as possible.

Here we test that the extension reached is indeed an a-2-free word.

We now extend the same word "abacabadbacbcdbacdbd" of length 20 one more step (of length 4) to the right and left.

Once again, we know only that the given word is so-far-favorable.

Let us consider the following word of length 84.

In this case we know definitely that the given word of length 84 is unfavorable. Actually, it turns out that the given word is already the longest possible extension of "abacabadbacbcdbacdbd"! Note that the givenWordForTest has the form of "cbcacbcdcabdbabcdabcacdadbdadcad" "adbcacdacabdabcbdadcdbcbadbcbdbc". However, we will not elaborate on this here.

Here is another word to consider.

Thus, for the time being, our case is so-far-favorable. The computation did not take long even though the extension was quite large ( to both sides).

Here are some of the inner values and structures produced by the previous computation.

We now test that the extension reached really is an a-2-free word.

This is correct!

The final example shows that our word "abcbdbcbacbdcdacbdac" turns out to be unfavorable. However, this time the computation takes much more time than in the previous example.

We might have expected that the long buffers (found before this final trial) of length in both directions for "abcbdbcbacbdcdacbdac" would guarantee the given word to be favorable. Surprisingly enough, this turns out not to be the case.

Conclusions

Mathematica has versatile structures that firmly support the development of concepts, which has been extremely useful for constructing prototypes and full standalone programs. The use of Mathematica has enabled us to discover phenomena that previously were either unbelievable or very hard to experiment with. Even so, it would be useful, if in the future, in Mathematica, one could also use restricted pattern matching in a fast way as explained in the Introduction. Lastly, we expect that the presented program and its future updates will be used for a long time in our research.

Acknowledgments

We gratefully acknowledge the participation of the following individuals, most of whom have been students at the Rovaniemi Polytechnic/University of Applied Sciences over the course of the research from 1990 to 2006, who have written a number of computer programs to search for strings with desirable properties, helped in a crucial way to set up the computing environments, or made interactive graphical and musical representations of the structures. Starting year is given in parentheses: Kari Tuovinen (1990); Minna Iivonen, Anja Keskinarkaus, Marko Manninen (1993); Abdeljalil Chabani, Tomi Laakso (1994); Mika Moilanen, Juha Särestöniemi (1996); Juho Alfthan (1999); Olli-Pentti Saira (2000); Marja Kenttä, Ville Mattila (2001); Lauri Autio, Marianna Mölläri (2002); Antti Eskola (2003); Antti Karhu, Veli-Matti Lahtela, Olli-Pekka Siivola (2004); Esa Nyrhinen, Sami Vuolli (2005); Esa Taskila, Mikhail Kalkov (2006).

References

[1]	P. Erdös, “Some Unsolved Problems,” Magyar Tudomanyos Akademia Matematikai Kutato Intezet Kozlemenye, 6, 1961 pp. 221-254.
[2]	P. A. B. Pleasants, “Non-Repetitive Sequences,” Proceedings of the Cambridge Philosophical Society, 68, 1970 pp. 267-274.
[3]	V. Keränen, “Abelian Squares Are Avoidable on 4 Letters,” in Proceedings of the Nineteenth International Colloquium on Automata, Languages and Programming (ICALP ’92), Vienna (W. Kuich, ed.), Lecture Notes in Computer Science, 623, London: Springer-Verlag, 1992 pp. 41-52.
[4]	A. Carpi, “On the Number of Abelian Square-Free Words on Four Letters,” Discrete Applied Mathematics, 81(1-3), 1998 pp. 155-167. doi:10.1016/S0166-218X (97)88002-X.
[5]	V. Keränen, “Mathematica in Word Pattern Avoidance Research,” CADE—2007: Computer Algebra and Differential Equations (A. Mylläri, V. Edneral, and N. Ourusoff , eds.), Acta Academiae Aboensis, Series B, Mathematica et physica, 67(2), bo: bo Akademi University Press, 2007 pp. 12-27. urn.fi/URN:ISBN:978-951-765-403-6.
[6]	V. Keränen. “A-2-Free Endomorphisms and Substitutions over 4 Letters.” (Mar 29, 2007) www.south.rotol.ramk.fi/keranen/words2007/a2f.html.
[7]	V. Keränen, “A Powerful Abelian Square-Free Substitution over 4 Letters,” Theoretical Computer Science, 2009. doi:10.1016/j.tcs.2009.05.027.
[8]	R. L. Rivest. “Abelian Square-Free Dithering for Iterated Hash Functions.” Draft, Cambridge, MA: MIT (2005) www.theory.lcs.mit.edu/~rivest/publications.html.
[9]	E. Andreeva, C. Bouillaguet, P-A. Fouque, J. J. Hoch, J. Kelsey, A. Shamir, and S. Zimmer, “Second Preimage Attacks on Dithered Hash Functions,” in Advances in Cryptology, Proceedings of the Twenty-Seventh Annual International Conference on the Theory and Applications of Cryptographic Techniques (EUROCRYPT’08), Istanbul (N. P. Smart, ed.), Lecture Notes in Computer Science, 4965, Berlin: Springer, 2008 pp. 270-288.
[10]	S. Mäkelä, “Patterns in Words,” M.Sc. Thesis. University of Turku, Turku, Finland, 2002 (in Finnish).
[11]	P. Ochem and T. Reix. “Upper Bound on the Number of Ternary Square-Free Words.” (2006) www.lirmm.fr/~ochem/morphisms/wowa.pdf.
[12]	V. Keränen. “Forbidden Abelian Square-Free Factors over 4 Letters.” (Jun 8, 2004) www.south.rotol.ramk.fi/keranen/research/ForbiddenFactors.html.
[13]	G. Rozenberg and A. Salomaa, The Mathematical Theory of L-systems (Pure and Applied Mathematics), New York: Academic Press, 1980.
[14]	V. Keränen. “Programs & Notebooks & Packages (Mathematica Programs for Suppression of Unfavourable Factors in Pattern Avoidance).” (Mar 22, 2006) www.south.rotol.ramk.fi/keranen/research/UnfavourableFactorsInPatternAvoidance/Programs&Notebooks&Packages.html.
[15]	V. Keränen. “Avoidable Regularities in Strings: Structures, Programs, Graphics, and Music of A-2-Free Strings,1996-2009.” (2009) www.south.rotol.ramk.fi/keranen/StructuresGraphicsMusic.html.
Veikko Keränen, “Combinatorics on Words,” The Mathematica Journal, 2010. dx.doi.org/doi:10.3888/tmj.11.3-4.

About the Author

Veikko Keränen has been Principal Lecturer in Mathematics at the Rovaniemi University of Applied Sciences since 1987. He received his Ph.D. in mathematics from the University of Oulu in 1986. Keränen is one of the founding members, with Peter Mitic and Gautam Dasgupta, of the International Mathematica Symposium (IMS). His research areas include theoretical computer science, combinatorics on words, avoidable regularities in strings, and symbolic computation. Keränen has also been involved in mathematics teaching development, for example, through the Matriculation Examination Board and the National Board of Education of Finland.

Veikko Keränen
Rovaniemi University of Applied Sciences
Jokiväylä 11
96300 Rovaniemi
Finland
veikko.keranen@ramk.fi
veikko.keranen@gmail.com