## Introduction

The biochemistry of induction and progression of cancer is complex and highly variable from tumour to tumour.
Over 300 oncogenes and tumour suppressor genes have been documented, that is, genes that when mutated, amplified,
or partially deleted are associated with malignant transformation. It has been suggested that up to six of these
oncogenic mutations may be necessary for full expression of the malignant phenotype. For example, Hanahan and
Weinberg [

We postulate that cancer is the result of dysfunction of two cell cycle checkpoints, the retinoblastoma
protein-controlled late G1 checkpoint, and the mitotic spindle assembly checkpoint (the M checkpoint).
Loss of control of these checkpoints has been described in cancer by many authors [

In order to describe the dynamics of complex interactive systems it is usually necessary to make simplifying
assumptions. An approach introduced by Von Neumann [

## Results

The finite state machine was first used to examine the consequences of a model of malignant progression in which the transformation of a normal cell into an invasive, metastatic malignant cell proceeds by a series of four or more somatic mutations. This model of malignant progression, which we term the multistage somatic mutation hypothesis, is summarised in Figure 1. It may be described briefly as follows: Malignant progression requires multiple sequential somatic mutations. Each of these mutations confers a survival advantage on the cells carrying it, so that the proportion of cells in the total population that carries that mutation increases. When the number of cells bearing the first mutation is sufficient, there is a high likelihood that a second mutation will occur, which will confer a further selective advantage on the doubly mutant cells, whose population then rises until a third mutation becomes probable, and so on. We assume that the total number of dividing epithelial stem cells in a mouse is 1 x 107, that the mutation rate for somatic mutations is 3 x 10-7 per cell division [The lower limit for macroscopic detectability of a tumour is assumed to be 107 cells. A computer program that models the system of Figure1, termed FINITE4, is listed in the supplementary material. Running the program with the parameter values given above predicted that the lifetime probability of tumour incidence in a mouse was about 0.5%. (Simulation 1: complete output from the simulations is provided in the supplementary information). If the assumed mutation rate was decreased to 2 x 10-7 the predicted lifetime tumour incidence fell to less than 0.1% (Simulation 2). When the mutation rate was increased to its probable upper limit of 1 x 10-6 the model still did not predict the appearance of a palpable tumour within the 1000 day lifetime (Simulation 3). It was concluded that the dynamics of the multistage somatic mutation model are incompatible with the observed lifetime incidence of spontaneous tumours in mice of a few percent.

The alternative theory of malignant progression, the Duesberg hypothesis [

Consideration of the competition between the various normal and mutant cells suggested that a mutation that significantly
decreased the cell doubling time might overcome the competitive disadvantage resulting from aneuploidy. This situation,
shown in Figure 3, was modelled (Simulation 5). The first mutation, m1, represents loss of function of the G1 checkpoint,
with a resulting change in the cell doubling time from 24 hours to 12 hours. P2 cells have a functional M checkpoint,
so they are diploid. The second mutation, m2, causes defective function of the M checkpoint. It was assumed, initially,
that the cell loss factor of P3 cells was 0.1, and that mutation rates downstream from the appearance of aneuploidy
were increased by 100-fold. The model now predicted a lifetime tumour incidence in mice of 3.7%, in rough agreement with
observation. Spontaneous tumour incidence in mice varies between strains, but is typically a single-digit percentage [

These calculations suggested that loss of the G1 checkpoint and of the M checkpoint are necessary and sufficient for neoplastic transformation and malignant progression to occur (the two checkpoints theory of cancer). Loss of the two checkpoints enables a process of Darwinian selection in which the selective pressure is provided by competition for reproductive resources and genetic variability is provided by error-prone mitosis, a consequence of loss of the M checkpoint. This malignant progression algorithm (a form of genetic algorithm) is iterative, autocatalytic, and irreversible (Figure 4).

The malignant progression process is parameter-dependent. A half-log increase in the mutation rate for loss of G1 checkpoint function resulted in a three-fold increase in the predicted tumour incidence (Simulation 6). A half-log increase in the mutation rate m3 (the frequency of mutations downstream from loss of the M checkpoint) resulted in a large increase of predicted tumour incidence, with all mice predicted to develop tumours by the age of 26 months (Simulation 7). When the cell loss factor of M checkpoint-deficient cells was decreased by 50%, the predicted lifetime probability of tumour development increased 113% (Simulation 8). When cell loss factor was increased 50%, lifetime probability of tumour development decreased 37% (Simulation 9). If the cell loss rate resulting from aneuploid cell division (ka in Figures 2 and 3) passed beyond a threshold value of 0.9, the doubly checkpoint-defective cells could not sustain themselves and malignant progression did not occur (Simulation 10). This suggests that a possible approach to tumour prevention may be to identify agents that cause selective apoptosis of cells with a defective M checkpoint.

In populations where the cell loss factor for aneuploid cells was assumed to have the default value the time for progression to a detectable tumour depended upon the doubling time of G1 checkpoint-deficient cells. Unlike the M checkpoint, which has all-or-none function, the G1 checkpoint may have partial loss of function (e.g. resulting from decreased expression of p16). A partially functional G1 checkpoint will result in a doubling time that is shorter than normal, but longer than that of a cell that expresses the fully transformed phenotype. If the doubling time of cells with defective G1 checkpoint function was increased from the default value of 12 hr to 21 hr (only slightly shorter than the 24 hr doubling time of untransformed cells) the predicted probability of tumour occurrence was still 58% of the value with default parameters (Simulation 11). For tumour progression to occur, it is necessary for G1 checkpoint-defective cells to have a selective advantage over normal cells, but a quite small advantage is sufficient. The other parameter that directly affected tumour progression was the maximum domain cell count (as defined in “description of the algorithm”): when this was reduced by one-third, the predicted tumour incidence decreased by one-third (Simulation 12), and when the maximum domain cell count was increased by one-third the tumour incidence increased by one-third (Simulation 13).

In summary, five parameters determine the ability of a tumour cell population to progress: its doubling time, its cell loss factor, the mutation rates for loss of checkpoint function (m1 and m2), and the maximum domain cell count.

We also modelled the situation where the end cells did not lose anchorage dependence (Simulation 14: modelled by setting m3 and m4 to zero). The eventual proportion of P3 cells (which are transformed but anchorage dependent) will depend upon the doubling time and the cells’ loss factor, in comparison with the other cell types. Eventually, one cell type will dominate the population, but so long as the cells are limited to growth on basement membrane, they do not constitute a malignant, invasive tumour. Tumour cells that retain anchorage dependence are regarded as the earliest transformed cells in the lineage of a tumour, and are termed “cancer stem cells”. These cells may be near-diploid but they are genetically unstable. However, without further mutations they do not form tumour growths. They remain in the place where they originated. In the case of epithelial tumours (such as skin cancers, and cancers of the breast, lung and colon) they need to be attached to basement membrane in order to survive. Adhesion of normal cells (or tumour stem cells) and basement membrane is complex, involving several families of cell surface receptors and associated signalling pathways. The predominant family of adhesion molecules involved in basement membrane attachment are known as integrins. If a mutation occurs in an integrin molecule or (more commonly) its associated signalling pathway (m3 in Figure 3), a tumour stem cell may become able to survive without attachment to basement membrane. Normal epithelial cells that lose basement membrane attachment will die, because their survival signals require integrin signalling. A tumour stem cell that can survive without membrane attachment, no longer has to grow as a flat sheet, but can grow into a three-dimensional lump. It is now said to be an invasive tumour. These cells are shown as P4 in Figure 3. Such tumours are usually curable by surgery, because although they may be invasive they remain localised at or near their site of origin.

There are other families of adhesion molecules that are involved in attachments between cells. The predominant family of cell-cell adhesion molecules are the cadherins. If cadherin signalling becomes non-functional (mutation m4 in Figure 3) the tumour cells can now survive without cell-cell attachment (P5 in Figure 3). These cells are now able to detach from the tumour mass, and may move to other parts of the body in the bloodstream or the lymphatic system, and give rise to secondary growths at distant sites. P5 cells are said to be metastatic tumour cells.

Spontaneous tumours occur in mice, but as in most short-lived species, they are comparatively rare. However, in transgenic mice carrying a mutation that disrupts or over-rides the G1 checkpoint, such as a constitutively activated H-ras [

## Discussion

Previous discussions of tumour progression as a process of Darwinian selection [

For these events to happen with sufficient frequency to generate a tumour, the population of aneuploid cells must be large enough that, despite high cell loss and the low frequency of the mutation leading to loss of contact inhibition, one or more cells bearing this mutation will survive and replicate. This will only occur if the initial, contact-inhibited aneuploid cells have a selective advantage over their diploid precursors. This is why loss of the G1 checkpoint must precede loss of the M checkpoint, why a premalignant stage must precede full malignancy: to establish a critical mass of premalignant, checkpoint-defective cells, so that the product of the probability of the mutation resulting in a non-contact-inhibited cell and the size of the population of cells at risk becomes great enough to overcome the unfavourable population dynamics (Figure 5). This interpretation of the required sequence of checkpoint loss is supported by the fact that while premalignant lesions (G¯M+) are common, there are no reports of aneuploid tumours that have an intact G1 checkpoint (G+M¯). The finite state machine predicted that M checkpoint deletion in absence of G1 checkpoint dysfunction could in principle result in tumour progression, but only at combinations of high mutation rate and low cell loss factor unlikely to be encountered in practice.

All these features are captured in our malignant progression algorithm (Figure 4). The algorithm resembles a classical genetic algorithm [

It must be emphasised that the G1 checkpoint is complex, and has multiple functions: control of progression into S phase in response to growth factors, determination of whether a cell in G1 is destined to proliferate, remain static, enter apoptosis, or senesce, and responding to DNA damage by entering cell cycle arrest until the damaged DNA is repaired (or failing that, to undergo apoptosis). Although we argue here that all tumour cells have a dysfunctional G1 checkpoint, this does not necessarily mean that all these functions are lost. Unlike the M checkpoint [

These then are the defining characteristics of the malignant progression algorithm: loss of the G1 checkpoint provides a competitive survival advantage; loss of the M checkpoint provides the required genetic variability; Darwinian selection results and increases the proportion of transformed cells in the total cell population. The process is iterative, autocatalytic, and irreversible. The order of loss of the two checkpoints is essentially obligatory, and is determined by the population dynamics of the system. In all these

characteristics, malignant progression follows a classical genetic algorithm. However, it differs in one important respect. As usually implemented, genetic algorithms have a fixed objective function, so the system evolves to improve the goodness of fit (Darwinian “fitness”). In the malignant progression algorithm, cells are initially selected to give maximal proliferation within the constraints imposed upon normal cells (e.g. fixed growth area), but following loss of the M checkpoint and resulting loss of anchorage dependence, this constraint is removed, and cells are free to invade other spaces – in other words, the objective function has now changed. The process thus falls into two stages, selection for resources within the constraint of anchorage dependence, and unrestrained proliferation once that constraint is removed.

In this sense, the malignant progression algorithm reproduces in microcosm certain aspects of the process of evolution. Species that are optimally adapted to their environment appear through natural selection. However, if the environmental constraints change, perhaps because of a change in climate, or food availability, or because the species extends its geographic range, the constraints alter, and the selection process may now favour different genetic variants that have an advantage under the new conditions. This is the basis of speciation, and the second phase of malignant progression, in which a tumour becomes invasive and metastatic, appears to follow similar dynamics. The origin of species by natural selection has been described as an algorithm [

Our studies have modelled the situation where normal and premalignant cells are restricted to growth on basement membrane, and must compete for space. This describes the kinetics of many epithelial tissues. It will be interesting to model other kinds of tissue kinetics, for example, the situation in intestinal villi, where cells originate in the crypts, and progress through a finite number of divisions (moving along the villi as they do so) and are finally sloughed into the gut lumen [

The two checkpoints theory of cancer has implications for selection of anticancer drug targets. Given that the multiple routes through the malignant progression process first diverge, then converge, are targets early or late in the progression cascade likely to lead to broader-spectrum drugs than targets in the middle of the process? Given the pivotal role of the M checkpoint in tumour progression, will drugs that act on this checkpoint, e.g. inhibitors of aurora kinase B [

The two checkpoints theory also has clear implications for cancer prevention strategies. Mutations to pre-malignancy are essentially inevitable. They can be increased (e.g. by X-irradiation or ultraviolet radiation exposure) but not decreased. In contrast, it may be possible to find pharmacological approaches to minimise the progression process. The computational approach used in the present study can be extended to explore ways of doing this.

## Description of the algorithm

Depending upon the status of the two checkpoints, we consider cells as having one of four genotypes: G+M+, G¯M+, G+M¯or G¯M¯. G+M+ cells, with both checkpoints fully functional, are considered to be normal cells. G+M¯ cells and G¯M¯ cells, with a dysfunctional M checkpoint, are cancer cells [

G+ cells are assumed to have a fixed doubling time (24 hr for the purpose of our model). G¯ cells may have different doubling times, but these will be the same as for G+ cells, or faster, because loss of G checkpoint function cannot slow down the cellular growth rate. For the purpose of most of our simulations we make the simplifying assumption that G¯ cells have a doubling time of 12 hr. However, so long as the G¯ cells have a selective advantage over G+ cells, however slight, the overall dynamics of the system do not change.

M+ cells are assumed to replicate faithfully: their daughter cells will have the same genotype and phenotype as the parent cell except for the rare occasions (about one in every few million cell divisions) when a somatic mutation occurs. M+ cells are assumed to be anchorage-dependent and contact-inhibited: they will replicate until the space available is fully occupied, and then stop replicating. M¯ cells have a finite probability (here termed m3) of changing at each cell division. The possible changes are loss of anchorage dependence and more rapid doubling time. In addition, some fraction, ka, of M checkpoint-deficient cells is assumed to undergo apoptosis; those cells that survive the first doubling may then be apoptosis-resistant. However, cells that have lost anchorage-dependence and enter the circulation are subject to destruction by various mechanisms (e.g. NK cell-mediated cyctotoxicity) so the overall cell loss rate may increase at this stage.

According to this scheme, the four genotypes can have one of 7 phenotypes (states) as follows, where DT = doubling time in hours, A = anchorage dependence, + or –, and where cells that have lost basement membrane dependence may or may not retain cell-cell contact-dependence (Table 1). These seven phenotypes constitute the states of the finite state machine. Not all seven phenotypes seem to occur naturally: those labelled P6 and P7 have combinations of mutations that are theoretically possible but do not seem to be observed in practice.

Those phenotypes that are found naturally are labelled P1 to P5 in Table 1. P1 cells are normal, and P2 cells are pre-malignant – i.e. they have a dysfunctional G1 checkpoint, but they are diploid. P3 cells are tumour stem cells: they lack both the G1 checkpoint and the M checkpoint, but are otherwise minimally transformed. P4 cells are aneuploid and have lost attachment-dependence: they are invasive but not metastatic. P5 cells are aneuploid, have acquired additional mutations, and show the fully malignant phenotype, i.e. they are invasive and metastatic.

The possible transitions between these cell types, with their associated probabilities, are shown in Figures 1 - 3.

### Transition rules

The number of cells in populations P1, P2 .. P5 at time t are N1(t), N2(t), .. N5(t). Start with N1(0) cells in population P1 at time zero and other populations at zero (i.e. simulations start with all normal cells). At discrete time intervals, calculate the number of new cells in each population from the previous cell number, time interval, and doubling time. From the number of doublings, and the transition probabilities (mutation rates) calculate the population transitions:

∆P1 = P1 * - m1

∆P2 = P1 * m1 – P2 * m2

∆P3 = P2 * m2 – P3 * (m3 + ka)

∆P4 = P3 * m3 – P4 * m4

∆P5 = P4 * m4

### Cell domains

The genetic algorithm assumes that cells compete for resources, but a particular cell does not compete with every other cell in the entire body. The area within which competition takes place is geographically restricted. Thus, when a skin stem cell is infected by papilloma virus, its G1 checkpoint is overridden and a wart develops. However, it does not cover the whole skin surface: the growth of the wart is limited to the area fed by a single afferent capillary. This area will typically contain from less than one hundred to a few hundred stem cells. We shall refer to this area of stem cells (within which competition for space and for nutrients takes place) as a domain. Similarly, it is estimated that a single intestinal crypt, fed by a single afferent capillary, contains about sixty stem cells [

### The objective function:

If the total number of anchorage-dependent cells, Ctotal = N1 + N2 + N3, is greater than its allowed maximum (AP), the populations of P1, P2 and P3 are reduced in proportion:

N(i),t+1 = N(i),t / Ctotal x AP

There is no maximum permitted cell number for non-contact-inhibited cells.

**Default parameter values** were as shown in Table 2.

### Computer programs

The programs used to run simulations 1 – 18 are listed in the Supplementary Information. An outline of the algorithm is given in the appendix.

## Appendix

### The Genetic Algorithm implemented as a Finite State Machine

Initialise cell populations P1 to P6 and total cell number

↓

Calculate proliferation factors for P1 to P6↓

Begin iterative loop

↓

Calculate population transitions from cell numbers and mutation rates

↓

Update population numbers to allow for cell transitions

↓

Calculate proliferation of P1 to P6

↓

Call objective function, and re-proportion cell numbers

↓

If tumour size > evaluation size, exit; otherwise repeat loop.

## Acknowledgements

The author thanks Dr Fordyce Davidson, Division of Mathematics, University of Dundee, for helpful advice and discussions.

## References

- Hanahan D, Weinberg RA. The hallmarks of cancer. Cell. 2000 Jan 7;100(1):57-70.

Reference Link - Cahill DP, Kinzler KW, Vogelstein B, Lengauer C. Genetic instability and darwinian selection in tumours. Trends Cell Biol. 1999 Dec;9(12):M57-60.

Reference Link - Kops GJ, Weaver BA, Cleveland DW. On the road to cancer: aneuploidy and the mitotic checkpoint. Nat Rev Cancer. 2005 Oct;5(10):773-85.

Reference Link - Duesberg P, Li R, Fabarius A, Hehlmann R. Aneuploidy and cancer: from correlation to causation. Contrib Microbiol. 2006;13:16-44.

Reference Link - Duesberg P. Chromosomal chaos and cancer. Sci Am. 2007 May;296(5):52-9.

Reference Link - Musacchio A, Salmon ED. The spindle-assembly checkpoint in space and time. Nat Rev Mol Cell Biol. 2007 May;8(5):379-93.

Reference Link - Lane DP. Cell immortalization and transformation by the p53 gene. Nature. 1984 Dec 13-19;312(5995):596-7.

Reference Link - Aguda BD, Tang Y. The kinetic origins of the restriction point in the mammalian cell cycle. Cell Prolif. 1999 Oct;32(5):321-35.

Reference Link - Von Neumann J (1963). General and logical theory of automata, in “John von Neumann: Collected Works”, (AH Taub, editor), vol. 5, pp. 288-328. New York, Pergamon
- Gill A (1970). Introduction to the Theory of Finite-State Machines. New York, McGraw-Hill.
- Wagner F, Schmuki R, Wagner T, Wolstenholme P (2006). Modeling Software with Finite State Machines.
- Alon U. An Introduction to Systems Biology: Design Principles of Biological Circuits. London, Chapman & Hall 2007.
- Simons BD, Clevers H. Stem cell self-renewal in intestinal crypt. Exp Cell Res 2011; 317: 2719-2724.

Reference Link - Lévi F, Filipski E, Iuriski I, Li XM, Innominate P. Cross-talks between circadian timing system and cell division cycle determine cancer biology and therapeutics. Cold Spring Harb Symp Quant Biol 2007; 72: 465-475.

Reference Link - Jackson RC. The Theoretical Foundations of Cancer Chemotherapy Illustrated by Computer Models. New York, Academic Press 1992.
- Glover DM. Mitosis in Drosophila. J Cell Sci 1989; 92: 137-146.
- Harnden DG (1976), in Scientific Foundations of Oncology (T Symington and RL Carter, eds), pp. 181-190. Chicago, Heinemann.
- Hanahan D, Wagner EF, Palmiter RD. The origins of oncomice: a history of the first transgenic mice genetically engineered to develop cancer. Genes Dev. 2007 Sep 15;21(18):2258-70.

Reference Link - Holland JH. Adaptation in Natural and Artificial Systems. Ann Arbor, Univ. of Michigan Press 1975.
- Holland JH. Emergence from Chaos to Order. 258 pp. Oxford University Press 1998.
- Mitchell M. An Introduction to Genetic Algorithms. Cambridge MA: MIT Press 1996.
- Mistry HB, MacCallum DE, Jackson RC, Chaplain MA, Davidson FA. Modeling the temporal evolution of the spindle assembly checkpoint and role of Aurora B kinase. Proc Natl Acad Sci U S A. 2008 Dec 23;105(51):20215-20.

Reference Link - Dennett DC. Darwin’s Dangerous Idea. New York, Simon and Schuster 1995.
- Dawkins R. The Selfish Gene (2nd edition). Oxford University Press 1989.
- Snippert HJ, van der Flier LG, Sato T, van Es JH, van den Born M, Kroon-Veenboer C, Barker N, Klein AM, van Rheenen J, Simons BD, Clevers H. Intestinal crypt homeostasis results from neutral competition between symmetrically dividing Lgr5 stem cells. Cell. 2010 Oct 1;143(1):134-44.

Reference Link