Unorganized machines and the brain

In a fascinating and farsighted report written in 1948 Alan Turing suggested that the infant human cortex was what he called an unorganized machine (1). In the following, I discuss Turing’s ground-breaking ideas on unorganized machines and intelligence, with working examples of such machines, which I derived in 1994.

Turing defined the class of unorganized machines as largely random in their initial construction, but capable of being trained to perform particular tasks. There is good reason to consider the cortex unorganized in this sense: there is insufficient storage capacity in the DNA which controls the construction of the central nervous system to exactly specify the position and connectivity of every neurone and by not hard-wiring brain function before birth we are able to learn language and other socially important behaviours which carry great evolutionary advantage.

Turing gives two examples of artificial unorganized machines, which he claims are about the simplest possible models of the nervous system. The first type are A-type machines – these are randomly connected networks of NAND gates (where every node has two states representing 0 or 1, two inputs and any number of outputs). The second type Turing calls B-type machines – these are derived from any A-type network by intersecting every inter-node connection with a construction of three further A-type nodes which form a connection modifier as shown in Figure 1. Nodes are shown as circles, and in Figure 1 the connection modifier intersects a length of circuit cd. Arrowheads on connecting circuits show the direction in which binary pulses flow. The values in nodes x and y in Figure 1 affect the behaviour of the circuit they are connected to – acting like a kind of memory unit. B-type networks are therefore a special and more interesting case of A-type networks, as the connection modifiers greatly facilitate training by an external agent by allowing functional modifications to be made at any point in the network. This is an advantage, which would be very unlikely to arise spontaneously from a large randomly connected A-type network. Figure 2 below is a non-randomly connected example of a functional B-type network, which counts to 10 then starts again. The red circles marked m show three of the connection modifiers, which have been placed between every node in the network. For clarity, connection modifiers (Figure 1) are reduced to three small circles in Figure 2. All connection modifiers have been set to the correct values for the network to operate as a decimal counter. The blue circle marked s indicates the start node, which holds a value of 1. When this B-type machine operates the value at s traverses the network in an anti-clockwise direction.
Figure 1 Figure 2

Turing also discussed input, output and training structures for B-type machines (1, 2). Figure 3 shows a B-type network capable of computing the logical functions OR and XOR (exclusive OR). Input leads i1 and i2 have their values held constant until the output stabilises at the output node o on the right. The connection modifier m1 is a standard modifier. However, m2 has had two internal connections stripped off to create two “interfering” or training inputs t1 and t2, and these can be used to dynamically alter the behaviour of the network. If t1 is set to 0 and t2 is set to 1 this B-type network will perform logical OR. If t2 is then set to 0 the network switches to performing logical XOR. Turing intended interfering inputs such as t1 and t2 to be used for training networks with an external agent (although he also discussed networks capable of self-modification). As most connection modifier settings do not result in useful network behaviour, Turing intended training to involve a systematic process where connection modifier settings are changed until settings are found which make the network perform a desired task. A set of useful modifier settings would constitute a kind of fixed program for the network. This program could be over written as desired to change the function of the network, or critical training leads such as t1 and t2 could be used to switch from one desired function to another during operation. Very large networks would thus be capable of complex behaviour by way of such training and functional manipulation. Figures 2 and 3 are very small networks, which might be used as functional components in a much larger (more interesting) network.

Figure 3

While simulating these networks it quickly became apparent that the syntax for B-types, as Turing described it, had a serious limitation. If a connection modifier is placed on every inter-node connection the resulting network can not perform all logical operations. I believe that the simplest solution to this problem, using the structures Turing defined, is to allow a mixture of A-type and B-type connections within the same network – a solution I call a WB-type network – click here for more. A WB-type network can perform all logical operations, yet retains the important capacity of being able to be trained. Figure 3 is a WB-type network, containing some A-type connections (without connection modifiers) for this reason, as are our larger evolved Turing networks elsewhere. Copeland and Proudfoot propose another, particularly redundant, and I believe, less satisfactory solution, which requires the invention of a new connection modifier that figures nowhere in Turing’s original paper (1, 2). Inconsistencies and errors also exist in their publications on Turing’s neural networks, which I discuss elsewhere (2-4) (see Copeland and Proudfoot miss the mark).

Building brain-like networks

Figures 2 and 3, while functional B-type machines, are not constructed in the way Turing claimed was required to model the brain – primarily because they are too orderly. To develop Turing’s idea of building a brain-like B-type machine we need to mirror the brain’s own development. The proliferation of neurones during the brain’s formation involves a substantial random element and only later is this growth fine-tuned by killing off the cells that have grown in the wrong places. This process of weeding out is called programmed cell death, is essential to the development of intelligence, and means that we start off with many more brain cells than we actually need to function as a normal adult. Those neurones that remain, grow interconnecting fibres and start firing. Fibres which make successful connections with other neurones are strengthened, while those that do not, atrophy in another selective process known as dendritic pruning. These two processes set up the neural machinery which allow us to begin learning once born. Years of learning and practice further fine-tune our neural connections until we are capable of the behavioural and cognitive complexities of an adult human.

A B-type cortex would begin with a very large number of nodes and follow a developmental path with the same delicate mix of the random and determined as a living brain. At a magnification where individual nodes and connections could be seen, the resulting very large B-type network would typically look much like a bowl of spaghetti. Such a disorderly structure is prone to forming feedback loops of varying lengths which take varying times to traverse, thus forming possible delay or memory circuits. In a large network these loops can lead to greatly varying patterns of activity, regardless of input, since activity can be perpetually recycled in a complex manner. The activity in many conventional neural networks stops when the output layer settles into a stable pattern; the equivalent of a Turing Machine halting, its computation over. But just as the brain does not halt, large B-type networks will tend not to either.

Self-stimulating feedback loops, variable length circuits and continuously varying patterns of activity are all known features of the central nervous system and have been implicated in many cognitive processes and in intelligence itself (5). A recent view of the brain called Dynamicism stresses the central importance of self-stimulating feedback loops in almost every aspect of brain function (6). According to this view, information is not encoded in individual cells, but rather in waves of excitation which sweep the brain like ripples on a pond. New stimulus causes new ripples, but also interferes with the old ripples which are memories, making the overall activity pattern distributed and complex. Your brain is continuously active, even when you are asleep, as the majority of inputs to a neurone come from feedback loops, not from the world. B-type networks with their propensity to form loops of various lengths may be well suited to model the kind of massive, widespread feedback and interacting waves of activity that modern theories like Dynamicism imply.


Turing’s genetical search

Changing the settings of the connection modifiers in a B-type network changes its function. However, in any moderately-sized network there will be a very large number of possible patterns of modifier settings, only a tiny fraction of which will be useful. Any attempt to find which setting patterns constitute useful functions by exhaustively trying out all the possibilities, quickly becomes intractable as the number of nodes involved increases. Appropriate patterns must be discovered by empirical means. Turing himself mentioned a method which I believe to be the most promising for solving the B-type training problem; that of a genetic algorithm (GA), or as Turing called it before the term GA was coined, a genetical search.

Genetic algorithms are an efficient kind of complex search, which allow a desirable set of values to be found in the very large space of all possible values for a particular problem. A GA mimics the process of natural selection by setting up a population of artificial “organisms” and allowing them to reproduce based on selection pressures defined by the user. For example, if we intended to produce a B-type network capable of binary addition by this method, we would create a population of randomly connected B-type networks and test each in turn at each of the four possibilities of the goal task. Each B-type network would get a score in the range of 0 to 4 dependent on the number of additions it got right. Initially some individual networks in the population would score better than others, even if apparently only by chance. These scores would become the fitness measures of the individual networks and dictate their number of offspring. The most fit networks would be disproportionately over-represented in the next generation, while those poorer scoring networks would be under-represented or drop out of the population altogether. Artificial sexual reproduction is then typically employed where artificial genes are swapped between paired-off parents ranked in order of fitness, producing new networks which are composites of their parents. If this test-and-reproduce cycle is repeated for many generations individual networks will become better at the goal task until eventually a network will be created which gains a perfect score. Click here to see a successful binary addition network.

This process is in effect a replacement for training since it progressively discards networks which cannot perform the task until it comes across a network configuration which can – the resulting network is the end point of an exhaustive search. This is appropriate for constructing networks with fixed function for simple tasks such as binary addition. However, GAs can also be used to find network configurations for more complex tasks such as creating a network which is capable of human-like adaptation or learning. Although considerably more difficult, GAs could be used to construct a network with an appropriate structure to allow it to carry out the more subtle self-modifications involved in learning by itself or heeding instruction. Such a B-type network would be Turing’s unorganized-machine cortex, which would be ready to assimilate information from its environment just as a human brain is primed to learn language after birth.

Many thanks to my brother, Bruce Webster, who wrote the simulator which produced the diagrams on this page.

    References

  1. Turing AM. Intelligent Machinery. In: Ince DC, editor. Collected works of A. M. Turing: Mechanical Intelligence. Elsevier Science Publishers, 1992.
  2. Webster CS. Alan Turing’s unorganized machines and artificial neural networks – his remarkable early work and future possibilities. Evolutionary Intelligence 2012; 5: 35-43.
  3. Copeland BJ, Proudfoot D. On Alan Turing’s anticipation of connectionism. Synthese 1996; 108: 361-377.
  4. Copeland BJ, Proudfoot D. Alan Turing’s forgotten ideas in computer science. Scientific American 1999; 4(280): 76-81.
  5. Dennett DC. Consciousness Explained. London: Penguin, 1993.
  6. McCrone J. Wild Minds. New Scientist 1997; 156(2112): 26-30.

2 Responses to “Unorganized machines and the brain”

  1. Logan James Nicholas Rae Says:

    Exceptional article; thank you.

  2. [D] Alan Turing’s “Intelligent Machinery” (1948) | Toronto AI Meetup Says:

    […] work, introducing ideas such as genetic algorithms and neural networks (what he called “unorganized machines”) with learning capabilities, and reinforcement learning. I believe […]

Leave a Reply