Evolved Turing neural networks
by Craig Webster and William Fleming
In a report written in 1948 Alan Turing proposed the use of a genetic algorithm (GA) to “train” a particular type of neural network he called a B-type (1). Despite the apparent advantages of these networks Turing’s proposal has remained undeveloped until now.
To our knowledge the Turing neural networks below are the first functional networks to be derived using GAs in the manner Turing described. The trial task we chose to validate our approach was single-digit binary addition with a carry. The GA-simulator representations of Turing networks are somewhat different to that of an early simulator. Input nodes are shown on the left-hand side in yellow, as are output nodes on the right. Red nodes are A-type nodes and green nodes are B-type modifiers (in turn composed of three A-type nodes) – these networks are therefore what I have called WB-type networks because they mix A- and B-type links – click here for more. Input connections to individual nodes are shown as black line-segments around nodes, and all other connections are outputs. All nodes have two states, either solid or with an unfilled circle. Click here for more background on Turing’s networks. The GAs used to generate the following networks involved conventional mutation and sexual reproduction operators adapted to the peculiarities of B-type networks. Initially a population of networks was generated with typically 200 to 300 members. Individual networks were initially composed of a random number of nodes, with random connectivity between nodes, and contained a mix of A-type and B-type connections (a WB-type). The population of networks progressed through a cycle of connection modifier interference (training), goal-task testing, ranking based on goal-task performance and sexual reproduction and mutation to form a new generation. This cycle was then repeated for how ever many generations were required to reach a desired threshold of performance at the goal task. Click here for more background on GAs.
The network in Figure 1 was evolved with a selection pressure for parsimony. That is, the size of the network was fixed at a small number of nodes when considering ability at the goal task. This network is by no means the simplest WB-type solution for binary addition. It is however, considerably simpler than the network in Figure 2 which was evolved without a selection pressure for parsimony.
The network in Figure 2 is more in the spirit of the kind of very large network, which Turing claimed could be trained to perform a number of different useful tasks (1). Clearly binary addition under utilises this network, as Figure 1 demonstrates that the task can be performed with many fewer nodes. However, given appropriate training the greater number of nodes in this network make it potentially capable of performing many more tasks than the network in Figure 1.
Figure 3 shows a network which performs a task more akin to the processes of cognition. The network compares the magnitude of two 4-digit binary numbers (hence the eight input nodes on the left) and returns a 1 (for true) on its single output node when the first number is larger than the second – else it returns a 0 (for false). This WB-type machine gets 192 of 256 possible 4-digit comparisons correct – a score of 75%. The network is the best result of 500 generations of evolution of a population of 250 networks with a selection pressure for correct responses.
Evolved design
Evolution is not a rational process and often yields solutions which are unexpected and quite different to those designed in a minimal, rational manner. As a result, evolved solutions are often very difficult to understand. For the same reason it is usually a mistake to second guess, or otherwise constrain, the “solution space” beforehand when tackling complex problems with GAs. In their discussions of Turing’s networks Copeland and Proudfoot mention only two of the four possible connection modifier settings available to training agents, and so appear to have made the mistake of limiting possible solutions a priori (2,3). The two connection modifier settings, which they fail to mention, are unstable states. Excluding them is an anti-evolutionary move, more in keeping with the minimal, rational design ethos of conventional computer circuits, than Turing’s desires to construct very large, evolved, brain-like networks (see Copeland and Proudfoot miss the mark). Our GA derived networks above and elsewhere do contain connection modifiers in unstable states. Perhaps they function as single-bit clocks – but, probably not. Evolution selects only for the function of the overall network, not for the tidy compartmentalisation of function into individual components. Neural networks of all kinds and the products of GA design have the tendency to quickly become algorithmically opaque for this reason. The advantage of GAs, of course, is that evolution often works when no rational or minimal solution is known, or where finding an optimal solution by other methods would take an unreasonably long time (for example, the Travelling Salesman Problem which can take longer than the lifetime of the universe to solve).
William Fleming is a software developer working in Auckland, New Zealand.
- References
- Turing AM. Intelligent Machinery. In: Ince DC, editor. Collected works of A. M. Turing: Mechanical Intelligence. Elsevier Science Publishers, 1992.
- Copeland BJ, Proudfoot D. On Alan Turing’s anticipation of connectionism. Synthese 1996; 108: 361-377.
- Copeland BJ, Proudfoot D. Alan Turing’s forgotten ideas in computer science. Scientific American 1999; 4(280): 76-81.
February 26th, 2012 at 3:03 am
[…] i computer erano ancora un’ipotesi, e le teorie di Turing dovettero aspettare decenni prima che Craig Webster provasse a implementarle su un moderno computer. 6) Il primo giocatore di scacchi elettronico: Ma […]