Turing Machine

  We're about to introduce a new kind of machine that will be the basis and the simplest yet powerful model of the current computers, this machine is called Turing Machine. It is structurally very similar to the machines we've talked about before.
  The Turing Machine must be capable of output since it's the most powerful computational model, this is they only way that it can communicates what it know with anything else like humans.

A Turing Machine, denoted TM, is a collection of:

  1. Alphabet \Gamma denoting those characters that can be printed on the \tt TAPE by tape \tt HEAD
  2. Alphabet \Sigma denoting input letters, it may be a subset of \Gamma
  3. The blank symbol \Delta\in\Gamma, denoting the blank symbol, which means no characters, it is different from the null string \Lambda
  4. An infinitely long \tt TAPE, marked out into square cells, each of which contains a input letter from \Sigma or \Delta
  5. A finite set of states Q
  6. Start state q_0\in Q
  7. Halt state F\sube Q
  8. A tape \tt HEAD that can do following things in one step:
    1. Read a letter from \tt TAPE
    2. Alter it with any character c\in\Gamma, if the character is \Delta, it will be considered as an erasion
    3. Move it self one cell left or right
      the \tt HEAD is at the first cell of the \tt TAPE before the machine starts, it can never move to the left of the first cell, if it is instructed to do so, the machine will crash
  9. A program, which is a function that instructs the machine to change state, alter the character on the tape and move the tape head according to the current state, the program will be depicted in a directed graph like the FAs, where the directed edges are the flow of the states, nodes are states themselves and each edge is labeled with (letter, character, direction) where the letter is the letter to be read from the tape, the character is the character that to be used to alter the letter on the tape, and the direction , which could be L or R, is the instruction to tell the tape head move left or right

the TMs do not require that every state must have all the outgoing edges for all the possible characters it can read from the tape, if a letter is read an the we're currently encountered has no appropriate outgoing edge, the machine crashes, the input is said to be accepted by the TMs if and only if the TM halts at the halt state. The turing machines are deterministic, for every input there is a unique trace, it is not allowed to have two outgoing edges from one state labeled with the same letter.
  We use a specific notion no trace the path on a TM, if a TM is currently at state q_3, and the input string is abcd where the letter that is about to be read is c, we use q_3(ab\underline{c}d) to show this, if it moves to the state q_4 and the letter about to be read becomes d, we depict it by q_4(abc\underline{d}), and all of these can form a path which we called trace from start state q_0 to halt state q_n

q_0(\underline{a}bcd)\rightarrow ... \rightarrow q_3(ab\underline{c}d)\rightarrow q_4(abc\underline{d}) \rightarrow ... \rightarrow q_n(abcd\Delta\underline{\Delta}) \rightarrow \tt HALT

Recognizing Capability

Compared to Finite Automata

For every regular language L there exist a Turing Machine accepts it

Proof

  The proof is trivial, we take an FA M that accepts L, then for every edge in the FA, we change the label to (a,a,R) and (b,b,R) correspondingly, for every halt state in the FA, we turn it into a normal state and add an extra edge labeled with (\Delta,\Delta,R) to a newly added state \tt HALT, which will be the halt state of the TM.
  This algorithm works because the trace is exactly the same as it were in the M, if we get to the halt state in M originally, we get to the \tt HALT state in the TM, and if the string is fall into an infinite loop in M(we used to use an infinite loop to denote a string that must be rejected), the corresponding TM will crash because the string will be exhausted somehow and \Delta will be encountered, however there is no edge in M that is labeled with \Delta except those who connected to the \tt HALT state

Every Turing Machine T over the \Sigma divides the set of input strings into three classes:

  1. The set of all strings leading to a \tt HALT state \text{ACCEPT}(T), this set is also the language accepted by T
  2. The set of all strings which will lead to a undefined behavior, a.k.a. crash, \text{REJECT}(T)
  3. The set of all others strings \text{LOOP}(T), i.e. strings that will loop forever while running on T

These sets could be empty, and there is a simple example for the definition above, consider the regular language (a+b)^\ast aa(a+b)^\ast, we build a TM for it:

if the input has double a, the machine halts successfully, if the input does not contain double a and end with a, the machine will crash since there is no \Delta outgoing edge from q_1, but if the input does contain double a and end with b, then it will loops forever on q_0, in this case:

  • the \text{ACCEPT}(T)=\text{all strings that have a double a}
  • the \text{REJECT}(T)=\text{all strings without double a and end with a}
  • the \text{LOOP}(T)=\text{all strings without double a and end with b}

  A fun fact that a TM can do is that it can not only accept some regular and context-free languages, but also some non context-free languages, we can encode a TM for language a^nb^na^n which has been proved to be non context-free, the idea is to encode the TM in a circuit, on each iteration we erase an a from the first a^n, a b from the b^n and a a from the second a^n, if the tape is finally all blank, then it means that the string must be in a^nb^na^n, but if it ends up with, say, one a and one b, or a single a or a single b or whatsoever makes the tape still retains some non-blank characters, then the string must not balanced on a and b and second clump of as, the encoded TM is depicted as follow:

You can try to simulate aaabbbccc on this machine, see if it will be accepted.


行成于思,毁于随