The Semantic DB Project: December 2016

Tuesday 20 December 2016

improvement of the next operator

Just a brief one today. I tweaked the next operator so we can now specify the size of the gap we can handle. Previously it was hard-coded to 3. Recall the next operator, given a subsequence predicts the rest of the sequence.

To demonstrate it, let's use the alphabet:

a = {A.B.C.D.E.F.G.H.I.J.K.L.M.N.O.P.Q.R.S.T.U.V.W.X.Y.Z}

Now put it to use (for brevity I omit the gm2sw-v2.py and load into console step):

sa: next[0] |A>
incoming_sequence: ['A']
|B . C . D . E . F . G . H . I . J . K . L . M . N . O . P . Q . R . S . T . U . V . W . X . Y . Z>
|B>

sa: next[0] |A.E>
incoming_sequence: ['A', 'E']
nodes 1: 0.1|node 1: 0>
intersected nodes: |>
|>

sa: next[1] |A.E>
incoming_sequence: ['A', 'E']
nodes 1: 0.1|node 1: 0>
intersected nodes: |>
|>

sa: next[2] |A.E>
incoming_sequence: ['A', 'E']
nodes 1: 0.1|node 1: 0>
intersected nodes: |>
|>

sa: next[3] |A.E>
incoming_sequence: ['A', 'E']
nodes 1: 0.1|node 1: 0>
intersected nodes: 0.1|node 1: 4>
|F . G . H . I . J . K . L . M . N . O . P . Q . R . S . T . U . V . W . X . Y . Z>
|F>

sa: next[4] |A.E>
incoming_sequence: ['A', 'E']
nodes 1: 0.1|node 1: 0>
intersected nodes: 0.1|node 1: 4>
|F . G . H . I . J . K . L . M . N . O . P . Q . R . S . T . U . V . W . X . Y . Z>
|F>

So, what is happening there? Well in the first example, given A, predict the rest of the alphabet. Next example, given A followed by E, predict the rest of the sequence. But observe we get the null result |> until we specify a skip of at least size 3. And that's it! A small improvement to our next operator.

Monday 19 December 2016

learning and recalling a sequence of frames

In this post, we tweak our sequence learning code to learn and recall a short sequence of random frames. Previously all our sequences have been over random 10 bit on SDR's. This post shows we can extend this to sequences of almost arbitrary SDR's. Perhaps the only limitation is that these SDR's need to have enough on bits. Thanks to the random-column[k] operator acting on our SDR's we have an upper bound of k^n distinct contexts, where n is the number of on bits in the given SDR. Though smaller than this in practice to allow for noise tolerance. Which means the 10 on bits, and k = 10 which we have been using is more than enough, even for the 74k sequences in the spelling dictionary example.

To learn and recall our frames, we need three new operators:

random-frame[w,h,k]
display-frame[w,h]
display-frame-sequence[w,h]

where w,h are the width and height of the frames, and k is the number of on bits. For example, here is a 15*15 frame with 10 on bits:

sa: display-frame[15,15] random-frame[15,15,10]
....#..........
.........#...#.
.......#.......
...............
...............
..............#
.......#.......
...............
...............
..........#....
...............
..#..........#.
...............
.........#.....
...............

And this is what the random-frame SDR's look like (we just store the co-ordinates of the on bits):

sa: random-frame[15,15,10]
|7: 7> + |6: 2> + |10: 8> + |4: 12> + |4: 5> + |13: 9> + |8: 8> + |12: 8> + |13: 0> + |12: 14>

And now displaying this exact frame:

sa: display-frame[15,15] (|7: 7> + |6: 2> + |10: 8> + |4: 12> + |4: 5> + |13: 9> + |8: 8> + |12: 8> + |13: 0> + |12: 14>)
.............#.
...............
......#........
...............
...............
....#..........
...............
.......#.......
........#.#.#..
.............#.
...............
...............
....#..........
...............
............#..

OK. On to the main event. Let's learn these two sequences defined using gm notation:

$ cat gm-examples/frame-sequences.gm
seq-1 = {1.2.3.4.5}
seq-2 = {2.4.1.5.3}

We convert that to sw using gm2sw-v2.py, and then manually edit it to look like this:

-- frames:
frame |1> => random-frame[10,10,10] |>
frame |2> => random-frame[10,10,10] |>
frame |3> => random-frame[10,10,10] |>
frame |4> => random-frame[10,10,10] |>
frame |5> => random-frame[10,10,10] |>
frame |end of sequence> => random-frame[10,10,10] |>

-- learn low level sequences:
-- empty sequence
pattern |node 0: 0> => random-column[10] frame |end of sequence>

-- 1 . 2 . 3 . 4 . 5
pattern |node 1: 0> => random-column[10] frame |1>
then |node 1: 0> => random-column[10] frame |2>

pattern |node 1: 1> => then |node 1: 0>
then |node 1: 1> => random-column[10] frame |3>

pattern |node 1: 2> => then |node 1: 1>
then |node 1: 2> => random-column[10] frame |4>

pattern |node 1: 3> => then |node 1: 2>
then |node 1: 3> => random-column[10] frame |5>

pattern |node 1: 4> => then |node 1: 3>
then |node 1: 4> => random-column[10] frame |end of sequence>

-- 2 . 4 . 1 . 5 . 3
pattern |node 2: 0> => random-column[10] frame |2>
then |node 2: 0> => random-column[10] frame |4>

pattern |node 2: 1> => then |node 2: 0>
then |node 2: 1> => random-column[10] frame |1>

pattern |node 2: 2> => then |node 2: 1>
then |node 2: 2> => random-column[10] frame |5>

pattern |node 2: 3> => then |node 2: 2>
then |node 2: 3> => random-column[10] frame |3>

pattern |node 2: 4> => then |node 2: 3>
then |node 2: 4> => random-column[10] frame |end of sequence>


-- define our classes:
-- seq-1 = {1.2.3.4.5}
start-node |seq 1> +=> |node 1: 0>

-- seq-2 = {2.4.1.5.3}
start-node |seq 2> +=> |node 2: 0>

After loading that into the console we have:

$ ./the_semantic_db_console.py
Welcome!

sa: load frame-sequences.sw
sa: dump

----------------------------------------
|context> => |context: sw console>

frame |1> => |0: 9> + |5: 1> + |5: 5> + |8: 1> + |8: 2> + |7: 7> + |9: 1> + |3: 0> + |6: 3> + |2: 8>

frame |2> => |0: 6> + |3: 8> + |1: 2> + |2: 5> + |2: 2> + |0: 9> + |4: 3> + |6: 6> + |9: 4> + |9: 0>

frame |3> => |5: 7> + |9: 1> + |9: 3> + |0: 8> + |0: 0> + |8: 6> + |7: 9> + |5: 9> + |2: 5> + |7: 0>

frame |4> => |9: 3> + |1: 7> + |6: 9> + |0: 0> + |4: 6> + |8: 1> + |7: 4> + |4: 8> + |9: 1> + |4: 1>

frame |5> => |3: 3> + |5: 3> + |5: 9> + |5: 8> + |7: 1> + |7: 6> + |3: 1> + |4: 1> + |2: 2> + |5: 2>

frame |end of sequence> => |3: 8> + |7: 7> + |1: 3> + |2: 0> + |9: 2> + |0: 7> + |5: 7> + |3: 1> + |4: 9> + |3: 0>

pattern |node 0: 0> => |3: 8: 7> + |7: 7: 4> + |1: 3: 0> + |2: 0: 0> + |9: 2: 3> + |0: 7: 8> + |5: 7: 2> + |3: 1: 0> + |4: 9: 1> + |3: 0: 6>

pattern |node 1: 0> => |0: 9: 0> + |5: 1: 9> + |5: 5: 1> + |8: 1: 6> + |8: 2: 1> + |7: 7: 7> + |9: 1: 6> + |3: 0: 6> + |6: 3: 4> + |2: 8: 2>
then |node 1: 0> => |0: 6: 0> + |3: 8: 4> + |1: 2: 4> + |2: 5: 7> + |2: 2: 9> + |0: 9: 5> + |4: 3: 3> + |6: 6: 2> + |9: 4: 9> + |9: 0: 4>

pattern |node 1: 1> => |0: 6: 0> + |3: 8: 4> + |1: 2: 4> + |2: 5: 7> + |2: 2: 9> + |0: 9: 5> + |4: 3: 3> + |6: 6: 2> + |9: 4: 9> + |9: 0: 4>
then |node 1: 1> => |5: 7: 8> + |9: 1: 1> + |9: 3: 1> + |0: 8: 2> + |0: 0: 6> + |8: 6: 8> + |7: 9: 3> + |5: 9: 6> + |2: 5: 7> + |7: 0: 0>

pattern |node 1: 2> => |5: 7: 8> + |9: 1: 1> + |9: 3: 1> + |0: 8: 2> + |0: 0: 6> + |8: 6: 8> + |7: 9: 3> + |5: 9: 6> + |2: 5: 7> + |7: 0: 0>
then |node 1: 2> => |9: 3: 9> + |1: 7: 0> + |6: 9: 9> + |0: 0: 9> + |4: 6: 5> + |8: 1: 3> + |7: 4: 5> + |4: 8: 0> + |9: 1: 8> + |4: 1: 7>

pattern |node 1: 3> => |9: 3: 9> + |1: 7: 0> + |6: 9: 9> + |0: 0: 9> + |4: 6: 5> + |8: 1: 3> + |7: 4: 5> + |4: 8: 0> + |9: 1: 8> + |4: 1: 7>
then |node 1: 3> => |3: 3: 7> + |5: 3: 6> + |5: 9: 4> + |5: 8: 4> + |7: 1: 7> + |7: 6: 9> + |3: 1: 8> + |4: 1: 3> + |2: 2: 5> + |5: 2: 3>

pattern |node 1: 4> => |3: 3: 7> + |5: 3: 6> + |5: 9: 4> + |5: 8: 4> + |7: 1: 7> + |7: 6: 9> + |3: 1: 8> + |4: 1: 3> + |2: 2: 5> + |5: 2: 3>
then |node 1: 4> => |3: 8: 6> + |7: 7: 0> + |1: 3: 7> + |2: 0: 9> + |9: 2: 0> + |0: 7: 1> + |5: 7: 3> + |3: 1: 9> + |4: 9: 2> + |3: 0: 9>

pattern |node 2: 0> => |0: 6: 7> + |3: 8: 5> + |1: 2: 9> + |2: 5: 4> + |2: 2: 3> + |0: 9: 2> + |4: 3: 4> + |6: 6: 1> + |9: 4: 8> + |9: 0: 6>
then |node 2: 0> => |9: 3: 8> + |1: 7: 3> + |6: 9: 0> + |0: 0: 2> + |4: 6: 8> + |8: 1: 1> + |7: 4: 0> + |4: 8: 4> + |9: 1: 0> + |4: 1: 1>

pattern |node 2: 1> => |9: 3: 8> + |1: 7: 3> + |6: 9: 0> + |0: 0: 2> + |4: 6: 8> + |8: 1: 1> + |7: 4: 0> + |4: 8: 4> + |9: 1: 0> + |4: 1: 1>
then |node 2: 1> => |0: 9: 9> + |5: 1: 6> + |5: 5: 0> + |8: 1: 0> + |8: 2: 8> + |7: 7: 2> + |9: 1: 1> + |3: 0: 2> + |6: 3: 7> + |2: 8: 3>

pattern |node 2: 2> => |0: 9: 9> + |5: 1: 6> + |5: 5: 0> + |8: 1: 0> + |8: 2: 8> + |7: 7: 2> + |9: 1: 1> + |3: 0: 2> + |6: 3: 7> + |2: 8: 3>
then |node 2: 2> => |3: 3: 9> + |5: 3: 8> + |5: 9: 6> + |5: 8: 7> + |7: 1: 8> + |7: 6: 5> + |3: 1: 5> + |4: 1: 9> + |2: 2: 8> + |5: 2: 0>

pattern |node 2: 3> => |3: 3: 9> + |5: 3: 8> + |5: 9: 6> + |5: 8: 7> + |7: 1: 8> + |7: 6: 5> + |3: 1: 5> + |4: 1: 9> + |2: 2: 8> + |5: 2: 0>
then |node 2: 3> => |5: 7: 0> + |9: 1: 3> + |9: 3: 8> + |0: 8: 5> + |0: 0: 8> + |8: 6: 6> + |7: 9: 9> + |5: 9: 5> + |2: 5: 9> + |7: 0: 8>

pattern |node 2: 4> => |5: 7: 0> + |9: 1: 3> + |9: 3: 8> + |0: 8: 5> + |0: 0: 8> + |8: 6: 6> + |7: 9: 9> + |5: 9: 5> + |2: 5: 9> + |7: 0: 8>
then |node 2: 4> => |3: 8: 8> + |7: 7: 1> + |1: 3: 7> + |2: 0: 1> + |9: 2: 3> + |0: 7: 1> + |5: 7: 4> + |3: 1: 6> + |4: 9: 1> + |3: 0: 8>

start-node |seq 1> => |node 1: 0>

start-node |seq 2> => |node 2: 0>
----------------------------------------

with one interpretation that each ket is the co-ordinate of a synapse. Eg, |5: 1> or |5: 9: 4>. Noting that frames are 2D, and random-column[k] maps frames to 3D. It is this extra dimension that allows SDR's to be used in more than 1 context. Indeed, an upper bound of k^n distinct contexts, where n is the number of on bits. Though in the current example we only have two distinct sequences. Now let's display a couple of frames:

sa: display-frame[10,10] frame |1>
...#......
.....#..##
........#.
......#...
..........
.....#....
..........
.......#..
..#.......
#.........

sa: display-frame[10,10] frame |2>
.........#
..........
.##.......
....#.....
.........#
..#.......
#.....#...
..........
...#......
#.........

Now a couple of frames in our first sequence, noting "extract-category" is the inverse of random-column[k] and hence converting the pattern SDR back to 2D:

sa: display-frame[10,10] extract-category pattern |node 1: 0>
...#......
.....#..##
........#.
......#...
..........
.....#....
..........
.......#..
..#.......
#.........

sa: display-frame[10,10] extract-category then |node 1: 0>
.........#
..........
.##.......
....#.....
.........#
..#.......
#.....#...
..........
...#......
#.........

And finally our sequences:

sa: display-frame-sequence[10,10] start-node |seq 1>
...#......
.....#..##
........#.
......#...
..........
.....#....
..........
.......#..
..#.......
#.........

.........#
..........
.##.......
....#.....
.........#
..#.......
#.....#...
..........
...#......
#.........

#......#..
.........#
..........
.........#
..........
..#.......
........#.
.....#....
#.........
.....#.#..

#.........
....#...##
..........
.........#
.......#..
..........
....#.....
.#........
....#.....
......#...

..........
...##..#..
..#..#....
...#.#....
..........
..........
.......#..
..........
.....#....
.....#....

|end of sequence>

sa: display-frame-sequence[10,10] start-node |seq 2>
.........#
..........
.##.......
....#.....
.........#
..#.......
#.....#...
..........
...#......
#.........

#.........
....#...##
..........
.........#
.......#..
..........
....#.....
.#........
....#.....
......#...

...#......
.....#..##
........#.
......#...
..........
.....#....
..........
.......#..
..#.......
#.........

..........
...##..#..
..#..#....
...#.#....
..........
..........
.......#..
..........
.....#....
.....#....

#......#..
.........#
..........
.........#
..........
..#.......
........#.
.....#....
#.........
.....#.#..

|end of sequence>

Anyway, a nice proof of concept I suppose.

Friday 16 December 2016

predicting sequences

In today's post we are going to be predicting the parent sequence given a subsequence. This is a nice addition to the other tools we have to work with sequences, and one I've been thinking about implementing for a while now. The subsequence can either be exact in which case it will only match the parent sequence if it is a perfect subsequence, or the version we use in this post where the subsequence can skip a couple of elements and still predict the right parent sequence. Here we just consider sequences of letters, and then later words, but the back-end is general enough that it should apply to sequences of many types of objects.

Let's jump into an example. Consider these two sequences (defined using the labor saving gm notation mentioned in my last post):

a = {A.B.C.D.E.F.G}
b = {U.V.W.B.C.D.X.Y.Z}

Then convert that to sw and load into the console:

$ ./gm2sw-v2.py gm-examples/simple-sequences.gm > sw-examples/simple-sequences.sw
$ ./the_semantic_db_console.py
Welcome!

sa: info off
sa: load simple-sequences.sw

Here is what our two sequences expand to:

full |range> => range(|1>,|2048>)
encode |end of sequence> => pick[10] full |range>

-- encode words:
encode |A> => pick[10] full |range>
encode |B> => pick[10] full |range>
encode |C> => pick[10] full |range>
encode |D> => pick[10] full |range>
encode |E> => pick[10] full |range>
encode |F> => pick[10] full |range>
encode |G> => pick[10] full |range>
encode |U> => pick[10] full |range>
encode |V> => pick[10] full |range>
encode |W> => pick[10] full |range>
encode |X> => pick[10] full |range>
encode |Y> => pick[10] full |range>
encode |Z> => pick[10] full |range>

-- encode classes:
encode |a> => pick[10] full |range>
encode |b> => pick[10] full |range>

-- encode sequence names:

-- encode low level sequences:
-- empty sequence
pattern |node 0: 0> => random-column[10] encode |end of sequence>

-- A . B . C . D . E . F . G
pattern |node 1: 0> => random-column[10] encode |A>
then |node 1: 0> => random-column[10] encode |B>

pattern |node 1: 1> => then |node 1: 0>
then |node 1: 1> => random-column[10] encode |C>

pattern |node 1: 2> => then |node 1: 1>
then |node 1: 2> => random-column[10] encode |D>

pattern |node 1: 3> => then |node 1: 2>
then |node 1: 3> => random-column[10] encode |E>

pattern |node 1: 4> => then |node 1: 3>
then |node 1: 4> => random-column[10] encode |F>

pattern |node 1: 5> => then |node 1: 4>
then |node 1: 5> => random-column[10] encode |G>

pattern |node 1: 6> => then |node 1: 5>
then |node 1: 6> => random-column[10] encode |end of sequence>

-- U . V . W . B . C . D . X . Y . Z
pattern |node 2: 0> => random-column[10] encode |U>
then |node 2: 0> => random-column[10] encode |V>

pattern |node 2: 1> => then |node 2: 0>
then |node 2: 1> => random-column[10] encode |W>

pattern |node 2: 2> => then |node 2: 1>
then |node 2: 2> => random-column[10] encode |B>

pattern |node 2: 3> => then |node 2: 2>
then |node 2: 3> => random-column[10] encode |C>

pattern |node 2: 4> => then |node 2: 3>
then |node 2: 4> => random-column[10] encode |D>

pattern |node 2: 5> => then |node 2: 4>
then |node 2: 5> => random-column[10] encode |X>

pattern |node 2: 6> => then |node 2: 5>
then |node 2: 6> => random-column[10] encode |Y>

pattern |node 2: 7> => then |node 2: 6>
then |node 2: 7> => random-column[10] encode |Z>

pattern |node 2: 8> => then |node 2: 7>
then |node 2: 8> => random-column[10] encode |end of sequence>

Now put it to use. First up, given the first element in the sequence, predict the rest of the parent sequence:

sa: next |A>
incoming_sequence: ['A']
|B . C . D . E . F . G>
|B>

sa: next |U>
incoming_sequence: ['U']
|V . W . B . C . D . X . Y . Z>
|V>

The first letters in our sequences are distinct, so our code has no trouble finding a unique parent sequence. Note that our code also returns the list of elements that are only one step ahead of the current position, in this case |B> and |V>. Now, what if we give it a non-unqiue subsequence?

sa: next |B.C>
incoming_sequence: ['B', 'C']
nodes 1: 0.1|node 1: 1> + 0.1|node 2: 3>
intersected nodes: 0.1|node 1: 2> + 0.1|node 2: 4>
|D . E . F . G>
|D . X . Y . Z>
2|D>

sa: next |B.C.D>
incoming_sequence: ['B', 'C', 'D']
nodes 1: 0.1|node 1: 1> + 0.1|node 2: 3>
intersected nodes: 0.1|node 1: 2> + 0.1|node 2: 4>
nodes 1: 0.1|node 1: 2> + 0.1|node 2: 4>
intersected nodes: 0.1|node 1: 3> + 0.1|node 2: 5>
|E . F . G>
|X . Y . Z>
|E> + |X>

And since we don't uniquely know which sequence B.C.D belongs to, the one step ahead prediction is for an E or a X. But if we then prepend an A or a W, we again have unique parent sequences:

sa: next |A.B.C>
incoming_sequence: ['A', 'B', 'C']
nodes 1: 0.1|node 1: 0>
intersected nodes: 0.1|node 1: 1>
nodes 1: 0.1|node 1: 1>
intersected nodes: 0.1|node 1: 2>
|D . E . F . G>
|D>

sa: next |W.B.C>
incoming_sequence: ['W', 'B', 'C']
nodes 1: 0.1|node 2: 2>
intersected nodes: 0.1|node 2: 3>
nodes 1: 0.1|node 2: 3>
intersected nodes: 0.1|node 2: 4>
|D . X . Y . Z>
|D>

Or another example:

sa: next |B.C.D.E>
incoming_sequence: ['B', 'C', 'D', 'E']
nodes 1: 0.1|node 1: 1> + 0.1|node 2: 3>
intersected nodes: 0.1|node 1: 2> + 0.1|node 2: 4>
nodes 1: 0.1|node 1: 2> + 0.1|node 2: 4>
intersected nodes: 0.1|node 1: 3> + 0.1|node 2: 5>
nodes 1: 0.1|node 1: 3> + 0.1|node 2: 5>
intersected nodes: 0.1|node 1: 4>
|F . G>
|F>

sa: next |B.C.D.X>
incoming_sequence: ['B', 'C', 'D', 'X']
nodes 1: 0.1|node 1: 1> + 0.1|node 2: 3>
intersected nodes: 0.1|node 1: 2> + 0.1|node 2: 4>
nodes 1: 0.1|node 1: 2> + 0.1|node 2: 4>
intersected nodes: 0.1|node 1: 3> + 0.1|node 2: 5>
nodes 1: 0.1|node 1: 3> + 0.1|node 2: 5>
intersected nodes: 0.1|node 2: 6>
|Y . Z>
|Y>

So it all works as desired. Here is a quick demonstration where we skip a couple of sequence elements, as might happen in a noisy room, in this case C.D, and it still works:

sa: next |B.E>
incoming_sequence: ['B', 'E']
nodes 1: 0.1|node 1: 1> + 0.1|node 2: 3>
intersected nodes: 0.1|node 1: 4>
|F . G>
|F>

sa: next |B.X>
incoming_sequence: ['B', 'X']
nodes 1: 0.1|node 1: 1> + 0.1|node 2: 3>
intersected nodes: 0.1|node 2: 6>
|Y . Z>
|Y>

That's the basics. Subsequences predicting parent sequences, with tolerance for noisy omission of elements. Now apply it to simple sentences encoded as sequences. Consider this knowledge:

$ cat gm-examples/george.gm
A = {george.is.27.years.old}
B = {the.mother.of.george.is.jane}
C = {the.father.of.george.is.frank}
D = {the.sibling.of.george.is.liz}
E = {jane.is.47.years.old}
F = {frank.is.50.years.old}
G = {liz.is.29.years.old}
H = {the.age.of.george.is.27}
I = {the.age.of.jane.is.47}
J = {the.age.of.frank.is.50}
K = {the.age.of.liz.is.29}
L = {the.mother.of.liz.is.jane}
M = {the.father.of.liz.is.frank}
N = {the.sibling.of.liz.is.george}

Process it as usual:

$ ./gm2sw-v2.py gm-examples/george.gm > sw-examples/george-gm.sw
$ ./the_semantic_db_console.py
sa: load george-gm.sw

And first consider what sequences follow "the":

sa: next |the>
incoming_sequence: ['the']
|mother . of . george . is . jane>
|father . of . george . is . frank>
|sibling . of . george . is . liz>
|age . of . george . is . 27>
|age . of . jane . is . 47>
|age . of . frank . is . 50>
|age . of . liz . is . 29>
|mother . of . liz . is . jane>
|father . of . liz . is . frank>
|sibling . of . liz . is . george>
2|mother> + 2|father> + 2|sibling> + 4|age>

And these simple sentences enable us to ask simple questions:

sa: next |the.mother.of>
incoming_sequence: ['the', 'mother', 'of']
nodes 1: 0.1|node 2: 0> + 0.1|node 3: 0> + 0.1|node 4: 0> + 0.1|node 8: 0> + 0.1|node 9: 0> + 0.1|node 10: 0> + 0.1|node 11: 0> + 0.1|node 12: 0> + 0.1|node 13: 0> + 0.1|node 14: 0>
intersected nodes: 0.1|node 2: 1> + 0.1|node 12: 1>
nodes 1: 0.1|node 2: 1> + 0.1|node 12: 1>
intersected nodes: 0.1|node 2: 2> + 0.1|node 12: 2>
|george . is . jane>
|liz . is . jane>
|george> + |liz>

sa: next |the.age.of>
incoming_sequence: ['the', 'age', 'of']
nodes 1: 0.1|node 2: 0> + 0.1|node 3: 0> + 0.1|node 4: 0> + 0.1|node 8: 0> + 0.1|node 9: 0> + 0.1|node 10: 0> + 0.1|node 11: 0> + 0.1|node 12: 0> + 0.1|node 13: 0> + 0.1|node 14: 0>
intersected nodes: 0.1|node 8: 1> + 0.1|node 9: 1> + 0.1|node 10: 1> + 0.1|node 11: 1>
nodes 1: 0.1|node 8: 1> + 0.1|node 9: 1> + 0.1|node 10: 1> + 0.1|node 11: 1>
intersected nodes: 0.1|node 8: 2> + 0.1|node 9: 2> + 0.1|node 10: 2> + 0.1|node 11: 2>
|george . is . 27>
|jane . is . 47>
|frank . is . 50>
|liz . is . 29>
|george> + |jane> + |frank> + |liz>

sa: next |sibling.of>
incoming_sequence: ['sibling', 'of']
nodes 1: 0.1|node 4: 1> + 0.1|node 14: 1>
intersected nodes: 0.1|node 4: 2> + 0.1|node 14: 2>
|george . is . liz>
|liz . is . george>
|george> + |liz>

Or making use of sequence element skipping (in the current code up to 3 sequence elements), we can ask more compact questions:

sa: next |father.george>
incoming_sequence: ['father', 'george']
nodes 1: 0.1|node 3: 1> + 0.1|node 13: 1>
intersected nodes: 0.1|node 3: 3>
|is . frank>
|is>

sa: next |age.george>
incoming_sequence: ['age', 'george']
nodes 1: 0.1|node 8: 1> + 0.1|node 9: 1> + 0.1|node 10: 1> + 0.1|node 11: 1>
intersected nodes: 0.1|node 8: 3>
|is . 27>
|is>

Obviously the brain stores knowledge about the world using more than just rote sentences (unless you are bad at studying for exams), but I think it is not a bad first step. Who knows, maybe very young children do just store simple sequences, without "decorations"? Certainly in adults music knowledge of lyrics and notes feels to be simple sequences. But we still don't have a good definition of what it means to understand something. To me it feels like some well constructed network. ie understanding something means it is thoroughly interlinked with related existing knowledge. But how do you code that?

Finally, an important point to make is the above is only interesting in that I'm doing it in a proposed brain like way. Using grep it is trivial to find subsequences of parent sequences. For example:

$ grep "father.*george" george.gm
C = {the.father.of.george.is.frank}

$ grep "the.*age.*of" george.gm
H = {the.age.of.george.is.27}
I = {the.age.of.jane.is.47}
J = {the.age.of.frank.is.50}
K = {the.age.of.liz.is.29}

And the way we represent our high order sequences has a lot of similarity to linked lists.

BTW, I should mention I tried the strict version of the next operator on the spelling dictionary example, using the subsequence f.r, resulting in this prediction for the next letter:

173|e> + 157|a> + 120|o> + 114|i> + 49|u> + 8|y> + 2|.> + |t>

So pretty much just vowels.

Next post, learning and recalling a sequence of random frames.