Sunday 30 November 2014

learning indirectly

A common thing when having a conversation is to have a set of variables keeping track of who you are talking about. The simplest is "you".
So, in my scheme how can we learn something about you?
We can't simply do:
-- first define context:
sa: context learn you
sa: |you> => |Fred>
sa: age |you> => |age: 29>

Here is why:
sa: dump
----------------------------------------
|context> => |context: learn you>

 |you> => |Fred>
age |you> => |age: 29>
----------------------------------------
We really wanted Fred to be 29, not "you".

So, it took some coding! but we have a new construct:
OP SUPERPOSITION-1 => SUPERPOSITION-2
which maps to:
OP KET => SUPERPOSITION-2
for all kets in SUPERPOSITION-1

Example:
-- see the current definition of "you"
sa: "" |you>
|Fred>
-- so it is "Fred"

-- let's learn his age:
sa: age "" |you> => |age: 29>

-- see what we know:
sa: dump
----------------------------------------
|context> => |context: learn you>

 |you> => |Fred>
age |you> => |age: 29>

age |Fred> => |age: 29>
----------------------------------------
-- success!

Now, swap "you" to "Sam", and learn his age:
sa: |you> => |Sam>
sa: age "" |you> => |age: 34>

sa: dump
----------------------------------------
|context> => |context: learn you>

age |you> => |age: 29>
 |you> => |Sam>

age |Fred> => |age: 29>

age |Sam> => |age: 34>
----------------------------------------
-- again, success!

Now for another example, learn daily closing times for a shop:
sa: context shop closing time

sa: |weekday: _list> => |Monday> + |Tuesday> + |Wednesday> + |Thursday> + |Friday>
sa: |weekend: _list> => |Saturday> + |Sunday>

sa: closing-time "" |weekday: _list> => |time: 6pm>
sa: closing-time "" |weekend: _list> => |time: 4:30pm>

sa: dump
----------------------------------------
|context> => |context: shop closing time>

 |weekday: _list> => |Monday> + |Tuesday> + |Wednesday> + |Thursday> + |Friday>

 |weekend: _list> => |Saturday> + |Sunday>

closing-time |Monday> => |time: 6pm>

closing-time |Tuesday> => |time: 6pm>

closing-time |Wednesday> => |time: 6pm>

closing-time |Thursday> => |time: 6pm>

closing-time |Friday> => |time: 6pm>

closing-time |Saturday> => |time: 4:30pm>

closing-time |Sunday> => |time: 4:30pm>
----------------------------------------

More later!

Update: with my new pretty print table code, we can now do this:
sa: table[day,closing-time] ("" |weekday: _list> + "" |weekend: _list>)
+-----------+--------------+
| day       | closing-time |
+-----------+--------------+
| Monday    | time: 6pm    |
| Tuesday   | time: 6pm    |
| Wednesday | time: 6pm    |
| Thursday  | time: 6pm    |
| Friday    | time: 6pm    |
| Saturday  | time: 4:30pm |
| Sunday    | time: 4:30pm |
+-----------+--------------+
Update: we can tidy up the operator names a little so they are more like natural English:
  list-of |week days> => |Monday> + |Tuesday> + |Wednesday> + |Thursday> + |Friday>
  list-of |weekend days> => |Saturday> + |Sunday>
  closing-time list-of |week days> => |time: 6pm>
  closing-time list-of |weekend days> => |time: 4:30pm>
  table[day,closing-time] list-of (|week days> + |weekend days>)
+-----------+--------------+
| day       | closing-time |
+-----------+--------------+
| Monday    | 6pm          |
| Tuesday   | 6pm          |
| Wednesday | 6pm          |
| Thursday  | 6pm          |
| Friday    | 6pm          |
| Saturday  | 4:30pm       |
| Sunday    | 4:30pm       |
+-----------+--------------+
Update: we can also make the learn-you more like natural English:
  name-of |you> => |Fred>
  age-of name-of |you> => |age: 29>
  name-of |you> => |Sam>
  age-of name-of |you> => |age: 34>
  dump
----------------------------------------
|context> => |context: sw console>

name-of |you> => |Sam>

age-of |Fred> => |age: 29>

age-of |Sam> => |age: 34>
----------------------------------------
Update: another more interesting learning indirectly example. This time one extra operator deep:
  the-third |US President> => |Thomas Jefferson>
  the-party-of the-third |US President> => |party: Democratic-Republican>
  the-dissolution-date-of the-party-of the-third |US President> => |year: 1825>

sa: dump
----------------------------------------
|context> => |context: sw console>

the-third |US President> => |Thomas Jefferson>

the-party-of |Thomas Jefferson> => |party: Democratic-Republican>

the-dissolution-date-of |party: Democratic-Republican> => |year: 1825>
----------------------------------------
ie, inching ever closer to natural language. And I think we can get even closer still!

exponentiating operators

So, it is a common thing to apply the same operator over and over again.
say:
op op op op op |x>
So that motivates a short cut:
op^5 |x>

Let's give an example using a simple binary tree:
left |x> => |0>
right |x> => |1>
child |x> => |0> + |1>

left |0> => |00>
right |0> => |10>
child |0> => |00> + |10>

left |1> => |01>
right |1> => |11>
child |1> => |01> + |11>

left |00> => |000>
right |00> => |100>
child |00> => |000> + |100>

left |10> => |010>
right |10> => |110>
child |10> => |010> + |110>

left |01> => |001>
right |01> => |101>
child |01> => |001> + |101>

left |11> => |011>
right |11> => |111>
child |11> => |011> + |111>

Now, lets use it:
sa: child |x>
|0> + |1>

-- without using the notational short-cut:
sa: child child |x>
|00> + |10> + |01> + |11>

-- using it:
sa: child^2 |x>
|00> + |10> + |01> + |11>

sa: child^3 |x>
|000> + |100> + |010> + |110> + |001> + |101> + |011> + |111>

sa: child^4 |x>
|>
-- we are past the bottom of the tree!

Note that child^k only gives you horizontal slices through the tree.
What if you want a vertical component too?
We can do that.

First recall from maths: exp(x)
And that exp(x) has the Taylor Series:
exp(x) = 1 + x + x^2/2 + x^3/3! + x^4/4! + x^5/5! + ...

If we tidy that up (we don't need the n! denominators), we have:
exp[op,n] |x>
is equivalent to:
(1 + op + op^2 + op^3 + ... + op^n) |x>

eg:
-- equivalent to:
-- |x> + child |x> + child^2 |x>
sa: exp[child,2] |x>
|x> + |0> + |1> + |00> + |10> + |01> + |11>

sa: exp[child,3] |x>
|x> + |0> + |1> + |00> + |10> + |01> + |11> + |000> + |100> + |010> + |110> + |001> + |101> + |011> + |111>

sa: exp[child,2] |1>
|1> + |01> + |11> + |001> + |101> + |011> + |111>

sa: exp[left,2] |1>
|1> + |01> + |001>

Finally, what if you want everything, but you don't know how deep to go?
Hence:
exp-max[op] |x>
which is equivalent to:
exp[op,n] |x> such that exp[op,n] |x> == exp[op,n+1] |x>

eg:
sa: exp-max[child] |x>
|x> + |0> + |1> + |00> + |10> + |01> + |11> + |000> + |100> + |010> + |110> + |001> + |101> + |011> + |111>

sa: exp-max[child] |0>
|0> + |00> + |10> + |000> + |100> + |010> + |110>

Finally, the idea of "six degrees of separation"
If we had the data, then eg:
sa: friends^6 |Fred>
sa: exp[friends,6] |Fred>
sa: exp-max[friends] |Fred>

That's more than enough for now. More later!

Update: now with the magic of tables:
sa: load binary-tree-with-child.sw
sa: table[node,child,text] exp-max[child] |x>
+------+----------+-------------------+
| node | child    | text              |
+------+----------+-------------------+
| x    | 0, 1     | start node        |
| 0    | 00, 10   | first child node  |
| 1    | 01, 11   | second child node |
| 00   | 000, 100 | third child node  |
| 10   | 010, 110 | fourth child node |
| 01   | 001, 101 | fifth child node  |
| 11   | 011, 111 | sixth child node  |
| 000  |          |                   |
| 100  |          |                   |
| 010  |          |                   |
| 110  |          |                   |
| 001  |          |                   |
| 101  |          |                   |
| 011  |          |                   |
| 111  |          |                   |
+------+----------+-------------------+

Saturday 29 November 2014

linearity of operators

One of the powerful features of BKO is the linearity of (most) operators.
eg, say we start with:
  op1 |x> => |a> + |b> + |c>
Then:
  op2 op1 |x>
  = op2 (|a> + |b> + |c>)
  = op2 |a> + op2 |b> + op2 |c>

Let's do a simple example in the console:
-- enter in some knowledge:
sa: friends |Fred> => |Sam> + |Mary> + |Bella>
sa: age |Sam> => |age: 39>
sa: age |Mary> => |age: 42>
sa: age |Bella> => |age: 36>

-- see what we know:
-- who are Fred's friends?
sa: friends |Fred>
|Sam> + |Mary> + |Bella>

-- how old are Fred's friends?
sa: age friends |Fred>
|age: 39> + |age: 42> + |age: 36>

-- enter in some more knowledge:
sa: friends |Sam> => |Liz> + |Rob>
sa: friends |Bella> => |Joan> + |Tom>

-- who are Sam's friends?
sa: friends |Sam>
|Liz> + |Rob>

-- who are Mary's friends?
sa: friends |Mary>
|>
-- in English |> means "I don't know anything about that".
-- so instead of bugging out if you ask it something it doesn't know, it just quietly returns |> (the empty ket)

-- who are Bella's friends?
sa: friends |Bella>
|Joan> + |Tom>

And now finally:
-- who are Fred's friends of friends?
sa: friends friends |Fred>
|Liz> + |Rob> + |Joan> + |Tom>

And note that it quietly made use of:
"superposition + |> = |> + superposition = superposition"
ie, |> is the identity element for superpositions.
So even though we don't know who's Mary's friends are, the code didn't bug out, and replied best it could.
I like to think humans do something similar.

-- what are the ages of Fred's friends of friends?
sa: age friends friends |Fred>
|>
-- ie, we don't know!


Then finally, let's look at what we now know:
sa: dump
----------------------------------------
|context> => |context: sw console>

friends |Fred> => |Sam> + |Mary> + |Bella>

age |Sam> => |age: 39>
friends |Sam> => |Liz> + |Rob>

age |Mary> => |age: 42>

age |Bella> => |age: 36>
friends |Bella> => |Joan> + |Tom>
----------------------------------------
(and we forgot to set context when we started, so it used the default context "sw console")

And that's it for now!

introducing the semantic agent console

So, we have a general scheme to represent knowledge:
OP KET => SUPERPOSITION
Now we need some method to make use of it.
And hence the semantic agent console.
From here on, anything that starts with "sa: " can be assumed to be in the console.
$ ./the_semantic_db_console.py
Welcome!

sa: h

  q, quit, exit                quit the agent.
  h, help                      print this message
  context                      print list of context's
  context string               set current context to string
  reset                        reset back to completely empty console
                               Warning! you will lose all unsaved work!
  dump                         print current context
  dump exact                   print current context in exact mode
  dump multi                   print context list
  dump self                    print what we know about the default ket/sp
  dump ket/sp                  print what we know about the given ket/sp
  display                      (relatively) readable display of current context
  display ket/sp               (relatively) readable display about what we know for the ket/sp
  freq                         convert current context to frequency list
  mfreq                        convert context list to frequency list
  load file.sw                 load file.sw
  save file.sw                 save current context to file.sw
  save multi file.sw           save context list to file.sw
  files                        show the available .sw files
  cd                           change and create if necessary the .sw directory
  ls, dir, dirs                show the available directories
  create inverse               create inverse for current context
  create multi inverse         create inverse for all context in context list
  x = foo: bah                 set x (the default ket) to |foo: bah>
  id                           display the default ket/superposition
  s, store                     set x to the result of the last computation
  .                            repeat last computation
  i                            interactive history
  history                      show last 30 commands
  history n                    show last n commands
  save history                 save console history to file
  -- comment                   ignore, this is just a comment line.
  if none of the above         process_input_line(C,line,x)


personality profile for a bot called bella

This is just an example of the kind of knowledge a bot might need to try the Loebner contest.
First, we build up a data set of personality details.
Then we choose elements randomly, and we get something like this:
name |bot: Bella> => |Bella>
mother |bot: Bella> => |Mia>
father |bot: Bella> => |William>
birth-sign |bot: Bella> => |birth-sign: Cancer>
number-siblings |bot: Bella> => |number: 1>
wine-preference |bot: Bella> => |wine: Merlot>
favourite-fruit |bot: Bella> => |fruit: pineapples>
favourite-music |bot: Bella> => |music: genre: punk>
favourite-play |bot: Bella> => |play: Endgame>
hair-colour |bot: Bella> => |hair-colour: gray>
eye-colour |bot: Bella> => |eye-colour: hazel>
where-live |bot: Bella> => |location: Sydney>
favourite-holiday-spot |bot: Bella> => |location: Paris>
make-of-car |bot: Bella> => |car: Porsche>
religion |bot: Bella> => |religion: Christianity>
personality-type |bot: Bella> => |personality-type: the guardian>
current-emotion |bot: Bella> => |emotion: fear>
bed-time |bot: Bella> => |time: 8pm>

some ket label conventions

Now, as I said previously, ket labels can be anything except <, |, > and \r and \n.
But we have some conventions.

For a start we separate categories from sub-categories using ": ".
eg:
|plant: tree>
|plant: tree: elm>
also with the convention that the sub-categories are a subset of the parent category.
(though perhaps in some cases it is useful to break this convention).
So in this case, tree is clearly a subset of the concept plant.
And elm is clearly a subset of the concept tree, and plant.

Note that we are not forced to use categories (which I guess could equally validly be called "data-types").
eg:
closing-time |monday> => |5pm>
vs:
closing-time |day: monday> => |time: 5pm>
So, while categories/data-types are not mandated, they are often useful.

Next, we have _list.
We use this (again by convention) to define a list of objects.
eg, the states in Australia:
|Australia: state: _list> => |state: Western Australia> + |state: South Australia> + |state: Queensland> + |state: New South Wales> + |state: Victoria> + |state: Tasmania>

eg, shopping list:
|shopping: _list> => 4|apples> + 6|bananas> + 2|tomatos> + |coffee> + |milk> + |bread>

Finally, we have *.
|category: *> matches all members of "category".
And |*> matches everything!
(we have some "magic" behind the scenes to make this work. More on that later)

Here is an example, learning that all plants die:
will-die |plant> => |yes>
will-die |plant: *> => |yes>

Then if we ask the system:
will-die |plant: tree> we get |yes>
And if we ask:
will-die |plant: tree: elm> we get |yes>

More to come!

Update: we have conventions for operator names too!
There are two main types of operators, those that step from one ket to other kets, and those that only change the coeff of the ket. It is sometimes useful to distinguish between these two cases. eg:
op |x> => |a> + |b> + |c> is type 1.
op |x> => 3.14159|x> is type 2.
The convention (though sometimes ignored) is the for this second type to have -self appended to its name. eg, a common usage:
population |Adelaide> => |population: 1200000>
vs:
population-self |Adelaide> => 1200000|Adelaide>
Type 2 operators are very useful when working with lists of objects. Examples of this later!
The other thing to note is that in general type 1 operators do not commute, but type 2 operators always commute. op2-self op1-self |x> == op1-self op2-self |x> for example.

Friday 28 November 2014

Tim Berners-Lee flowchart in BKO

So, the standard learn rule is already quite powerful (though we add more power later).
OP KET => SUPERPOSITION

Certainly powerful enough to map a flow chart to a relatively readable BKO.
Consider the flow chart from the original html proposal:
In BKO is simply:
|context> => |context: www proposal>

describes |document: www proposal> => |"Hypertext"> + |A Proposal "Mesh">
refers-to |document: www proposal> => |Comms ACM>

describes |Comms ACM> => |"Hypertext"> 

includes |"Hypertext"> => |Linked information> + |Hypermedia>

for-example |Linked information> => |Hyper Card> + |ENQUIRE> + |A Proposal "Mesh">

describes |a proposal "mesh"> => |CERN>
unifies |a proposal "mesh"> => |ENQUIRE> + |VAX/NOTES> + |uucp News> + |CERNDOC>

examples |Computer conferencing> => |IBM GroupTalk> + |uucp News> + |VAX/NOTES> + |A Proposal "Mesh">

for-example |Hierarchical systems> => |CERN> + |CERNDOC> + |Vax/Notes> + |uucp News> + |IBM GroupTalk>

includes |CERNDOC> => |document: www proposal>

wrote |person: Tim Berners-Lee> => |document: www proposal>

 
So, if we study the flow-chart vs the BKO we observe that nodes in the chart map to kets, and arrows map to operators. eg, |document: www proposal> and "describes" respectively.
Indeed, if text was not 1D, we could actually put the operators above the => symbol, to make the mapping even clearer.

I guess the final thing to note is that all the coeffs are 1, since in this flow-chart the arrows do not specify weights.

And that's it for now!

context, learn and recall

At the very heart of this project are two functions:
context.learn(a,b,c)
context.recall(a,b)

First thing to note is the context variable.
Every thing we learn is with respect to a particular context.
In one context, everything must be (sort of) self consistent (though the nature of superpositions means we can tolerate ambiguity/inconsistency somewhat).
In different context, well, they are fully independent, so you can define anything.

The convention to define context is this learn rule:
|context> => |context: a sample context>

Perhaps a small example, consider 2 shops with different closing times on Mondays:
|context> => |context: shop 1>
closing-time |monday> => |time: 5pm>

|context> => |context: shop 2>
closing-time |monday> => |time: 6pm>

Now, if these were defined in the same context, then the second closing-time rule would over-write the first. But since they are in different context, it is perfectly fine.

Next, I have the assumption that all knowledge can be represented by triples (a,b,c):
context.learn(a,b,c)
And context.recall(a,b) returns c.

Maybe this assumption is wrong, or not the entire picture, but we can certainly go a long way using it.

For example:
|context> => |context: shop 1>
has:
a = ""
b = "context"
c = "context: shop 1"

closing-time |monday> => |time: 5pm>
has:
a = "closing-time"
b = "monday"
c = "time: 5pm"

Currently in the project we only handle a few data-types for a,b,c (and I don't currently see the need to add any more).
a must be a string.
b is a string or a ket
c is a string (auto cast to a ket), a ket, a superposition, or a stored_rule

And the back-end for context.learn() and context.recall() is currently a convoluted hash-table. But it shouldn't be too hard to swap that out for something better (eg a database or something) if we have to.

That's it for now.

Update: I rewrote the context.learn() and context.recall() code. Much simpler, more memory efficient and faster than the old one. eg, converting the geonames 15000 cities data to sw format was a full 24 times faster with the new code. And this also means loading Moby synonyms or loading geonames cities 1000 into the console, and working with it, is now practical.

Thursday 27 November 2014

Defining some terms

So, the notation I use is motivated by the bra-ket notation as used in quantum mechanics, and invented by the famous physicist Paul Dirac. Note though that the mathematics of my scheme and that of QM are vastly different.

Let's define the terms:
<x| is called a bra
|x> is called a ket
Essentially any text (currently ASCII only) can be inside bra/kets except <, |, > and \r and \n. Though we do have some conventions, more on that later.

Next, we have operators, that are again text.
The python defining valid operators is:
def valid_op(op):
  if not op[0].isalpha() and not op[0] == '!':
    return False
  return all(c in ascii_letters + '0123456789-+!?' for c in op) 
Next, we have what I call "superpositions" (again borrowed from QM). A superposition is just the sum of one or more kets.
eg:
|a> + |b> + |c>
But a full superposition can also have coeffs (I almost always write coefficients as coeffs).
3|a> + 7.25|b> + 21|e> + 9|d>

The name superpositions is partly motivated by Schrodinger's poor cat:
is-alive |cat> => 0.5 |yes> + 0.5 |no>

This BTW, is what we call a "learn rule" (though there are a couple of other variants).
They have the general form:
OP KET => SUPERPOSITION

Next, we have some math rules for all this, though for now it will suffice to mention only these:
1) <x||y> == 0 if x != y.  
2) <x||y> == 1 if x == y.
7) applying bra's is linear. <x|(|a> + |b> + |c>) == <x||a> + <x||b> + <x||c>
8) if a coeff is not given, then it is 1. eg, <x| == <x|1 and 1|x> == |x>
9) bra's and ket's commute with the coefficients. eg, <x|7 == 7 <x| and 13|x> == |x>13  
13) kets in superpositions commute. |a> + |b> == |b> + |a>
18) |> is the identity element for superpositions. sp + |> == |> + sp == sp.
19) the + sign in superpositions is literal. ie, kets add.
|a> + |a> + |a> = 3|a>
|a> + |b> + |c> + 6|b> = |a> + 7|b> + |c>

And that is it for now. Heaps more to come!

Update: I guess you could call superpositions labelled, sparse vectors. By giving each element in a vector a name, it gives us the freedom to drop elements with coeff = 0. For large sparse vectors, this is a big win. For large dense vectors, there is of course a price to pay for all those labels. And since we have labels we can change the order of the elements without changing the meaning of the superposition. Say if we want to sort by coeff size. This is harder if you use standard unlabeled vectors. I guess the other thing to note about superpositions is that they allow you to define operators with respect to vector labels, and not vector positions.

Announcing the Semantic DB Project

So, I've been tinkering/coding some ideas for knowledge representation and semantic web for about a year now. Roughly 14,000 lines of python for the whole project. Anyway, I think now is the time to try and get others interested in my project. I like to think I have some interesting ideas, let's see if others do.

In terms of describing what I'm trying to do, I currently only have this though it is rather minimalist when it comes to words. Its really just a series of examples of my notation, rather than a decent write-up. Maybe this blog will help explain my ideas.

More to come!