Your Brain, Copyright, and Lossy Compression

Last week, the New Zealand government passed a controversial copyright law related to file sharing. This was partly outrageous because of the use of urgency to pass these laws without due consultation. If you watch any of the videos from that particular debate, it will shine a light on just how clueless the majority of NZ’s politicians are. The notable exceptions are Clare Curran and Gareth Hughes. However, this isn’t a post about the politics! Instead I want to talk about the philosophy behind copyright and how as technology becomes an intrinsic part of our intelligence, the less sense it makes to challenge the personal dispersal or storage of information.

For a good introduction to the topic, read this post on the “colour of bits”. The post outlines the conflicting viewpoints on information: How computer scientists can’t academically differentiate between one copy of a copyrighted piece of data and another, but the pressure from law to try to make something up regardless (e.g. DRM). It also discusses how, if you perform a reversable mathematical transformation of the bits you are fundamentally changing the data but can restore it at any moment. If you can do that, is the transformed version copyrighted too? Given that with the right transformation you can turn any sequence of bytes into any other. That means there is only one copyright holder: the universe.
Continue reading →

The weird and wonderful world of UAVs

Recently, my long time friend and colleague Ben Goertzel came to Hong Kong to help advise on the AI project I’m working on. He also happened to bring a “Parrot” quadcopter (warning, this link autoplays a youtube video), which is an awesome wifi controlled toy that has quad rotors. Not much different to a radio-controlled helicopter except it’s much cheaper and also more stable.

There are some vague plans to do autonomous control of these devices using vision processing and voice recognition. Although the actual hardware used maybe different since quadcopter drones have gone hobbyist and you can build your own from scratch.

I will now leave you with two youtube videos of them in action…

Using the Kinect hacked onto a drone to allow it to build a 3d model of the environment and do it’s own path finding:

Other’s have used motion capture to allow tracking of a ball which is then juggled between two quadcopters with trampolines:

Q&A about open-source AGI development

Ben Goertzel recently asked several people for comment about open-source AGI development for a couple of pieces of writing he’s working on. I thought I’d share my own responses, and I’ll update the post later with Ben’s finished product that will have responses from others too.

What are the benefits you see of the open-source methodology for an AGI project, in terms of effectively achieving the goal of AGI at the human level and ultimately beyond? How would you compare it to a traditional closed-source commercial methodology; or to a typical university research project in which software code isn’t cleaned up and architected in a manner conducive to collaborative development by a broad group of people.

I believe open source software is beneficial for AGI development for a number of reasons.

Making an AGI project OSS gives the effort persistence and allows some coherence in an otherwise fragmented research community.

Everyone has there own pet theory of AGI, and providing a shared platform with which to test these theories I think invites collaboration. Even if the architecture of a project doesn’t fit a particular theory, learning that fact is something that is valuable to know along with where the approaches diverge.

More than one commercial projects with AGI-like goals have run into funding problems. If the company then dissolves there will often be restrictions on how the code can be used or it may even be shut-away in a vault and never be seen again. Making a project OSS means that funding may come and go, but the project will continue to make incremental progress.

OSS also prompts researchers to apply effective software engineering practices. Code developed for research often can end up a mess due to being worked on by a single developer without peer review. I was guilty of this in the past, but working and collaborating with a team means I have to comment my code and make it understandable to others. Because my efforts are visible to the rest of the world there is more incentive to design and test properly instead of just doing enough to get results and publish a paper.

How would you say OpenCog has benefitted specifically from its status
as an OSS project so far?

I think OpenCog has benefited in all the ways I’ve described above.

We’re fortunate to also have had Google sponsor our project for the Summer of Code in 2008 and 2009. This initiative brought in new contributors as well as helped us improve documentation and guides for
making OpenCog more approachable to newcomers. As one might imagine, there is a steep learning curve to learning the ins and outs to a AGI framework!

In what ways would you say an AGI project differs from a typical OSS project? Does this make operating OpenCog significantly different from operating the average OSS project?

One of the most challenging things of building an OSS AGI project compared to any other is that most OSS projects have an end use. A music player plays music, a web server serves web pages, and a statistical library provides implementations of statistical functions.

An AGI on the other hand doesn’t really reach it’s end use until it’s complete. Thus creating packaged releases and the traditional development cycle is not as well defined. We are working to improve this with projects that are applying OpenCog to game characters and other domains, but the core framework is still a mystery to most people. It takes a certain level of investment before you can see how might apply the server and other aspects of OpenCog in your applications.

However, a number of projects associated with OpenCog have made packaged releases. RelEx, the NLP relationship extractor, and MOSES, a probabilistic genetic programming system, are both standalone tools.

Some people have expressed worries about the implications of OSS
development for AGI ethics in the long term. After all, if the code
for the AGI is out there, then it’s out there for everyone, including
bad guys. On the other hand, in an OSS project there are also
generally going to be a lot more people paying attention to the code
to spot problems. How do you view the OSS approach to AGI on balance
— safer or less safe than the alternatives, and why? And how
confident are you of your views on this?

I believe that the concerns of OSS development of AGI are exaggerated. We are still in the infancy of AGI development and scare-mongering by saying that any such efforts shouldn’t happen won’t solve anything. Much like prohibition, making something illegal or refusing to do it will just leave it to more unscrupulous types.

I’m also completely against the idea of a group of elites developing AGI behind closed doors. Why should I trust self-appointed guardians of humanity? This technique is often used by the less pleasant rulers of modern-day societies: “Trust us – everything will be okay! Your fate is in our hands. We know better.”

The open-source development process allows developers to catch the coding mistakes of one another. When a project reaches fruition, they typically have many contributors and many eyes on the code will catch what smaller teams may not. However, it also allows other Friendly AI theorists to inspect the mechanism behind an AGI system and make specific comments about the ways in which Unfriendliness could occur. When everyone’s AGI system is created behind closed doors, these specific comments can not be made, or proven to be correct.

Further, a lot behind the trajectory of an AGI system will be dependent on the initial conditions. Indeed, even the apparent intelligence of the system may be influenced by whether it has the right environment and whether it’s bootstrapped with knowledge about the world. Just like having an ultra intelligent brain sitting in a jar with no external stimulus will be next to useless, so will a seed AI that doesn’t have a meaningful connection to the world… (despite potential claims otherwise I can’t see seed AI developing in a ungrounded null-space).

I’m not 100% confident of this, but I’m a rational optimist. Much like I’m a fan of open governance, I feel the fate of our future should also be open.

Are there any other relevant questions you think I should have asked?
If so feel free to pose and answer them for me 😉 …

When will the singularity occur? … would be the typical question the
press would ask so that they can make bold claims about the future

But my answer to that is NaN. 😉

Variadic template example in C++0x

I must have been living under a rock because I’d been unaware of all the cool additions being added in C++0x. I thought I’d do a series of examples showing simple use of several of these. This one is about variadic templates, namely, templates that can have any number of types associated with them.

So not only can you do:

<typename T> int my_function(T arg) {...}

But you can do this:

<typename T, typename... Args> in my_function(T arg, Args the_rest) {...}

Wikipedia has a good overview, so I’ll direct you there for more. I’m essentially copying their printf example, but I had a few issues compiling it, so for completeness I’ll mention them here:

#include <stdexcept>
#include <iostream>

void mprintf(const char* s)
    while (*s) {
        if (*s == '%' && *(++s) != '%')
            throw std::runtime_error("invalid format string: missing arguments");
        std::cout << *s++;
template<typename T, typename... Args>
void mprintf(const char* s, T value, Args... args)
    while (*s) {
        if (*s == '%' && *(++s) != '%') {
            std::cout << value;
            mprintf(s, args...); // call even when *s == 0 to detect extra arguments
        std::cout << *s++;
    throw std::logic_error("extra arguments provided to printf");

int main(int argc, char** argv) {
    std::string s("test");
    std::string s2("test2");
    mprintf("% - %", s, s2);
    return 0;

Compile with:

gcc -std=c++0x -lstdc++ -o vtemplate-example

If you run the executable vtemplate-example you should get the output “test – test2”.

A couple of notes:

  • I renamed printf (used in the wikipedia example) to mprintf because the compiler complained about the call to the overloaded printf being ambiguous.
  • If you are on OSX, make sure you have gcc-4.4 installed and that you invoke it instead of the Apple gcc. If you try to use the Apple version of gcc it won’t understand the -std=c++0x option.

I hope to play with lambda functions sometime soon too.

Idea: Combine Flattr and Boxee

This is something I’ve been simmering on for a while, hoping against hope that an influx of spare time might let me implement it. Alas, it seems unlikely, so here’s a post to push it into the collective unconscious of the internet.

Boxee is a platform for media PCs based on XBMC which is free to download (they eventually plan to release a hardware appliance). Flattr is a social micropayment site, who I think have got the payment system just right (fixed monthly amount you can choose, Flattr evenly distributes that amount amongst the things you “flattr”, that way you can freely flattr things without worrying about breaking the bank).

I would like to combine the two. I have a lot of media, but I rarely watch broadcast TV. I’m not purposefully trying to avoid the media creators being paid, but:

  • I want to watch shows when and where I feel like it, not at designated times.
  • I don’t want to be bombarded by inane advertising that is so blatantly manipulative and in your face that it usually puts me off the products.
  • I don’t want to accumulate more physical stuff by buying bits of plastic (aka DVDs)
  • I hate iTunes – I wouldn’t trust it to do… well anything, except reliably have the GUI thread lock up while trying to use it (yes, on OSX as well as Windows)

Thus I resort to downloading torrents and sharing media with friends. I want to give back to the media creators, but it has to be convenient. To me, convenience is the primary factor driving piracy. I can download almost any piece of media and I can usually do it faster than it takes me to go to the shop or dvd rental store. I also don’t have to return anything or accumulate and waste physical materials.

So, my idea is, to set up a trust of some sort, which creates a plugin to be added to Boxee, this in turn flattrs the media that you watch. This trust would then be responsible for passing the payment on to the initial creators. The difficult part is getting the money to the creators, but you could allow for creators to claim content and/or only seek the creator when a certain threshold of money has been assigned to that piece of content.

Flattr also allows anonymous flattrs, so it’d hopefully protect people from being singled out for piracy while the law catches up with digital reality. Besides, you could also allow people to just flattr episodes they are fond of, so that there is no evidence of whether it’s just a fan or someone who “illegally” downloaded the show.

Free will and chaotic brains

My personal take on free will is that it’s an illusion, as is consciousness.

The impression of free-will is very believable though as the brain probably exhibits chaotic dynamics[1]. From any given state the brain is in, a slight change, however minute, could give rise to a very different outcome later on. This means that for any system with a model that’s external to an individual brain (e.g. a brain simulation if such a thing is possible), it is impossible for that model to completely predict the behaviour of the brain… eventually the brain’s state will diverge from the model. The important point is, this can happen even if the brain is completely deterministic. So even if the rules governing our cognition are unwavering instructions, which I think is unlikely, there is still the inability for a system outside of the brain to predict it’s behaviour[2].

In addition, I believe that consciousness is due to a recursive model that represents ourselves (ala Douglas Hofstadter’s book – I am a Strange Loop). As this is a model of the epiphenomenon of our “self”, it also has incomplete knowledge of the rest of the brain – this gives our conscious minds the illusion of free will as it can’t completely predict what it/we will do next. We think we are weighing up choices based on our knowledge and then making a “decision”, but that’s because we (our conscious minds) don’t have complete knowledge of the brain’s underlying hardware which ultimately leads us to that choice. This lack of knowledge in our conscious minds is what we call “free will”.

[1] Or at least I’d expect it to, I don’t have references I’ve read over, but this looks promising.

[2] That is, assuming we exclude the almost impossible ideal of having perfect knowledge of the brain’s state which would include all neurochemistry as well as structure.

This post is taken from a comment I made to Leo Parker Dirac’s post on “Free Will and Turing-completeness of the Brain”. Turns out I think it’s a relatively succinct description of what the concept of free will actually is so I thought I’d repost it here…

Don’t become a closed system

Another post from the draft pile that I finally polished into something that isn’t a series of half formed sentences… enjoy 😉

The human body as a closed system is not sustainable, as any closed system eventually achieves an equilibrium lacking order. Entropy would increase as the second law of thermodynamics asserts itself. Flux of energy/matter is required to maintain and build order. This is a central part of Ludwig von Bertalanffy’s paper on “general systems theory” and his theory of open systems:

“the conventional formulation of physics are, in principle, inapplicable to the living organism being open system having steady state. We may well suspect that many characteristics of living systems which are paradoxical in view of the laws of physics are a consequence of this fact.”

I think though, that a similar law applies to intelligent systems. Without stimulus the mind is not alive and eventually a lack of synaptic firing would lead to the neuronal weighting between neurons to deteriorate. This would result in a reversal to the initial states that most artificial neural networks start in (they are usually initiated with random weights)… but perhaps this reversal of weights on neurons that no longer fire isn’t a bad thing. It may lead to them being re-purposed…

As one ages, it can become more difficult to pick up new information as existing synaptic channels get reinforced and so the neuronal tributaries of our brains because less used, or require more active effort to use than taking the ready associations that come easily to our consciousness. While these tributaries may get reset to random weightings due to dis-use, this may allowed them to later get stimulated and used to store new associations.

The NY Times earlier this year posted “How to train the aging brain”:

“There’s a place for information,” Dr. Taylor says. “We need to know stuff. But we need to move beyond that and challenge our perception of the world. If you always hang around with those you agree with and read things that agree with what you already know, you’re not going to wrestle with your established brain connections.”

Such stretching is exactly what scientists say best keeps a brain in tune: get out of the comfort zone to push and nourish your brain. Do anything from learning a foreign language to taking a different route to work.

These new scenarios make the brain utilise alternative neuronal branches:

“As adults we have these well-trodden paths in our synapses,” Dr. Taylor says. “We have to crack the cognitive egg and scramble it up. And if you learn something this way, when you think of it again you’ll have an overlay of complexity you didn’t have before — and help your brain keep developing as well.”

Not only that, but if you are encouraging more interesting events in your life, especially those that push and challenge you and your preconceptions, then your perception of time expands. While in the moment it may seem like time flies, retrospectively it will seem like the past took longer. The brain collapses intervals of time where nothing much happens.

So if you don’t push your brain to learn new things, you’re cutting it off from having anything new to work with. It will also be easier to efficiently and compactly store your experiences based on what you already know. This shrinks your temporal impression of memory and, retrospectively, it will seem as though the last 5 or 10 years were but a blink. If you keep using the same arguments, and facing the same challenges, then you will become optimised and specialised at that task, but this will come at the cost of generality and breadth of understanding.

Measuring text information content through the ages…

Earlier this week I met with a linguistics PhD student from Victoria University named Myq, we discussed a variety of topics. I shared my experience with OpenCog and suggested he check out RelEx. He discussed his work around disproving a study which investigated the number of words required in a piece of text to retain the core meaning. Basically, a lot of the words in text/speech, although useful for stringing ideas together, are not vital to the message being carried.

This got me thinking…

Since I’m working on NetEmpathy, which is currently focussed on analysing the sentiment of tweets, the meaning within tweets (when it exists) is very high. There’s little space for superfluous flowery text when you only have 140 characters.

Myq mentioned how academic papers are a lot like this now. The meaning is highly compressed, particularly in scientific papers. You’ve got to summarise past research, state your method so that it’s reproducible, analyse the results, etc. All in a half a dozen pages. This wasn’t always the case though. In the past academic papers would be long works which meandered their way to the point. Part of this might have to do with the amount of preexisting knowledge present in society, i.e. earlier on there was less global scientific knowledge available, so to adequately cover the background of a subject wasn’t a major difficulty and they could spend more time philosophising. That’s a topic for another post though…

What I was interested is how densely information is packed. Is this increasing?

My immediate thoughts were: text compression! and measure the entropy!.

Basically, information theory dictates that text that contains less information can be represented in fewer bytes. This is why it’s possible to create lossless compression. You assign frequent symbols to be represented by smaller ones. For example, because ‘the’ is one of the most common English words, you might replace it with ‘1’ (and crudely, you could replace ‘1’ with ‘the’ so that you could still use ‘1’ normally). This way, you’ve reduced the size of that symbol by two thirds without loss of information. Obviously this wouldn’t improve your compression factor and a spreadsheet full of numbers though.

A guy called Douglas Biber has apparently already investigated this information content historically, but from a more linguistic and manual investigation.

What I’d like to do one day is examine the compression factors of early scientific journals, recent journals, tweets, txt messages, wikipedia, etc. and see just how the theoretical information content has changed, if at all.

Another project for when I’m independently wealthy.

Sexism, Racism and the Ism of Reasoning

Note this post is not to condone racism or sexism, merely as an explanation of how it might come about from embodied experience and probabilistic reasoning, as well as how we might protect against it.

Things like racism or sexism, or over-generalising on a class of people is one of the more socially inappropriate things you can do. However, depending on how your logic system works, it’s not an entirely unreasonable method of thinking (the word “unreasonable” chosen purposefully) – and for any other subject, where the things being reasoned about are not humans, we wouldn’t particularly care. In fact, certain subjects like religion and spirituality are held to less strict standards of reasoning… there’s actually more defense in being racist/sexist then being a practitioner of certain religions. Perhaps this is why these occasionally go hand in hand[1].

So what do I actually mean by this? I’m going to use two methods of reasoning, deduction and induction, and then explain them in terms of uncertain truth. Nothing in this world is ultimately absolute[2] and so it behooves us to include probabilistic uncertainty in to any conclusion or relationship within our logic set.

Continue reading →

Empathy in the machine

A draft post/idea from the archives that I thought it was about time that I release. Funnily, this was entirely before I started working on NetEmpathy – maybe it’s not as disconnected as I thought from AGI after all!

It is my belief that empathy is a a prerequisite to consciousness.

I recently read Hofstadter’s I am a strange loop, whose central themes are around recursive representations of self leading to our perception of consciousness. For some, the idea that our consciousness is somewhat of an illusion might be hard to swallow – but then, quite likely, so are all the other qualia. They seem real to us, because our mind makes it real. To me, it’s not a huge hurdle to believe. I find the idea that our minds are infinitely representing themselves via self-reflection kind of beautiful in simplicity. You can get some very strange things happening when things start self-reflecting.

For example, Gödel’s incompleteness theorem originally broke Principia Mathematica and can do the same for any sufficiently expressive formal system when you force that formal system to reason about itself. One day I’ll commit to explaining this in a post, but people write entire books about the idea to make Godel’s theorem and it’s consequences easy to understand!

And as an example of self-reflection and recursion being beautiful, I merely have to point to fractals which exhibit self-similarity at arbitrary levels of recursion. Or perhaps the recursive and repeating hallucinations induced by psychedelics give us some clue about the recursive structures within the brain.

Hofstadter also later in the book delves into slightly murky mystical waters, which I find quite entertaining and not without merit. He says that, due to us modelling of the behaviour of others, we also start representing their consciousness too. The eventual conclusion, which is explained in much greater and philosophical detail in his book, is that our “consciousness” isn’t just the sum of what’s in our head but is a holistic total of ourselves and everyone’s representation of us in their heads.

I don’t think the Turing test will really be complete until a machine can model humans as individual and make insightful comments on their motivations. Ok, so that wouldn’t formally be the Turing test any more, but I think that as a judgement of conscious intelligence, the artificial agent needs to at least be able to reflect the motivations of others and understand the representation of itself within others. Lots of recursive representations!

The development of consciousness within AI via empathy is what, in my opinion, will allow us to create friendly AI. Formal proofs won’t work due to computational irreducibility of complex systems. In an admittedly strained analogy this is similar to trying to formally prove where a toy sailboat will end up after dropping it in a river upstream. Trying to prove that it won’t get caught in an eddy before it reaches the ocean of friendliness (or perhaps if you’re pessimistic and you view the eddy as the small space of possibilities for friendly AI). Sure computers and silicon act deterministically (for the most part), but any useful intelligence will interact with an uncertain universe. It will also have to model humans out of necessity as humans are one of the primary agents on the Earth that will need to interact with… perhaps not if it becomes all-powerful but certainly initially. By modelling humans, it’s effectively empathising with our motivations and causing parts of our consciousness to be represented inside it[1].

Given that machine could increase it’s computationally capacity exponentially via Moore’s law (not to mention via potentially large investment and subsequently rapid datacenter expansion) it could eventually model many more individuals than any one human does. So if the AI had a large number of simulated human minds, which would, if accurately modelled, probably bawk at killing the original, then any actions the AI performed would likely benefit the largest number of individuals.

Or perhaps the AI would become neurotic trying to satisfy the desires and wants of conflicting opinions.

In some ways this is similar to Eliezer’s Collected Extrapolated Volition (as I remember it at least… It was a long time ago that I read it. I should do so again to see how/if it fits with what I’ve said here).

[1] People might claim that this won’t be an issue because digital minds designed from scratch will be able to box up individual representations to prevent a bleed through of beliefs. Unfortunately, I don’t think this is a tractable design for AI, even if it was desirable. AI is about efficiency of computation and representation, so these concepts and beliefs will blend. Besides, conceptual blending is quite likely a strong source of new ideas and hypotheses in the human brain.