AI Foom Debate: Post 29 – 31

29. I Heart CYC (Hanson)

Hanson endorses CYC, an AI-project headed by Doug Lenat, the inventor of EURISKO.

The lesson Lenat took from EURISKO is that architecture is overrated;  AIs learn slowly now mainly because they know so little.  So we need to explicitly code knowledge by hand until we have enough to build systems effective at asking  questions, reading, and learning for themselves.  Prior AI researchers were too comfortable starting every project over from scratch; they needed to join to create larger integrated knowledge bases.  This still seems to me a reasonable view, and anyone who thinks Lenat created the best AI system ever should consider seriously the lesson he thinks he learned.

Yudkowsky replies with this hilarious comment:

“So my genuine, actual reaction to seeing this post title was “You heart WHAT?

Knowledge isn’t being able to repeat back English statements. This is true even of humans. It’s a hundred times more true of AIs, even if you turn the words into tokens and put the tokens in tree structures.

A basic exercise to perform with any supposed AI is to replace all the English names with random gensyms and see what the AI can still do, if anything. Deep Blue remains invariant under this exercise. Cyc, maybe, could count – it may have a genuine understanding of the word “four” – and could check certain uncomplicatedly-structured axiom sets for logical consistency, although not, of course, anything on the order of say Peano Arithmetic. The rest of Cyc is bogus. If it knows about anything, it only knows about certain relatively small and simple mathematical objects, certainly nothing about the real world.

You can’t get knowledge into a computer that way. At all. Cyc is composed almost entirely of fake knowledge (barring anything it knows about certain simply-structured mathematical objects).

As a search engine or something, Cyc might be an interesting startup, though I certainly wouldn’t invest in it. As an Artificial General Intelligence Cyc is just plain awful. It’s not just that most of it is composed of suggestively named LISP tokens, there are also the other hundred aspects of cognition that are simply entirely missing. Like, say, probabilistic reasoning, or decision theory, or sensing or acting or –

– for the love of Belldandy! How can you even call this sad little thing an AGI project?

So long as they maintained their current architecture, I would have no fear of Cyc even if there were a million programmers working on it and they had access to a computer the size of a moon, any more than I would live in fear of a dictionary program containing lots of words.

Cyc is so unreservedly hopeless, especially by comparison to EURISKO that came before it, that it makes me seriously wonder if Lenat is doing something that I’m not supposed to postulate because it can always be more simply explained by foolishness rather than conspiracy.

Of course there are even sillier projects. Hugo de Garis and Mentifex both come to mind.”

Yeah, Hugo de Garis is rather ridiculous, but even I think comparing him to Mentifex is a little harsh 🙂

Another great comment by Marcello Herreshof:

“The human genome fits on a single CD-ROM. Yet a human baby can learn fast. If you do not attribute this feat to the baby’s brain having a good architecture, then what on earth *do* you attribute it to?

A baby doesn’t know that, say, Paris is the capital of France or that Bill Clinton is a president. Therefore, Cyc theoretically shouldn’t need that information.”

Haha, Yudkowsky again:

“Okay… look at this way. Chimpanzees share 95% of our DNA and have much of the same gross cytoarchitecture of their brains. You cannot explain to chimpanzees that Paris is the capital of France. You can train them to hold up a series of signs saying “Paris”, then “Is-Capital-Of”, then “France”. But you cannot explain to them that Paris is the capital of France.

And a chimpanzee’s cognitive architecture is hugely more sophisticated than Cyc’s. Cyc isn’t close. It’s not in the ballpark. It’s not in the galaxy holding the star around which circles the planet whose continent contains the country in which lies the city that built the ballpark.”

30. Recursive Self-Improvement (Yudkowsky)

This is probably one of the most important posts of the whole AI Foom Debate. It’s very long and quite persuasive.

First of all, what are we actually talking about when we’re discussing the possibility of AI going FOOM?

Just to be clear on the claim, “fast” means on a timescale of weeks or hours rather than years or decades; and “FOOM” means way the hell smarter than anything else around, capable of delivering in short time periods technological advancements that would take humans decades, probably including full-scale molecular nanotechnology (that it gets by e.g. ordering custom proteins over the Internet with 72-hour turnaround time).  Not, “ooh, it’s a little Einstein but it doesn’t have any robot hands, how cute”.

Yudkowsky then proposes a classification-scheme for analyzing the developmental velocity of an AI that becomes smarter and smarter.

…We’ll break down this velocity into optimization slope, optimization resources, and optimization efficiency.  We’ll need to understand cascades, cycles, insight and recursion; and we’ll stratify our recursive levels into the metacognitive, cognitive, metaknowledge, knowledge, and object level.

Quick review:

  • “Optimization slope” is the goodness and number of opportunities in the volume of solution space you’re currently exploring, on whatever your problem is;
  • “Optimization resources” is how much computing power, sensory bandwidth, trials, etc. you have available to explore opportunities;
  • “Optimization efficiency” is how well you use your resources.  This will be determined by the goodness of your current mind design – the point in mind design space that is your current self – along with its knowledge and metaknowledge (see below).

Optimizing yourself is a special case, but it’s one we’re about to spend a lot of time talking about.

When a mind solves a problem, there are several causal levels involved (whereby the drawing of the boundaries is of course somewhat arbitrary):

  • “Metacognitive” is the optimization that builds the brain – in the case of a human, natural selection; in the case of an AI, either human programmers or, after some point, the AI itself.
  • “Cognitive”, in humans, is the labor performed by your neural circuitry, algorithms that consume large amounts of computing power but are mostly opaque to you.  You know what you’re seeing, but you don’t know how the visual cortex works.  The Root of All Failure in AI is to underestimate those algorithms because you can’t see them…  In an AI, the lines between procedural and declarative knowledge are theoretically blurred, but in practice it’s often possible to distinguish cognitive algorithms and cognitive content.
  • “Metaknowledge”:  Discoveries about how to discover, “Science” being an archetypal example, “Math” being another.  You can think of these as reflective cognitive content (knowledge about how to think).
  • “Knowledge”:  Knowing how gravity works.
  • “Object level”:  Specific actual problems like building a bridge or something.

There are several classes of phenomena that could lead to a hard takeoff:

  • Roughness:  A search space can be naturally rough – have unevenly distributed slope. With constant optimization pressure, you could go through a long phase where improvements are easy, then hit a new volume of the search space where improvements are tough.  Or vice versa.  Call this factor roughness.
  • Resource overhangs:  Rather than resources growing incrementally by reinvestment, there’s a big bucket o’ resources behind a locked door, and once you unlock the door you can walk in and take them all.

A good example for a resource overhang would be if the AI suddenly gained access to the internet.

Here are other factors that could lead to a hard takeoff, all of them discussed in previous posts, but I include them because they are nicely summarized:

  • Cascades are when one development leads the way to another – for example, once you discover gravity, you might find it easier to understand a coiled spring.
  • Cycles are feedback loops where a process’s output becomes its input on the next round.  As the classic example of a fission chain reaction illustrates, a cycle whose underlying processes are continuous, may show qualitative changes of surface behavior – a threshold of criticality – the difference between each neutron leading to the emission of 0.9994 additional neutrons versus each neutron leading to the emission of 1.0006 additional neutrons.  k is the effective neutron multiplication factor and I will use it metaphorically.
  • Insights are items of knowledge that tremendously decrease the cost of solving a wide range of problems – for example, once you have the calculus insight, a whole range of physics problems become a whole lot easier to solve.  Insights let you fly through, or teleport through, the solution space, rather than searching it by hand – that is, “insight” represents knowledge about the structure of the search space itself.
  • Recursion is the sort of thing that happens when you hand the AI the object-level problem of “redesign your own cognitive algorithms”.

So, let us elaborate on the last factor – recursion:

Imagine you go to an AI programmer and ask him to write a program that plays chess (the object level). The programmer will use his own knowledge of computer science (the knowledge level), the knowledge of science, rationality, etc. (the metaknowledge level) and his neural algorithms which are largely invisible to him (the cognitive level). Those algorithms were of course designed by natural selection (the metacognitive level).

If you go to a sufficiently sophisticated AI – nothing that currently exists – and ask it to perform the same task roughly the same thing would happen. The only difference is that the metacognitive level which would be the programmer.

But now imagine that you hand the AI the problem,

…,”Write a better algorithm than X for storing, associating to, and retrieving memories”.  At first glance this may appear to be just another object-level problem that the AI solves using its current knowledge, metaknowledge, and cognitive algorithms.  And indeed, in one sense it should be just another object-level problem.  But it so happens that the AI itself uses algorithm X to store associative memories, so if the AI can improve on this algorithm, it can rewrite its code to use the new algorithm X+1.

This means that the AI’s metacognitive level – the optimization process responsible for structuring the AI’s cognitive algorithms in the first place – has now collapsed to identity with the AI’s object level.

Now this is something huge. Imagine what would happen if reading scientific books made you literally more intelligent, i.e. enhanced your neural algorithms on the cognitive level, not merely increased your knowledge.

…Inventing science is not rewriting your neural circuitry.  There is a tendency to completely overlook the power of brain algorithms, because they are invisible to introspection.  It took a long time historically for people to realize that there was such a thing as a cognitive algorithm that could underlie thinking.

…All you can see is the knowledge and the metaknowledge, and that’s where all your causal links go; that’s all that’s visibly important.

To see the brain circuitry vary, you’ve got to look at a chimpanzee, basically.  Which is not something that most humans spend a lot of time doing, because chimpanzees can’t play our games.

But what would happen if we could rewrite our own cognitive algorithms? Let’s look at history, to be more specific, at evolution, the only optimization process that ever improved cognitive algorithms from the niveau of chimpanzees to that of humans.

Evolution, the blind idiot god, achieved great things, although there was no insight involved and efficiency and resources stayed roughly the same.

…Why is this important?  Because it shows that with constant optimization pressure from natural selection and no intelligent insight, there were no diminishing returns to a search for better brain designs up to at least the human level.  There were probably accelerating returns (with a low acceleration factor).  There are no visible speedbumps,so far as I know.

And remember we’re speaking about natural selection, a blind and dumb process that isn’t recursive at all.

The conclusion:

…With history to date, we’ve got a series of integrals looking something like this:

Metacognitive = natural selection, optimization efficiency/resources roughly constant

Cognitive = Human intelligence = integral of evolutionary optimization velocity over a few hundred million years, then roughly constant over the last ten thousand years

Metaknowledge = Professional Specialization, Science, etc. = integral over cognition we did about procedures to follow in thinking, where metaknowledge can also feed on itself, there were major insights and cascades, etc.

Knowledge = all that actual science, engineering, and general knowledge accumulation we did = integral of cognition+metaknowledge(current knowledge) over time, where knowledge feeds upon itself in what seems to be a roughly exponential process

Object level = stuff we actually went out and did = integral of cognition+metaknowledge+knowledge(current solutions); over a short timescale this tends to be smoothly exponential to the degree that the people involved understand the idea of investments competing on the basis of interest rate, but over medium-range timescales the exponent varies, and on a long range the exponent seems to increase

If you were to summarize that in one breath, it would be, “with constant natural selection pushing on brains, progress was linear or mildly accelerating; with constant brains pushing on metaknowledge and knowledge and object-level progress feeding back to metaknowledge and optimization resources, progress was exponential or mildly superexponential”.

Now fold back the object level so that it becomes the metacognitive level.

And note that we’re doing this through a chain of differential equations, not just one; it’s the final output at the object level, after all those integrals, that becomes the velocity of metacognition.

-> Intelligence explosion.

31. Whither Manufacturing? (Hanson)

We shouldn’t be too sure about the feasibility of full-scale nanotech. Fully automated and highly local manufacturing is also far from certain which makes amassing resources and producing hardware and thus world-domination more difficult, although it may be still rather trivial for superintelligent AIs.

This entry was posted in AGI, AI-Foom debate, existential risks, FAI, Lesswrong Zusammenfassungen, singularity. Bookmark the permalink.

Leave a Reply

Fill in your details below or click an icon to log in: Logo

You are commenting using your account. Log Out /  Change )

Google photo

You are commenting using your Google account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s