?Read ?Why AM and Eurisko Appear to work??Write a few paragraph essay as to what ?Theorem discovery? in AI might have to do with data analysis and prediction (with or without machine learning).
Unformatted Attachment Preview
From: AAAI-83 Proceedings. Copyright ©1983, AAAI (www.aaai.org). All rights reserved.
Why AM and Eurisko Appear to Work
John Seely Brown
Cognitive and Instructional
Palo Alto, Ca.
Douglas B. Lenat
Heuristic Programming Project
Of central importance is the RECUR-ALG
which contains a recursive algorithm for computing
LIST-EQUAL of two input lists x and y. That algorithm
recurs along both the CAR and CDR directions of the
list structure, until it finds the leaves (the atoms), at
M?hich point it checks that each leaf in N is identicallq
equal to the corresponding node in 1. If an recursive
call on LIST-EQUAL signals KIL, the entiie result is
KlL, otherwise the result is T. During one NM task, it
sought for examplss of LIST-EQUAL in action, and a
heuristic accomodated by pickin
random pairs of
examples of LIST, plugging them m for x and y, and
running the algorithm. Needless to say, very?few of those
executions returned T (about 2%, as there iere about 50
examples of LIST at the time). Another heuristic noted
that this was extremely IOM(though nonzero), so it might
be worth defining new predicates by slightly- generalizing
LIST-EQUAL; that is, copy its algorithm and weaken it
so that it returns T more often. When that task was
chosen from the agenda, another heuristic said that one
way to generalize a definition with two conjoined
recursive calls was simply to eliminate one of them
entirely, or to replace the AND by an OR. In one run
(in June. 1976) AM then defined these three new
Seven years ago, the AM program was constructed as
an experiment in learning by discovery. Its source of
power was a large body of heuristics, rules which guided
fruitful topics of investigation,
to perform, toward plausible
hypotheses and definitions.
Other heuristics evaluated
those discoveries for utility and ?interestingness?,
they were added to AM?s vocabulary of concepts. AM?s
ultimate limitation apparently was due to its Inability to
discover new, powerful, domain-specific heuristics for the
various new fields it uncovered. At that time, it seemed
straight-forward to simply add Heuretics (the study of
heuristics) as one more field in which to let AM explore,
observe, define, and develop. That task — learning new
heuristics by discovery — turned out to be much more
difficult than was realized initially, and we have just now
achieved some successes at it. Along the way, it became
clearer why AM had succeeded in the first place, and
why it was so difficult to use the same paradigm to
discover new heuristics.
This paper discusses those
recent insights. They spawn questions about ?where the
meaning really resides? in the concepts discovered by
A?/I. This leads to an appreciation of the crucial and
unique role of representation in theory fomlation, a role
intolling the relationship bet%een Form and Content.
What AlI Really Did
In essence, AM was an automatic programming
system, whose primitive actions produced modifications
to pieces of Lisp code, predicates which represented the
characteristic functions of various math concepts.
instance, AM had a frame that represented the concept
LIST-EQUAL, a predicate that checked any two Lisp list
structures to see whether or not they were equal (printed
out the same way). That frame had several slots:
The first of these, L-E-1, has had the recursion in the
CAR direction removed. All it checks for now is that,
when elements are stripped off each list. the tKo lists
become null at exactI the same time. That is, L-E-l is
noM the predicate be might call Same-Length.
(EQUAL x y))
If”” ,6″,’,”” xl (ATOM Y))
(EQ x Y))
The second of these, L-E-2, has had the CDR
recursion removed. When run on tM.0 lists of atoms, it
checks that the first elements of each list are equal.
When run on arbitrary lists, it checks that they have the
same number of leading left parentheses, and then that
the atom that then appears in each is the same.
(CAR x) (CAR y))
(CDR x) (CDR y))))))
The third of these is more difficult to characterize in
words. It is of course tnore general than both L-E-l and
L-E-2; if x and y are equal in length then L-E-3 would
Algorithms slot of the frame labelled ?C?. This would
typically take about 4-S lines to write down, of which
only 1-3 lines were the ?meat? of the function. Syntactic
mutation of such tiny Lisp programs led to meaningful,
related Lisp programs, which in turn lvere often the
characteristic function for some meaningful, related math
concept. But taking a two-page program (as many of the
AV heuristics were coded) and makmg a small syntactic
mutation is doomed to almost alwa!Vs giving garbage as
the result. It?s akin to causing a point mutation in an
organism?s DKA (by bombardins it with radiation, say):
in the case of a very simple mlcroorganism, there is a
reasonable chance of producing a triable, altered mutant.
In the case of a higher animal, however, such point
mutations are almost universally deleterious.
return T, as it would if they had the same first element,
etc. This disjunction propogates to all levels of the list
structure. so that L-E-3 would return true for
x = (A (B C D) E F) and y = (Q (B)) or even y = (Q
(W X Y)). Perhaps this predicate is most concisely
described by its Lisp definition.
A few points are important to make from this
example. First, note that AM does not make changes at
random, it is driven by empirical findings (such as the
rarity of LIST-EQUAL returning T) to suggest specific
directions in which to change particular concepts (such as
deciding to generalize LIST-EQUAL).
haking reached this eminently reasonable goal, it then
reverts to a more or less syntactic mutation process to
(Ch an g ing AND to OR, eliminating a
conjunct from an AKD, etc.) See [Green et al., 741 for
on this style of code synthesis and
representations fine-grained enough to capture all the
nuances of the concepts they stand for (at least, all the
properties we can think of), but we rarely worry about
making those representations
too flexible, too fmegrained.
But that is a real problem: such a ?too-finegrained? representation creates syntactic distinctions that
don?t reflect semantic distinctions — distinctions that are
meaningful in the domain.
For instance, in cpdin$ a
piece of knov,ledge for MYCIN, in u7hich an lteratlon
was to be performed, it was once necessary to use several
rules to achieve the desired effect. The ph),sicians (both
the experts and the end-users) could not make head or
tail of such rules indiiduallj-, since the doctors didn?t
break their knowledge down below the level at which
As another example, in
iteration was a primitive.
representing a VLSI design heuristic H as a two-page
Lisp program, enormous structure and detail were
added — details that are meaningless as far as capturing
its meaning as a piece of VLSI knowledge (e.g., lots of
named local variables being bound and updated; many
which were conceptually
primitive part of H were coded as several lines of Lisp
which contained dozens of distinguishable (and hence
Those details were
mutable) function calls: etc.)
meaningful (and necessary) to H?s implementation on a
Of course, ne can never directly
mutate the meaning of a concept, we can only mutate the
structural for-t?? of that concept as embedded in some
scheme. Thus, there is never any
detail? that is a consequence of the
rather than some genuine part of the
Second, note that all three derived predicates are at
least a priori plausible and interesting and valuable.
The! are not trivial (such as alia>s returning T, or
ali,ays returning !j hat LIST-EQUAL returns), and et en
the strangest of them (L-E-3) is genuinely
exploring for a minute.
Third, note that one of the three (L-E-2) is familiar
and useful (stime leading element), and another one (LE-l) is familiar and of the utmost significance (same
length). AM quickly derived from L-E-l a function we
would call LESGTH and a set of canonical lists of each
possible length ( ( ), (T), (T T), (T T T), (T T T T), etc.:
i.e., a set isomorphic to the natural numbers).
restricting list operations (such as APPEND) to these
canonical l, AM derived the common arithmetic
functions (in this case, addition), and soon began
exploring elementary number theory. So these simple
mutations sometimes led to dramatic discoveries.
Why was that?
attributed it to the power of heuristic search (in defining
specific goals such as ?generalize LIST-EQUAL?) and to
the density of worthwhile math concepts. Recently, we
have come to see that it is, in part, the density of
worthwhile math concepts as represented in Lisp that is
the crucial factor.
The Significance of AN?s Representation
But there are even more serious representations
issues. In terms of the syntax of a given language, it is
straightforward to define a collection of mutators that
produce minimal generalizations of a given Lisp function
to its implementation
structure (e.g., removing a conjunct, replacing AXD by
OR, finding a NOT and specializing its argument, etc.)
Structural generalizations produced in this \ay can be
guaranteed to generalize the extension of function, and
that necessarih. produces a generalization of its intension,
its meaning. -i?herein lies the lure of the AM and Eurisko
paradigm. Ve noif understand that that lure conceals a
dangerous barb: minimal generalizations defined over a
function?s structural encoding need not bear much
to minimal intensional
especially if these functions are computational objects as
opposed to mathematical entities.
It was only because of the intimate relationship
between Lisp and Vlathematics that the mutation
composition, argument elimination. function substitution,
etc.) turned out to j ield a high ?hit rate? of ,iable, useful
new math concepts when applied to prei iousl!–known,
useful math concepts– concepts represented as Lisp
functions. But no such deep relationship existed between
Lisp and Heuretics. and 15hen the basic automatic
(mutations) operators N ere applied to
viable, useful heuristics, they almost alwal s produced
useless (often worse than useless) new heuristic rules.
To rephrase that: a math concept C was represented
in AM by its characteristic function, which in turn was
represented as a piece of Lisp code stored on the
primary retrospective lesson ue ha.e gleaned from our
study of AM. We have applied it to getting Eurisko to
discover heuristics. and are beginning to get Eurisko to
discover such new languages, to automatically modify its
vocabulary of slots. To date. there are three cases in
which Eurisko has successfully and fruitfully split a slot
into more specialized subslots. One of those cases was in
the domain of designing three dimensional VLSI circuits,
where the Terminals slot was automatically split into
Since 1976, one of us has attempted
EURISKO (the descendant of AM; see [Lenat 82,83a,b])
to learn new heuristics the same way it learns new math
concepts. For five years, that effort achieved mediocre
Gradually, the way we represented heuristics
changed, from two opaque lumps of Lisp code (a onepage long IF part and a one-page long THEN part) into
a new language in which the statement of heuristics is
more natural: it appears more spread out (dozens of slots
replacing the IF and THEN), but the length of the values
in each IF and THEN is quite small, and the total size of
all those values put together is still much smaller (often
an order of magnitude) than the original two-page lumps
The central argument here is the following:
(1) ?Theories? deal with the meaning, the content of a
body of concepts, whereas ?theory formation? is of
necessity limited to working on form, on the structures
that represent those concepts in some scheme.
(2) This makes the mapping between form and content
quite important to the success of a theory formation
effort (be it by humans or machines).
(3) Thus it?s important to find a representation in which
the form<-->content mapping is as natural (i.e., efficient)
as possible, a representation that mimics (analogicall)
the conceptual underpinnings
of the task domain b&g
This is akin to Brian Smith?s
of the desire to achisle
alignment betljeen the syntax and semantics of a
(4) Exploring ?theorb formation? therefore frames — and
forces us to study — the mapping between form and
(5) This is especially true for those of us in AI who wish
to build theory formation
programs, because that
mapping is vital to the ultimate successful performance
of our programs.
It is not merely the shortening of the code that is
important here, but rather the fact that this new
vocabulary of slots provides a functional decomposition of
the original two-page program. A single mutation in the
now ?macro expands? into many
coordinnted small mutations at the Lisp code level:
conversely. most :leaningless small changes at the Lisp
level can?t e?en be expressed in terms of changes to the
higher-order language. This is akin to the uay biological
evolution makes use of the gejle as a meaningful
functional unit, and gets great milage from rearranging
A heuristic in EURISKO is now — like a math
concept always was in AM — a collection of about twenty
or more slots, each filled with a line or two worth of code
(or often just an atom or two). By employing this new
language, the old property that A-M satisfied fortuitously
is once again satisfied: the primitive syntactic mutation
operators usually now produce meaningful
variants of what they operate on. Partly by design and
partly by evolution, a language has been constructed in
which heuristics are represented naturally, just as Church
and McCarthy made the lambda calculus and Lisp a
language in which math characteristic functions could be
represented naturally. Just as the Lisp<-->Math ?match?
helped AM to work, to discover math concepts, the new
?match? helps Eurisko to discover heuristics.
Where does the meaning reside?
We speak of our progr?ams knowing something, e.g.
ANs knowing about the List-Equal concept. But in what
sense does A-V know it? Although this question may
seem a bit adolescent, we believe that in the realm of
theory formation (and learning s!,srems), answers to this
question are crucial, for otherwise what does it mean to
say that the system has ?discovered? a new concept? In
fact, many of the controversies over A;M stem from
confusions about this one issue — admittedly, confusions
in our own understanding of this issue as well as others?.
In getting Eurisko to work in domains other than
mathematics, we have also been forced to develop a rich
set of slots for each domain (so that any one value for a
slot of a concept will be small) and provide a frame that
contains information about that slot (so it can be used
meaningfully by the program).
This combination of
small size, meaningful functional decomposition, plus
explicitly stored information about each type of slot,
enables the AM-Eurisko scheme to function adequately.
It has already done so for domains such as the design of
three dimensional VLSI chips, the desi.gn of fleets for a
futuristic nai al wargame, and for lnterhsp programming.
In AM and Eurisko, a concept C is simuly;eously
fundamentally different ways. The first way is via its
characteristic function (as stored on the Algorithms and
Domain/Range slots of the frame for C). This provides a
meani?g relative to the WOJ it is interpreted, but since
there 1s a single unchanging EVAL, this provides a
unique interpretation of C. The second way a concept is
specified is more declaratilel!.. ia slots that contain
corzstraiuts on the meaning: Generalizations,
For instance, if b?e specify that D is a
of C (i.e., D is an entr1 on C?s
by the semantics
?Generalizations? all entries on C?s Examples slot oucht
to cause D?s Algorithm to return T.
squeeze the set of possible meanings of C but rarely to a
single point. That is. multiple interpretations based just
on these underdetermined
constraints are still possible.
Sotice that each scheme has its ow?n unique advantage.
The characteristic function provides a complete and
We believe that such a natural representation should
be sought b> anypne building an expert system for
domain X: if M,hat IS bei?g built is intended to form new
theories about X, then it IS a necessity, not a luxury.
is, it is necessary to find a way of representing X?s
concepts as a structure whose pieces are each relatively
small and unstructured.
In many cases, an existing
representation will suffice, but if the ?leaves? are large,
simple methods will not suffice to transform and
combine them into new, meaningful ?leaves?. This is the
AM (and any AI program) is merely a model, and by
watching it we place a particular interpretation on that
model, though many alternatives may exist.
representation of a concept by a Lisp encoding of its
characteristic function may very well admit only one
interpretation (given a fixed EVAL, a fixed set of data
structures for arguments,
But most human
observers looked not at that function but rather at the
underconstrained declarative information stored on slots
Generalizations, ISA, Examples, and so on. We find it
provocative that the most useful heuristics in Eurisko -the ones which provide the best control guidance — have
triggering conditions which are also based only on these
same underconstraining slots.
that can both be executed
efficiently and operated 011. The descriptive information
instead provides the grist to guide
control of the mutators,
as well as jogging the
imagination of human users of the program by forcing
them to do the disambiguation themselves! Both of these
uses capitalize on the ambiguities. We will return to this
point in a moment but first let us consider how meaning
resides in the characteristic function of a concept.
It is beyond the scope of this paper to detail how
meaning per se resides in a procedural encoding of a
characteristic function. But two comments are in order.
First, it is obvious that the meaning of a characteristic
function is always relative to the interpreter (theory) for
the given language in which the function is. In this case,
the interpreter can be succintly7 specified by the EVAL of
Purchase answer to see full
Delivering a high-quality product at a reasonable price is not enough anymore.
That’s why we have developed 5 beneficial guarantees that will make your experience with our service enjoyable, easy, and safe.
You have to be 100% sure of the quality of your product to give a money-back guarantee. This describes us perfectly. Make sure that this guarantee is totally transparent.Read more
Each paper is composed from scratch, according to your instructions. It is then checked by our plagiarism-detection software. There is no gap where plagiarism could squeeze in.Read more
Thanks to our free revisions, there is no way for you to be unsatisfied. We will work on your paper until you are completely happy with the result.Read more
Your email is safe, as we store it according to international data protection rules. Your bank details are secure, as we use only reliable payment systems.Read more
By sending us your money, you buy the service we provide. Check out our terms and conditions if you prefer business talks to be laid out in official language.Read more