Re: Classical Categorisation

From: Harnad, Stevan (harnad@cogsci.soton.ac.uk)
Date: Tue Jan 23 1996 - 09:26:22 GMT


> From: "Smith, Wendy" <WS93PY@psy.soton.ac.uk>
> Date: Tue, 23 Jan 1996 09:28:22 GMT
> Subject: Classical Categorisation
>
> Miller (1956) demonstrated that by "chunking" information, we could
> store more information than by not chunking it.

What is chunking? You have to explain (to your kid brother) what
both abstraction and recoding are, and what their advantages are
(indeed, why they are necessary if we are to make any generalisations
about KINDS of experiences at all). Note that abstraction happens at the
level of sensory features (selecting the relevant ones, ignoring the
irrelevant ones) and recoding occurs at the symbolic level. Both create
"chunks," i.e., categories with names. The rest has to do with the
advantages of names and language (theft vs. honest toil).

> Those individuals with
> memories which retained everything (eg S described by Luria) at first
> seem gifted with their abilities of recall. However, on closer
> examination, they are actually handicapped. They are less able to
> understand the material they are presented, because they cannot extract
> the relevant items. Each situation is therefore unique, because they
> cannot generalise, based on these items. Categorisation would appear to
> involve the ability to generalise, select invariants, and reduce
> information. The question of interest to psychologists is how do we do
> this?

"Reduce" information is not self-explanatory: Explain how a lot of
sensory variation is irrelevant and confusing, and this needs to be
filtered out, to focus on the regularities on whose basis you can do
the right thing (whether it be eating, fleeing, or just naming).

> The classical view of categorisation can be considered as the defining
> attribute view, based mainly on the work of Frege. Briefly, it suggests
> that a category can be described by a set of defining attributes. The
> attributes are singly necessary and jointly sufficient to allow an item
> to be identified as a member of a category. This means that the
> boundaries are clearly defined: what is and is not a category member is
> clear cut. It also means that each category member is equally
> representative of the category.

That is a bit of a red herring, since we don't know what
"representative" means (or whether it matters: it of course turns into
the later obsession with "typicality"): What is true is that every
member of the category is indeed a member; membership is all-or-none;
there are no 65% members.

Note that this is true of most of the categories studied ("bird" and
"game" being classical examples). There ARE graded categories, with
membership a matter of degree, but they are usually relative
categories, such as "big" (typically adjectives rather than nouns).
The two should not be confused or conflated, but they have been, along
with "fuzzy boundaries," which usually refers to cases about which
people are unsure ("is a whale a fish?" no; "is gravity a wave?"
physics will eventually let us know), or about which there is no fact
of the matter as to whether they are or not members ("is skywriting a
game?" "is it better to be a Scorpio than an Aquarius?")

> It was also considered that the
> categories could be organised hierarchically, in which the more
> specific instances include all the attributes of the more abstract
> levels, and include extra attributes which define it more narrowly. For
> example, let us say that a "bird" has the defining attributes "beak,
> feathers, wings" . Each of these attributes are necessary for
> inclusion. If an item does not have a beak, it is not a bird, even if
> it has feathers and wings. The attributes are jointly sufficient. In
> this example, if an item has a beak, feathers and wings, it is a bird -
> even if it has four legs and barks. Also, this member is just as
> representative as a sparrow, or a penguin. A penguin has all the
> defining attributes of a bird, but in addition it has the attributes of
> "black and white, swims"

Ah me, all this nonsense (not your fault) that has issued from the
Roschian legacy! How to sort this out: First, of course, there is a
"hierarchy" in categorisation, because as we move from the concrete to
the more abstract and general, the way we do it is by ABSTRACTING some
features and dropping others. But this as you know from the Ugly
Duckling Theorem, this can go in all directions and to all levels.
There are no special, privileged levels, except the ones that happen to
be there by default. It all depends on what you need to tell apart from
what, and which features will let you do that reliably.

Next, there is the mixing of the physical/metaphysical and the
psychological (questions about what there really is in the world vs.
questions about what you can know about what there is in the world, and
how): For, say, biological purposes, sparrows and penguins are EQUALLY
birds (100%!). "Representativeness" has NOTHING to do with that; it's
another matter.

The question of physical/metaphysical features ("what features does
something have to have in order to be a bird?": this is the kind of
question SCIENCE must answer for us) is not the same as the question of
psychological features ("how do we manage to tell what's a bird and
what isn't?" e.g., "what SENSORY features do we use to tell whether
what we are looking at is or is not a bird?").

So objects have an infinite number of "features" (the Ugly Duckling
Theorem -- talking about another bird!) and science studies their
features to sort out what's what for scientists. Amongst those features,
if you like, are the kinds of things that the shapes of birds can do to
the shapes of the shadows on our retinas. The features of their shadows
on our retinas are the ones we use to tell what's a bird and what's not.
The other features are irrelevant for that task, and that task is
categorisation.

> There were criticisms of the defining attributes view. First, it
> considered all attributes within the definition as equally salient.
> However, in speed tests and descriptive tasks, some attributes appeared
> more relevant than others (Conrad, 1972). Second, it considered all
> members of a category to be equally representative. however, when
> people were asked to rate typicality, some members were consistently
> voted more typical. Typical members were also categorised and learned
> faster than atypical members.

All this about typicality and speed and ease of learning is all well and
good, but unless it casts light on how we actually manage to DO it -- it
being CATEGORISATION, and not the judging of typicality -- it is merely
changing the subject, isn't it? For BEFORE you can say how typical a
bird this is, you have to be able to say it's a bird in the first place!

> It was difficult to determine what the
> defining attributes were for many categories, the most celebrated
> example being "games" (Wittgenstein). This led to the proposal that
> members of a category shared family resemblances rather than a set of
> defining features.

If the "defining" features are meant to be "metaphysical," having to do with
what birds really are, then they're none of our business, unless we are
avian zoologists. If they are psychological, to do only with how we manage
to sort things correctly as birds vs. nonbirds (by sight, in the first
instance), then it is no surprise that people can't tell you what they
are by introspection: It is cognitive science that must figure out what
they are, and how the device between their ears manages to find and use
them.

About "family resemblances," see my earlier replies on earlier threads
discussing "Rosch: Categorisation." In brief, "family resemblances" just
boil down to feature combinations, sometimes in an either/or
combination. That still makes them features, perfectly "classical"
features. And games: Well, there are some things about which no one
knows whether they are games; others about which not everyone agrees
(and there is no one around to say what's right or wrong, since we
decide ourselves what we consider a game); and then there are the cases
where every adult on the planet can correctly say whether they are or
are not games. Those are the ones we should be considering, because
those are the ones we can categorise. Anyone have a better idea than
that they, like everything else, are sorted on the basis of features --
not features we can identify by introspection, but then what else is
new? Introspection can hardly tell us anything...

> Second, the notion of clear cut boundaries between
> categories was called into question. Sometimes, people could not agree
> whether an X belonged in one category or another.

Where people cannot agree, there may still be a fact of the matter: For
example, science will in the end tell us what is and is not a bird. And
in other cases, such as whether or not "skywriting" is a game, there may
be no fact of the matter, because it's more or less a social decision
whether or not to treat it as a game: In all cases where the "boundary
uncertainty" is for reasons like this, there is no boundary uncertainty,
because people cannot categorise. (Remember, the boundary question in
cognitive science is psychological, it has to do with the features people
when they categorise: where they CANNOT categorise, there is no question
for cognitive science to answer!)

As I've said before, the very notion of categorisation is an all-or-none
(otherwise known as "categorical") one: A thing either is in a given
category or not. Its degree of TYPICALITY in that category should not be
confused with its degree of MEMBERSHIP. Typicality might be at 40% while
membership is rock-solid at 100% (vs 0% -- all or none). Metaphysical
uncertainty is irrelevant; it has to do with what things really are, not
how we are able to sort and label them NOW (because in such cases the
answers is: we CANNOT sort and label them!).

Now there are "categories" that really do seem to have "degrees of
membership," and the example I gave was "big" (in contrast to "bird,"
which is all or none): Every concrete object is "big" -- to a degree,
even a flea. Big is a relative category. You have to set the scale.
If a planet is the biggest thing you consider and a flea the smallest,
then people are small. If you consider only animal species, then I
suppose we're medium big.

With big, though, the problem is not a "fuzzy boundary," but the fact
that there is no boundary at all, or just an arbitrary one, for big is
just one pole of a dimension that has small on the other end of it, and
everything that has any size at all is BOTH small and big, but to
different degrees.

Cold/hot is a more interesting case, both metaphysically and
psychologically. Metaphysically, although the temperature dimension is
continuous, there are some "critical points," like the boiling and
freezing points for water, where something special happens (physics
calls it a "phase transition"). There you have real boundaries along
the continuous dimension, but, as I've said, the physical and
metaphysical stuff is none of our business.

Psychologically, some things feel hot and some things feel cold, and
some things feel neutral in between. There the boundary also exists,
but it's somewhat fuzzy: the threshold for moving from neutral to, say,
warm, has all of the usual inexactness and variability of a
psychophysical threshold. This kind of fuzziness -- arising from
psychophysical thresholds -- can always be found for sensory stimuli,
and it DOES have some relevance to categorisation, but not the
relevance that the champions of category boundary fuzziness in the
Roschian tradition have in mind: They infer from typicality judgments
and boundary uncertainty, plus some metaphsyical/psychological
confusion, that membership in a category is a matter of degree (with
the absurd outcome that even fish are birds, to a degree). In reality,
(nonrelational) categories are all-or-none, and boundary uncertainty
occurs either because of inability to categorise YET (learning still
underway), inability to categorise EVER (in which case, woe is us if
our lunch depends on it!), or no category there at all (because there
are no consequences that matter that follow from sorting one way or
another). The rest is just relational categories (big/small) and
threshold psychophysics (the limits of our sensory discriminating
apparatus).

> However, these criticisms can be regarded as a manifestation of how the
> questions are asked, they are looking at the metaphysical aspects of
> categories rather than the psychological aspect of how we categorise.

Good; I was sure this was coming, but, under time pressure, am
commenting as I go along.

> Rating for typicality is a similarity judgement, and not
> categorisation. Judging the salient properties in the manner above can
> also only be performed after the categories have been established.

Spot on with both these points!

> Both
> judgement can be influenced by the categorisation - this is known as
> categorical perception. The knowledge of the categories will affect a
> person's judgement of the categories. Although the defining attributes
> view may have had limitations, the criticisms above did not address the
> important ones, such as how the defining attributes were decided upon.

Kid brother would have some trouble with this quick flotation of
categorical perception, but for present purposes, it's the right dark
allusion to make, and the details can come when the focus is on CP
rather than classical categorisation. Note, though, that "defining"
was always a red herring: Defining how? Metaphysically? Perceptually?
Lexically (as in a dictionary)? For the last of these, it's a bit ahead
of the game, because you have to have categories before you can use
their names to define other categories (the symbol grounding problem).

> An alternative view of categorisation was proposed to account for these
> so-called deficits, especially the typicality effects and the fuzziness
> of the boundaries. Theories within this view were called "prototype
> theories" (eg Rosch). The prototype theories all suggested that
> categorisation was based on a prototype. A prototype is an ideal, or
> central, member. Although there may be necessary attributes, membership
> does not depend on them as a defining set. The are not "jointly
> sufficient". Membership is determined by the similarity of an item's
> properties to those of a prototype for that category. Members of a
> category will therefore show a typicality gradient. The boundaries
> between categories are not clearly defined, but "fuzzy". Again, the
> categories were arranged hierarchically, and Rosch suggested there was
> a basic level of categorisation at which all manner of operations
> (perception, language etc) will converge.

ProtoTYPE-matching is fine as a model for typicality judgment, which is
always a matter of degree, but it's not a very successful model for
categorisation, which is feature-based and all-or-none. Note as a matter
of logic, though, that since Saddam Hussein can always DEFINE a category
(something you must bow down and worship or else be shot) as the idol
Baal, and certain continuous deformations of it, up to a (fuzzy?)
boundary, beyond which you must not bow down and worship it, but instead
attack and destroy it (or be shot). This template-deformation-based
category would then be a very special case of a feature-based category,
in which the (perfectly classical) feature consists of all the
within-boundary deformations, and the rest is not in the category.

The empirical question is: How many of the tens of thousands of
categories we can sort and name are like that? If many or most, then
template-matching would be a good model for categorisation (note that it
would still be classical, as long as the boundary, be it ever so fuzzy,
separated two categories in an all-or-none fashion). The empirical
failure of template-matching models of pattern recognition to do very
much successful categorisation suggests that this special case of
feature-based models is not a very useful or representative one...

> There are criticisms of the prototype view. First, not all categories
> have prototype characteristics, particularly abstract categories such
> as "instinct" (Hampton, 1981).

No need to defend too much, because prototypes mostly bear on typicality,
so even if there are more and less typical instincts, it's irrelevant to
the real question: categorisation.

> Second, when categorising, people do not
> just look for properties which co-occur together, but for properties
> which co-occur with the consequences of getting the categorisation
> right. For example, there are two large, animals, which look identical
> in appearance, except for one feature. One animal has sharp canines,
> because it is a carnivore (and likes eating humans), and the other has
> flat teeth for grinding leaves. The correlation between fur, eye
> colour, ear size and tail length etc are all largely irrelevant as far
> as the consequences of miscategorisation are concerned. The only
> feature which matters is teeth, and that it co-occurs with whether the
> animal will eat humans, not that it co-occurs with all the other
> features.

Correct, but since you have not yet made the case FOR co-occurrence
(this is your first mention of it), the case against it is somewhat lost
on your kid brother...

> However, by far the biggest problem with the prototype view is that it
> is approaching the problem from the wrong direction: it is looking at
> the metaphysical problem of what constitutes a category. Typicality
> judgements may be useful for answering this type of question, but it is
> not the same task as categorisation, and presupposes that a category
> already exists.

And has been found by people; i.e., they can sort what is and isn't in
it correctly.

> The real question that should be asked by psychologists
> is how we categorise. The only situation in which prototypes help
> answer that question is in continuous categories, and categorisation of
> unidimensional, continuous stimuli is not very successful outside a few
> special cases where inborn feature detectors appear to be present.

Also in relational categories as well as the template-deformation case
I mentioned; it is possible that facial expressions have templates, and
recognition is based on closeness to a template. It is not clear how
general a model this would be, and important to recognise that template
deformation and matching is just a very special case of feature
matching (and perfectly classical).

> Before we address how we categorise, perhaps we need to ask why we
> categorise, why is it useful? The answer is quite simple: there are
> consequences of miscategorisation. Miscategorising toadstools as
> mushrooms has the consequence of poisoning, therefore being able to
> categorise mushrooms and toadstools is useful in avoiding the
> consequence of miscategorisation.

Fine, though kid brother would be puzzled about how the mushroom world
and its consequences relates to the overall discussion, based only on
what you've said...

> So, how do humans categorise? As far as correlations between properties
> are concerned (prototype theories) this is not useful.

Explain how feature correlation is related to prototype theory.

> The Ugly
> Duckling Theorem illustrates that when all properties are considered,
> everything becomes infinitely unique, and of no use in determining
> possible consequences, or reducing uncertainty about what action to
> take.

Not quite. Watanabe points out that every PAIR of objects is equally
similar, sharing exactly the same number of features (if you can count
them all as equal, from the most concrete to the most abstract and
arbitrary, such as not being blue on Sundays), because there are an
infinite number of them. You are conflating the infinite uniqueness of
individual objects (and of our every perceptual instant of experience of
them), which figures in the Luria's Mnemonist and Borges's Funes, with
the equal similarity between all things, because of arbitrary pairings
of their unique, infinite feature sets.

> Instead, some of the properties have to be selected, and others
> ignored when a person categorises. Furthermore, the properties are
> selected by their correlation with the consequences of categorising one
> way or another. The properties which get the categorisation "right" are
> the ones which are selected. "Right" is determined by the consequences.
> If there are no consequences, the categories are trivial, and perhaps
> better described as subjective tastes.

Good, but to drive it home to kid brother you have to point out that
because every time you see something it will differ in many ways, you
have to ignore all the differences to recognise it's the same thing.
That's selection and abstraction. Same is true for recognising that it's
the same KIND (category) of thing; in fact, recognising that it is the
same thing is already a categorisation: e.g., all the sensory shadows of
Ed.

The consequences tell you whether you've sorted correctly or
incorrectly. The magic device in your head (a neural net, perhaps?
finds the winning features, under the guidance of the feedback,
learning to pick out the features that matter and ignore the ones that
don't, from among the huge number of possible features.

> Categories can be arranged hierarchically, but this is arbitrary, and
> so is the entry point. This often depends upon context. The "basic
> level" of Rosch's theory is really a default context. The uniformity of
> response probably springs from uniformity of experience within a
> culture.

All good, though again too fast for kid-bro. This is just an outline,
though, and I assume you know what you mean. In the exam you should be
more explicit, striking a good balance between breadth and depth.

There are lots of ways to sort things. The outcome depends on your
sample and on the consequences of sorting. Some of the categories we all
have are inborn, part of our evolutionary legacy; some are because our
world and needs are similar (we all get sick if we eat toadstools); and
some because we construct the same systems of social consequences
(atheism is ill-viewed in our planet's mostly theistic societies).

> Clear boundaries are also important. Categorisation means responding
> differentially to certain categories of input; therefore boundaries
> signal a change in behaviour and are all or nothing. Categorisation
> involves selecting which features are invariant, and disregarding the
> rest. This is more reminiscent of the defining attribute theory than
> the prototype theories. Feature detection would appear a more plausible
> mechanism for categorisation than prototype matching.

Prototype matching is actually a special case of feature detection,
not an alternative to it.

> In conclusion, the classical view of categorisation was of defining
> attributes. Several things were thought to be wrong with this. The
> main problems were that it was difficult to establish the defining
> attributes in many cases, some category members were found to be more
> typical than others, and were categorised and learned faster, and it
> was thought that categories were based on family resemblances rather
> than features. The alternatives proposed were the prototype theories.
> Although these may address the metaphysical problem of what a category
> is, they do not fare well in explanatory strength of how we actually
> categorise.

They don't do very well with the metaphysical problem either, but never
mind...



This archive was generated by hypermail 2b30 : Tue Feb 13 2001 - 16:23:57 GMT