BlogJava-Shaird's Java Space -文章分类-AI:General

[转]Ontologies – Description and Applications

Shaird — Fri, 01 Feb 2008 13:44:00 GMT

摘要

The word “ontology” has gained a good popularity within the AI community. Ontology is usually viewed as a high-level description con- sisting of concepts that organize the upper parts of the knowledge base. However, meaning of the term “ontology” tends to be a bit vague, as the term is used in dierent ways. In this paper we will attempt to clarify the meaning of the ontology including the philosophical views and show why ontologies are useful and important. We will give an overview of ontology structures in several particular systems. A field proposed within ontological eorts, “ontological engi- neering”, will be also described. Usage of ontologies in several particular ways will be discussed. These include systems and ideas to support knowledge base sharing and reuse, both for computers and humans, ontology based communication in multi- agent systems, applications of ontologies for natural language processing, applications in documents search and enrichment of knowledge bases, both particularly for the World Wide Web environment and construction of educational systems, particularly intelligent tutoring systems.

本体（ontology）一词在人工智能界已经有相当的知名度了。本体通常被认为是由概念所组成的高级描述，概念则是用来对知识库进行组织的上层部分。然而，当“ontology”这个术语在不同的场合以不同方式加以应用时，其含义往往是有点儿含糊不清的。本文将力图阐明本体的含义，包括哲学观点上的含义，并指明为什么本体是很有用的，也是很重要的。我们将给出几个特殊系统中的本体结构总体情况，并对“本体工程”这一最新被提议的研究领域加以阐述。本文还将讨论本体的几种特定的用法，包括支持人与计算机的知识库共享与重用的系统和想法、多主体系统中基于本体的通信、本体在自然语言中的应用、本体在文本搜索和知识库浓缩中的应用，同时包括在互联网环境和教育系统中，特别是智能辅导系统。

1 Introduction

The word “ontology” has gained a good popularity within the AI community. Ontology is usually viewed as a high-level description consisting of concepts that organize the upper parts of the knowledge base. However, meaning of the term “ontology” tends to be a bit vague, as the term is used in different ways. In this paper we will attempt to clarify the meaning of the ontology and show why ontologies are useful and important. We will discuss usage of ontologies in several particular ways, such as knowledge base reuse, knowledge sharing, communication in multi-agent systems, applications of ontologies for WWW applications, for natural language processing, and for intelligent tutoring systems.

1 简介

“本体”这个词在AI领域中广泛流传。本体经常被视作一个高层次的描述方法，这个描述方法由一些概念组成，而这些概念被认为组成了知识库的上层结构。但是，由于它被用在许多不同的地方，“本体”一词的意思似乎很容易被混淆。在这份文件中，我们将尝试弄清本体的真正意思，并且展示产生本体重要意义和实用性的原因。我们将用不同的方面讨论本体的用处，例如知识库的复用，知识库的共享，多代理系统内部的通讯，用作网络应用的本体应用程序，用作自然语言处理的本体应用程序以及用作智能辅助系统内的本体应用程序。

1.1 动机　　在AI研究历史中，定义了两种研究类型[31,8]：面向形式的研究（机制理论）及面向内容的研究（内容理论）。前者处理逻辑与知识表达，而后者处理知识的内容。显然前者时至今日是AI的勘察范围，然而在最近，面向内容的研究已逐渐引起更多的关注，因为许多现实世界的问题的解决如知识的重用、agent通讯的简化、通过理解集成媒体、大规模的知识基等等，不仅需要先进的理论或推理方法而且还需要对知识内容进行复杂的处理。 Formal theories such as predicate logic provides us with a powerful tool to guarantee sound reasoning and thinking. It even enables us to discuss the limits of our reasoning in a principled way. However, it cannot answer to any of the questions such as what knowledge we should have for solving given problems, what is knowledge at all, what properties a specific knowledge has, and so on. Sometimes, the AI community gets excited by some mechanisms such as neural nets, fuzzy logic, genetic algorithms, constraint propagation etc. These mechanisms are proposed as the “secret” of making intelligent machines. At other times, it is realized that, however wonderful the mechanism, it cannot do much without a good content theory of the domain on which it is to work. Moreover, we often recognize that once a good content theory is available, many dierent mechanisms might be used equally well to implement eective systems, all using essentially the same content.

Importance of content-oriented research is being recognized more and more nowadays. Unfortunately it seems that there are no widely recognized sophisticated methodologies for content-oriented research now. Major results till later years were only development of knowl- edge bases. 以前的理论比如谓词逻辑学提供了一种合理的推理和思考的工具。它甚至使我们可以在一定原则下来探讨推理的局限性。然而，这一理论却不能回答诸如“解决特定问题需要什么知识”，“究竟什么是知识”，“一种特定知识具备怎样的特征”等等的问题。有时，人工智能领域因为一些理论机制而变得沸沸扬扬，比如神经网络，模糊学，基因运算规则以及选择性繁殖等。这些理论被认为是开发人工智能的“秘密”所在。而又有些时候，我们意识到不管这些机制多么令人赞叹，如果在其作用领域内没有一个完善的内容理论，它将难以发挥巨大作用。更进一步，我们常常发现一旦建立了完备的内容理论，许多不同的理论机制都能良好的实现有效的系统，而这些系统本质上都应用同样的内容。现在，面向内容的研究的重要性已日益为我们所重视。遗憾的是目前还没有形成面向内容的被广泛认同的精确的方法论，近年来最大的成果也只是知识库的开发。

 The reasons for this can be [31]:

content-oriented research tends to be ad hoc there is no methodology that enables to accumulate research results It is necessary to overcome these diculties in the content-oriented research. Ontologies are proposed for that purpose. Ontology engineering, as proposed in e.g. [31], is a research methodology which gives us design rationale of a knowledge base, kernel conceptualization of the world of interest, strict definition of basic meanings of basic concepts together with sophis- ticated theories and technologies enabling accumulation of knowledge which is dispensable for modeling the real world. Interest in ontologies has also grown as researchers and system developers have become more interested in reusing or sharing knowledge across systems. Currently, one key imped- iment to sharing knowledge is that dierent systems use dierent concepts and terms for describing domains. These dierences make it dicult to take knowledge out of one system and use it in another. If we could develop ontologies that could be used as the basis for multi- ple systems, they would share a common terminology that would facilitate sharing and reuse. Developing such reusable ontologies is an important goal of ontology research. Similarly, if we could develop tools that would support merging ontologies and translating between them, sharing would be possible even between systems based on dierent ontologies. 出现这种情况的原因或许有如下几点：【31】
1.面向内容的研究更趋于专业化
2.对于研究结果的聚集尚无一定的方法论
内容研究必须克服这些难点，而本体就是基于这个目的提出的。本体设计，就像【31】所要求的，是一种内容研究的方法论，它提供了知识库设计的基本原理，专业领域的核心概念，对基本概念含义的严格定义，以及模拟现实世界所必不可少的知识聚集的复杂理论和技术。
随着研究人员和系统开发者对系统内的知识重用和共享越发感兴趣，对本体论的兴趣也日益增长。目前，阻碍知识共享的一个关键问题是不同系统使用不同的概念和术语来描述其领域。这种不同使得将一个系统的知识用于其他系统变得十分复杂。如果可以开发一些能够用作多个系统的基础的本体，这些系统就可以共享通用的术语以实现知识共享和重用。开发这样的可重用本体是本体论研究的重要目标。类似的，如果我们可以开发一些支持本体合并以及本体间互译的工具，那么即使是基于不同本体的系统也可以实现共享。

1.2 Philosophical View

哲学角度看本体

The term ontology was taken from philosophy. According toWebster’s Dictionary an ontology is a branch of metaphysics relating to the nature and relations of being a particular theory about the nature of being or the kinds of existence
Ontology (the “science of being”) is a word, like metaphysics, that is used in many dierent senses. It is sometimes considered to be identical to metaphysics, but we prefer to use it in a more specific sense, as that part of metaphysics that specifies the most fundamental categories of existence, the elementary substances or structures out of which the world is made. Ontology will thus analyze the most general and abstract concepts or distinctions that underlay every more specific description of any phenomenon in the world, e.g. time, space, matter, process, cause and eect, system. Recently, the term of “ontology” has been up taken by researchers in Artificial Intelligence, who use it to designate the building blocks out of which models of the world are made.

An agent (e.g. an autonomous robot) using a particular model will only be able to perceive that part of the world that his ontology is able to represent. In this sense, only the things in his ontology can exist for that agent. In that way, an ontology becomes the basic level of a knowledge representation scheme. An example is set of link types for a semantic network representation which is based on a set of ”ontological” distinctions: changing–invariant, and general–specific.
本体这个术语来自于哲学。根据韦氏词典的解释，本体是
形而上学的一个分支，研究关于自然和存在的关系;
关于存在的本质的专门理论。
本体（指关于存在的科学）是个词，就好象形而上学，可以用于各种不同的语境。有时候把本体等同于形而上学，但我们倾向于在更具体的意义上应用它，就像形而上学详细说明了存在的最基本的范畴，组成世界的基本物质或结构。本体论因此将分析最普遍最抽象的概念或差别，这种差别成为对世界上各种现象（比如时间、空间、物质、过程、原因和结果、系统等）进行具体描述的根基。
最近，本体在人工智能领域中得以应用，它被认为是构建世界模型的积木。
一个使用特定模型的代理（比如一个自主机器人），只能理解它内部定义的本体所能代表的世界的某部分。在这个意义上，只有在代理本体里定义的事物对代理来说才是存在的。这样，一个本体就代表了知识大纲的基本水平。例如对语义网的链接类型的表现是基于一系列“本体论的”定义：变更——固定；普遍——特殊。

2 What is an Ontology?

The term “ontology” is used in many dierent ways. In this section we will discuss what an ontology is on several definitions that are currently used.

何谓本体论？

本体论这个术语应用于很多方面。这一节中我们将在几个目前所使用的不同定义的基础上讨论什么是“本体论”。

2.1 Common Definitions

2.1 普遍定义

The most widespread definitions of ontology are given below. 1. Ontology is a term in philosophy and its meaning is “theory of existence”. 2. Ontology is an explicit specification of conceptualization [21]. 3. Ontology is a theory of vocabulary or concepts used for building artificial systems [31]. 4. Ontology is a body of knowledge describing some domain (eg. a common sense knowl- edge domain in CYC [45]) The definition 1 is radically dierent from all the others (including additional ones dis- cussed below). We will shortly discuss some implications of its meaning for definition of “ontology” for AI purposes. The second definition is generally proposed as a definition of what an ontology is for the AI community. It may be classified as “syntactic”, but its precise meaning depends on the understanding of the terms “specification” and “conceptualization”. The third definition is a proposal for definition within the knowledge engineering community. The last fourth definition diers from the previous two ones — it views the ontology as an inner body of knowledge, not as the way to describe the knowledge. Although these definitions are compact, they are not sucient for in-depth understanding of what an ontology is. We will try to give more comprehensive definitions and insights. 最广为流传的本体论定义如下：
1.本体论是一个哲学术语，意义为“关于存在的理论”
2.本体论是关于概念化的清楚详细的说明
3.本体论是关于词汇或概念的理论，它用于构建人工智能系统
4.本体论是用来定义某一领域的知识主体（比如：在CYC领域的常识性知识）
定义1与其他定义(包括下面将要讨论的其他定义)有着本质不同。我们一会儿将讨论在人工智能领域的“本体论”的深层含义。第二个定义通常认为是“本体论”在人工智能中的定义。它或许可以归为符合造句法的一类，然而其更准确的含义要依靠对“详细说明”和“概念化”的理解。第三个定义是知识工程师团体推荐的定义。最后第四个有别于前两个定义——它把本体论看作知识的内主体，而不是描述知识的途径。
这些定义虽然简洁，但是要深层理解本体论这些是不够的。我们将试着给出更多的更为全面的定义和观点。

2.1.1 Ontology as a Philosophical Term

2.1.1 作为哲学名词的"本体"

Following [24] we will use the convention that the uppercase initial letter “O” is to distinguish the “Ontology” as a philosophical discipline from other usages of this term. Ontology is a branch of philosophy that deals with the nature and the organization of reality. It tries to answer questions like “what is existence”, “what properties can explain the existence” etc. Aristotle defined Ontology as the science of being as such. Unlike the special sciences, each of which investigates a class of beings and their determinations, Ontology regards “all the species qua being and the attributes that belong to it qua being” (Aristotle, Metaphysics, IV, 1). In this sense Ontology tries to answer the question “what is the being?” or, in a meaningful reformulation “what are the features common to all beings?”. This is what is today called “General Ontology” in contrast with various Special or Re- gional Ontologies (eg. Biological, Social). From this, Formal Ontology is defined as an area that has to determinate the conditions of the possibility of the object in general and the in- dividualization of the requirements that every object’s constitution has to satisfy. According to [24] Formal Ontology can be defined as the systematic, formal, axiomatic development of the logic of all forms and modes of being. From this, Formal Ontology is not concerned so much in the existence of certain objects, but rather in the rigorous description of their forms of being, i.e. their structural features. In practice, Formal Ontology can be intended as the theory of the distinctions, which can be applied independently of the state of the world, i. e. the distinctions: among the entities of the world (physical objects, events, regions...) among the meta-level categories used to model the world (concept, property, quality, state, role, part...) In this sense, Formal Ontology, as a discipline, may be relevant to both Knowledge Rep- resentation and Knowledge Acquisition [24].
以下，我们使用首字母大写的“O”时，指“Ontology”作为一门哲学学科，以此与它的其他用法进行区别。“Ontology”（哲学上的本体论）时哲学的一个分支，研究自然存在以及现实的组成结构。它试图回答“什么是存在”，“存在的性质是什么”等等。亚里士多德也同样定义“本体论”是存在的科学。每一门具体科学都研究一类事物和它们的性质，与之不同，本体论涉及的是“所有作为存在的事物以及它们作为存在的特性(亚里士多德, 形而上学,IV, 1). ”在这个意义上，本体论是试图回答“存在是什么”的科学，或者这个问题可以表达为含义更清楚的形式，即“所有的存在有什么共性？”
这就是今天所说的“一般本体论”，它与各种特殊的专门的本体论相对（如，生物本体论，社会本体论）。从这个观点出发，形式本体论是指这样一个领域，它确定客观事物总体上的可能的状态，确定每个客观事物的结构所必须满足的个性化的需求。根据[24]，形式本体论可以定义为有关存在的一切形式和模式的系统，正式，自明的发展。

由此看来，形式本体论并不是特别关注特定事物的存在，而是严格描述它们存在的形式，比如它们的结构特征。实践中，形式本体论可以看作是区别理论，可以独立应用于世界的状态，如：
世界上不同实体之间的区别(物理实体、事件、地区等)；
模拟世界的元范畴间的区别（概念、性质、质量、状态、角色、部分等）

2.1.2 Ontology as a Specification of Conceptualization

2.1.2 作为概念化详细说明的本体论

The second definition of ontology mentioned above, explicit specification of conceptualiza- tion, is briefly described in [20]. The definition comes from work [22] where the ontology is used in context of knowledge sharing. According to Thomas Gruber, explicit specification of conceptualization means that an ontology is a description (like a formal specification of a program) of the concepts and relationships that can exist for an agent or a community of agents. This definition is consistent with the usage of ontology as set of concept definitions, but more general.In this sense, ontology is important for the purpose of enabling knowledge sharing and reuse. An ontology is in this context a specification used for making ontological commitments. Practically, an ontological commitment is an agreement to use a vocabulary (i.e. ask queries and make assertions) in way that is consistent (but not complete) with respect to the theory specified by an ontology. Agents are then built that commit to ontologies and ontologies are designed so that the knowledge can be shared with and among these agents.
上面所提到的本体论第二个定义——概念化的清楚详细的说明——在【20】中进行了简要描述。这一定义来自【22】的工作，在这里本体用于知识共享。根据Thomas Gruber的解释，概念化的清楚的详细说明是指：一个本体是对概念和关系的描述（就像程序的详细说明书），而这些概念和关系可能是针对一个代理或代理群体而存在的。这个定义与本体论在概念定义中的描述一致，但它更具普遍意义。在这个意义上，本体论对于知识共享和重用非常重要。此处，一个本体是用来进行本体委托的详细说明。事实上，本体委托就是使用词汇的一个协议（比如进行询问和做出声明），而使用的方法要与某个本体指定的理论一致（而不必完全的照本宣科）。然后就可以开发应用这些本体的代理，而本体设计的目的就是让代理内部或者代理之间能够共享知识。 The body of a knowledge is based on a conceptualization: the objects, concepts, and other entities that are assumed to exist in some area of interest and the relationship that hold among them. A conceptualization is an abstract, simplified view of the world that we wish to represent for some purpose. Every knowledge base, knowledge-based system, or knowledge-level agent is committed to some conceptualization, explicitly or implicitly.
知识的主体是基于概念化的：客观事物、概念以及其他实体存在于特定领域和其所处关系之中。概念化是对世界的抽象，是我们在一定目的下对期望表现的世界简化观察。每个知识库，基于知识的系统，或者是知识水平上的代理都或明显或潜在地遵照某些概念化的过程。 For these systems, what “exists” is that which can be represented. When the knowledge of a domain is represented in a declarative formalism, the set of objects that can be represented is called the universe of discourse. This set of objects and the describable relationships among them, are reflected in the representational vocabulary with which a knowledge-based program represents knowledge. Thus, in the context of AI, we can describe the ontology of a program by defining a set of representational terms. In such an ontology, definitions associate the names of entities in the universe of discourse (e.g. classes, relations, functions, or other objects) with human readable text describing what the names mean, and formal axioms that constraint the interpretation and well-formed use of these terms. Formally it can be said that an ontology is a statement of a logical theory [20].
对这些系统来说，存在的就是那些可以被表示的。当某个领域的知识以声明的形式表示时，那些可以表示的对象的集合就称为universe of discourse。这些对象集以及它们之间可描述的关系，可以用描述性词汇来表示，这种词汇被用于基于知识的系统表达知识。因此，在人工智能环境下，可以通过定义一套描述性术语来描绘程序的本体。在这种本体中，定义与universe of discourse中的实体名相交互，用人类可读的文本来描述这些名字的含义，描述普遍真理，而这些真理规定了如何理解和正确使用这些术语。正规一些，我们可以说本体是对逻辑理论的阐述。 Ontologies are often equated with taxonomic hierarchies of classes without class definitions and the subsumption relation. Ontologies need not to be limited to these forms. Ontologies are also not limited to conservative definitions, that is, definitions in the traditional logic sense that only introduce terminology and do not add any knowledge about the world. To specify a conceptualization, one needs to state axioms that do constrain the possible interpretations for the defined terms. 本体常常等同于没有类的定义也不包括它们之间的关系的类的分类等级。然而本体并不局限于此形式。它也不只限于保守的定义，即在传统逻辑意义上的只包括术语而不附加任何关于知识的定义。要详细说明概念化，必须说明那些对定义项目的理解进行限制的公理。 Pragmatically, a common ontology defines the vocabulary with which queries and as- sertions are exchanged among agents. The agents sharing a vocabulary need not share a knowledge base. An agent that commits to an ontology is not required to answer all queries that can be formulated in the shared vocabulary. In short, a commitment to a common ontol- ogy is a guarantee of consistency, but not completeness, with respect to queries and assertions using the vocabulary defined in the ontology. 实际运用中，一个一般性的本体定义代理之间进行询问和声明所用的词汇表。共享词汇表的代理之间不需要共享一个知识库。遵循某个本体的代理也不需要能够回答用共享词汇表所构成的所有问题。总之，遵循一般性本体是连贯性的保证，但不是完整性的保证。

2.1.3 Ontology as a Representational Vocabulary

2.1.3作为代表性词汇的本体 The third definition of ontology proposed above says that it is in fact a representational vo- cabulary [8, 31]. The vocabulary can be specialized to some domain or subject matter.

 More

precisely, it is not the vocabulary as such that qualifies as an ontology, but the conceptu- alization that the terms in the vocabulary are intended to capture. Thus, translating the terms in an ontology from one language to another, for example from Czech to English, does not change the ontology conceptually.

 In engineering design, one might discuss the ontology of an electronic devices domain, which might include vocabulary that describes conceptual

elements — transistors, operational amplifiers, and voltages — and the relations between these elements — operational amplifiers are a type-of electronic device, and transistors are component-of operational amplifiers. Identifying such a vocabulary and the underlying con- ceptualization generally requires careful analysis of the kinds of objects and relations that can exist in the domain.
上述本体的第三个定义认为本体实际上是一种代表性的词汇。这种词汇可以应用于特定领域或者主题。更确切的说，它不是像本体那样严格定义的词汇，而是一种概念化，这种概念化是词汇表中的术语想要抽取出来的。因此，将这些术语用本体的形式在不同语言间翻译时，比如由捷克语译成英语，并不从概念上改变本体。在工程设计中，或许会讨论到电子设备领域的本体，它包含一些描述基本概念的词汇，比如晶体管，运算放大器，电压等；也包含这些基本元素间的关系，运算放大器是电子设备的一种，而晶体管是运算放大器的组件。一般来说，识别这种词汇和潜在的概念需要仔细分析领域内存在的各种对象和关系。
The term ontology is sometimes used to refer to a body of knowledge describing some domain (see below), typically a common sense knowledge domain, using a representational vocabulary. For example, CYC [45] often refers to its knowledge representation of some area of knowledge as its ontology. In other words, the representation vocabulary provides a set of terms with which one can describe the facts in some domain, while the body of knowledge using that vocabulary is a collection of facts about a domain. However, this distinction is not as clear as it might first appear. In the electronic-device example, that transistor is a component-of operational amplifier or that the latter is a type-of electronic device is just as much a fact about its domain as a CYC fact about some aspect of space, time or numbers. The distinction is that the former emphasizes the use of ontology as a set of terms for representing specific facts in an instance of the domain, while the latter emphasizes the view of ontology as a general set of facts to be shared.
本体这一术语有时候用于指描述某个领域的知识主体。比如，CYC常将它对某个领域知识的表示称为本体。也就是说，表示词汇提供了一套用于描述领域内事实的术语，而使用这些词汇的知识主体是这个领域内事实的集合。但是，它们之间的这种区别并不明显。在电子设备的例子中，晶体管是运算放大器的一个组件，或者运算放大器是一种电子设备也可以是领域内的一种事实，就像关于宇宙，时间或者数字的CYC事实一样。两者的区别在于，前者强调本体作为表现领域内特定事实的术语集而使用，而后者则强调本体是可以共享的普遍的事实的集合。

2.1.4 Ontology as a Body of Knowledge

2.1.4作为知识主体的本体 Sometimes, ontology is defined as a body of knowledge describing some domain, typically a common sense knowledge domain, using a representation vocabulary as described above. In this case, an ontology is not only the vocabulary, but the whole “upper” knowledge base (including the vocabulary that is used to describe this knowledge base). The typical example of this definition usage is project CYC (http://www.cyc.com/, [45]) that defines its knowledge base as an ontology for any other knowledge based system. CYC is the name of a very large, multi-contextual knowledge base and inference engine. The development of CYC started during the early 1980s headed by Douglas Lenat. CYC is an attempt to do symbolic AI on a massive scale. It is neither based on numerical methods such as statistical probabilities, nor is it based on neural networks or fuzzy logic. All of the knowledge in CYC is represented declaratively in the form of logical assertions. CYC contains over 400; 000 significant assertions [45], which include simple statements of fact, rules about what conclusions to draw if certain statements of fact are satisfied (true), and rules about how to reason with certain types of facts and rules. New conclusions are derived by the inference engine using deductive reasoning. The CYC team doesn’t believe there is any shortcut toward being intelligent or creating an artificial intelligence based agent. Addressing the need for a large body of knowledge with content and context may only be done by manually organizing and collating information.
有时候，本体被定义为描述某个领域的知识，通常是一般意义上的知识领域，它使用上面提到的表示性词汇。这时，一个本体不仅仅是词汇表，而是整个上层知识库（包括用于描述这个知识库的词汇）。这种定义的典型应用是CYC工程，它以本体定义其知识库，为其他知识库系统所用。CYC是一个巨型的，多关系型知识库和推理引擎。CYC的开发早在80年代就已经开始，重要负责人是Douglas Lenat。CYC是大型的符号型人工智能的一次尝试。它不是基于数字方法，比如概率统计，也不是基于神经网络或者模糊逻辑。 CYC中所有的知识都以逻辑声明的形式表示。CYC包含400，000多个关键声明，这其中包含对事实的简单陈述，关于满足特定事实陈述时得出何种结论的规则，以及关于通过一定类型的事实和规则如何推理的标准。新的结论由推理引擎通过演绎推理得到。CYC小组不相信在通往智能化或创造基于人工智能的代理的途中存在什么捷径。他们强调需要有大型的内容知识主体，而联系只能通过手工组织和比较信息而获得。

This knowledge includes heuristic, rule of thumb problem solving strategies, as well as facts that can only be known to a machine if it is told. Much of the useful common sense knowledge needed for life is prescientific and has there- fore not been analyzed in detail. Thus a large part of the work of the CYC project is to formalize common relationships and fill in the gaps between the highly systematized knowl- edge used by specialists. It is not necessary to divide such a large knowledge base into smaller pieces to enable reasoning in reasonable time. Because of this, the CYC knowledge base uses a special context space [29], that is divided by 12 dimensions into smaller pieces (contexts) that have something in common and can be used to reason about a specific problem in that context. It is possible to “lift” assertion from one context to another when the problem requires it. The CYC common sense knowledge can be used as a body of a knowledge base for any knowledge intensive system. In this sense, this body of knowledge can be viewed as an ontology of the knowledge base of the system.
这种知识包括启发、问题解决策略的检索规则，也包含只能被机器理解的事实。生活中需要的常识知识大部分是近代科学以前的，因此尚未详细分析。所以CYC很大一部分工作就是格式化一般的关系并填补它与专家使用的高度系统化的知识间的空白。为了在合理时间内完成推理而将这样一个大型的知识库分割成小部分是不必要的。为此，CYC知识库使用特殊的关系空间，这一空间被十二个因素分割成小块儿（关系），每个小块有共同点，可以用来推理特定的问题。在需要的时候也可以将声明从一个关系块转换到另一个关系块。CYC常识知识库可以被用作任何知识密集型系统的知识主体。在这个意义上，知识主体可以被看成系统知识库的本体。

2.2 Other Ontology Definitions

/* 2.2 其它本体定义*/ 正如我们从上述讨论中所见，还没有明确的对本体的准确定义，然而可以看出上述定义有许多共同之处。除了上述定义外还有许多对本体定义的其它说法。[24]中收集的一些其它的定义有：1.非正式的概念体系 2.正式的语义说明3. 对概念体系用逻辑性的理论进行描述 (a) 用特定格式的属性表现其特征(b) 仅按其特定的目标进行特征描述4. 逻辑性理论所采用的词汇表5. 逻辑理论的规范。定义1和定义2将一个本体视为一个概念的“语义”实体，正式或非正式的，而概念3，4和5的阐述则是一个具体的“语法”对象。根据定义1，一个本体是一个被设想成能够由特定知识库支持的概念体系。而定义2则认为有知识库支持的本体在语义层根据适当形式的结构予以表示。在上述2定义下，我们都可以说“知识库A的本体与知识库B的本体不同”。在定义3下，一个本体仅是一个逻辑理论。问题在于这样一个理论要成为本体是否需要有特殊格式的属性，或是否以让人将一个逻辑理论作为本体考虑为目标。后者可以由一个本体是关于事物的加注解和索引的声明的集合的辩论来支持： “离开注解和索引，它变成一个声明的集合：逻辑上何谓理论。(Pat Hayes 在 [24]中阐述的). 根据定义4，一个本体不作为一个逻辑理论，而是作为逻辑理论使用的词汇表。如果一个本体被视为一个包含一系列逻辑定义的词汇规范，则此定义转化为3.a。可以预测当概念化试图作为词汇表时Gruber的定义描述（概念化规范）也将转化为3.a。最后，在定义5下，基于一种认识：它指定了在特定领域的理论中使用的“构件”，一个本体被视为一个逻辑理论的规范*/*/As we can see from the above discussions, the exact definition of ontology is not obvious, however it can be seen that the definitions have much in common. In addition to the above definitions there are many other proposals for ontology definitions. Some other definitions collected from [24] are: 1. informal conceptual system 2. formal semantic account 3. representation of a conceptual system via a logical theory (a) characterized by specific formal properties (b) characterized only by its specific purposes 4. vocabulary used by a logical theory 5. (meta-level) specification of a logical theory Definitions 1 and 2 conceive an ontology as a conceptual “semantic” entity, either formal or informal, while according to the interpretations 3, 4 and 5 is a specific “syntactic” object. According to interpretation 1, an ontology is the conceptual system which may be assumed to underlay a particular knowledge base. Under interpretation 2, instead, the ontology, that underlies a knowledge base, is expressed in terms of suitable formal structures at the semantic level. In both cases, we may say that “the ontology of knowledge base A is dierent from that of knowledge base B”. Under interpretation 3, an ontology is nothing else then a logical theory. The issue is whether such a theory needs to have particular formal properties in order to be an ontology or, rather, whether it is the intended purpose which lets us consider a logical theory as an ontology. The latter position can be supported by arguing that an ontology is an annotated and indexed set of assertion about something: “leaving o the annotations and indexing, this is a collection of assertions: what in logic is called a theory” (Pat Hayes statement in [24]). According to interpretation 4, an ontology is not viewed as a logical theory, but just as the vocabulary used by a logical theory. Such an interpretation collapses into 3.a if an ontology is thought of as a specification of a vocabulary consisting of a set of logical definitions. We may anticipate that the Gruber’s interpretation (specification of conceptualization) collapses into 3.a as well when a conceptualization is intended as a vocabulary. Finally, under interpretation 5, an ontology is seen as a specification of a logical theory in the sense that it specifies the “architectural components” (or primitives) used within a particular domain theory. */

3 Ontology Structure

From the overview above we can see that an ontology can be perceived in basically two approaches. The first approach is an ontology as a representational vocabulary, where the conceptual structure of terms should remain unchanged during translation. The other ap- proach, that is discussed in this section, is an ontology as the body of knowledge describing a domain, in particular a common sense domain. An ontology can be divided in several ways. We will describe some of the proposals here. Particularly interesting is so called “upper ontology” that is intended to serve as an upper part of ontology of practically all knowledge based systems. Some of the ways of dividing presented here are intended to be used for merging to form an upper ontology standard in the IEEE Standard Upper Ontology Study Group [39]. On pages linked from [39] there are many other examples that could be used as some kind of an upper ontology. 根据以上看法可以得出一个本体基本上可以通过两个步聚来认识。第一个步骤是本体是一个抽象词汇表，在这个词汇表里术语的概念结构在转换的过程中应该保持不变。另一个步聚就是本节需要讨论的，本体是用来描述一个领域，特别是一个公共领域的一个知识体系。本体有几中划分方式。我们将在这里来讨论一些划分的建议。特别有趣的是一种“上层本体”，它试图用作几乎所有的基于知识的系统的本体的上层部分。在IEEE标准上层本体研究组中所描述的一些划分本体的方式试图用来合并成一个上层本体标准。在[39]的链接网页上有很多其它的例子可以作为一个上层本体。（感觉翻译不太好！） (figure 1)

Figure 1: How ontologies dier in their analyses of the most general concepts [8] It is interesting that many authors agree that the upper class1 of the ontology is “thing”, however even in the second level they do not agree on the separation, as can be seen in the figure 1. The initiative [39] tries to unify these views.

3.1 CYC

The ontology of CYC is based on a several terms that form the fundamental vocabulary of the CYC knowledge base. The universal set is #$Thing2 (see figure 1). It is the set of everything. Every CYC constant in the knowledge base is a member of this collection. In the prefix notation of the language CycL [10], we express that fact as (#$isa CONST #$Thing). Thus, too, every collection in the knowledge base is a subset of the collection #$Thing. In CycL, that fact is expressed as (#$genls COL #$Thing). The set #$Thing has some subsets, such as PathGeneric, Intangible, Individual, Sim- pleSegmentOfPath, PathSimple, MathematicalOrComputationalThing, IntangibleIndividual, Product, TemporalThing, SpatialThing, Situation, EdgeOnObject, FlowPath, ComputationalObject, Microtheory, plus about 1500 more public subsets and about 13600 unpublished subsets.

$Individual is the collection of all things that are not sets or collections. Thus,
$Individual includes (among other things) physical objects, temporal subabstractions of

physical objects, numbers, relations, and groups (#$Group). An element of #$Individual may have parts or a structure (including parts that are discontinuous), but no instance of

$Individual can have elements or subsets.
$Collection is the collection of all CYC collections. CYC collections are natural kinds

or classes, as opposed to mathematical sets. Their elements have some common attribute(s). Each CYC collection is like a set in so far as it may have elements, subsets, and supersets, and may not have parts or spatial or temporal properties. Sets, however, dier from collections in that a mathematical set may be an arbitrary set of things which have nothing in common (#$Set-Mathematical). In contrast, the elements of a collection will all have in common some feature(s), some ‘intensional’ qualities. In addition, two instances of #$Collection can be co-extensional (i.e. have all the same elements) without being identical, whereas if two arbitrary sets had the same elements, they would be considered equal.

$Individual and #$Collection are disjoint collections. No CYC constant can be an

instance of both.

$Predicate is the set of all CYC predicates. Each element of #$Predicate is a truth-

functional relationship in CYC which takes some number of arguments. Each of those argu- ments must be of some particular type. Informally, one can think of elements of #$Predicate as functions that always return either true or false. More formally, when an element of

$Predicate is applied to the legal number and type of arguments, an expression is formed

which is a well-formed formula (w) in CycL. Such expressions are called atomic formulas if they contain variables, or ground atomic formulas (gaf) if they contain no variables.

$isa:<#$ReifiableTerm> <#$Collection> expresses the ISA relationship. (#$isa EL

COL) means that EL is an element of the collection COL. CYC knows that #$isa distributes over #$genls. That is, if one asserts (#$isa EL COL) and (#$genls COL SUPER), CYC will infer that (#$isa EL SUPER). Therefore, in practice one only manually asserts a small fraction of the #$isa assertions — the vast majority are inferred automatically by CYC.

$genls:<#$Collection> <#$Collection> expresses similar relationship for collections

(generalization). (#$genls COL SUPER) means that SUPER is one of the supersets of COL. Both arguments must be elements of #$Collection. Again, as with the #$isa, CYC knows that #$genls is transitive, therefore, in practice one only manually asserts a small fraction of the #$genls assertions since the rest is inferred inferred automatically. More details about the structure of the CYC ontology and about how the CYC knowledge base is constructed can be found at http://www.cyc.com.

3.2 Russell & Norvig’s General Ontology Russell & Norvig’大本体

Yet another view of general ontology structure is presented in Russell & Norvig’s book [38]. Every category of their ontology (see figure 2) is discussed in detail on example axioms. An example of this ontology in KIF [18] can be found at http://ltsc.ieee.org/suo/ ontologies/Russell-Norvig.txt.

在Russell & Norvig的书 [38] 中提及了另一种关于大本体结构的观点。每个类别都有各自的本体（见图2），这在例程公理中已详细讨论过了。

这种本体的KIF [18]可以在

 Russell-Norvig.txt (http://ltsc.ieee.org/suo/ontologies/Russell-Norvig.txt) 找到。

(Figure 2)

Figure 2: Russell & Norvig’s general ontology structure [38] 图2：Russell & Norvig的大本体结构 [38]

3.3 Ontology Engineering

3.3 本体工程

Ontology engineering is a field in artificial intelligence or computer science that is concerned with ontology creation and usage. Report [31], that proposes and comments this field, declares that the ultimate purpose of ontology engineering should be “to provide a basis of building models of all things in which computer science is interested”.

本体工程是人工智能或者计算机科学的一个领域, 它关注于本体的建立和使用. 在Report [31]中提出了这一新的领域并对其进行了注解，它宣称本体工程的终极目标应该是"为计算机科学感兴趣的所有事物提供一个建立模型的基础".

3.3.1 Structure of Usage

3.3.1 用法的结构

An ontology can be divided into following subcategories according to [31] from the knowledge reuse and ontology engineering point of view as follows. This is rather a structure of ontologies from a point of view of their usage than a division of one general ontology. Some examples are included.
根据 [31]从知识重用和本体论工程指出的如下观点，本体论可以被分成以下子类。与其说是一个通用本体的分类，不如说是一个通过它们的用途划分的本体结构。包括一些例子。
Workplace Ontology
工作场所本体
This is an ontology for workplace which aects task characteristics by specifying several boundary conditions which characterize and justify problem solving behaviour in the workplace. Workplace and task ontologies collectively specify the context in which domain knowledge is intended and used during the problem solving. Examples from circuit troubleshooting: fidelity, eciency, precision, high reliability. Task Ontology Task ontology is a system of vocabulary for describing problem solving structure of all the existing tasks domain independently. It does not cover the control structure. It covers components or primitives of unit inferences taking place during performing tasks. Task knowledge in turn specifies domain knowledge by giving roles to each objects and relations between them. Examples from scheduling tasks: schedule recipient, schedule resource, goal, constraint, availability, load, select, assign, classify, remove, relax, add.

Domain ontology Domain ontology can be either task dependent or task independent. Task independent ontology usually relates to activities of objects. – Task-dependent ontology A task structure requires not all the domain knowledge but some specific domain knowledge in a certain specific organization. This special type of domain knowledge can be called task-domain ontology because it depends on the task. Examples from job-shop scheduling: job, order, line, due date, machine availability, tardiness, load, cost. – Task-independent ontology Activity-related ontology Object ontology. This ontology covers the structure, behaviour and function of the object. Examples from circuit boards: component, connection, line, chip, pin, gate, bus, state, role. Activity ontology. Examples from enterprise ontology: use, consume, produce, release, state, resource, commit, enable, complete, disable. Activity-independent ontology Field ontology. This ontology is related to theories and principles which govern the domain. It contains primitive concepts appearing in the theories and relations, formulas, and units constituting the theories and principles. Units ontology. Examples: mole, kilogram, meter, ampere, radian. Engineering mathematics ontology. Examples: linear algebra, physical quantity, physical dimension, unit of measure, scalar quantity, physical components. General or Common ontology Examples: things, events, time, space, causality or behaviour, function etc.

3.3.2 Ontology Engineering Subfields

We can also divide the ontology or ontologies from the point of view of ontology engineering as a field. The subjects which should be covered by ontology engineering are demonstrated in [31]. It includes basic issues in philosophy, knowledge representation, ontology design, standardization, EDI, reuse and sharing of knowledge, media integration, etc. which are the essential topics in the future knowledge engineering. Of course, they should be constantly refined through further development of ontology engineering. Basic Subfield – Philosophy(Ontology, Meta-mathematics) Ontology which philosophers have discussed since Aristotle is discussed as well as logic and meta-mathematics.

– Scientific philosophy Investigation on Ontology from the physics point of views, e.g., time, space, pro- cess, causality, etc. is made. – Knowledge representation Basic issues on knowledge representation, especially on representation of ontologi- cal stu, are discussed. Subfield of Ontology Design – General(Common) ontology General ontologies such as time, space, process, causality, part/whole relation, etc. are designed. Both in-depth investigation on the meaning of every concept and relation and on formal representation of ontologies are discussed. – Domain ontologies Various ontologies in, say, Plant, Electricity, Enterprise, etc. are designed. Subfield of Common Sense Knowledge – Parallel to general ontology design, common sense knowledge is investigated and collected and knowledge bases of common sense are built. Subfield of Standardization – EDI (Electronic Data Interchange) and data element specification Standardization of primitive data elements which should be shared among people for enabling full automatic EDI. – Basic semantic repository Standardization of primitive semantic elements which should be shared among people for enabling knowledge sharing. – Conceptual schema modeling facility (CSMF) – Components for qualitative modeling Standardization of functional components such as pipe, valve, pump, boiler, regis- ter, battery, etc. for qualitative model building. Subfield of Data or Knowledge Interchange – Translation of ontology Translation methodologies of one ontology into another are developed. – Database transformation Transformation of data in a data base into another of dierent conceptual schema. – Knowledge base transformation Transformation of a knowledge base into another built based on a dierent ontology. Subfield of Knowledge Reuse – Task ontology Design of ontology for describing and modeling human ways of problem solving.

– T-domain ontology Task-dependent domain ontology is designed under some specific task context. – Methodology for knowledge reuse Development of methodologies for knowledge reuse using the above two ontologies. Subfield of Knowledge Sharing – Communication protocol Development of communication protocols between agents which can behave coop- eratively under a goal specified. – Cooperative task ontology Task ontology design for cooperative communication Subfield of Media Integration – Media ontology Ontologies of the structural aspects of documents, images, movies, etc. are de- signed. – Common ontologies of content of the media Ontologies common to all media such as those of human behavior, story, etc. are designed. – Media integration Development of meaning representation language for media and media integration through understanding media representation are done. Subfield of Ontology Design Methodology – Methodology – Support environment Subfield of ontology evaluation – Evaluation of ontologies designed is made using the real world problems by forming a consortium.

Shaird 2008-02-01 21:44 发表评论

(转)语言习得的联结主义模式

Shaird — Sun, 20 Jan 2008 21:35:00 GMT

作者:李平

来源:当代语言学, Contemporary Linguistics, 编辑部邮箱 2002年 03期

摘要:语言学是认知科学的一个重要分支。本文探讨近年来对认知科学产生了重大影响的联结主义理论及方法,介绍联结主义的基本概念,在语言学及语言习得中的应用,以及它给语言研究提供的新思路。
Linguistics is an important branch in cognitive science. The paper explores the connectionist theory and methods, introducing the basic concepts in connectionism, its application in language acquisition, and the new insights to linguistic research.

1.引言：认知科学与语言学

    认知科学的发展日新月异。从上世纪五十年代到今天虽然只有短短的几十年，科学家们对人脑的构造及功能已经有了比较深入的认识。语言学在这个认识过程中起了十分重要的作用。特别是心理语言学，由于它跨学科的特征，使我们能通过对人们使用语言和学习语言的心理机制来透视人脑处理信息的普遍特征。本文拟从语言习得的角度来探讨目前风靡一时的联结主义模式(connectionist models)（注：Connectionism又称为neural networks（神经网络），国内有学者译作“连接主义”。但笔者认为“联结主义”能够更好地反映这个理论的特征。），并以此讨论认知科学及语言学的一般性问题。

    从上世纪五十年代末期到今天，Chomsky的理论一直在语言学中占主导地位。Chomsky对传统的语言学理论提出了挑战，认为语言知识从根本上是一种心理机制，而这种机制的根本又是形式语法系统。也就是说，人脑是通过一个内存的规则系统（形式语法）来反映语言的。过去几十年中，Chomsky不断更新他对形式语法系统的描述，从原有的“转换生成语法”到今天的“最简方案”，虽然其间有不少变化，但不离其对规则的基本诉求。心理学家和心理语言学家们同样对规则系统深信不疑，认为只有规则系统才能够有效地反映人脑的高级抽象活动。这种认识乃是基于认知科学家的一个基本假设：人脑是处理符号系统(symbol system)的机器(Newell 1980)。这个假设对认知科学起了很大的影响：一旦我们将人脑当作符号系统，我们就可以很方便地描述这个机器对符号加工与处理的方式。从某个角度来看，我们可以拿这部机器与计算机作比较：描述人脑的过程跟描述计算机的软件操作过程一样。

    这种将人脑看作符号系统的观点与心理学的模块理论 (modular theory)有着密不可分的关系。18世纪Franz Gall提出了模块理论的最初假设。但那时的假设强调人的性格特征与脑骨骼的外型特征的关系，因而缺乏科学根据。现代心理学对模块理论表述最完备的莫过于 Jerry Fodor(1983)。Fodor认为人脑的认知系统是由许许多多的模块组成的。这些模块有的负责语法、有的负责视觉、有的负责听觉，任务专一 (domain-specific)，互相独立(autonomous)。对于心理语言学来说，最重要的是这些模块在语言的加工过程中不能同时互动 (parallel interaction)。例如，当你听到“小明和小张在切蛋糕”这句话时，模块理论假设，我们是由语音系统开始，然后对词汇，再对语法，最后对语义进行加工。这是由低层到高层的一个过程(bottom-up process)，次序严谨，不能打乱。再者，在对语法加工的同时，语音和语义都不能起作用：每一层面的信息都是自给自足的 (informationally encapsulated)。模块理论的线性次序，及其分明的层次，对认知科学家具有极强的吸引力。但是，近十几年来它也受到了强烈的挑战。对模块理论及其在大脑中的表征，读者可参看Uttal(2001)较系统的阐述及批判。对其挑战的主要理论当属联结主义了。

    我们知道，符号系统的观点及模块理论的假设是建立在将人脑比作电脑的基础之上的。这种比拟的优点是，我们能够有效地讨论人脑在信息处理时的操作过程及加工特征（如线性次序，模块结构，加工流程图等等）。但它最大的缺点是难以在生物及神经学上找到对应的关系(neurally implausible)。人脑内有上千亿神经元，而且这些神经元之间的联结关系比起电脑中几百或上千的电极管要复杂得多。还有，电极管每秒可以进行几百万或几千万次运算，而神经元每秒则只可以发送或接收几百次电子化学的脉冲。因此，如果人脑是按线性次序来操作，每秒不过能计算一百次左右(100- step rule,Feldman and Ballard1982)。显而易见，每秒一百个操作步骤是不能够完成复杂的认知过程的。例如，词语的加工过程至少精确到十分之一秒。最后，数字电脑只能接收单一的、清楚的符号信号(all or none)，没有所谓的中介状态(partial status)。这与人脑的灵活性及可塑性有极大的差别。所有这些原因都给联结主义的观点铺下了基石。

    联结主义的一些初期理论就已经与模块理论的基本假设针锋相对了。最著名的要算“互动激活”(interactive activation)理论。Rumelhart和McClelland(1981)提出了互动激活的基本假说。根据这个假说，语言加工的过程既包含从下至上的过程(bottom-up process)，也包含从上至下的过程(top-down process)。与模块理论的假说相反，这两种过程可以在同一时间互动。举例来说，当你听到“小明和小张在切蛋糕”这句话时，既可有语音至词汇至语法至语义的过程，也可以有语境的作用由上至下帮助听者理解语义、语法、词汇及语音。这两种过程可以从听者对在噪音的干扰下仍能完整地理解句子的情况中看出来。如果“蛋糕”的“糕”字突然受到干扰（例如在电话交谈中），听者的理解系统可以自动修补并添加“糕”的字音。Rumelhart和McClelland还举例说明，如果英文字母R或K的右上角被遮盖（类似h），读者可以根据词的周围语境(WOR-)自动修补，达到理解K而不是R。这种语境效应或词优效应 (word-superiority effect)对互动激活的理论提供了有力的支持。

2.联结主义的基本特征

    互动激活的假说给联结主义用于语言分析中打下了基础。但严格地说，它还不能算是联结主义的模型。按照Rumelhart,McClelland，和PDP (parallel distributed processing)Group(1986)的PDP理论，联结主义有以下两个基本特征。首先，在知识的表征(representation)方面，它强调“分布表征”(distributed representation)。分布表征与传统认知理论对知识的表征有很大的不同。上面我们提到，传统认知理论将人脑看作是符号处理系统，因而它采用的是“方位表征”法(localist representation)。这种表征的基本特点是一个信息加工的单位（或单元）只表达一个概念（例如语素、字或词），而一个概念也只由一个单位来表达。这样，表达单位不能进一步分解为更小的单位，因为它与概念间有清楚的一对一的关系。分布表征与此不同：它强调一个概念由多个单元互相作用的关系来表达。例如，英文大写字母F和E之间的不同在于后者多了一横。照方位表征法，F和E是分别由两个不同的单元来表达的。照分布表征法，F和E可以由多个同样的单元来表达，所不同的某些单元在表达E时被激活，但在表达F时被抑制。这样一来，我们如果仅看这些个别的单元，它们既不表达F，也不表达E。F和E的知识是由多个单元之间激活的关系来表达的。

    联结主义区别传统认知理论的第二个基本特征在于它对知识学习的看法。这也是本文需详细介绍的。长期以来，心理语言学家认为，学习语言就是一个学习规则的过程。这种观点，如前所述，是与Chomsky的语言学理论密不可分。联结主义则认为， Chomsky理论提供了有效的规则系统来描述语言本身，但这个系统不能描述学习的过程。由于联结主义采用分布表征，它认为知识学习的过程就是学习分布表征的过程。换句话说，学习是经过调节单元与单元之间的关系来完成的，而调节单元与单元之间的关系又是经过改变单元与单元之间的权值(weight)来完成的。那么什么是权值呢？权值是表达单元与单元之间联结的强度。权值数越高，单元之间的联结就越强。一旦联结网络中相应的单元都由适当的权值联系好了，知识的表达和学习的过程也就完成了。以上述简化的例子而言，如果我们已经学会了F这个字，那么学习E时只需要将最下部分的单元激活并给予高强度权值，将其与网络中其它单元联结起来，我们便学会了E。

    很显然，联结主义的这些特点，与传统的认知理论相比，有较强的“生理可解性” (biological plausibility)。单元、激活、抑制，以及联结强度等概念，都能在人脑中找到直接的对应。反观传统的认知理论，符号、规则、语言树形图等概念则相当地抽象，难以简单地对应于特定的生物机制。联结主义的目的就是要通过对前类概念的描述达到对后类概念的解释。

    联结主义的思想早在上世纪四十年代初期就出现了(McCulloch and Pitts 1943)。McCulloch等人认为神经网络的结构可以解释数理逻辑的功能。与当今的联结主义网络不同的是，他们的网络的输出只能是二进制的（on或 off），且单元间联结的强度不能通过学习而改变。在McCulloch和Pitts之后有许多人对联结主义的思想加以改进，其间以五十年代末期的视觉网络(perceptron)最引人注目(Rosenblatt 1958)。视觉网络虽然克服了许多McCulloch-Pitts网络的问题（如不限于二进制输出，也可以在学习中改变单元的联结强度），但与所有早期的联结主义模型一样，都只能解决简单的“线性可分”(linearly separable)的问题，如逻辑上的“和”(and)与“或”(or)问题。对于“线性不可分”（或称“非线性”）问题，比如逻辑上的“排它或” （exclusive or，简称xor）问题，它们则是一筹莫展。例如：

    (1)○　○→A　　（箭头表示“归类为”）

    (2)○　□→B

    (3)□　○→B

    (4)□　□→A

     在这些例子中，(1)和(4)之间的差别最大（两个圆形对两个方块），(2)和(3)则都是由一个圆形加一个方块组成，只是次序不同。之所以说这个问题是 “线性不可分”，是因为它要求将差别最大的单位归为相同的范畴(A)。这种分类法不能简单地在问题的平面上用直线切开，但对于人来说，我们能够灵活地使用非线性方法解决此类问题。从上世纪六十年代起，研究者们就开始考虑如何能使联结主义网络解决非线性问题。Rumelhart,McClelland和 PDP Group(1986)的PDP理论对解决这类问题提出了有效的方法。

    PDP理论的联结主义网络一般由三个层次组成：输入层(input layer)、内隐层(hidden layer)和输出层(output layer)。输入层接受输入的表征（如汉字的字形），输出层提供输出应有的表征（如汉字的分类），而内隐层则存储网络所学习到的知识表征（如汉字在各个不同学习阶段的形体）。网络学习由输入层开始，至内隐层，再达到输出层。这个学习过程是一个调节网络中各单元的激活程度及单元之间的联结强度的过程。 PDP理论对解决非线性问题最大的贡献在于它对内隐层与其它层次之间的调节方法（或称算法）。联结主义网络中至今最有影响的算法可能要推“反馈学习法” （back-propagation，简称BP算法，Rumelhart Hinton and Williams 1986）。按照BP算法，网络每次学习输入与输出的关系时，同时也接受一个“指导信号”(teacher)。这个指导信号乃是网络应该提供的正确的输出。如果网络所产生的输出信号与指导信号有差别，那么这个差别的大小就会计算为网络的误差率。误差率然后反馈至网络，使相关的单元与单元之间的权值得到改变。这样不断改变的结果使网络能最后正确地产生所有的输出。最重要的是在这个不断地调节过程中，单元间的权值及内隐层单元的激活能够最有效地反映输出与输入间的关系，从而有效地反映输入层单位间的内在关系（注：由于篇幅及内容的限制，我们在这里撇开许多技术上的细节，着眼于提供与语言学有关的理论描述。想进一步了解PDP理论的读者可阅读Rumelhart等人1986年的两卷PDP论著。对联结主义与认知科学有兴趣的读者可阅读Bechtel and Abrahamsen(1991),Ellis and Humphreys(1999),Spitzer(1999)的论文。对技术细节或数学模型有兴趣的读者可阅读Andersen(1995), Dayhoff(1990),Fausett(1994)，以及Hertz,Krogh and Palmer(1991)的论文。）。

     综上所述，联结主义理论与传统的认知理论有很大的区别。它所运用的基本概念都与人脑的生物机制有一定程度上的对应关系，例如单元对应于神经元，单元的联结对应于神经元的联结，权值对应于联结强度，激活与抑制对应于神经元间电生理活动的方式。如何能够利用单元、联结、权值、激活与抑制这些概念去更好地解释传统认知理论中的重大问题乃是联结主义理论成功的关键。由于语言习得是认知科学中的重大课题，下面我们讨论联结主义是如何解释语言习得的。

3.联结主义网络在语言学及语言习得中的运用

由于内隐层和BP算法的出现，联结主义网络不单能够解决简单的非线性问题，如xor，而且能够原则上解决任何非线性问题。语言现象也许是最复杂的和最有代表性的非线性问题之一。联结主义网络打开了语言研究的一扇新大门。Rumelhart等人1986年的两卷PDP论著为联结主义作出了划时代的贡献，而语言又是其中讨论得最多的一个环节。在联结主义看来，我们现有的语法规则及语义范畴都只能作为有效的语言学理论描述，但不能作为心理表征的机制。换句话说，语言学理论有实用价值，但没有心理现实性。这种观点自1986年由Rumelhart等人在PDP一书中提出后，引起了极大争论。这个争论一直持续到今天，并无最后定论。

那么PDP理论怎样看待语法规则及语义范畴呢？Rumelhart等人在PDP一书中有许多章节涉及到语言学的问题，包括言语感知，句子理解，语言习得等等，我们不能在这里具体一一加以讨论。但其中对语言学最有影响的一章就是Rumelhart与McClelland提出的英语过去时态的PDP模型。以下我们简单地介绍一下这个模型。

3.1联结主义网络对语法规则的学习

    众所周知， Chomsky批判传统的行为主义心理学最有力的证据，就是儿童并非简单模仿成人语言，而是利用对规则的掌握进行类推。比如，儿童学习到一定阶段时，会说 breaked作为break的过去式，而不是说正确的不规则形态broke(Brown 1973;Bybee and Slobin 1982;Kuczaj 1977)。breaked在成人语言中根本不存在，模仿学说的理论显然难以自圆其说。根据Chomsky的理论，语言习得研究者一直认为，最有效地解释儿童“泛化”(overgeneralization)的方法就是假定儿童在学习的某一阶段已经掌握了一个抽象的内在规则，如“在任何动词后加-ed成为该动词的过去式”，或“在任何名词后加-s成为该名词的复数形式”。由于内在规则的普遍适用性，儿童便把不规则的动词当作规则动词来处理 (regularization)，产生breaked,comed，或falled等错误。要纠正这些错误，儿童必须逐字学习，加以校正。这个逐字学习的过程与规则的掌握过程完全不同。所以，Pinker等人认为儿童在掌握英语过去时态时，有两种不同的学习机制在起作用(Pinker 1991,1999;Pinker and Prince 1988)：一种是学习一般性的形态规则，由此能产生泛化的结果。另一种则是“联想学习”(associative learning)，将不规则动词的形态与其基本形式逐个对应起来。前者负责一般规律，后者负责单个例外。

    PDP理论与这种观点截然相反。Rumelhart与McClelland的英语过去时态模型强调儿童学习过去式只有一种机制在起作用，那就是联结主义的机制。Rumelhart等人使用了一个简单的联结主义网络来模拟儿童的学习过程，发现该网络能产生“U-形学习效应”。所谓U-形学习效应是指儿童在早期的学习过程中基本不犯语法错误，如正确地使用broke,came或fell。在中期的学习阶段，错误大量出现，如不正确地使用breaked, comed或falled。儿童在后期的学习阶段才逐步将错误消除(Bowerman 1982)。这种效应以前人们一直借助于规则的学习来解释：儿童早期没有学到规则，中期学到规则以后泛化使用规则，后期逐字调节，对规则的使用范围加以改正。在Rumelhart等人的联结主义网络中并无任何规则的表征，但网络却显现出规则的效应。这个网络是怎样达到这种效应的呢？在这个模拟中，网络收到的是每个动词词根的语音特征，然后与它的过去式的语音特征加以匹配。每次匹配的同时，网络中的联结权值得以改变。正是这些联结权值使网络对动词的基本形态与它的过去式之间的关系有了详尽的了解。这些关系反映过去式形态变化的基本规律（flow,glow,slow都是带-ed作为过去式），从而指导网络在学习新动词时的类推行为（如blow也应带-ed作为过去式）。在这种学习过程中，网络能有效地将规则动词与不规则动词区别对待。但在同时，这个过程所产生的结果既有将不规则动词当作规则动词的情况（regularization，如blowed），也有将规则动词当作不规则动词的情况（irregularization，如ment作为mend的过去式）。后一种情况的产生是由于网络学到了一些不规则动词中的“次规律”(sub- regularities)，比如lend,send,spend的过去式分别是lent,sent,spent。这种情况似乎难以用上述Pinker等人提出的“规则与例外”的双机制来解释。

    Rumelhart与McClelland的模型的一个核心的思想就是语言学规则是“浮现特征”(emergent properties)。也就是说，联结主义网络通过单元、激活、抑制，与联结等特征能够有效地表达语言行为，而这种表达的有效程度仿佛其背后有语言学规则在支配。由上述所见，单一的联结主义机制既能反映儿童对规则过去式的掌握，也能反映其对不规则过去式的掌握。规则本身不需要在系统中明确表征，但却通过网络学习浮现而出。我们可以通过一个简单的例子来了解浮现特征这个概念(Bates 1984)。联结主义中的规则行为可以与蜂窝的六角形状来加以比较。从单个的蜜蜂的行为来看，蜂窝的六角形状似乎不可思议。但如果我们分析其动态物理的特征，那么六角形则是恰到好处。每个蜜蜂在构造蜂窝时都只需要一小点蜜，但当多个蜜蜂从多个角度将蜜一点一滴地挤入蜂窝，当许多柔软的小圆形的蜜受到多角度同时挤压时，整体蜂窝的形状便自然而然地成为六角形。在这种情况下，我们说六角形是浮现特征，而不需要假设蜜蜂拥有一个制造六角形的规则系统。最近，语言学家和心理语言学家对浮现特征从多个角度给予了讨论，一些相关的论点在MacWhinney(1999)一书中有所介绍。

    Rumelhart等人的PDP模型的出现引发了一系列的争论，尤其是它与Pinker等人的双机制的争论直到今天仍僵持不下。Pinker等人对 Rumelhart与McClelland的模型提出了许多问题，尤其是认为它在词汇的表征上，在模拟的程序上，以及在语音语义的关系上都不能反映儿童学习过去式中的许多细节。后人对Rumelhart与McClelland的模型作了较大的修改（包括结构上的，表征上的及训练程序上的修改， Plunkett and Marchman 1991,1993;MacWhinney and Leinbach 1991)，发现虽然原有的模型确有缺陷，但扩充后的模型仍支持原有模型的基本观点。从这些争论中我们可以回到本文开头讨论的问题而看到一个基本的对立，那就是应该怎样看待人脑的构造与功能：人脑到底是一个模块的符号处理系统呢，还是一个多元的分布处理系统？

3.2联结主义网络对语义范畴的学习

     自Rumelhart和McClelland模型问世以来，联结主义在语言习得中的研究主要注重在对语法规则和语音结构的表征上，很少在语义方面下功夫。理由很简单：语义太复杂。因此即使偶尔有涉及语义的联结主义网络，也只是随机抽取语义特征，而后加以轻描淡写。但是联结主义的分布表征及学习的特点其实对解决语义方面的问题有极大的帮助。有鉴于此，笔者在上世纪九十年代开始研究如何用联结主义网络来学习语义范畴。

    Li(1993)及Li和MacWhinney(1996)从“隐型范畴”(cryptotype)着手研究语义的习得问题。隐型范畴在语义学中是个棘手的问题。Whorf(1956)在1936年对隐型范畴作了如下的“定义”：隐型范畴是微妙的，看不见也摸不着的，不能以一个简单的标志加以命名的。这样的定义似乎叫人对隐型范畴最好敬而远之。以英语的前缀un-为例，很多动词可以带un-（如unbuckle,undress,unfasten, untie），但也有很多动词不能带un-（如*unbuild,*unkick,*unmove,*unpush）。Whorf认为有一个隐型范畴在支配un-的使用。问题就在于语言学家不能清楚地描述隐型范畴。隐型范畴必须通过其它的型态标记（如前缀un-）来负面定义。

    Bowerman(1982)对Whorf提出的隐型范畴在语言习得中的作用做了探讨。她认为儿童在学习动词前缀un-时经历一个与学习过去时态一样的U -型效应。儿童在第一阶段正确地使用带un-的动词，因为他们尚未将动词词根与前缀区分开来。在第二阶段时大量的泛化使用un-错误开始出现（如 *unhold,*unpress,*unsqueeze等）。在这个阶段重要的是儿童已经认识到了un-的隐型范畴，因此与隐型范畴相似的动词都被用来带un-。最后阶段儿童才纠正错误。Bowerman这样的解释十分合理，但最大的问题是没有说明儿童是怎样获得un-的隐型范畴的。

    Li(1993)及Li和MacWhinney(1996)模拟了联结主义网络学习隐型范畴的过程。网络的任务是按照能否带un-给动词加以分类。我们的假设是隐型范畴之所以“隐型”，乃是由于(a)隐型范畴涉及复杂的语义关系；(b)隐型范畴涉及动词的许多语义特征；(c)不同的语义特征在隐型范畴中有不同的激活程度；(d)语义特征之间存在着不是互相排斥而是相互交叉的情况。联结主义网络所使用的分布表征及非线性学习给我们研究隐型范畴提供了最理想的工具。我们的模拟结果表明，当网络学到一定的词汇量时，隐型范畴在网络的内隐层浮现而出。更重要的是，当网络继续学习新词时，隐型范畴指导它进行类推，产生类似儿童在第二阶段时泛化使用un-的错误。这些结果表明，联结主义网络可以通过学习带un-动词的语义特征之间的复杂关系以及这些特征与前缀共现的规律来形成隐型范畴的表征。通过对网络内隐层的统计分析，我们可以看到带un-的动词有一定的特点，而不带un-的动词有另外的特点。这些结果进一步说明学习隐型范畴或un-不是一个简单的规则学习过程，而是逐步累计相关特征的计算过程。这个计算过程考察词义，词型，以及词缀之间在所学语料中共现的频率与规律。我们的结果与前面讨论的联结主义网络学习语法规则的结果十分一致。两者都说明联结主义模型的单一机制能学习语言中的许多现象。

3.3联结主义网络对语言先天性的看法

     在前面我们提到Chomsky的理论对语言学产生了深远的影响。Chomsky对于规则系统的阐述可谓尽善尽美。但其理论的另一个核心是规则的“先天性” (innateness)。这个问题在语言学中有很多详尽的讨论（李行德1992），本文不多加赘述。与Chomsky理论相反，联结主义理论强调学习的重要性，强调网络从语言素材中抽取规律的能力。但与简单的经验主义(empiricism)不同，联结主义并不否定先天性。这一点在Elman及 Bates等人的《对先天性的再思考》(Rethinking Innateness)一书中有详细的讨论。在这里我们只简略地介绍一下Elman(1990)等人的观点。

    Elman(1996)等人认为，前人对先天性的认识局限于单个层次，但先天性本身有三个层次值得研究。第一个层次是表征上的层次 (representational)。这个层次的先天性是指人脑具有先天固有的神经系统，而且这个系统中的神经元之间的关系早已确定为表达特定的范畴与概念。后天的经验或学习对这个系统的影响甚微。第二个层次是结构上的层次(architectural)。这个层次的先天性是指人脑的构造对信息的加工或问题的解决有什么样的限制。人脑在局部或整体都有一些构造特征，比如单个神经元的信息处理速度限制在每秒100个步骤左右，比现有的数字计算机慢了许多（如前所述）。第三个层次是发展速度上的层次(timing of maturational events)。这个层次的先天性是指人脑的各个区域有不同的发展进程，如脑功能侧化(hemispheric lateralization)及神经元的再生(neurogenesis)。语言习得的“关键期”(critical period)就可能是由于人脑可塑性的降低而导致的，反映发展速度上的先天性。这三个层次上的先天性都在前人的讨论之列，但在语言学家眼中（从 Chomsky到Pinker，再到Bickerton），先天性大多停留在第一个层次上。有趣的是，Elman等人从神经生物学出发，特地反驳第一层次上的先天性。他们指出，人脑的DNA本身并无足够用来表达人类所需的多如牛毛的具体概念与范畴，况且人脑的后天可塑性也与固有神经系统的看法不一致。因此，Elman等人认为结构上和发展速度上的先天性更为合理及有效，而且这两个层次上的先天性可以直接在联结主义网络中得到反映与表达（如网络的结构、关系及学习速度等）。

    先天与后天，自然与哺育的争论，自古希腊哲学家开始一直到今天都没有完整的答案。语言学家、心理学家、及认知科学家现在开始寻找新的角度来探讨这个问题。包括Elman等人在内的一些学者认为，单靠内在机制或外在因素都不足以解答人与环境之间的复杂且丰富的相互作用关系。因此我们应该仔细研究人与环境之间相互作用下所产生的“浮现特征”。这些浮现特征从联结主义的角度来看正是网络与学习材料之间相互作用的结果。 Nelson(1999)将这种观点推到一个新的层次，认为人的神经系统本身会随着学习经验的增加而加以改变或得到发展。也就是说，内在的神经机制本身也不是一成不变的。显而易见，在这种情况下再坚持谈内在与外在或先天与后天谁更重要就显得毫无意义了。

4.自组联结主义网络与语言习得

     联结主义自Rumulhart等人的PDP论著问世以来已经在语言学、心理语言学、神经语言学以及语言习得中引起了一波又一波的研究高潮(Ellis and Humphreys 1999)。但迄今为止这些研究大都局限于以下三个方面。首先，大部分涉及语言的联结主义模型都只探讨语法或语音等语言形态方面的特征(formal properties of language)，而很少研究语义或语用方面的特征。这一点在前面我们已经提到，主要原因是后者的研究难度较大。第二，以前的研究大都只使用极少数量的语言素材，从几十到几百词汇不等。最著名的联结主义网络之一，Elman(1990)的“简单回馈网络”(simple recurrent network)只用了29个名词和动词。但这些网络能否适用于广泛的、大量的语言素材则是个问题（所谓scalability的问题）。这如同语言学家用几个例句能否解释大量语言学现象一样。第三，大部分研究语言的网络都只采用了典型的反馈学习法（BP算法）。BP算法网络，如前所述，有特定的指导信号反馈网络，使相关的权值加以改变。它是一种属于“有指导学习”(supervised learning)的网络。这种网络在研究语言习得方面的可行性很值得怀疑(Li 1999)。虽然儿童学习语言时也有成人指导和儿童模仿的成分，但自从Chomsky批判行为主义的语言学说以来，语言学家们大都认为儿童学习语言时不需要或不接受“错误反馈”(negative evidence,Bowerman 1988)。换句话说，语音习得基本上是一个无需指导的学习过程。

     最近几年笔者及合作研究者试图突破以上几方面的限制，研究一种无需指导的自组联结主义网络(self-organizing connectionist network)来探讨语言习得(Li 1999,2000;Li and Shirai 2000;Li and Farkas 2001)。这种网络属于非指导学习(unsupervised learning)的神经网络。自组联结主义网络相比传统的BP网络对语言习得而言有更大的心理现实性及生物有效性。在这种网络中，学习通常是在二维平面图中进行的（又称“自组网图”，self-organizning maps或简称SOM;Kohonen 1982,1989,1995）。网图中的每个单元都能对一个或多个输入单位加以反射。在学习的最初阶段，输入单位随机激活网图中的一个单元，这个单元就成为该输入单位的反射代表。随着网络的不断学习，该单元及其周围的单元对权值不断加以调节，使网图在下次处理同样的输入时能够激活同样的或邻近的单元。这样不断调节的过程就使网图上的每个单元只对某些特征相似的输入加以反射，从而使得网图能够利用有限的二维平面来表达多维的输入特征。

    Miikkulainen(1993,1997)将多个网图连接起来，用以学习语音、语义及字型的关系。每个网图本身只表达语音、语义或字型，但网图与网图之间通过赫伯学习法(Hebbian learning)来联结，以模拟各语言层面可能产生的相互作用。赫伯学习法(Hebb 1949)是一种有生物基础的规则。它的主要原则是两个神经元如果同时激活，它们之间的联结强度就会相应提高。笔者与实验室的研究人员近年利用这种多重网图模型来模拟语言习得中的一些具体问题。我们的模型一个最大的特征就是它能通过自组学习，对大量的语言素材进行加工，从词与词在句中共现的机率中提取语法语义范畴。这种提取是根据最近自然语言处理中对大语料库加工的相关理论而产生的。Burgess和Lund(1997,1999)提出了hyperspace analogue to language(HAL)的理论，认为自然语言素材中词与词之间的关系提供了足够的语义信息。Landauer和Dumais(1997)也提出了类似的理论(Latent semantic analysis)，认为语义可从词与篇章的关系中提取。在一系列的研究中，我们发现如果儿童分析成人话语中词与词的共现关系及其频率，可以获得词的语义及语法关系(Li,Burgess and Lund 2000)。这个结论与最近研究幼儿切分话语单位的结论是一致的(Saffran et al.,1996,1997)。同时，我们还提出了词汇的发展模型(DevLex)，用以不断学习新的词汇表征(Farkas and Li 2001)。DevLex不限于固定的词汇，而是通过语料的增加而相应地增加新词，并可以不断增加网络中的单元数目及网图数目(Farkas and Li 2002)。这种逐步增加的过程可以更适当地反映儿童语言学习或成人外语学习的过程。

    我们将DevLex模型运用到语言习得中的几个具体问题上，比如前面提到的语义隐型范畴与前缀un-的关系(Li 1999)，英语时态的学习(Li 2000;Li and Shirai 2000)，中英双语的词汇表征（Li and Farkas 2001;Li 2001及Li的综述，2002）。结果表明模型能有效地提取及表达语法语义范畴。在中英双语的模拟中，两种语言的词汇及语音都被网络自然地分离开来。在前缀与时态的模拟中，语义范畴的出现指导着形态标记的使用，从而产生儿童语言中类推或泛化的现象。总而言之，我们的模型克服了传统联结主义模型的局限：它利用自组而非反馈网络，学习大量自然语料，解决语义语法问题，从而达到更自然地反映语言习得本质的目的。

5.结语

从以上四个部分的讨论中，读者可以看到联结主义近十几年来对西方语言学、心理学及认知科学产生的巨大影响。可惜的是，联结主义应用在中文上的研究寥寥无几。除了陈鹰和彭聃龄(1994)对汉字认知以及笔者对语言习得的研究外，基本上找不到其它的文献。这与中国语言文字科学的发展是极不相称的。笔者希望通过本文起到抛砖引玉的作用，使国内学人将语言学与联结主义的研究推向一个高峰。

【参考文献】:

    1　Anderson,J.1995.An Introduction to Neural Networks.Cambridge,MA:MIT Press.

    2　Bates,E.1984.Bioprograms and the innateness hypothesis.Behavioral and Brain Sciences,7,188-190.

    3　Bechtel,W.and Abrahamsen,A.1991.Connectionism and the Mind.Cambridge,MA:Blackwell.

    4　Bowerman,M.1982.Reorganizational processes in lexical and syntactic development.In E.Wanner and L. Gleitman,eds.,Language Acquisition:The State of the Art.Cambridge:Cambridge University Press.

    5　——.1988.The"no negative evidence"problem:how do children avoid constructing an overly general grammar?In J.Hawkins,ed.,Explaining Language Universals.New York:Basil Blackwell.

    6　Brown,R.1973.A First Language.Cambridge,MA:Harvard University Press.

    7　Burgess,C.and Lund,K.1997.Modelling parsing constraints with high-dimensional context space. Language and Cognitive Processes,12,1-34.

    8　——.1999.The dynamics of meaning in memory.In E.Dietrich and A.Markman,eds.,Cognitive Dynamics:Conceptual and Representational Change in Humans and Machines(pp.17-56).Mahwah, NJ:Erlbaum.

    9　Bybee,J.and Slobin,D.1982.Rules and schemes in the development and use of the English past tense. Language 58:265-289.

    10　Dayhoff,Judith.1991.Neural Network Architecture:An Introduction.New York:Van Nostrand Reinhold.

    11　Ellis,R.and Humphreys,G.1999.Connectionist Psychology: A Text with Readings.Psychology Press: Taylor and Francis.

    12　Elman,J.1990.Finding structure in time.Cognitive Science,14,179-211.

    13　Elman,J.Bates,E.,Johnson,M.,Karmiloff-Smith,A.,Parisi,D.,and Plunkett,K.1996.Rethinking Innateness:A Connectionist Perspective on Development.Cambridge,MA:MIT Press.

    14　Farkas,I.and Li,P.2001.A self-organizing neural network model of the acquisition of word meaning.In E.M.Altmann,A.Cleeremans,C.D.Schunn,and W.D.Gray,eds.,Proceedings of the Fourth International Conference on Cognitive Modeling,pp.67-72.Mahwah,NJ:Lawrence Erlbaum.

    15　——.2002.Modeling the development of the lexicon with a growing self-organizing map.In H.J. Caulfield et al.,eds.,Proceedings of the Sixth Joint Conference on Information Science,pp,553-556. Association for Intelligent Machinery,Inc.

    16　Fausett,L.1994.Fundamentals of Neural Networks.Englewood Cliffs,NJ:Prentice Hall.

    17　Feldman,J.A.and Ballard,D.1982.Connectionist models and their properties.Cognitive Science,6,205-254.

    18　Fodor,J.1983.The Modularity of Mind.Cambridge,MA:MIT Press.

    19 Hebb, D. 1949. The Organization of Behavior: A Neuropsychological Theory.New York,NY:Wiley.

    20　Hertz,J.,Krogh,A.and Palmer,R.1991.Introduction to the Theory of Neural Computation.Redwood City,CA:Addison-Wesley.

    21　Kohonen,T.1982.Self-organized formation of topologically correct feature maps.Biological Cybernetics,43,59-69.

    22　——.1989.Self-organization and Associative Memory.Heidelberg:Springer-Verlag.

    23　——.1995.Self-organizing Maps.Heidelberg:Springer-Verlag.

    24　Kuczaj,S.1977.The acquisition of regular and irregular past tense forms.Journal of Verbal Learning and Verbal Behavior 16:589-600.

    25　Landauer,T.,Dumais,S.1997.A solution to Plato's problem:the latent semantic analysis theory of acquisition,induction,and representation of knowledge.Psychological Review,104,211-240.

    26 Li, P.1993.Cryptotypes,form-meaning mappings, and overgeneralizations.In E.V.Clark,ed., Proceedings of the 24th Child Language Research Forum pp.162-178.Center for the Study of Language and Information,Stanford University.

    27　——.1999.Generalization,representation, and recovery ina self-organizing feature-map model of language acquisition.In M.Hahn and S.C.Stoness,eds., Proceedings of the Twenty First Annual Conference of the Cognitive Science Society pp.308-313.Mahwah,NJ:Lawrence Erlbaum.

    28　——.2000.The acquisition of lexical and grammatical aspect in a self-organizing feature-map model.In L.Gleitman and Aravind K.Joshi,eds.,Proceedings of the Twenty Second Annual Conference of the Cognitive Science Society.Mahwah,NJ:Lawrence Erlbaum.

    29　——.2001.Language acquisition in a self-organizing neural network model.In P.Quinlan,ed., Connectionism and Developmental Theory.Philadelphia and Brighton:Psychology Press.

    30　——.2002.Emergent semantic structures and language acquisition:A Dynamic Perspective.In H.Kao, C.K.Leong,and G.D.,Guo,eds.,Cognitive Neuroscience Studies of the Chinese Language.Hong Kong,China:Hong Kong University Press.

    31　Li,P.Burgess,C.and Lund,K.2000.The acquisition of word meaning through global lexical cooccurrences.In E.Clark,ed.,Proceedings of the Thirtieth Stanford Child Language Research Forum, Cambridge,MA:Cambridge University Press.

    32　Li,P.and Farkas,I. 2001.A self-organizing connectionist model of bilingual processing.In R.Heredia and J.Altarriba,eds.,Bilingual Sentence Processing.North-Holland:Elsevier Science Publisher.

    33　Li,P.and MacWhinney,B.1996.Cryptotype,overgeneralization,and competition:A connectionist model of the learning of English reversive prefixes.Connection Science,8,1-28.

    34　Li,P.and Shirai,Y.2000.The Acquisition of Lexical and Grammatical Aspect.Berlin and New York: Mouton de Gruyter.

    35　MacWhinney,B.1999.The Emergence of Language.Mahwah,NJ:Lawrence Erlbaum.

    36　MacWhinney,B.and Leinbach,J.1991.Implementations are not conceptualizations: Revising the verb learning model.Cognition,40,121-157.

    37　McCulloch,W.and Pitts,W.1943.A logical calculus of the ideas immanent in nervous activity.Bulletin of Mathematical Biophysics,7,115-133

    38　Miikkulainen,R.1993.Subsymbolic　Natural　Language Processing: An Integrated　Model of Scripts Lexicon,and Memory.Cambridge,MA:MIT Press.

    39　——.1997.Dyslexic and category-specific aphasic impairments in a self-organizing feature map model of the lexicon.Brain and Language,59,334-366.

    40　Nelson,C.1999.Neural plasticity and human development. Current Directions in Psychological Science 8,42-45.

    41　Newell,A.1980.Physical symbol systems.Cognitive Science,4,135-183.

    42　Pinker,S.1991.Rules of language.Science,253:530-535.

    43　——.1999.Out of the minds of babies.Science,283:40-41.

    44　Pinker,S.,Prince,A.1988.On language and connectionism:analysis of a parallel distributed processing model of language acquisition.Cognition,28,73-193.

    45　Plunkett,K.and Marchman,V.1991. U-shaped learning and frequency effects in a multi-layered perceptron: implications for child language acquisition.Cognition,38,43-102.

    46　——.1993.From rote learning to system building: acquiring verb morphology in children and connectionist nets.Cognition,48,21-69.

    47　Rosenblatt,F.1958.The perceptron:A probabilistic model for information storage and organization in the brain.Psychological Review,65,386-408.

    48　Rumelhart,D.,Hinton, G. and Williams, R.1986.Learning internal representations by error propagation. In: David　E.Rumelhart,James　L.McClelland　and　the　PDP　Research　Group,eds., Parallel Distributed Processing:Explorations　in the Microstructures of Cognition,Vol.1:Foundations. Cambridge,MA:MIT Press.

    49　Rumelhart,D.,James L.McClelland and the PDP Research Group,eds.1986.Parallel Distributed Processing.Explorations in the Microstructure of Cognition,Vol.1:Foundations.Cambridge,MA: MIT Press.

    50　Rumelhart,D.and McClelland,J.1986.On learning the past tenses of English verbs.In:James L. McClelland,David E.Rumelhart and the PDP Research Group,eds.,Parallel Distributed Processing: Explorations in the Microstructures of Cognition,Vol.2:Psychologicaland Biological Models. Cambridge,MA:MIT Press.

    51　Saffran,J.,Aslin,R.and Newport, E. 1996. Statistical learning by 8-month-old infants.Science,274, 1926-1928.

    52　Saffran,J.,Newport,E.,Aslin,R.,Tunick,R.and Barrueco,S.1997.Incidental language learning: Listening(and learning)out of the corner of your ear.Psychological Science,8,101-105.

    53　Spitzer,M.1999.The Mind within the Net.Cambridge,MA:MIT Press.

    54　Uttal,W.2001.The New Phrenology:The Limits of Localizing Cognitive Processes in the Brain. Cambridge,MA:MIT Press.

    55　Whorf,B.1956.Language,Thought,and Reality(edited by John Carroll).Cambridge,MA:MIT Press.

    56　陈鹰、彭聃龄，1994，汉字识别和认知的连接主义模型。In H.-W.Chang,J.-T.Huang,C.-W.Hue, and O.Tzeng,eds.,Advances in the Study of Chinese Language Processing.Vol.1,Taipei:National Taiwan University Press,211-240.

    57　李行德，1992，语法的心理现实性。《国外语言学》第3期，25-34页。

Shaird 2008-01-21 05:35 发表评论