ANTONIO SAN MARTÍN PIZARRO - PHD THESIS - UNIVERSITY OF GRANADA
LA REPRESENTACIÓN DE LA VARIACIÓN CONTEXTUAL MEDIANTE DEFINICIONES TERMINOLÓGICAS FLEXIBLES
Representing Contextual Variation by Means of Flexible Terminological Definitions
Supervisors: Pamela Faber Benítez and Pilar León Araúz
Full text (PDF)
Definitions are one of the most important components of any high-quality terminological resource as well as a privileged medium for knowledge representation since they offer a direct natural-language explanation of the content of a concept. The adequacy of the definitions thus largely determines the overall usefulness of the terminological resource for the user.
This study has been motivated by the observation that terminological definitions often do not meet the needs of users. This is partly due to certain preconceptions about the purpose of definitions as well as the nature of meaning itself. These preconceptions thus affect how terminological definitions are created.
Traditionally, defining a term has been seen as stating the necessary and sufficient characteristics that make up the meaning of the term. This approach, known as the Aristotelian definition (§3.2), presupposes the existence of a stable meaning, independent of the context in which the term is used. In addition, it is assumed that meaning (or semantic knowledge) is independent of world knowledge (or encyclopedic knowledge). The key premises on which the traditional approach to definitions is based have been refuted in the field of cognitive linguistics (§2.1): i) it is not possible to determine the necessary and sufficient features of a concept because conceptual boundaries are fuzzy; ii) concepts have prototypical features not shared by all members of the category; iii) it is not possible to make a distinction between semantic and encyclopedic knowledge, nor between semantic and pragmatic knowledge.
From a cognitive point of view, encyclopedic knowledge plays a central role in the study of meaning because concepts always appear embedded in frames, which are structures based on encyclopedic knowledge which attribute sense to concepts (Fillmore 1982a). Moreover, meaning is not considered a stable entity. It is constructed in each usage event in accordance with the context (§2.1.2). As a consequence, meaning and context are inseparable.
In this doctoral thesis, we apply these premises of cognitive linguistics to terminological definitions and present a proposal called the flexible terminological definition (§3.6). This consists of a set of definitions of the same concept made up of a general definition (in this case, one encompassing the entire environmental domain) along with additional definitions describing the concept from the perspective of the subdomains in which it is relevant.
Our proposal assumes that by eliminating the artificial boundaries between semantic and pragmatic knowledge, the representation of contextual variation in the terminological definition will no longer be a mere possibility. Given the ubiquity of context and its effects on cognition and language, the representation of the traits activated by concepts in accordance with the context becomes a necessity if one aspires to fully meet the user’s needs. This also entails the inclusion of prototypical characteristics in the definition, i.e. characteristics that are not always applicable to the concept, but which are relevant in a given context.
Similarly, encyclopedic knowledge in the terminological definition is no longer forbidden. It now forms an integral part of the definition. The role that the defined concept plays in the frames it activates should be, as far as possible, part of the definition. Our proposal is specifically based on frame-based terminology (§2.2), in addition to the theories of grounded cognition (§2.1.1), frame semantics (§184.108.40.206), prototype theory (§220.127.116.11), and the theory theory (§18.104.22.168).
We took as a starting point the application of frame-based terminology to the representation of specialized knowledge and to terminological definitions in EcoLexicon (a terminological knowledge base on the environment created in accordance with frame-based terminology). In fact, our proposal for a flexible definition was inspired by the recontextualization of EcoLexicon (§3.6.1), as a result of which conceptual maps only show the relevant information for the subdomain of the environment chosen by the user. Recontextualization in EcoLexicon represents contextual variation and avoids information overload, thus increasing knowledge acquisition by the users.
EcoLexicon follows the principle proposed by Meyer, Bowker, and Eck (1992: 159) according to which, for a terminological knowledge base to be truly useful, it must reflect the same conceptual organization as in the human brain. Since terminological definitions are a kind of knowledge representation (Faber 2002), in this doctoral thesis, we assume that the creation of terminological definitions should also be based on the organization of the human conceptual system.
Since context is a determining factor in the construction of the meaning of lexical units (including terms), we assume that terminological definitions can, and should, reflect the effects of context, even though definitions have traditionally been treated as the expression of meaning void of any contextual effect.
The main objective of this thesis is to analyze the effects of contextual variation on specialized environmental concepts with a view to their representation in terminological definitions. Specifically, we focused on contextual variation based on thematic restrictions (§22.214.171.124), i.e. how the different areas of knowledge comprising the vast domain of the environment conceptualize differently the same concepts, and how this can be reflected in the definition. One of the main fundamentals of our proposal is the notion in cognitive linguistics that lexical units only have meaning in real use events (§2.1.2). Outside of any use event, a lexical unit does not have any meaning, only semantic potential.
A term’s semantic potential is the raw material for its definition, not its object. The semantic potential is not the object because this would mean that defining a term would involve describing all the conceptual content that the term could activate. This is not viable since a term’s semantic potential corresponds to a vast, immeasurable quantity of information that is never fully activated in real events.
Lexical units have not only semantic potential, but also associated conventional and contextual constraints. These constraints cause some conceptual content to be activated more often than others, giving rise to what Croft and Cruse (2004: 110) call pre-meanings. Pre-meanings are conceptual units that appear between the semantic potential and the meaning in the conceptualization process. The object of the definition is thus a subset of the semantic potential that corresponds to a pre-meaning. The pre-meaning that becomes the object of a given definition depends on the contextual constraints applied to the definition. In all cases, this subset always corresponds to a portion of a single concept and the frames that it can activate. By contextual constraints, we mean any situational factors that affect meaning construction and, indirectly, the content of terminological definitions.
Given that the object of the definition (the pre-meaning) is an abstraction of the meanings that a lexical unit has under certain contextual constraints, we can state that the context associated with a pre-meaning is also a sort of abstraction from real contexts. As a consequence, we gave the name pre-context to the set of contextual restrictions that limit the semantic potential of a lexical unit in a relatively predictable way, giving rise to pre-meanings.
Context comprises the linguistic context, discursive context, sociocultural context, and spatial-temporal context. The pre-context for terminological definitions includes linguistic constraints, thematic constraints, cultural constraints, ideological constraints, and diachronic constraints.
Thematic constraints (i.e. discourse topic) allow for more accurate predictions about the way that the semantic potential of a given lexical unit is restricted than other contextual factors. Thematic constraints reduce the semantic potential of a lexical unit according to the topic at issue during a communicative act and the point of view taken. Our proposal of a flexible terminological definition relies on these types of constraint.
Domains, in terms of a knowledge field, allow for the systematic characterization of thematic constraints in terminological definitions. They can be understood as macroframes that guide knowledge organization and categorization in a given conceptual area. In this doctoral thesis, we have used a simplified version of the domain classification that was created specifically for EcoLexicon.
This work focuses on the phenomenon of contextual variation as opposed to lexical ambiguity (polysemy and homonymy). Contextual variation is the phenomenon that occurs when a concept does not always activate the same traits in use events and the relevance of these traits varies. For its part, lexical ambiguity is the phenomenon that occurs when a lexical unit is associated with more than one concept (Cruse 2011: 100).
To accomplish the objectives of this doctoral thesis, we conducted an empirical study (§5) consisting of the analysis of a set of contextually variable concepts and the creation of a flexible definition for two of them. Each of these two concepts presented different contextual profiles.
To select the concepts to be analyzed, a terminological extraction was performed on 14 corpora of different environmental subdomains, specifically compiled for this doctoral thesis (§126.96.36.199). The extraction was limited to simple nouns, and the results were compared so as to retain only those terms appearing (with a set frequency) in more than three domains. Polysemic terms were discarded manually.
To extract the knowledge needed for the conceptual analysis and the writing of the flexible terminological definitions, the methodology of frame-based terminology (with certain additions) was followed (§4.2.2). This methodology consists of a combined top-down and bottom-up approach. The top-down method includes mainly the analysis of definitions from other terminological resources, whereas the bottom-up approach comprises corpus analysis.
For more efficient knowledge extraction from the corpora, we employed hypernymic knowledge patterns (Meyer 2001: 290) coded as word-sketches for SketchEngine. This allows for the extraction of superordinate concept candidates for the choice of genus in definitions. Moreover, we created a word-sketch for the extraction of contextonyms, which are the lexical units that tend to co-occur with a given lexical unit in linguistic contexts (Ji, Ploux, and Wehrli 2003; Ji and Ploux 2003). In this work, the analysis of contextonyms was used to determine the semantic traits activated by a concept in a given domain.
As a result of the first part of our empirical study (the analysis of all the terms in our working list) (§5.1), we divided our notion of domain-dependent contextual variation into three different phenomena: i) modulation (similar to Cruse’s modulation (§188.8.131.52.2)); ii) perspectivization (related to Cruse’s ways-of-seeing (§184.108.40.206.3)); iii) subconceptualization (akin to Cruse’s microsenses and local sub-senses (§220.127.116.11.4)). These phenomena are additive in that all concepts experience modulation, some concepts also undergo perspectivization, and finally, a small number of concepts are additionally subjected to subconceptualization.
Modulation (§5.2.1) is the type of contextual variation that only alters minor characteristics of a concept which are neither necessary nor prototypical. These alterations are not represented in a terminological definition. For its part, perspectivization (§5.2.2) results in the change in the level of prototypicality of certain traits for a concept in relation to the general environmental premeaning. Finally, subconceptualization (§5.2.3) is the type of contextual variation in which the extension of the concept in relation to the general environmental premeaning is modified.
In the second part of our empirical study (§5.4), we created two flexible terminological definitions, one for a concept with subconceptualizations (POLLUTANT) and another for a concept with perspectives (CHLORINE). In this section, we presented guidelines on how to build them, from the extraction of knowledge to the actual writing of the definition. These guidelines ensure that the definition actually reflects how the defined concept is construed in different environmental domains, which might differ from the viewpoint adopted in the environment as a whole or other environmental subdomains.
This doctoral thesis contributes to the improvement of the quality of terminological definitions because, with our approach, the user is presented with a definition tailored to the domain that he/she has chosen, thus multiplying the probabilities that the definition will offer him/her the information he/she needs. Furthermore, flexible terminological definitions provide a knowledge representation that better resembles the human conceptual system than traditional terminological definitions. As a consequence, a flexible definition not only provides more relevant information, but it also accomplishes this in a way that potentially facilitates and enhances knowledge acquisition.