ARTIFICIAL INTELLIGENCE- JNTUK R19- UNIT4- Knowledge Representation
4.1. KNOWLEDGE REPRESENTATION
INTRODUCTION:
Knowledge representation is an important issue in both cognitive science & AI. In cognitive science, it is concerned with the way in which information is stored & processed by humans, while in AI, the main focus is on storing knowledge or information in such a manner that programs can process it & achieve human intelligence. In AI, knowledge representation is an important area because intelligent problem solving can be achieved and simplified by choosing an appropriate knowledge representation technique. Representing knowledge in some specific ways makes certain problems easier to solve. The fundamental goal of knowledge representation is to represent knowledge in a manner that facilitates the process of inferencing (drawing conclusions) from it.
Any knowledge representation system should possess properties such as learning, efficiency in acquisition, representational adequacy & inferential adequacy.
·Learning: refers to a capability that acquires new
knowledge, behaviours, understanding etc.
·Efficiency
in acquisition: refers to the ability to acquire new knowledge using
automatic methods wherever possible rather than relying on human intervention.
·Representational adequacy: refers to ability to
represent the required knowledge.
·Inferential
knowledge: refers to the ability of manipulating knowledge to produce
new knowledge from the existing one.
APPROACHES TO KNOWLEDGE
REPRESENTATION:
Knowledge structures represent objects, facts, relationships & procedures. The main function of these knowledge structures is to provide expertise and information so that a program can operate in an intelligent way. Knowledge structures are usually composed of both traditional & complex structures such as semantic network, frames, scripts, conceptual dependency etc.
1.
Relational Knowledge: Relational knowledge
comprises objects consisting of attributes & associated values. The
simplest way of storing facts is to use a relational method. In this method,
each fact is stored in a row of a relational table as done in relational
database. A table is defined as a set of data values that is organized using a
model of vertical columns identified by attribute names & horizontal rows
identified by the corresponding values.
From the above table we can easily obtain the answers of the following queries:
·What is the age of
John?
·How much does Mary
earn?
·What is the qualification of Mike?
However, we cannot obtain answers for queries such as “Does a person having PhD qualification earn more?” So, inferencing new knowledge is not possible from such structures.
2.
Inheritable
Knowledge: In the inheritable knowledge approach, all data must be
stored into a hierarchy of classes. Elements inherit values from other members
of a class. This approach contains inheritable knowledge which shows a relation
between instance and class, and it is called instance relation. Every individual
frame can represent the collection of attributes and its value. Objects and
values are represented in Boxed nodes. We use Arrows which point from objects
to their values. For example
3.
Inferential Knowledge:
Inferential capability can be achieved if knowledge is represented in the form
of formal logic. It guaranteed correctness. For example, Let's suppose there
are two statements:
a.
Marcus is a man
b.
All men are mortal Then it can represent as; man(Marcus)
∀x = man (x)------------------------- > mortal (x)s
4.
Procedural Knowledge: It is encoded in the form
of procedures which carry out specific tasks based on relevant knowledge. For
example, an interpreter for a programming language interprets a program on the
basis of the available knowledge regarding the syntax & semantics of the language.
The advantages of this approach are that domain-specific knowledge can be
easily represented & side effects of actions may also be modeled. However
there is a problem of completeness (all cases may not be represented) &
consistency (all deductions may not be correct).
KNOWLEDGE REPRESENTATION
USING SEMANTIC NETWORK:
The basic idea applied behind using semantic network is that the meaning of concept is derived from its relationship with other concepts; the information is stored by interconnecting nodes with labeled arcs. For example, consider the following knowledge:
·Every
human, animal & birds are living things who can breathe & eat. All
birds can fly. Every man & woman are human who have 2 legs. A cat has fur
& is an animal. All animals have skin & can move. A giraffe is an
animal & has long legs & is tall. A parrot is a bird & is green in colour.
We can represent such knowledge using a structure called as Semantic Network (or) Semantic Net. It is conveniently represented in a graphical notation where nodes represent concepts or objects & arcs represent relation between 2 concepts. The relations are represented by bold directed links.
·isa: This
relation connects 2 classes, where one concept is a kind or subclass of the
other concept. For example, “Man isa Human” means Man is a subclass of the
Human class.
·inst: This relation relates
specific members of a class, such as John is an instance of Man.
Fig: Knowledge Representation using Semantic Network
Other relations such as {can, has, colour, height} are known as property relations which are been represented by dotted lines. For example, the query “Does a parrot breathe?” can be easily answered as ‘Yes’ even though this property is not associated directly with parrot. It inherits this property from its super class named Living_Thing.
Fig: Concepts connected with Prop Links
Inheritance in Semantic Net:
Hierarchical structure of knowledge representation allows knowledge to be stored at the highest possible level of abstraction which reduces the size of the knowledge base. Since semantic net is stored as hierarchical structure, the inheritence mechanism is in-built & facilitates inferencing of information associated with nodes in semantic net.
Algorithm:
Input: Object & property to be found from Semantic Net:
Output: returns yes, if the object has the desired property else returns false. Procedure:
·Find an object in the semantic net;
·Found=False:
·While [(object ≠ root)
OR Found] Do
{
§If there is
an attribute attached with an object then Found = True: else {object = isa(object,class) or object=inst(object.class)}
};
·If Found=true then report ‘Yes’ else report ‘No’;
Semantic Net can be implemented in any programming language along with inheritance procedure implemented explicitly in that language. Prolog language is very convenient for representing an entire semantic structure in the form of facts & inheritance rules.
Prolog Facts: The facts in Prolog would be written as shown below
Table: Prolog Facts
Inheritance Rules in Prolog: We know that in
class hierarchy structure, a member subclass of a class is also a member of all
super classes connected through isa link. Similarly, an instance of a subclass
is also an instance of all super classes connected by isa link. Similarly, a property
of a class can be inherited by lower sub-classes.
Various queries can be answered by the above inheritance program as follows
Table: Various Queries for Inheritance Program
EXTENDED SEMANTIC NETWORKS
FOR KR:
Logic and semantic networks are two different formalisms that can be used for knowledge representations. Simple semantic network is represented as a directed graph whose nodes represent concepts or objects and arcs represent relationships between concepts or objects. It can only express collections of variable- free assertions. The English sentences “john gives on apple to mike and john and mike are human” may be represented in semantic network as shown
Fig: Semantic Net
Here ‘E’ represents an event which is an act of giving, whose actor is John, the object is an apple, and recipient is mike. It should be noted that semantic net can hold semantic information about situation such as actor of an event giving is john and object is apple, in the sentence john gives an apple to mike. The relationships in the network shown can be expressed in clausal form of logic as follows:
object(E, apple). action(E, give). recipient(E, mike). isa(john, human). isa(mike, human).
Predicate relations corresponding to labels on the arcs of semantic networks always have two arguments. Therefore, the entire semantic net can be coded using binary representation. Such representation is advantageous when additional information is added to the given system. For example, in the sentence john gave an apple to mike in the kitchen, it is easy to add location(E,kitchen) to the set of facts given above.
In first-order predicate logic, predicate relation can have n arguments, where n>=1. For example, the sentence john gives an apple to mike is easily represented in predicate logic by give(john, mike, apple). Here , john, mike, and apple are arguments, while give represents a predicate relation. The predicate logic represention has greater advantages compared to semantic met representation as it can express general propositions, in addition to simple assertions. For example, the sentence john gives an apple to everyone he likes is expressed in predicate logic by clause as follows:
give(john, X, apple) ← likes (john, X)
Here, the symbol X is a variable representing any individual. The arrow represents the logical connective implied by. The left side of ← contains conclusion(s), while the right side contains condition(s).
Despite all the advantages, it is not convenient to add new
information in an n-ary representation of predicate logic. For example, if we
have 3-ary relationship give(john, mike, apple) representing john gives an apple to mike and to
capture an additional information about kitchen in the sentence john gave an appe to mike in the kitchen,
we would have to replace the 3-ary representation give(john, mike, apple) with a new 4-ary representation such as give(john, mike, apple, kitchen).
Further, a clause in logic can have several conditions, all of which must hold for the conclusion to be true. For example,
·The
sentence if john gives something he lkes
to a person, then he also likes that person can be expressed in clausal
representation in logic as
likes(john,X)←give(john,X,Y),likes(john,Y)
·The
sentence every human is either male or
female is expressed by the following clause
male(X), female(X) ← human(X)
In coventional semantic network, we cannot express clausal form of logic. To overcome this, R Kowalski and his colleagues (1979) proposed an Extended Semantic Network(ESNet) that combines the advantages of both logic and semantic networks. ESNet can be interpreted as a variant syntax for the clausal form of logic. It has the same expressive power as that of predicate logic with well-defined semantics, inference rules, and a procedural interpretation. It also incorporates the advantages of using binary relation as in semantic network rather than n-ary relations of logic.
Binary predicate symbols in clausal logic are represented by labels on arcs of ESNet. An atom of the form love(john.mary) is an arc labeled as love with its two end nodes representing john and mary. The direction of the arc (link) indicates the order of the arguments of the predicate symbol which labels the arc as follows:
Conclusions and conditions of clausal form are represented in ESNet by different kinds of arcs. The arcs denoting conditions are drawn with dotted arrow lines. These are called denial links ( > ), while the arcs
denoting
conclusions are drawn with continuous arrow lines. These are known as assertion
links( → ). For example, the clausal representation grandfather( X,Y) ← father(X, Z), parent(Z, Y) for grandfather in
logic can be represented in ESNet as given below. Here, X and Y are variables; grandfather(X, Y) is the
consequent(conclusion), and father(X, Z) and
parent(Z, Y) are the antecedents
(conditions).
Fig: ESNet Representation
Similarly,
the causal rule male(X), female(X) ←
human(X) can be represented using binary representation as
isa(X, male), isa(X, female) ß isa(X, human).
Fig: ESNet Representation
Inference Rules:
·The
representation of the inference for every action of giving, there is an action of taking
in clausal logic is action(f(X)), take) ← action(X, give).
The interpretation of this rule is that the event of taking action is a funtion of the event of giving action. In the ESNet representation, functional terms, such
as f(X), are represented by a single node. The representation of the statement action(f(X)), take) ß action(X, give).
Fig: ESNet Representation
·The
inference rule that an actor who performs a taking
action is also the recipient of
this action and can be easily represented in clausal logic; ESNet as given
below. Here , E is a variable representing an event of an action of taking.
recipient(E, X) ß action(E, take), actor(E,X)
Fig: ESNet Representation
·The
contradiction in ESNet can be represented as shown. Here P part_of X is conclusion
and P part_of Y is condition, where Y is
linked with X via isa link. Such kind of representation is
contradictory and hence there is a contradiction in ESNet.
Fig: Contradiction in ESNet
Deduction in Extended Semantic Networks:
·Forward reasoning inference mechanism (bottom-up appoach)
·Backward reasoning inference mechanism (top-down approach)
Forward reasoning inference mechanism:
Given an ESNet, apply the following reduction (resolution) using modus ponen rule of logic {i.e., given (A←B) and B, then conclude A}. For example consider the following set of clauses: isa(X,human)←isa(X,man)
isa(john,man)
Fig: Forward reasoning inference mechanism
Backward reasoning inference mechanism:
In this mechanism, we can prove a conclusion (or) goal from a given ESNet by adding the denial of the conclusion to the network and show that the resulting set of clauses in the network gives contradiction.
Fig: Backward reasoning inference mechanism
After adding denial link in ESNet, we get the reduction in ESNet as follows
Fig: Reduction of ESNet
KNOWLEDGE REPRESENTATION
USING FRAMES:
A frame is a collection of attributes (usually called slots) and associated values (and possible constraints on values) that describes some entity in the world.
Frames as Sets and Instances:
Set theory provides a good basis for understanding frame system. Each frame represents either a class (a set) or an instance (an element of a class).The frame system is shown as
In this example, the frames Person, Adult-Male, ML-Baseball-Player, Fielder, and ML-Baseball-Team are all classes. The frames Pee-Wee-Reese and Brooklyn-Dodgers are instances.
Hence, the isa relation is used to define the subset relation. The set of adult males is a subset of the set of people. The set of baseball players is a subset of the set of adult males, and so forth.
Our instance relation corresponds to the relation element-of. Pee-Wee-Reese is an element of the set of fielders. Thus he is also an element of all of the supersets of fielders, including baseball players and people.
Both the isa and instance relations have inverse attributes, which we call subclasses and all-instances. Because a class represents a set, there are two kinds of attributes that can be associated with it. There are attributes about the set itself, and there are attributes that are to be inherited by each element of the set. We indicate the difference between these two by prefixing the later with an asterisk (*).
For example, consider the class ML-Baseball-Player. We have shown only two properties of it as a set: It is a subset of the set of adult males. And it has cardinality 624. We have listed five properties that all ML- Baseball-Player have (height, bats, batting-average, team, and uniform-color), and we have specified default values for the first three of them. By providing both kinds of slots, we allow a class both to define a set of objects and to describe a prototypical object of the set.
We can view a class as two things simultaneously. A subset (isa) of a large class which also contains its elements and instances of class subsets from which it inherits it set-level properties.
To distinguish between two classes whose elements are individual entities and Meta classes (special classes) whose elements are themselves classes. A class in now an element of a class as well as subclass of are one or own class. A class inherits the property from the class of which it is an instance. A class passes inheritable properties down from its super classes to its instances.
4.2.
ADVANCED KNOWLEDGE REPRESENTATION TECHNIQUES
CONCEPTUAL DEPENDENCY (CD):
Conceptual dependency is a theory of how to represent the kind of knowledge about events that is usually contained in natural language sentences. The goal is to represent the knowledge in a way that
·Facilitates drawing
inferences from the sentences.
·Is
independent of the language in which the sentences were originally stated.
Representing Knowledge:
Because of the above two concerns mentioned, the CD representation of a sentence is built not out of primitives corresponding to the words used in the sentence, but rather out of conceptual primitives that can be combined to form the meanings of words in any particular language. It was first propose by SCHANK. Unlike Semantic nets, CD provides both a structure and a specific set of primitives, out of which representations of particular pieces of information can be constructed. For ex, we can represent the sentence.
“I gave the man a book”
where the symbols have the following meanings:
·Arrows indicates direction of dependency.
·Double arrow indicates two way links between actor
and action.
·P indicates past
tense.
·ATRANS is one of the primitive acts used by the
theory. It indicates transfer of possession.
·indicates the object case relation.
·R indicates the recipient case relation
In CD, representations of actions are built from a set of primitive acts. Examples of Primitive Acts are:
ATRANS – Transfer of an abstract
relationship (Eg: give) PTRANS –
Transfer of the physical location of an object (Eg: go) PROPEL – Application of a physical force to an object (Eg: push) MOVE – Movement of a body
part by its owner (Eg: kick) GRASP–
Grasping of an object by an actor (Eg: clutch)
INGEST – Ingestion of an object by an animal (Eg: eat)
EXPEL – Expulsion of something from the
body of an animal (Eg: cry) MTRANS–
Transfer of mental information (Eg:
tell)
MBUILD – Building new information out of old (Eg: decide)
SPEAK –
production of sounds (Eg: say)
ATTEND – Focusing of a sense organ toward a stimulus (Eg: listen)
A Second set of CD building blocks is the set of allowable dependencies among the conceptualizations described in a sentence. There are four primitive conceptual categories from which dependency structures can be built. These are
ACTs Actions
PPs Objects (picture producers)
AAs Modifiers of actions (action aiders)
Pas Modifiers of PPs (picture aiders)
Rule 1: describes the relationship between an actor and the event he or she causes. This is a two-way dependency since neither actor nor event can be considered primary. The letter p above the dependency link indicates past tense.
Ex: John ran.
Rule 2: describes the relationship between a PP and a PA that is being asserted to describe it. Many state descriptions, such as height, are represented in CD as numeric scales.
Ex: John is tall.
Rule 3: describes the relationship between two PPs, one of which belongs to the set defined by the other.
Ex: John is a Doctor
Rule 4: describes the relationship between a PP and an attribute that has already been predicated of it. The direction of the arrow is toward the PP being described.
Ex: A nice boy.
Rule 5: describes the relationship between two PPs, one of which provides a particular kind of information about the other. The three most common types of information to be provided in this way are possession (shown as POSS-BY), location (shown as LOC), and physical containment (shown as CONT). The direction of the arrow is again toward the concept being described.
Ex: John’s dog.
Rule 6: describes the relationship between an ACT and the PP that is the object of that ACT. The direction of the arrow is toward the ACT since the context of the specific ACT determines the meaning of the object relation.
Ex: John pushed the cart.
Rule 7: describes the relationship between an ACT and the source and the recipient of the ACT.
Ex: John took the book from Mary.
Rule 8: describe the relationship between an ACT and the instrument with which it is performed. The instrument must always be a full conceptualizations (i.e., it must contain an ACT), not just a single physical object.
Ex: John ate ice cream with a spoon.
Rule 9: describe the relationship between an ACT and its physical source and destination.
Ex: John fertilized the field.
Rule 10: describe the relationship between a PP and a state in which it started and another in which it ended.
Ex: The Plants grew
Rule 11: describe the relationship between one conceptualization and another that causes it. Notice that the arrows indicate dependency of one conceptualization on another and so point in the opposite direction of the implication arrows. The two forms of the rule describe the cause of an action and the cause of a state change.
Ex: Bill shot Bob.
Rule 12: describes the relationship between a conceptualization and the time at which the event it describes occurred.
Ex: John ran yesterday.
Rule 13: describes the relationship between one conceptualization and another that is the time of the first. The example for this rule also shows how CD exploits a model of the human information processing system; see is represented as the transfer of information between the eyes and the conscious processor.
Ex: While going home, I saw a frog
Rule 14: describes the relationship between a conceptualization and the place at which it occurred.
Ex: I heard a frog in the woods.
The set of conceptual tenses proposed by SCHANK
includes
·
p – Past
·
f – Future
·
t – Transition
·
ts– start transition
·
tf– finished transition
·
k – continuing
·
? – Interrogative
·
/ – Negative
·
Delta – Timeless
·
c – Conditional
SCRIPT STRUCTURE|
A script is a structure that describes a stereotyped sequence of events in a particular context. A script consists of a set of slots. Associated with each slot may be some information about what kinds of values it may contain as well as a default value to be used if no other information is available. Script and Frame structures are identical. The important components of a Script are:
·Entry Conditions: these must be satisfied
before events in the script can occur.
·Result: Conditions that will, in general, be true after
the events described in the script have occurred.
·Props: Slots representing objects that are involved in
the events described in the script.
·Roles: Persons involved in the events.
·Track:
Variations on the script. Different tracks of the same script will share many
but not all components.
·Scenes: The
actual sequences of events that occur. Events are represented in conceptual
dependency formalism.
Example 1: Going to a Theatre
Example 2: Going to a Restaurant
xample 3: Robbery in Bank
Advantages:
·Ability to predict
events.
·A single
coherent interpretation may be build up from a collection of observations.
Disadvantage:
·Less general than
frames.
CYC THEORY:
The CYC is a theory designed for describing the world knowledge (commonsense knowledge) to be useful in AI applications. The CYC is more comprehensive, where as CD is more specific theory for representing events. CYC is invented by Lenat & Guha for capturing commonsense knowledge. The CYC structure contains representations of events, objects, attitudes, space, time, motion etc and tends to huge in structure. CYC contains large Knowledge Bases (KBs). Some of the reasons for large KBs are as follows:
1.
Brittleness: Specialized knowledge bases are
brittle. It is hard to encode new situations and there is degradation in the
performance. Commonsense-based knowledge bases should have a firmer foundation.
2.
Form & Content: Knowledge representation
so far seen may not be sufficient for AI applications where main focus is
comprehension. Commonsense strategies could point out where difficulties in
content may affect the form and temporarily focus on content of KBs rather than
on their form.
3.
Shared Knowledge: Small knowledge base
system should allow greater communication among themselves with common bases
and assumptions.
It is a huge task to build such a large KB. There are some methods & languages in AI which can be used for acquiring this knowledge automatically. Special language based on frame based system is called CYCL using CYC knowledge is encoded. CYCL generalizes the notion of inheritance so that properties can be inherited along any link rather than only “isa” and “instance” links. In addition to frames, CYCL contains a constraint language that allows the expression of arbitrary first-order logical expressions.
CASE GRAMMARS:
Case Grammar theory was proposed by the American Linguist Charles J. Fillmore in 1968 for representing linguistic knowledge that removed the strong distinction between syntactic and semantic knowledge of a language. He initially introduced 6 cases (called thematic cases or roles):
·
AGENTIVE (Agent)
·
OBJECTIVE (Object)
·
INSTRUMENTAL (Instrument)
·
DATIVE (which covers EXPERIENCER)
·
FACTIVE (which covers result of an action)
·
LOCATIVE (Location of an action)
The ultimate goal of case grammar theory was to extract deep meanings of sentences & express in the form of cases mentioned above. Different meanings would lead to different case structures, but different syntactic structures with same meaning would map to similar structure. For example, in the sentences, the door was broken by John with hammer, using hammer John broke the door, John broke the door with hammer, the hammer (instrument), John (actor) and the door (object) play the same semantic roles in each of the sentences. Here the act is of “breaking of door” and will have the same case frame.
The case frame contains semantic relation rather than syntactic ones. The semantic roles such as agent, action, object & instrument are extracted from the sentence straightway & stored in case frame which represents semantic form of a sentence. For example, the sentences such as John ate an apple and An apple was eaten by John, will produce the same case frame. Some of optional cases such as TIME, BENEFICIARY, FROM_LOC (Source), TO_LOC (Destination), CO_AGENT & TENSE are introduced to capture more surface knowledge.
·
AGENT (instigator of action)
·
Object (Identity on which action is performed, For Eg: The door broke,
door is the object)
·
Dative (Animate entity affected by the action, For Eg: John killed
Mike, Mike is Dative case)
·
Experiencer (Animate subject in an active sentence
with no agent, For Ex: in sentences John cried, Mike laughs, John & Mike
fill Experiencer case)
·
Beneficiary (animate who has benefitted by action, For
Ex: in the sentence I gave an apple to Mike, Mike is beneficiary case whereas
in I gave an apple to Mike for Mary, Mary is Beneficiary case & Mike is
Dative case)
1. Location
(place of action, For Ex: The man was killed in garden, garden is location case
& Consider another Ex: John went to school from home, home is Source_Loc
case, whereas School is Destination_Loc case)
2. Instrument
(entity used for performing an action, For Ex: John ate an ice-cream with
spoon, spoon is instrument cae)
3. Co-Agent
(In some situations, where 2 people perform some action together, then second
person fills up Co_Agent case. For example, John and Mike lifted box or John
lifted box with Mike, Here, John fills Agent case & Mike fills Co_Agent case)
4. Time (The
time of action, For Ex: John went to market yesterday at 4 O’ clock, 4 O clock
is Time case)
5. Tense (Time of event, i.e..,
present, past or future)
Let us generate case frame for a sentence using case structure. The case frame for “John gave an apple to Mike in the kitchen or Mike was given an apple by John in the kitchen” is as follows
Case Frame |
|
Cases |
Values |
Action |
Give |
Agent |
John |
Objective |
Apple |
Beneficiary |
Mike |
Time |
Past |
Location |
Kitchen |
Table: Sample Case Frame
SEMANTIC WEB:
The Semantic Web provides a common framework that allows data and knowledge to be shared and reused across application, enterprise, and community boundaries.
The development of the Semantic web is a collaborative effort led by W3C with participation from a large number of researchers and industrial partners.
It defines standards for exchanging knowledge and for sharing conceptualizations.
Basic standards:
• RDF - Resource Description Framework, representation of information/data for the purpose of sharing
– Based on XML - Extensible Markup Language format - a general purpose specification for building custom markup languages
• OWL – a language for sharing vocabularies, sets of terms supporting web searches and other applications (a part of RDF)
àIn terms of the knowledge representation and reasoning SW lets us:
• Represent the knowledge
• Support search queries on knowledge and matches
• Support inference
Differences from other KR systems:
• Multiple sources of information and knowledge built for potentially different purposes
• Ambiguities may arise (the same term with two different meanings or two different terms with the same meaning)
• Dynamically changing environment – knowledge is added at fast pace so it should be robust to handle that
Benefits:
• knowledge integration,
• knowledge construction and storage
• knowledge searching
• And knowledge inference.
Semantic web: knowledge
integration:
Benefit of large amounts of information and knowledge on the web stands and falls on the data/knowledge integration
Technical challenges:
• Location: where the data/knowledge resides. The location of a Semantic Web resource is defined by the Uniform Resource Identifier (URI). A URI is simply a formatted string that identifies - via name, location, or any other characteristic - a resource. A standard web link is a form of a URI. URI allows us to label a Semantic Web source with a findable, unique location.
• Query Protocol: We need to interact with web resources. We need a communication language. The protocol for the Semantic Web uses standards such as http to form a flexible, easily understood, request/response exchange.
• Format: The data must be in a comprehensive and translatable format. The Semantic Web uses a standard format - the OWL Web Ontology Language. It is based on the Resource Description Framework (RDF) standard and Extensible Markup Language (XML).
Three steps of integration:
• Aggregation: – Combines the Semantic Web data sources into one unified, virtual data source.
• Mapping/Binding: – Associates similar references with each other and builds upon data in existing references. For example synonyms are identified.
• Rules: – Enables more sophisticated alignment and enrichment such as conditional logic that adds information based on the condition of other data
Knowledge construction and
storage:
Knowledge is stored in databases:
• Originally one database was used by one or many applications
• Multiple databases can be used by multiple applications
Problem:
• Applications take advantage of pieces of knowledge stored in databases
• Each application has its own view so that the knowledge in the database is fragmented and relations are lost
Semantic web approach: • Do not fragment data. Knowledge is built upon all data. A central term is enriched with relations, attributes, constraints defining its context. Relations and attributes are terms and define semantic links in between objects (entities) instead of just links as in the html.
• Restrictions:
• Restrictions define a term relative to other terms or limits. For example, an available contractor must have the constraint of availability. This adds to the useful vocabulary by adding a new term that is a condition of an existing term.
• Properties: • data elements that describe another data element. A person has an property called "livesAt" populated with her home address. Similarly, the person may have an additional property called "worksAt" populated with her work address. A data type address is hence used to describe two data elements – work and home address.
Collections: • abstractions built by combining terms together. The concept referred to by a particular term may be related to several other terms simultaneously. For instance, the term "contractor" could describe a collection of things called "skilled manual laborers" and also things called "construction workers"
Knowledge
construction and enrichment:
• Horizontal: – add new attributes and peer relationships. Examples include adding a birthday to person (new attribute) and adding a boss relationship between two workers (new peer relationships)
• Vertical: – via inheritance. Inheritance provides all the context of the base term plus whatever else we want to add. E.g. a person has a name, birthday, and sex. A worker is a type of person that also adds a workplace and boss. An update to person, such as adding a birthplace, automatically adds birthplace to all workers.
Knowledge construction and enrichment:
• Constraints: – new constraints can be introduced to further refine the context. For example parent is defined as a person with a daughter and/or a son. Constraints can get quite rich with logic such as tall person is a person with height greater than six feet.
• Distributed: – build on knowledge anywhere in your network. Knowledge and data that resides elsewhere are referenced uniquely with the help of its URI.
What knowledge can be expressed?
Commonality: • Declaring two data items equivalent simplifies data. This could occur in the structural ontology level in declaring person and contractor as the same. This also extends to specific instances. You can declare Joe Smith at a given URI equal to J Smith at another URI. So simply declaring two items equal adds knowledge.
• OWL uses: – "equivalentClass“ keyword to establish a connection between two unique URIs declaring them equal or synonyms.
• Inheritance: This adds knowledge in declaring that a data element is a form of another data element but not exactly equivalent. One element is a subset of the other. All people have names and addresses but not all people are e.g. managers. Thus we could declare a manager a type of person but still distinct from people. This clarifies a term without duplicating similar information.
• OWL: – Inheritance is implemented using subClassOf keyword
Knowledge inference:
Search identifies and returns answers that are explicitly represented in available knowledge resources
Inference addresses the questions for which the answer is not directly available and encoded
• This is were KR&R experience helps
Problems:
• Synonyms (are these two things equivalent?)
• Ambiguities - different contexts for the same term may lead to different and contradictory answers
The Semantic Web enables you to have knowledge your own way. It does not force you to adopt someone else's view of data or knowledge.
This is accomplished using:
• Integration of knowledge and existing resources
• Construction of a new knowledge
• Support for inferences
Basically the options you
have are:
• Start from scratch and build everything you need for your application • Tap on resources available that you can tailor to your needs, you reuse not only the information, you also can reuse and integrate the semantics
Comments
Post a Comment