Instance-Based Methods
n Instance-based
learning:
n Store
training examples and delay the processing (“lazy evaluation”) until a new
instance must be classified
n Typical
approaches
n k-nearest
neighbor approach
n Instances
represented as points in a Euclidean space.
n Locally
weighted regression
n Constructs
local approximation
n Case-based
reasoning
n Uses
symbolic representations and knowledge-based inference
The k-Nearest Neighbor Algorithm
n All
instances correspond to points in the n-D space.
n The
nearest neighbor are defined in terms of Euclidean distance.
n The
target function could be discrete- or real- valued.
n For
discrete-valued, the k-NN returns the most common value among the k
training examples nearest to Xq.
Vonoroi diagram: the decision surface induced by 1-NN for a typical
set of training examples
Discussion on the k-NN Algorithm
n The
k-NN algorithm for continuous-valued target functions
n Calculate
the mean values of the k nearest neighbors
n Distance-weighted
nearest neighbor algorithm
n Weight
the contribution of each of the k neighbors according to their distance to the
query point xq
n giving
greater weight to closer neighbors
n Similarly,
for real-valued target functions
n Robust
to noisy data by averaging k-nearest neighbors
n Curse
of dimensionality: distance between neighbors could be dominated by irrelevant
attributes.
n To
overcome it, axes stretch or elimination of the least relevant attributes.
Case-Based Reasoning
n Also
uses: lazy evaluation + analyze similar instances
n Difference:
Instances are not “points in a Euclidean space”
n Example:
Water faucet problem in CADET (Sycara et al’92)
n Methodology
n Instances
represented by rich symbolic descriptions (e.g., function graphs)
n Multiple
retrieved cases may be combined
n Tight
coupling between case retrieval, knowledge-based reasoning, and problem solving
n Research
issues
n Indexing
based on syntactic similarity measure,
and when failure, backtracking, and adapting to additional cases
Remarks on Lazy vs. Eager Learning
n Instance-based
learning: lazy evaluation
n Decision-tree
and Bayesian classification: eager
evaluation
n Key
differences
n Lazy
method may consider query instance xq when deciding how to generalize
beyond the training data D
n Eager
method cannot since they have already chosen global approximation when seeing
the query
n Efficiency:
Lazy - less time training but more time predicting
n Accuracy
n Lazy
method effectively uses a richer hypothesis space since it uses many local
linear functions to form its implicit global approximation to the target
function
n Eager:
must commit to a single hypothesis that covers the entire instance space
Genetic Algorithms
n GA:
based on an analogy to biological evolution
n Each
rule is represented by a string of bits
n An
initial population is created consisting of randomly generated rules
n e.g.,
IF A1 and Not A2 then C2 can be encoded as 100
n Based
on the notion of survival of the fittest, a new population is formed to
consists of the fittest rules and their offsprings
n The
fitness of a rule is represented by its classification accuracy on a set of
training examples
n Offsprings
are generated by crossover and mutation
Rough Set Approach
n Rough
sets are used to approximately or “roughly” define equivalent classes
n A
rough set for a given class C is approximated by two sets: a lower
approximation (certain to be in C) and an upper approximation (cannot be
described as not belonging to C)
n Finding
the minimal subsets (reducts) of attributes (for feature reduction) is NP-hard
but a discernibility matrix is used to reduce the computation intensity
Fuzzy Set Approaches
n Fuzzy
logic uses truth values between 0.0 and 1.0 to represent the degree of
membership (such as using fuzzy membership graph)
n Attribute
values are converted to fuzzy values
n e.g.,
income is mapped into the discrete categories {low, medium, high} with fuzzy
values calculated
n For
a given new sample, more than one fuzzy value may apply
n Each
applicable rule contributes a vote for membership in the categories
n Typically,
the truth values for each predicted category are summed
No comments:
Post a Comment
silahkan membaca dan berkomentar