Translate

Saturday, October 8, 2016

Instance-Based Methods

Instance-Based Methods
n  Instance-based learning:
n  Store training examples and delay the processing (“lazy evaluation”) until a new instance must be classified
n  Typical approaches
n  k-nearest neighbor approach
n  Instances represented as points in a Euclidean space.
n  Locally weighted regression
n  Constructs local approximation
n  Case-based reasoning
n  Uses symbolic representations and knowledge-based inference
The k-Nearest Neighbor Algorithm
n  All instances correspond to points in the n-D space.
n  The nearest neighbor are defined in terms of Euclidean distance.
n  The target function could be discrete- or real- valued.
n  For discrete-valued, the k-NN returns the most common value among the k training examples nearest to Xq.
Vonoroi diagram: the decision surface induced by 1-NN for a typical set of training examples
Discussion on the k-NN Algorithm
n  The k-NN algorithm for continuous-valued target functions
n  Calculate the mean values of the k nearest neighbors
n  Distance-weighted nearest neighbor algorithm
n  Weight the contribution of each of the k neighbors according to their distance to the query point xq
n  giving greater weight to closer neighbors
n  Similarly, for real-valued target functions
n  Robust to noisy data by averaging k-nearest neighbors
n  Curse of dimensionality: distance between neighbors could be dominated by irrelevant attributes.  
n  To overcome it, axes stretch or elimination of the least relevant attributes.
Case-Based Reasoning
n  Also uses: lazy evaluation + analyze similar instances
n  Difference: Instances are not “points in a Euclidean space”
n  Example: Water faucet problem in CADET (Sycara et al’92)
n  Methodology
n  Instances represented by rich symbolic descriptions (e.g., function graphs)
n  Multiple retrieved cases may be combined
n  Tight coupling between case retrieval, knowledge-based reasoning, and problem solving
n  Research issues
n  Indexing based on syntactic similarity measure,  and when failure, backtracking, and adapting to additional cases
Remarks on Lazy vs. Eager Learning
n  Instance-based learning:  lazy evaluation
n  Decision-tree and Bayesian classification:  eager evaluation
n  Key differences
n  Lazy method may consider query instance xq when deciding how to generalize beyond the training data D
n  Eager method cannot since they have already chosen global approximation when seeing the query
n  Efficiency: Lazy - less time training but more time predicting
n  Accuracy
n  Lazy method effectively uses a richer hypothesis space since it uses many local linear functions to form its implicit global approximation to the target function
n  Eager: must commit to a single hypothesis that covers the entire instance space
Genetic Algorithms
n  GA: based on an analogy to biological evolution
n  Each rule is represented by a string of bits
n  An initial population is created consisting of randomly generated rules
n  e.g., IF A1 and Not A2 then C2 can be encoded as 100
n  Based on the notion of survival of the fittest, a new population is formed to consists of the fittest rules and their offsprings 
n  The fitness of a rule is represented by its classification accuracy on a set of training examples
n  Offsprings are generated by crossover and mutation
Rough Set Approach
n  Rough sets are used to approximately or “roughly” define equivalent classes
n  A rough set for a given class C is approximated by two sets: a lower approximation (certain to be in C) and an upper approximation (cannot be described as not belonging to C)
n  Finding the minimal subsets (reducts) of attributes (for feature reduction) is NP-hard but a discernibility matrix is used to reduce the computation intensity
Fuzzy Set Approaches
n  Fuzzy logic uses truth values between 0.0 and 1.0 to represent the degree of membership (such as using fuzzy membership graph)
n  Attribute values are converted to fuzzy values
n  e.g., income is mapped into the discrete categories {low, medium, high} with fuzzy values calculated
n  For a given new sample, more than one fuzzy value may apply
n  Each applicable rule contributes a vote for membership in the categories
n  Typically, the truth values for each predicted category are summed

No comments:

Post a Comment

silahkan membaca dan berkomentar