IntroductionThe LearnOneRule procedure used in CN2 is normally implemented as general to specific beam search. This is not the only way, but is reported to be the most efficient search technique. It generates complexes from the most general, then greedily specialize those complexes until desired performance is reached.TeminologyComplex is conjunction of attriutevalue specification. It forms the condition part in a rule, like "if condition then predict class".For the examples in sequential covering page, all the most general complexes are:
Specializing a complex is making a conjunction of the comlex with one more attributevalue pair. For example:
Algorithm
VisualizationBelow is an illustrative figure of beam search. In the first step, 2 best complexes are found, namely A=A1 and B=B2. None of them satisfy the threshold, then the next level complexes are expanded and found 2 best complexes, eg. A=A1 & D=D1 and B=B2 & A=A2. The procedure keeps going until we find a complex satisfies threshold.

There are two options for evaluation function: 1. Using Entropy. This is original option. The evaluation is based on Entropy of the set covered by that complex. Here is an example of a hypothesis covering 8 positive and 2 negative examples. p2 = P(negative) = 2/(2+8) = 0.2; Entropy =  p1 * log(p1)  p2 * log(p2) = 0.72. 2. Using Laplace Accuracy. Suppose that a complex covers y positive and n negative examples. Then its accuracy for positive class pridiction is estimated as: Advantage of Laplace Accuracy function over Entropy is when consider 2 hypothesis: One with 100+ and 1, one with 3+ and 0. Using Entropy, the latter (with Entropy = 0) is considered to be better. However, Laplace accuracy of the (100+,1) is 99%, while the (3+,0) is 80%. It states that the former is better than the latter. Using Laplace Accuracy, complex performance is evaluated as a hypothesis with the classified class is the majority it covers. 