Naive Bayes Theorem
1 Conditional Probability - Finding an probability when an event is already occur
2 Independent Events - Tossing of coin
3 Dependent Events -Like shown in below image
P(A) -> probability of a is already picked up
P(B) -> Probability
Conditional Probability
These all are dependent evnet
P(B|A)-> probability of B when A is already occured P(A intersection B)/P(B)
P(A) is the priori of A (the prior probability, i.e. Probability of event before evidence is seen). The evidence is an attribute value of an unknown instance(here, it is event B).
P(A|B) is a posteriori probability of B, i.e. probability of event after evidence is seen
Lets see how the Naive bayes Algorithm Works on classification problem :
Lets's say we have n features { x1,x2,x3,x4...Xn} and output {y}
Now with respect to text data How Naive Data Behave :
F1 The | F2 Food | F3 Delicious | F4 Bad | Output |
1 | 1 | 1 | 0 | 1 |
1 | 1 | 0 | 1 | 0 |
0 | 1 | 0 | 1 | 0 |
0 | 1 | 1 | 0 | 1 |
0 | 0 | 0 | 1 | 0 |
This is often known as Zero Frequency. To solve this, we can use the smoothing technique. One of the simplest smoothing techniques is called Laplace estimation.
The assumptions made by Naive Bayes are not generally correct in real-world situations. In-fact, the independence assumption is never correct but often works well in practice.
On the other side naive Bayes is also known as a bad estimator, so the probability outputs are not to be taken too seriously.
Naive Bayes can handle missing data. Attributes are handled separately by the algorithm at both model construction time and prediction time. As such, if a data instance has a missing value for an attribute, it can be ignored while preparing the model, and ignored when a probability is calculated for a class value tutorial
It does this By taking the possible outcomes


Comments
Post a Comment