Skip to content

Multinomial Naive Bayes

Multinomial Naive Bayes is one of the two classic Naive Bayes variants used in text classification.

P(x_i | y) of features i appearing in sample belonging to class y is:

(Ny_i + alpha)/(Ny + alpha * n)


  • Ny_i = Sum(x_i) the number of times feature i appears in a sample of class y in the training set.
  • Ny = Sum(Ny_i) is the total count of all features for class y.

The training algorithm is typically very fast, and it is able to produce relatively good prediction performance when:

  • The training set is relatively small (dozens of samples per class).
  • Training and test data are internally well balanced (the different classes are equally represented in the data distribution).
  • Dataset are mainly made of equally sized documents.