Passive aggressive sliding window
Sequence tagging approach translated into a local classifier using Passive Aggressive margin-based online training algorithm.
The Passive aggressive algorithm does not react to correct classifications, even if they have a low score, but "aggressively" reacts to incorrect classifications, updating the model to correct the mistake.
A specific non-balanced configuration of the algorithm is used, mainly due to the distribution of classes which tends to be highly unbalanced in the entity extraction use case. The "OTHER" class often dominates the distribution, and with a balanced configuration, the rare classes would tend to be predicted too often, generating many false positives.
Similarly to the SVM case, the past predictions are used as context feature adding a soft constraint to the local prediction.
One of the advantages of this model is that it can be used in an online training pipeline. It can digest batches of data to perform partial training, so in Platform it's available for both Auto-ML and Online-ML experiments.
It converges faster than SGD and therefore it needs less epochs to train.