# Byte Pair Encoding
BPE tokenization takes the vocabulary V containing ordered merges and applies them to new text in the same order as they occurred during vocabulary construction.
![[BPE Algorithm.png]]
## WordPiece
[[BERT]]'s vocabulary is constructed using WordPiece algorithm, a close variant of BPE. However, instead of merging the most frequent token bigram, each potential merge is scored based on the likelihood of an n-gram language model trained on a version of the corpus incorporating that merge.
---
## References
1. Byte Pair Encoding is Suboptimal for Language Model Pretraining https://arxiv.org/abs/2004.03720