Comparative Analysis of Transformer-Based Models for Text-To-Speech Normalization