Audio samples of linguistic-speech regularization


1. List of speech samples with filled pauses (FPs)

Method Explanation of FP-included speech samples Example
TrueFP Synthesized from text with ground-truth FPs (i.e., actually used) I explain uh a theory.
PredFP Synthesized from text with predicted FPs I uh explain a theory.


2. List of models

Model Explanation α β
Baseline Trained without regularization 0.0 --
Proposed Trained with regularization for probabilistically sampled FPs 1.0 4.0


3. Audio samples

Speaker Utterance Sample Ground-truth
(Natural speech)
Baseline Proposed
A Sample1 TrueFP
PredFP --
Sample2 TrueFP
PredFP --
Sample3 TrueFP
PredFP --
B Sample1 TrueFP
PredFP --
Sample2 TrueFP
PredFP --
Sample3 TrueFP
PredFP --