Method | Explanation of FP-included speech samples | Example |
---|---|---|
TrueFP | Synthesized from text with ground-truth FPs (i.e., actually used) | I explain uh a theory. |
PredFP | Synthesized from text with predicted FPs | I uh explain a theory. |
Model | Explanation | α | β |
---|---|---|---|
Baseline | Trained without regularization | 0.0 | -- |
Proposed | Trained with regularization for probabilistically sampled FPs | 1.0 | 4.0 |
Speaker | Utterance | Sample | Ground-truth (Natural speech) |
Baseline | Proposed |
---|---|---|---|---|---|
A | Sample1 | TrueFP | |||
PredFP | -- | ||||
Sample2 | TrueFP | ||||
PredFP | -- | ||||
Sample3 | TrueFP | ||||
PredFP | -- | ||||
B | Sample1 | TrueFP | |||
PredFP | -- | ||||
Sample2 | TrueFP | ||||
PredFP | -- | ||||
Sample3 | TrueFP | ||||
PredFP | -- |