Anonymous submission to ISCA SSW13 (2025)
The style refining process over a series of sessions is defined as the "direction-cycle". The following table provides a clear understanding of I/O for the both comparative methods Actor-Guided(oracle) and Iterative(ours).
Conditions | Session 1 | Session 2 | Session 3 | ||||||
---|---|---|---|---|---|---|---|---|---|
Input | Output | Input | Output | Input | Output | ||||
Speech prompt | Direction 1 | Speech 1 | Speech prompt | Direction 2 | Speech 2 | Speech prompt | Direction 3 | Speech 3 | |
Actor-Guided(oracle) | Recorded-1 | ✗ | Guided-1 | Recored-2 | ✗ | Guided-2 | Recorded-3 | ✗ | Guided-3 |
Iterative (ours) | Recorded-0 | ✓ | Iterative-1 | Iterative-1 | ✓ | Iterative-2 | Iterative-2 | ✓ | Iterative-3 |
If you are having trouble listening to the audios, try refreshing the page.
Recorded-0 (Audio before being refined)
Transcription: Our tennis club advisor is pretty strict, you know. (Japanase sentence: テニス部の顧問が厳しいんだよね。
)
Direction-cycle1: Please begin with a slight exhale. To convey a tired feeling, add a bit of breath to your voice at the very start of your line and speak with a softer volume. (Japanase sentence: 少し息を吐くように始めてください。疲れた感じが伝わるように、発話の冒頭で声に少し息を混ぜて、声量を抑えて話しましょう。)
Recorded-1(refinement target) | Guided-1 | Iterative-1 | Iterative-0(speech prompt of Iterative-1) |
---|---|---|---|
Direction-cycle2: Let’s slow the tempo down a bit more. By inserting slight pauses between words and reducing your speaking pace, you’ll let the weariness come through in your voice. (Japanase sentence: テンポをもう少しゆっくりにしましょう。言葉の間に少し間を置いて、話すテンポを落とすことで、疲れが声に表れるようにしましょう。)
Recorded-2(refinement target) | Guided-2 | Iterative-2 | Iterative-1(speech prompt of Iterative-2) |
---|---|---|---|
Direction-cycle3: Let’s lower your endings. Weaken the ends of your phrases and speak as though you’re feeling a bit drained. (Japanase sentence: 語尾を落としましょう。フレーズの最後を弱くし、少し力が抜けたような感じで話してみてください。)
Recorded-3(refinement target) | Guided-3 | Iterative-3 | Iterative-2(speech prompt of Iterative-3) |
---|---|---|---|
Recorded-0 (Audio before being refined)
Transcription: Would you mind spending time with me? (Japanase sentence: 私と一緒にいるの、嫌かな…?)
Direction-cycle1: Let’s lower your voice slightly and speak a little more slowly to evoke a calm, sultry quality. (Japanase sentence: 声を少し低めにし、少しゆっくり話すことで、落ち着いた色っぽさを出してみてください。)
Recorded-1(refinement target) | Guided-1 | Iterative-1 | Iterative-0(speech prompt of Iterative-1) |
---|---|---|---|
Direction-cycle2: Please use a soft, breath-filled tone at a subdued volume. Pay special attention to gently lengthening the ends of your phrases. (Japanase sentence: 息を含んだ柔らかい口調で、抑えめな音量にしてみてください。特に語尾はやさしく伸ばすように意識してください。)
Recorded-2(refinement target) | Guided-2 | Iterative-2 | Iterative-1(speech prompt of Iterative-2) |
---|---|---|---|
Direction-cycle3: Let’s leave a bit of lingering resonance in your voice, speak as if you’re directly addressing someone, and take slightly longer pauses throughout. (Japanase sentence: 声に少し余韻を残すように、語りかける感じで全体的にもう少し間をとってみましょう。)
Recorded-3(refinement target) | Guided-3 | Iterative-3 | Iterative-2(speech prompt of Iterative-3) |
---|---|---|---|
Recorded-0 (Audio before being refined)
Transcription: I never thought you wouldn’t even be able to handle basic reporting and communication.
(Japanase sentence: 報連相すらできないとは思わなくてね。)
Direction-cycle1: Let’s lower your pitch just a bit and speak in a more composed tone. Overall, aim to convey a nuance of calmly stating your lines. (Japanase sentence: 声の高さをほんの少し低くして、少し落ち着いたトーンで話してください。全体として、冷静に言い放つようなニュアンスに近づけましょう。)
Recorded-1(refinement target) | Guided-1 | Iterative-1 | Iterative-0(speech prompt of Iterative-1) |
---|---|---|---|
Direction-cycle2: Let’s make each phrase end cleanly to keep your delivery consistent. Be especially careful not to let your endings become vague. (Japanase sentence: 各語句の最後をはっきりと切るようにして、言葉に一貫性を持たせてください。特に、語尾が曖昧にならないように気をつけてください。)
Recorded-2(refinement target) | Guided-2 | Iterative-2 | Iterative-1(speech prompt of Iterative-2) |
---|---|---|---|
Direction-cycle3: Let’s slow the pace down slightly and give each word a solid weight. Overall, balance your delivery so that the emphasis is distributed evenly. (Japanase sentence: 速度をややゆっくりにして、一言一言をしっかりと重みを持たせるように発話してください。全体として、強調する部分が均等になるように調整しましょう。)
Recorded-3(refinement target) | Guided-3 | Iterative-3 | Iterative-2(speech prompt of Iterative-3) |
---|---|---|---|
Recorded-0 (Audio before being refined)
Transcription: I’ve stumbled across quite the bargain. (Japanase sentence: なかなかの掘り出し物を見つけてしまった。)
Direction-cycle1: Begin at the start of the sentence and gradually raise your voice as you go. To convey a rising excitement, finish the last word at your highest pitch. (Japanase sentence: 文の始まりから終わりにかけて、声を少しずつ上げていってください。高揚感を出すために、最後の単語を一番高めの声で終えてください。)
Recorded-1(refinement target) | Guided-1 | Iterative-1 | Iterative-0(speech prompt of Iterative-1) |
---|---|---|---|
Direction-cycle2: At the very start of your line, take a slight inhale to create a subtly breathless effect. This will intensify the feeling of excitement. (Japanase sentence: 発話の最初で少し息を吸い込むようにして、少し息苦しい感じを演出してください。これによって、興奮した感じが強調されます。)
Recorded-2(refinement target) | Guided-2 | Iterative-2 | Iterative-1(speech prompt of Iterative-2) |
---|---|---|---|
Direction-cycle3: Let’s pick up the overall speaking speed. In particular, rush slightly in the second half to highlight the excitement of having found it. (Japanase sentence: 発話速度を全体的に速めてみてください。特に、後半の部分は少し急ぎながら話すことで、見つけた時の喜びを強調します。)
Recorded-3(refinement target) | Guided-3 | Iterative-3 | Iterative-2(speech prompt of Iterative-3) |
---|---|---|---|