Prev
# uttid text baseline
basket_config_path: quality/tts/tortoise-baskets/dsat_to_en_5projects_cleared_721.json
data_meta: null
exp_name: yt4_baseline_lats
lang: en
meta:
  basket_generation_config:
    basket_lang: en
    basket_path: /home/polovick/v2v_diff/ml/projects/ai-voice-cloning/dsat-basket-extended-refs-dur.json
    batch_size: 1
    gpus: 2
    inference:
      diff_steps: 400
      exp: /home/polovick/v2v_diff/ml/projects/ai-voice-cloning/yt4_baseline_lats
      gpt_generate_args:
        do_sample: true
        num_return_sequences: 50
      override_conditioning_features:
        c50: 0.0
        pitch_std: 100.0
        snr: 100.0
      reranking_options:
        mode: MBR
        top_k: 1
      target_len_rate: 0.75
      vocoder: univnet
    num_workers: 1
    output_dir: dsat-cleared/yt4_baseline_lats__2024-07-30_03-28-45
    ticket: QUALITY-54
  basket_generation_git_hash: e0df79f1213deffbae77e909499694944e0746da
model_data_type: tts-cloning
ticket: QUALITY-54
version: 2024-07-30_03-28-45
encodec-inhousediff-sameinfer
basket_config_path: quality/tts/tortoise-baskets/dsat_to_en_5projects_cleared_721.json
data_meta: null
exp_name: yt4_langbycond_revgrad1_encodec-opt-bigbatch-diffcodes__diff_nonorm_2codes_pretrained
lang: en
meta:
  basket_generation_config:
    basket_lang: en
    basket_path: quality/tts/tortoise-baskets/dsat_to_en_5projects_cleared_721.json
    batch_size: 1
    gpus: 1
    inference:
      condition_sample_rate: 24000
      diff_on_codes: false
      diff_steps: 400
      diffusion_exp: /mount/s3/tts-binary-data-nb/dimdi-y/yt4_langbycond_revgrad1-contrastive_encodec-opt-bigbatch_normloudness_delaypattern8/diff_nonorm_2codes_pretrained
      exp: /mount/s3/tts-binary-data-nb/dimdi-y/yt4_langbycond_revgrad1_encodec-opt-bigbatch-diffcodes
      gpt_generate_args:
        do_sample: true
        num_return_sequences: 50
        prefix_allowed_tokens_fn: encodec_interleaved_layers
        repetition_penalty_activation_span: 4.0
        repetition_penalty_span: 50.0
        use_cache: true
      out_sample_rate: 24000
      override_conditioning_features:
        c50: 0.0
        pitch_std: 100.0
        snr: 100.0
      reranking_options:
        cdist_time_downsampling_factor: 2
        mode: MBR
        sakoe_chiba_radius: 24
        top_k: 1
      vocoder: bigvgan
    num_workers: 1
    output_dir: es_en_clean-dsat_mapping_encodec_mbrlat_inhdiff/yt4_langbycond_revgrad1_encodec-opt-bigbatch-diffcodes__diff_nonorm_2codes_pretrained__2024-09-18_17-36-55
    ticket: TTS-393
  basket_generation_git_hash: 75e464c6d886d92ef5b904f41695658ca2bc7545
model_data_type: tts-cloning
ticket: TTS-393
version: 2024-09-18_17-36-55
xcodec160k_1code_inhdiff20k
basket_config_path: quality/tts/tortoise-baskets/dsat_to_en_5projects_cleared_721.json
data_meta: null
exp_name: yt4_langbycond_revgrad1_xcodec-opt-bigbatch__diffusion_yt4_xcodec
lang: en
meta:
  basket_generation_config:
    basket_lang: en
    basket_path: quality/tts/tortoise-baskets/dsat_to_en_5projects_cleared_721.json
    batch_size: 1
    gpus: 1
    inference:
      condition_sample_rate: 24000
      diff_on_codes: true
      diff_steps: 400
      diffusion_exp: /mount/s3/tts-binary-data-nb/dimdi-y/diffusion_yt4_xcodec
      exp: /mount/s3/tts-binary-data-nb/dimdi-y/yt4_langbycond_revgrad1_xcodec-opt-bigbatch
      gpt_generate_args:
        do_sample: true
        num_return_sequences: 50
        repetition_penalty_activation_span: 4.0
        repetition_penalty_span: 50.0
        use_cache: true
      out_sample_rate: 24000
      override_conditioning_features:
        c50: 0.0
        pitch_std: 100.0
        snr: 100.0
      reranking_options:
        mode: MBR
        sakoe_chiba_radius: 16
        top_k: 1
      vocoder: bigvgan
    num_workers: 1
    output_dir: es_en_clean-dsat_mapping_xcodec_mbrlat_inhdiff/yt4_langbycond_revgrad1_xcodec-opt-bigbatch__diffusion_yt4_xcodec__2024-10-02_15-50-47
    ticket: TTS-393
  basket_generation_git_hash: 75e464c6d886d92ef5b904f41695658ca2bc7545
model_data_type: tts-cloning
ticket: TTS-393
version: 2024-10-02_15-50-47
wavtokeniser
basket_config_path: quality/tts/tortoise-baskets/dsat_to_en_5projects_cleared_721.json
data_meta: null
exp_name: yt4_langbycond_revgrad1_wavtokenizer-opt-bigbatch_t5enc
lang: en
meta:
  basket_generation_config:
    basket_lang: en
    basket_path: quality/tts/tortoise-baskets/dsat_to_en_5projects_cleared_721.json
    batch_size: 1
    gpus: 1
    inference:
      condition_sample_rate: 24000
      diff_on_codes: true
      diff_steps: 400
      exp: /mount/s3/tts-binary-data-nb/dimdi-y/yt4_langbycond_revgrad1_wavtokenizer-opt-bigbatch_t5enc
      gpt_generate_args:
        do_sample: true
        num_return_sequences: 50
        repetition_penalty_activation_span: 1.0
        repetition_penalty_span: 100.0
        use_cache: true
      out_sample_rate: 24000
      override_conditioning_features:
        c50: 0.0
        pitch_std: 100.0
        snr: 100.0
      reranking_options:
        mode: MBR
        sakoe_chiba_radius: 24
        top_k: 1
      vocoder: none
    num_workers: 1
    output_dir: es_en_clean-dsat_mapping_xcodec_mbrlat_codecdec/yt4_langbycond_revgrad1_wavtokenizer-opt-bigbatch_t5enc__2024-10-29_09-42-52
    ticket: TTS-392
  basket_generation_git_hash: 75e464c6d886d92ef5b904f41695658ca2bc7545
model_data_type: tts-cloning
ticket: TTS-392
version: 2024-10-29_09-42-52
wavtokeniser_normloud
basket_config_path: quality/tts/tortoise-baskets/dsat_to_en_5projects_cleared_721.json
data_meta: null
exp_name: yt4_langbycond_revgrad1_wavtokenizer-opt-bigbatch_t5enc
lang: en
meta:
  basket_generation_config:
    basket_lang: en
    basket_path: quality/tts/tortoise-baskets/dsat_to_en_5projects_cleared_721.json
    batch_size: 1
    gpus: 1
    inference:
      condition_sample_rate: 24000
      diff_on_codes: true
      diff_steps: 400
      exp: /mount/s3/tts-binary-data-nb/dimdi-y/yt4_langbycond_revgrad1_wavtokenizer-opt-bigbatch_t5enc
      force_reference_std: -23
      gpt_generate_args:
        do_sample: true
        num_return_sequences: 50
        repetition_penalty_activation_span: 1.0
        repetition_penalty_span: 100.0
        use_cache: true
      out_sample_rate: 24000
      override_conditioning_features:
        c50: 0.0
        pitch_std: 100.0
        snr: 100.0
      reranking_options:
        mode: MBR
        sakoe_chiba_radius: 24
        top_k: 1
      vocoder: none
    num_workers: 1
    output_dir: es_en_clean-dsat_mapping_xcodec_mbrlat_codecdec/yt4_langbycond_revgrad1_wavtokenizer-opt-bigbatch_t5enc__2024-10-29_09-58-08
    ticket: TTS-392
  basket_generation_git_hash: 75e464c6d886d92ef5b904f41695658ca2bc7545
model_data_type: tts-cloning
ticket: TTS-392
version: 2024-10-29_09-58-08
wavtokeniser_normloud_lesspenalty
basket_config_path: quality/tts/tortoise-baskets/dsat_to_en_5projects_cleared_721.json
data_meta: null
exp_name: yt4_langbycond_revgrad1_wavtokenizer-opt-bigbatch_t5enc
lang: en
meta:
  basket_generation_config:
    basket_lang: en
    basket_path: quality/tts/tortoise-baskets/dsat_to_en_5projects_cleared_721.json
    batch_size: 1
    gpus: 1
    inference:
      condition_sample_rate: 24000
      diff_on_codes: true
      diff_steps: 400
      exp: /mount/s3/tts-binary-data-nb/dimdi-y/yt4_langbycond_revgrad1_wavtokenizer-opt-bigbatch_t5enc
      force_reference_std: -23
      gpt_generate_args:
        do_sample: true
        num_return_sequences: 50
        repetition_penalty: 1.5
        repetition_penalty_activation_span: 5.0
        repetition_penalty_span: 100.0
        use_cache: true
      out_sample_rate: 24000
      override_conditioning_features:
        c50: 0.0
        pitch_std: 100.0
        snr: 100.0
      reranking_options:
        mode: MBR
        sakoe_chiba_radius: 24
        top_k: 1
      vocoder: none
    num_workers: 1
    output_dir: es_en_clean-dsat_mapping_xcodec_mbrlat_codecdec/yt4_langbycond_revgrad1_wavtokenizer-opt-bigbatch_t5enc__2024-10-29_13-45-07
    ticket: TTS-392
  basket_generation_git_hash: 75e464c6d886d92ef5b904f41695658ca2bc7545
model_data_type: tts-cloning
ticket: TTS-392
version: 2024-10-29_13-45-07
90
ORIGINALSPAVERSION-93a5cd-VoiceActing_es_F0/ORIGINALSPAVERSION-93a5cd-VoiceActing_es_F0_0004
Hello?
91
ORIGINALSPAVERSION-93a5cd-VoiceActing_es_F0/ORIGINALSPAVERSION-93a5cd-VoiceActing_es_F0_0005
What do you want?
92
ORIGINALSPAVERSION-93a5cd-VoiceActing_es_F0/ORIGINALSPAVERSION-93a5cd-VoiceActing_es_F0_0006
Well, we came here because you said the townhouse was open to everyone....
93
ORIGINALSPAVERSION-93a5cd-VoiceActing_es_F0/ORIGINALSPAVERSION-93a5cd-VoiceActing_es_F0_0007
Well, all workers.
94
ORIGINALSPAVERSION-93a5cd-VoiceActing_es_F0/ORIGINALSPAVERSION-93a5cd-VoiceActing_es_F0_0008
She is not.
95
ORIGINALSPAVERSION-93a5cd-VoiceActing_es_F3/ORIGINALSPAVERSION-93a5cd-VoiceActing_es_F3_0009
I can help if you need anything.
96
ORIGINALSPAVERSION-93a5cd-VoiceActing_es_F0/ORIGINALSPAVERSION-93a5cd-VoiceActing_es_F0_0010
Help with what?
97
ORIGINALSPAVERSION-93a5cd-VoiceActing_es_F3/ORIGINALSPAVERSION-93a5cd-VoiceActing_es_F3_0011
I don't know.
98
ORIGINALSPAVERSION-93a5cd-VoiceActing_es_F3/ORIGINALSPAVERSION-93a5cd-VoiceActing_es_F3_0012
I can teach.
99
ORIGINALSPAVERSION-93a5cd-VoiceActing_es_F0/ORIGINALSPAVERSION-93a5cd-VoiceActing_es_F0_0013
Of what?
100
ORIGINALSPAVERSION-93a5cd-VoiceActing_es_F3/ORIGINALSPAVERSION-93a5cd-VoiceActing_es_F3_0014
French, for example.
101
ORIGINALSPAVERSION-93a5cd-VoiceActing_es_F0/ORIGINALSPAVERSION-93a5cd-VoiceActing_es_F0_0015
De fran...
102
ORIGINALSPAVERSION-93a5cd-VoiceActing_es_F0/ORIGINALSPAVERSION-93a5cd-VoiceActing_es_F0_0016
Ouch!
103
ORIGINALSPAVERSION-93a5cd-VoiceActing_es_F0/ORIGINALSPAVERSION-93a5cd-VoiceActing_es_F0_0017
But here they don't even know how to read Spanish.
104
ORIGINALSPAVERSION-93a5cd-VoiceActing_es_F3/ORIGINALSPAVERSION-93a5cd-VoiceActing_es_F3_0018
Well, I can teach them.
105
ORIGINALSPAVERSION-93a5cd-VoiceActing_es_F2/ORIGINALSPAVERSION-93a5cd-VoiceActing_es_F2_0019
Yes, he has studied to be a teacher.
106
ORIGINALSPAVERSION-93a5cd-VoiceActing_es_F0/ORIGINALSPAVERSION-93a5cd-VoiceActing_es_F0_0020
Already.
107
ORIGINALSPAVERSION-93a5cd-VoiceActing_es_F0/ORIGINALSPAVERSION-93a5cd-VoiceActing_es_F0_0021
Come!
108
ORIGINALSPAVERSION-93a5cd-VoiceActing_es_F0/ORIGINALSPAVERSION-93a5cd-VoiceActing_es_F0_0022
Good morning.
109
ORIGINALSPAVERSION-93a5cd-VoiceActing_es_F4/ORIGINALSPAVERSION-93a5cd-VoiceActing_es_F4_0023
Good morning.
110
ORIGINALSPAVERSION-93a5cd-VoiceActing_es_F0/ORIGINALSPAVERSION-93a5cd-VoiceActing_es_F0_0024
Girls, there is going to be a small change.
111
ORIGINALSPAVERSION-93a5cd-VoiceActing_es_F0/ORIGINALSPAVERSION-93a5cd-VoiceActing_es_F0_0025
Amparo leaves with other students and you stay with the new teacher.
112
ORIGINALSPAVERSION-93a5cd-VoiceActing_es_F0/ORIGINALSPAVERSION-93a5cd-VoiceActing_es_F0_0026
Be good.
113
ORIGINALSPAVERSION-93a5cd-VoiceActing_es_F3/ORIGINALSPAVERSION-93a5cd-VoiceActing_es_F3_0027
But it is not necessary that...
114
ORIGINALSPAVERSION-93a5cd-VoiceActing_es_F0/ORIGINALSPAVERSION-93a5cd-VoiceActing_es_F0_0028
We're leaving, girls! Your students.
115
ORIGINALSPAVERSION-93a5cd-VoiceActing_es_F0/ORIGINALSPAVERSION-93a5cd-VoiceActing_es_F0_0029
Hello.
116
ORIGINALSPAVERSION-93a5cd-VoiceActing_es_F0/ORIGINALSPAVERSION-93a5cd-VoiceActing_es_F0_0030
Well, I'm Amelia.
117
ORIGINALSPAVERSION-93a5cd-VoiceActing_es_F4/ORIGINALSPAVERSION-93a5cd-VoiceActing_es_F4_0031
Where are you from?
118
ORIGINALSPAVERSION-93a5cd-VoiceActing_es_F0/ORIGINALSPAVERSION-93a5cd-VoiceActing_es_F0_0032
From... Madrid
119
ORIGINALSPAVERSION-93a5cd-VoiceActing_es_F5/ORIGINALSPAVERSION-93a5cd-VoiceActing_es_F5_0033
Madrid is very large.
Next