改动日志

8月改动

  • 原代码没有指定训练epoch数,在train.py-109添加了max_epochs;运行代码需要高版本libstdcxx-ng,应该在requirements.txt中添加;新添参数max_epochs,默认为300;新添参数wandb_project代表wandb运行工程名;增加了seed参数,方便横向对比实验结果
  • 9.14——尝试在VPIDM中添加SiT作为模型结构,需要pip安装timm包。
    【9.17附】SiT要求输入尺寸固定为方形数据input_size*input_size,但为了让其适用于频域语音数据frame_length*n_fft,改用动态位置编码。最终导致参数量爆炸,batch_size设为1都会爆显存
    【9.18附】通过修改

9月改动

  • 【9.14】尝试在VPIDM中添加SiT作为模型结构,需要pip安装timm包。
  • 【9.17】SiT要求输入尺寸固定为方形数据input_size*input_size,但频域语音数据frame_length*n_fft,做了以下改动。
    1. 使用to_2tuple将原本的单参数input_size转变为双参数元组
    2. 重写PatchEmbed类,将部分参数扩展为元组,从而适应非方形数据
    3. 重写get_2d_sincos_pos_embed函数,基于动态计算的 H_patches 和 W_patches 生成了正确尺寸的 pos_embed,从而适应非方形输入。
    4. 在 unpatchify 函数中,你处理了不同高度和宽度补丁的重组,将 (H_patches, W_patches) 作为输入,使得模型能够正确将补丁恢复为非方形的完整图像。
    5. 在forward函数里增加了对输入数据转实数、转复数操作
  • 【9.23】通过修改参数初始化方法,使模型可以成功收敛,但效果不好
  • cm-75为depth=12, hidden_size=768, patch_size=2, num_heads=12, batch_size=6,sit_copy6

10月改动

对比表格

#
t= tensor([0.9600], device=’cuda:0′):
drift_real mean: 0.001028, std: 1.387510
drift min: -6.255919, max: 5.645583
drift_imag mean: -0.003041, std: 1.389403
drift_imag min: -5.551543, max: 5.848334
diffusion mean: 1.946401, std: nan
diffusion min: 1.946401, max: 1.946401
#
t= tensor([0.9200], device=’cuda:0′):
drift_real mean: 0.000242, std: 1.350083
drift min: -5.983840, max: 5.690376
drift_imag mean: -0.003913, std: 1.347464
drift_imag min: -5.779256, max: 5.787510
diffusion mean: 1.903549, std: nan
diffusion min: 1.903549, max: 1.903549
#
t= tensor([0.8800], device=’cuda:0′):
drift_real mean: 0.001425, std: 1.304999
drift min: -6.003798, max: 6.229289
drift_imag mean: -0.003935, std: 1.309390
drift_imag min: -5.515249, max: 5.315574
diffusion mean: 1.858913, std: nan
diffusion min: 1.858913, max: 1.858913
#
t= tensor([0.8400], device=’cuda:0′):
drift_real mean: -0.000629, std: 1.259740
drift min: -5.315304, max: 6.192216
drift_imag mean: 0.000887, std: 1.264473
drift_imag min: -5.324466, max: 4.936073
diffusion mean: 1.812459, std: nan
diffusion min: 1.812459, max: 1.812459
#
t= tensor([0.8000], device=’cuda:0′):
drift_real mean: 0.000784, std: 1.216656
drift min: -5.250953, max: 6.184686
drift_imag mean: -0.001661, std: 1.217426
drift_imag min: -5.093382, max: 5.483801
diffusion mean: 1.764160, std: nan
diffusion min: 1.764160, max: 1.764160
#
t= tensor([0.7600], device=’cuda:0′):
drift_real mean: 0.002392, std: 1.167682
drift min: -5.635061, max: 5.103218
drift_imag mean: 0.002525, std: 1.169464
drift_imag min: -4.827188, max: 4.963645
diffusion mean: 1.713991, std: nan
diffusion min: 1.713991, max: 1.713991
#
t= tensor([0.7200], device=’cuda:0′):
drift_real mean: 0.001672, std: 1.116133
drift min: -4.601490, max: 4.841025
drift_imag mean: -0.000539, std: 1.120077
drift_imag min: -4.573432, max: 5.087404
diffusion mean: 1.661936, std: nan
diffusion min: 1.661936, max: 1.661936
#
t= tensor([0.6800], device=’cuda:0′):
drift_real mean: 0.002910, std: 1.062951
drift min: -5.032801, max: 4.544854
drift_imag mean: -0.000708, std: 1.065635
drift_imag min: -4.604265, max: 4.528798
diffusion mean: 1.607982, std: nan
diffusion min: 1.607982, max: 1.607982
#
t= tensor([0.6400], device=’cuda:0′):
drift_real mean: 0.000675, std: 1.009798
drift min: -4.441285, max: 4.203805
drift_imag mean: 0.000950, std: 1.012471
drift_imag min: -4.139987, max: 4.678671
diffusion mean: 1.552119, std: nan
diffusion min: 1.552119, max: 1.552119
#
t= tensor([0.6000], device=’cuda:0′):
drift_real mean: 0.001295, std: 0.956149
drift min: -4.137105, max: 3.852574
drift_imag mean: 0.001996, std: 0.956204
drift_imag min: -4.118358, max: 4.533556
diffusion mean: 1.494342, std: nan
diffusion min: 1.494342, max: 1.494342
#
t= tensor([0.5600], device=’cuda:0′):
drift_real mean: 0.002181, std: 0.897933
drift min: -4.365409, max: 3.855125
drift_imag mean: 0.000585, std: 0.899173
drift_imag min: -3.638911, max: 4.058249
diffusion mean: 1.434645, std: nan
diffusion min: 1.434645, max: 1.434645
#
t= tensor([0.5200], device=’cuda:0′):
drift_real mean: 0.001265, std: 0.838747
drift min: -3.582977, max: 3.580992
drift_imag mean: 0.001077, std: 0.839164
drift_imag min: -3.670391, max: 3.394095
diffusion mean: 1.373023, std: nan
diffusion min: 1.373023, max: 1.373023
#
t= tensor([0.4800], device=’cuda:0′):
drift_real mean: 0.001136, std: 0.780881
drift min: -3.638036, max: 3.298007
drift_imag mean: 0.002820, std: 0.782339
drift_imag min: -3.280854, max: 3.344223
diffusion mean: 1.309467, std: nan
diffusion min: 1.309467, max: 1.309467
#
t= tensor([0.4400], device=’cuda:0′):
drift_real mean: 0.002041, std: 0.722704
drift min: -2.978718, max: 2.985824
drift_imag mean: 0.002030, std: 0.723352
drift_imag min: -3.132399, max: 2.797324
diffusion mean: 1.243960, std: nan
diffusion min: 1.243960, max: 1.243960
#
t= tensor([0.4000], device=’cuda:0′):
drift_real mean: 0.000621, std: 0.664624
drift min: -3.110819, max: 2.997027
drift_imag mean: 0.001773, std: 0.664240
drift_imag min: -2.646359, max: 2.830245
diffusion mean: 1.176469, std: nan
diffusion min: 1.176469, max: 1.176469
#
t= tensor([0.3600], device=’cuda:0′):
drift_real mean: 0.001700, std: 0.606232
drift min: -2.742868, max: 2.802910
drift_imag mean: -0.001700, std: 0.605417
drift_imag min: -2.734457, max: 2.802108
diffusion mean: 1.106941, std: nan
diffusion min: 1.106941, max: 1.106941
#
t= tensor([0.3200], device=’cuda:0′):
drift_real mean: 0.002795, std: 0.546745
drift min: -2.702860, max: 2.266898
drift_imag mean: -0.002274, std: 0.548131
drift_imag min: -2.467861, max: 2.577819
diffusion mean: 1.035286, std: nan
diffusion min: 1.035286, max: 1.035286
#
t= tensor([0.2800], device=’cuda:0′):
drift_real mean: 0.001690, std: 0.490185
drift min: -2.142412, max: 2.194646
drift_imag mean: -0.002413, std: 0.491714
drift_imag min: -2.091719, max: 2.043411
diffusion mean: 0.961359, std: nan
diffusion min: 0.961359, max: 0.961359
#
t= tensor([0.2400], device=’cuda:0′):
drift_real mean: 0.001244, std: 0.433069
drift min: -1.773319, max: 1.891073
drift_imag mean: -0.001816, std: 0.433641
drift_imag min: -1.945599, max: 1.986796
diffusion mean: 0.884932, std: nan
diffusion min: 0.884932, max: 0.884932
#
t= tensor([0.2000], device=’cuda:0′):
drift_real mean: 0.001100, std: 0.375882
drift min: -1.637365, max: 1.696351
drift_imag mean: -0.001723, std: 0.374826
drift_imag min: -1.775090, max: 1.532630
diffusion mean: 0.805636, std: nan
diffusion min: 0.805636, max: 0.805636
#
t= tensor([0.1600], device=’cuda:0′):
drift_real mean: 0.002833, std: 0.319433
drift min: -1.461934, max: 1.443037
drift_imag mean: -0.002973, std: 0.317190
drift_imag min: -1.577771, max: 1.452456
diffusion mean: 0.722879, std: nan
diffusion min: 0.722879, max: 0.722879
#
t= tensor([0.1200], device=’cuda:0′):
drift_real mean: 0.002107, std: 0.262938
drift min: -1.290313, max: 1.291217
drift_imag mean: -0.002558, std: 0.260862
drift_imag min: -1.150687, max: 1.123912
diffusion mean: 0.635657, std: nan
diffusion min: 0.635657, max: 0.635657
#
t= tensor([0.0800], device=’cuda:0′):
drift_real mean: 0.001805, std: 0.207426
drift min: -1.087441, max: 1.107205
drift_imag mean: -0.002439, std: 0.204845
drift_imag min: -0.857314, max: 0.864121
diffusion mean: 0.542166, std: nan
diffusion min: 0.542166, max: 0.542166
#
t= tensor([0.0400], device=’cuda:0′):
drift_real mean: 0.001622, std: 0.151578
drift min: -1.096648, max: 1.074418
drift_imag mean: -0.002582, std: 0.149009
drift_imag min: -0.656835, max: 0.631456
diffusion mean: 0.438765, std: nan
diffusion min: 0.438765, max: 0.438765
##################################################
文件: p232_161.wav, N=25
SI-SDR: 24.52938173290932
ESTOI: 0.9642957507514722
#
t= tensor([0.9600], device=’cuda:0′):
drift_real mean: 0.001028, std: 1.387510
drift min: -6.255919, max: 5.645583
drift_imag mean: -0.003041, std: 1.389403
drift_imag min: -5.551543, max: 5.848334
diffusion mean: 1.946401, std: nan
diffusion min: 1.946401, max: 1.946401
#
t= tensor([0.9200], device=’cuda:0′):
drift_real mean: 0.000242, std: 1.350083
drift min: -5.983840, max: 5.690376
drift_imag mean: -0.003913, std: 1.347464
drift_imag min: -5.779256, max: 5.787510
diffusion mean: 1.903549, std: nan
diffusion min: 1.903549, max: 1.903549
#
t= tensor([0.8800], device=’cuda:0′):
drift_real mean: 0.001425, std: 1.304999
drift min: -6.003798, max: 6.229289
drift_imag mean: -0.003935, std: 1.309390
drift_imag min: -5.515249, max: 5.315574
diffusion mean: 1.858913, std: nan
diffusion min: 1.858913, max: 1.858913
#
t= tensor([0.8400], device=’cuda:0′):
drift_real mean: -0.000629, std: 1.259740
drift min: -5.315304, max: 6.192216
drift_imag mean: 0.000887, std: 1.264473
drift_imag min: -5.324466, max: 4.936073
diffusion mean: 1.812459, std: nan
diffusion min: 1.812459, max: 1.812459
#
t= tensor([0.8000], device=’cuda:0′):
drift_real mean: 0.000784, std: 1.216656
drift min: -5.250953, max: 6.184686
drift_imag mean: -0.001661, std: 1.217426
drift_imag min: -5.093382, max: 5.483801
diffusion mean: 1.764160, std: nan
diffusion min: 1.764160, max: 1.764160
#
t= tensor([0.7600], device=’cuda:0′):
drift_real mean: 0.002392, std: 1.167682
drift min: -5.635061, max: 5.103218
drift_imag mean: 0.002525, std: 1.169464
drift_imag min: -4.827188, max: 4.963645
diffusion mean: 1.713991, std: nan
diffusion min: 1.713991, max: 1.713991
#
t= tensor([0.7200], device=’cuda:0′):
drift_real mean: 0.001672, std: 1.116133
drift min: -4.601490, max: 4.841025
drift_imag mean: -0.000539, std: 1.120077
drift_imag min: -4.573432, max: 5.087404
diffusion mean: 1.661936, std: nan
diffusion min: 1.661936, max: 1.661936
#
t= tensor([0.6800], device=’cuda:0′):
drift_real mean: 0.002910, std: 1.062951
drift min: -5.032801, max: 4.544854
drift_imag mean: -0.000708, std: 1.065635
drift_imag min: -4.604265, max: 4.528798
diffusion mean: 1.607982, std: nan
diffusion min: 1.607982, max: 1.607982
#
t= tensor([0.6400], device=’cuda:0′):
drift_real mean: 0.000675, std: 1.009798
drift min: -4.441285, max: 4.203805
drift_imag mean: 0.000950, std: 1.012471
drift_imag min: -4.139987, max: 4.678671
diffusion mean: 1.552119, std: nan
diffusion min: 1.552119, max: 1.552119
#
t= tensor([0.6000], device=’cuda:0′):
drift_real mean: 0.001295, std: 0.956149
drift min: -4.137105, max: 3.852574
drift_imag mean: 0.001996, std: 0.956204
drift_imag min: -4.118358, max: 4.533556
diffusion mean: 1.494342, std: nan
diffusion min: 1.494342, max: 1.494342
#
t= tensor([0.5600], device=’cuda:0′):
drift_real mean: 0.002181, std: 0.897933
drift min: -4.365409, max: 3.855125
drift_imag mean: 0.000585, std: 0.899173
drift_imag min: -3.638911, max: 4.058249
diffusion mean: 1.434645, std: nan
diffusion min: 1.434645, max: 1.434645
#
t= tensor([0.5200], device=’cuda:0′):
drift_real mean: 0.001265, std: 0.838747
drift min: -3.582977, max: 3.580992
drift_imag mean: 0.001077, std: 0.839164
drift_imag min: -3.670391, max: 3.394095
diffusion mean: 1.373023, std: nan
diffusion min: 1.373023, max: 1.373023
#
t= tensor([0.4800], device=’cuda:0′):
drift_real mean: 0.001136, std: 0.780881
drift min: -3.638036, max: 3.298007
drift_imag mean: 0.002820, std: 0.782339
drift_imag min: -3.280854, max: 3.344223
diffusion mean: 1.309467, std: nan
diffusion min: 1.309467, max: 1.309467
#
t= tensor([0.4400], device=’cuda:0′):
drift_real mean: 0.002041, std: 0.722704
drift min: -2.978718, max: 2.985824
drift_imag mean: 0.002030, std: 0.723352
drift_imag min: -3.132399, max: 2.797324
diffusion mean: 1.243960, std: nan
diffusion min: 1.243960, max: 1.243960
#
t= tensor([0.4000], device=’cuda:0′):
drift_real mean: 0.000621, std: 0.664624
drift min: -3.110819, max: 2.997027
drift_imag mean: 0.001773, std: 0.664240
drift_imag min: -2.646359, max: 2.830245
diffusion mean: 1.176469, std: nan
diffusion min: 1.176469, max: 1.176469
#
t= tensor([0.3600], device=’cuda:0′):
drift_real mean: 0.001700, std: 0.606232
drift min: -2.742868, max: 2.802910
drift_imag mean: -0.001700, std: 0.605417
drift_imag min: -2.734457, max: 2.802108
diffusion mean: 1.106941, std: nan
diffusion min: 1.106941, max: 1.106941
#
t= tensor([0.3200], device=’cuda:0′):
drift_real mean: 0.002795, std: 0.546745
drift min: -2.702860, max: 2.266898
drift_imag mean: -0.002274, std: 0.548131
drift_imag min: -2.467861, max: 2.577819
diffusion mean: 1.035286, std: nan
diffusion min: 1.035286, max: 1.035286
#
t= tensor([0.2800], device=’cuda:0′):
drift_real mean: 0.001690, std: 0.490185
drift min: -2.142412, max: 2.194646
drift_imag mean: -0.002413, std: 0.491714
drift_imag min: -2.091719, max: 2.043411
diffusion mean: 0.961359, std: nan
diffusion min: 0.961359, max: 0.961359
#
t= tensor([0.2400], device=’cuda:0′):
drift_real mean: 0.001244, std: 0.433069
drift min: -1.773319, max: 1.891073
drift_imag mean: -0.001816, std: 0.433641
drift_imag min: -1.945599, max: 1.986796
diffusion mean: 0.884932, std: nan
diffusion min: 0.884932, max: 0.884932
#
t= tensor([0.2000], device=’cuda:0′):
drift_real mean: 0.001100, std: 0.375882
drift min: -1.637365, max: 1.696351
drift_imag mean: -0.001723, std: 0.374826
drift_imag min: -1.775090, max: 1.532630
diffusion mean: 0.805636, std: nan
diffusion min: 0.805636, max: 0.805636
#
t= tensor([0.1600], device=’cuda:0′):
drift_real mean: 0.002833, std: 0.319433
drift min: -1.461934, max: 1.443037
drift_imag mean: -0.002973, std: 0.317190
drift_imag min: -1.577771, max: 1.452456
diffusion mean: 0.722879, std: nan
diffusion min: 0.722879, max: 0.722879
#
t= tensor([0.1200], device=’cuda:0′):
drift_real mean: 0.002107, std: 0.262938
drift min: -1.290313, max: 1.291217
drift_imag mean: -0.002558, std: 0.260862
drift_imag min: -1.150687, max: 1.123912
diffusion mean: 0.635657, std: nan
diffusion min: 0.635657, max: 0.635657
#
t= tensor([0.0800], device=’cuda:0′):
drift_real mean: 0.001805, std: 0.207426
drift min: -1.087441, max: 1.107205
drift_imag mean: -0.002439, std: 0.204845
drift_imag min: -0.857314, max: 0.864121
diffusion mean: 0.542166, std: nan
diffusion min: 0.542166, max: 0.542166
#
t= tensor([0.0400], device=’cuda:0′):
drift_real mean: 0.001622, std: 0.151578
drift min: -1.096648, max: 1.074418
drift_imag mean: -0.002582, std: 0.149009
drift_imag min: -0.656835, max: 0.631456
diffusion mean: 0.438765, std: nan
diffusion min: 0.438765, max: 0.438765
##################################################
文件: p232_161.wav, N=25
SI-SDR: 24.52938173290932
ESTOI: 0.9642957507514722
暂无评论

发送评论 编辑评论


				
|´・ω・)ノ
ヾ(≧∇≦*)ゝ
(☆ω☆)
(╯‵□′)╯︵┴─┴
 ̄﹃ ̄
(/ω\)
∠( ᐛ 」∠)_
(๑•̀ㅁ•́ฅ)
→_→
୧(๑•̀⌄•́๑)૭
٩(ˊᗜˋ*)و
(ノ°ο°)ノ
(´இ皿இ`)
⌇●﹏●⌇
(ฅ´ω`ฅ)
(╯°A°)╯︵○○○
φ( ̄∇ ̄o)
ヾ(´・ ・`。)ノ"
( ง ᵒ̌皿ᵒ̌)ง⁼³₌₃
(ó﹏ò。)
Σ(っ °Д °;)っ
( ,,´・ω・)ノ"(´っω・`。)
╮(╯▽╰)╭
o(*////▽////*)q
>﹏<
( ๑´•ω•) "(ㆆᴗㆆ)
😂
😀
😅
😊
🙂
🙃
😌
😍
😘
😜
😝
😏
😒
🙄
😳
😡
😔
😫
😱
😭
💩
👻
🙌
🖕
👍
👫
👬
👭
🌚
🌝
🙈
💊
😶
🙏
🍦
🍉
😣
Source: github.com/k4yt3x/flowerhd
颜文字
Emoji
小恐龙
花!
上一篇
下一篇