You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
The RAW audio output from piper is always garbeld/distorted, I cannot get any clean output. This suggest there is some cleanup being done on the audio. Audio from wyoming satellite is clear and clean, this suggest there is some postprocessing done on the audio.
When I run this: echo -e '{"type": "synthesize", "data": {"text": "Test from Satellite"}}\n' | nc -v IPADDRESS 10200 > working_tts_satellite.pcm
The RAW audio output from piper is always garbeld/distorted, I cannot get any clean output. This suggest there is some cleanup being done on the audio. Audio from wyoming satellite is clear and clean, this suggest there is some postprocessing done on the audio.
When I run this: echo -e '{"type": "synthesize", "data": {"text": "Test from Satellite"}}\n' | nc -v IPADDRESS 10200 > working_tts_satellite.pcm
The PCM file is always distorted, because:
ffprobe -show_format -show_streams -i test_output.pcm ffprobe version 5.1.6-0+deb12u1+rpt1 Copyright (c) 2007-2024 the FFmpeg developers built with gcc 12 (Debian 12.2.0-14) configuration: --prefix=/usr --extra-version=0+deb12u1+rpt1 --toolchain=hardened --incdir=/usr/include/aarch64-linux-gnu --enable-gpl --disable-stripping --disable-mmal --enable-gnutls --enable-ladspa --enable-libaom --enable-libass --enable-libbluray --enable-libbs2b --enable-libcaca --enable-libcdio --enable-libcodec2 --enable-libdav1d --enable-libflite --enable-libfontconfig --enable-libfreetype --enable-libfribidi --enable-libglslang --enable-libgme --enable-libgsm --enable-libjack --enable-libmp3lame --enable-libmysofa --enable-libopenjpeg --enable-libopenmpt --enable-libopus --enable-libpulse --enable-librabbitmq --enable-librist --enable-librubberband --enable-libshine --enable-libsnappy --enable-libsoxr --enable-libspeex --enable-libsrt --enable-libssh --enable-libsvtav1 --enable-libtheora --enable-libtwolame --enable-libvidstab --enable-libvorbis --enable-libvpx --enable-libwebp --enable-libx265 --enable-libxml2 --enable-libxvid --enable-libzimg --enable-libzmq --enable-libzvbi --enable-lv2 --enable-omx --enable-openal --enable-opencl --enable-opengl --enable-sand --enable-sdl2 --disable-sndio --enable-libjxl --enable-neon --enable-v4l2-request --enable-libudev --enable-epoxy --libdir=/usr/lib/aarch64-linux-gnu --arch=arm64 --enable-pocketsphinx --enable-librsvg --enable-libdc1394 --enable-libdrm --enable-vout-drm --enable-libiec61883 --enable-chromaprint --enable-frei0r --enable-libx264 --enable-libplacebo --enable-librav1e --enable-shared libavutil 57. 28.100 / 57. 28.100 libavcodec 59. 37.100 / 59. 37.100 libavformat 59. 27.100 / 59. 27.100 libavdevice 59. 7.100 / 59. 7.100 libavfilter 8. 44.100 / 8. 44.100 libswscale 6. 7.100 / 6. 7.100 libswresample 4. 7.100 / 4. 7.100 libpostproc 56. 6.100 / 56. 6.100 test_output.pcm: Invalid data found when processing input
Maybe because the PCM contains JSON information?
xxd test_output.pcm | head -50 00000000: 7b22 7479 7065 223a 2022 6175 6469 6f2d {"type": "audio- 00000010: 7374 6172 7422 2c20 2276 6572 7369 6f6e start", "version 00000020: 223a 2022 312e 352e 3322 2c20 2264 6174 ": "1.5.3", "dat 00000030: 615f 6c65 6e67 7468 223a 2036 317d 0a7b a_length": 61}.{ 00000040: 2272 6174 6522 3a20 3232 3035 302c 2022 "rate": 22050, " 00000050: 7769 6474 6822 3a20 322c 2022 6368 616e width": 2, "chan 00000060: 6e65 6c73 223a 2031 2c20 2274 696d 6573 nels": 1, "times 00000070: 7461 6d70 223a 206e 756c 6c7d 7b22 7479 tamp": null}{"ty 00000080: 7065 223a 2022 6175 6469 6f2d 6368 756e pe": "audio-chun 00000090: 6b22 2c20 2276 6572 7369 6f6e 223a 2022 k", "version": " 000000a0: 312e 352e 3322 2c20 2264 6174 615f 6c65 1.5.3", "data_le 000000b0: 6e67 7468 223a 2036 312c 2022 7061 796c ngth": 61, "payl 000000c0: 6f61 645f 6c65 6e67 7468 223a 2032 3034 oad_length": 204 000000d0: 387d 0a7b 2272 6174 6522 3a20 3232 3035 8}.{"rate": 2205 000000e0: 302c 2022 7769 6474 6822 3a20 322c 2022 0, "width": 2, " 000000f0: 6368 616e 6e65 6c73 223a 2031 2c20 2274 channels": 1, "t 00000100: 696d 6573 7461 6d70 223a 206e 756c 6c7d imestamp": null} 00000110: 4500 2900 2c00 1500 0f00 1600 0900 1000 E.).,........... 00000120: 1b00 1c00 2a00 1f00 2400 2a00 2700 1a00 ....*...$.*.'... 00000130: 1500 0500 0d00 0600 0400 0b00 0f00 0500 ................ 00000140: 1000 0000 0700 f8ff ebff 0700 e6ff e0ff ................ 00000150: e7ff edff f4ff dfff d8ff d7ff d4ff d7ff ................ 00000160: dcff c3ff c6ff dcff d5ff deff d7ff e4ff ................ 00000170: d4ff edff eeff e2ff edff f1ff f4ff fcff ................ 00000180: fdff 0500 feff fcff ffff 0b00 1300 0900 ................ 00000190: 0c00 1200 1700 1b00 1f00 1f00 1e00 3200 ..............2. 000001a0: 3300 2900 2400 2700 2500 3400 2800 3900 3.).$.'.%.4.(.9. 000001b0: 2d00 3900 4500 4000 4400 3800 3700 5000 [email protected]. 000001c0: 4800 3d00 4300 3b00 3500 4100 3300 3800 H.=.C.;.5.A.3.8. 000001d0: 4500 3900 4a00 4900 3300 3b00 3400 3400 E.9.J.I.3.;.4.4. 000001e0: 3b00 2e00 2300 3b00 2900 2800 2c00 2300 ;...#.;.).(.,.#. 000001f0: 1d00 2400 2800 2c00 2500 3100 2700 2500 ..$.(.,.%.1.'.%. 00000200: 2f00 2700 2100 2400 2400 1d00 1d00 2800 /.'.!.$.$.....(. 00000210: 1a00 2600 2a00 1600 1a00 2200 1600 2100 ..&.*....."...!. 00000220: 1c00 1500 1000 1b00 0d00 1100 0e00 0a00 ................
Piper container is started with: docker run -itd --restart unless-stopped -p 10200:10200 -v /home/user/piperdata:/data rhasspy/wyoming-piper \ --voice en_US-lessac-medium
What magic is done to cleanup the audio?
The text was updated successfully, but these errors were encountered: