Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Garbled / Static RAW output on piper docker linux #732

Open
simonsays-techtalk opened this issue Feb 15, 2025 · 0 comments
Open

Garbled / Static RAW output on piper docker linux #732

simonsays-techtalk opened this issue Feb 15, 2025 · 0 comments

Comments

@simonsays-techtalk
Copy link

simonsays-techtalk commented Feb 15, 2025

The RAW audio output from piper is always garbeld/distorted, I cannot get any clean output. This suggest there is some cleanup being done on the audio. Audio from wyoming satellite is clear and clean, this suggest there is some postprocessing done on the audio.

When I run this: echo -e '{"type": "synthesize", "data": {"text": "Test from Satellite"}}\n' | nc -v IPADDRESS 10200 > working_tts_satellite.pcm

The PCM file is always distorted, because:
ffprobe -show_format -show_streams -i test_output.pcm ffprobe version 5.1.6-0+deb12u1+rpt1 Copyright (c) 2007-2024 the FFmpeg developers built with gcc 12 (Debian 12.2.0-14) configuration: --prefix=/usr --extra-version=0+deb12u1+rpt1 --toolchain=hardened --incdir=/usr/include/aarch64-linux-gnu --enable-gpl --disable-stripping --disable-mmal --enable-gnutls --enable-ladspa --enable-libaom --enable-libass --enable-libbluray --enable-libbs2b --enable-libcaca --enable-libcdio --enable-libcodec2 --enable-libdav1d --enable-libflite --enable-libfontconfig --enable-libfreetype --enable-libfribidi --enable-libglslang --enable-libgme --enable-libgsm --enable-libjack --enable-libmp3lame --enable-libmysofa --enable-libopenjpeg --enable-libopenmpt --enable-libopus --enable-libpulse --enable-librabbitmq --enable-librist --enable-librubberband --enable-libshine --enable-libsnappy --enable-libsoxr --enable-libspeex --enable-libsrt --enable-libssh --enable-libsvtav1 --enable-libtheora --enable-libtwolame --enable-libvidstab --enable-libvorbis --enable-libvpx --enable-libwebp --enable-libx265 --enable-libxml2 --enable-libxvid --enable-libzimg --enable-libzmq --enable-libzvbi --enable-lv2 --enable-omx --enable-openal --enable-opencl --enable-opengl --enable-sand --enable-sdl2 --disable-sndio --enable-libjxl --enable-neon --enable-v4l2-request --enable-libudev --enable-epoxy --libdir=/usr/lib/aarch64-linux-gnu --arch=arm64 --enable-pocketsphinx --enable-librsvg --enable-libdc1394 --enable-libdrm --enable-vout-drm --enable-libiec61883 --enable-chromaprint --enable-frei0r --enable-libx264 --enable-libplacebo --enable-librav1e --enable-shared libavutil 57. 28.100 / 57. 28.100 libavcodec 59. 37.100 / 59. 37.100 libavformat 59. 27.100 / 59. 27.100 libavdevice 59. 7.100 / 59. 7.100 libavfilter 8. 44.100 / 8. 44.100 libswscale 6. 7.100 / 6. 7.100 libswresample 4. 7.100 / 4. 7.100 libpostproc 56. 6.100 / 56. 6.100 test_output.pcm: Invalid data found when processing input

Maybe because the PCM contains JSON information?
xxd test_output.pcm | head -50 00000000: 7b22 7479 7065 223a 2022 6175 6469 6f2d {"type": "audio- 00000010: 7374 6172 7422 2c20 2276 6572 7369 6f6e start", "version 00000020: 223a 2022 312e 352e 3322 2c20 2264 6174 ": "1.5.3", "dat 00000030: 615f 6c65 6e67 7468 223a 2036 317d 0a7b a_length": 61}.{ 00000040: 2272 6174 6522 3a20 3232 3035 302c 2022 "rate": 22050, " 00000050: 7769 6474 6822 3a20 322c 2022 6368 616e width": 2, "chan 00000060: 6e65 6c73 223a 2031 2c20 2274 696d 6573 nels": 1, "times 00000070: 7461 6d70 223a 206e 756c 6c7d 7b22 7479 tamp": null}{"ty 00000080: 7065 223a 2022 6175 6469 6f2d 6368 756e pe": "audio-chun 00000090: 6b22 2c20 2276 6572 7369 6f6e 223a 2022 k", "version": " 000000a0: 312e 352e 3322 2c20 2264 6174 615f 6c65 1.5.3", "data_le 000000b0: 6e67 7468 223a 2036 312c 2022 7061 796c ngth": 61, "payl 000000c0: 6f61 645f 6c65 6e67 7468 223a 2032 3034 oad_length": 204 000000d0: 387d 0a7b 2272 6174 6522 3a20 3232 3035 8}.{"rate": 2205 000000e0: 302c 2022 7769 6474 6822 3a20 322c 2022 0, "width": 2, " 000000f0: 6368 616e 6e65 6c73 223a 2031 2c20 2274 channels": 1, "t 00000100: 696d 6573 7461 6d70 223a 206e 756c 6c7d imestamp": null} 00000110: 4500 2900 2c00 1500 0f00 1600 0900 1000 E.).,........... 00000120: 1b00 1c00 2a00 1f00 2400 2a00 2700 1a00 ....*...$.*.'... 00000130: 1500 0500 0d00 0600 0400 0b00 0f00 0500 ................ 00000140: 1000 0000 0700 f8ff ebff 0700 e6ff e0ff ................ 00000150: e7ff edff f4ff dfff d8ff d7ff d4ff d7ff ................ 00000160: dcff c3ff c6ff dcff d5ff deff d7ff e4ff ................ 00000170: d4ff edff eeff e2ff edff f1ff f4ff fcff ................ 00000180: fdff 0500 feff fcff ffff 0b00 1300 0900 ................ 00000190: 0c00 1200 1700 1b00 1f00 1f00 1e00 3200 ..............2. 000001a0: 3300 2900 2400 2700 2500 3400 2800 3900 3.).$.'.%.4.(.9. 000001b0: 2d00 3900 4500 4000 4400 3800 3700 5000 [email protected]. 000001c0: 4800 3d00 4300 3b00 3500 4100 3300 3800 H.=.C.;.5.A.3.8. 000001d0: 4500 3900 4a00 4900 3300 3b00 3400 3400 E.9.J.I.3.;.4.4. 000001e0: 3b00 2e00 2300 3b00 2900 2800 2c00 2300 ;...#.;.).(.,.#. 000001f0: 1d00 2400 2800 2c00 2500 3100 2700 2500 ..$.(.,.%.1.'.%. 00000200: 2f00 2700 2100 2400 2400 1d00 1d00 2800 /.'.!.$.$.....(. 00000210: 1a00 2600 2a00 1600 1a00 2200 1600 2100 ..&.*....."...!. 00000220: 1c00 1500 1000 1b00 0d00 1100 0e00 0a00 ................

Piper container is started with: docker run -itd --restart unless-stopped -p 10200:10200 -v /home/user/piperdata:/data rhasspy/wyoming-piper \ --voice en_US-lessac-medium

What magic is done to cleanup the audio?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant