Skip to main content
The Quantum Dispatch
Back to Home
Cover illustration for ESP32 Walkie-Talkie: G.722 HD Voice Rides ESP-NOW Packets

ESP32 Walkie-Talkie: G.722 HD Voice Rides ESP-NOW Packets

PCMFlowG722 turns ESP32 boards into an HD-voice walkie-talkie over ESP-NOW, fitting a 160-byte G.722 frame into one packet. No router needed.

Alex Circuit
Alex CircuitMay 31, 20264 min read

How an ESP32 Walkie-Talkie Library Squeezes HD Voice Into ESP-NOW Packets

If you have ever wanted to build your own off-grid intercom, here is a maker project worth getting excited about. A freshly released open-source library called PCMFlowG722 turns ordinary ESP32 boards into a real-time, two-way ESP32 walkie-talkie that carries crisp HD voice over Espressif's connectionless ESP-NOW protocol. No Wi-Fi access point, no router, no cloud account. Just two boards, a couple of microphones and speakers, and a clever bit of audio engineering doing the heavy lifting. Published by Tanaka Masayuki under the permissive MIT license, version 0.2.0 landed on May 25, 2026, and it is the kind of tidy, hackable building block that makes hardware tinkering genuinely fun.

Why the G.722 Codec Is the Star of the Show

The trick that makes this whole thing work is the choice of codec. PCMFlowG722 is a G.722 wideband-codec add-on to the existing PCMFlow Arduino library, and G.722 is a wonderful fit here. It delivers roughly 7 kHz of audio bandwidth at a 16 kHz sampling rate, running at a fixed 64 kbps. In plain terms, that means voice that sounds noticeably fuller and more natural than the narrowband telephone audio many DIY projects settle for. You get the "S" sounds, the breathy consonants, the presence that makes a conversation feel like a conversation.

What really delights the engineer in me is how economical it all is. The codec footprint is only about 10 to 12 KB of flash and roughly 512 bytes of RAM per direction. That is a rounding error on a modern microcontroller, leaving plenty of headroom for the rest of your project.

The Packet Math: Why 160 Bytes Changes Everything

Here is the clever frame-fitting puzzle at the heart of the design, and it is worth walking through. ESP-NOW is a lightweight, connectionless transport with a maximum payload of 250 bytes per packet. That is a tight budget. If you tried to send raw 16 kHz PCM audio, a single 20 ms slice of sound would weigh in at 640 bytes. That simply does not fit, so you would be forced to fragment every chunk across multiple packets, adding latency and complexity.

Now run the same 20 ms through G.722, and that frame compresses to exactly 160 bytes. It drops neatly inside one ESP-NOW packet with room to spare. One frame, one packet, no fragmentation. That clean alignment between the codec's frame size and the transport's payload ceiling is the elegant detail that makes a low-latency, real-time ESP32 walkie-talkie practical rather than fiddly. It is a genuinely satisfying piece of engineering.

The library keeps things simple operationally: it is half-duplex, working in a familiar push-to-talk style, so one side transmits while the other listens. For an intercom or field-radio project, that is exactly the right call.

What You Get and What You Can Build

PCMFlowG722 ships with a friendly developer surface: dedicated G722Encoder and G722Decoder classes, plus a ready-to-run EspNowTransceiver example sketch that wires the codec to the radio for you. Drop it onto two boards and you are talking.

Hardware support is broad. The library targets the ESP32 family across the board, including the S3, C3, C6, and the newer P4, and it reaches well beyond Espressif silicon to the RP2040 and RP2350, Teensy 4.x, STM32 F4 and up, and the nRF52. The one notable exception is classic 8-bit AVR chips, which lack the muscle for the job. The author has demonstrated the project running on the M5Stack Core2, a tidy all-in-one platform that already has the speaker, mic, and display you would want for a finished gadget.

For makers, the use cases practically design themselves: off-grid intercoms between rooms or buildings, campsite and workshop communicators, robotics audio links, and weekend radio experiments where you want real wideband voice without standing up any network infrastructure.

This is the sort of release that rewards a Saturday afternoon and a soldering iron. The math is elegant, the footprint is tiny, and the result is HD voice flying through the air between two boards you built yourself. Hard to ask for more from a single MIT-licensed library.

Sources: CNX Software — May 30, 2026, Adafruit Blog — May 27, 2026, GitHub (tanakamasayuki/PCMFlowG722) — May 25, 2026