Blog | VoicePing

Blog

Insights, tips, and updates from VoicePing

Whisper in Production: Real-Time Dual-Language Switching, the Failures We Hit, and the Architecture That Works
ASR Whisper

Whisper in Production: Real-Time Dual-Language Switching, the Failures We Hit, and the Architecture That Works

How VoicePing engineered Bilingual Mode for automatic, low-latency language switching inside a single WebSocket stream powered by customized Whisper V2 models.

Akira Noda - VoicePing
9 min
Evaluating Speaker Diarization Models: A Practical Comparison
Speaker Diarization NeMo

Evaluating Speaker Diarization Models: A Practical Comparison

Technical comparison of NeMo MSDD and Pyannote 3.1 across 6 real-world test scenarios

Ashar Mirza - VoicePing
4 min
The Power of Real-Time Translation to Accelerate Global Innovation: VoicePing × Plug and Play Japan
AI Translation

The Power of Real-Time Translation to Accelerate Global Innovation: VoicePing × Plug and Play Japan

As a global accelerator that promotes N-to-N collaboration between startups and major corporations, Plug and Play Japan holds its annual "Plug and Play Japan Summit," which typically attracts around 2,200 participants. With many international speakers and attendees, language barriers posed a major challenge. This is why the real-time translation tool "VoicePing" was implemented.

VoicePing Editorial
8 min
How a Go Rewrite Made Our WebSocket Proxy 100x More Efficient
Go WebSocket

How a Go Rewrite Made Our WebSocket Proxy 100x More Efficient

Rewriting a Python WebSocket proxy in Go with lock-free connection pooling and event-driven reconciliation

Akira Noda - VoicePing
16 min
Part 2: Scaling Translation Inference: +82% Throughput
vLLM Translation

Part 2: Scaling Translation Inference: +82% Throughput

How we improved vLLM inference throughput by 82% using AsyncLLMEngine and right-sized continuous batching

Ashar Mirza - VoicePing
5 min
Part 1: The Bottleneck to Scale Our Translation Inference Servers
FastAPI vLLM

Part 1: The Bottleneck to Scale Our Translation Inference Servers

Identifying architectural bottlenecks in FastAPI + multiprocessing setup preventing efficient GPU utilization

Ashar Mirza - VoicePing
7 min

Try VoicePing for Free

Experience communication beyond language barriers with real-time voice translation

Get Started Free