Model optimization/CUDA Data Science engineer
Europe, UkraineWe seek a highly skilled CUDA API Developer to design and implement a high-performance API server for text-to-speech (TTS) applications. The ideal candidate will have expertise in leveraging CUDA for GPU-accelerated computing, audio processing, and efficient data encoding. You will be responsible for integrating advanced TTS models, optimizing performance on GPUs, and converting audio output into Opus frames for real-time streaming and storage.
Responsibilities:
- Develop a robust API server in C++ to process TTS requests efficiently.
- Integrate CUDA-accelerated TTS models for fast inference.
- Implement audio processing pipelines to sanitize and normalize PCM data.
- Use Libopus to encode PCM audio into Opus frames for streaming.
- Optimize GPU resource management to handle concurrent requests.
- Design and implement REST or gRPC endpoints for TTS services.
- Benchmark, test, and debug the system for latency and throughput optimization.
- Document the system architecture and provide maintenance support.
Requirements:
- Proficiency in C++ development, with experience in building performance-critical applications.
- Strong knowledge of CUDA and GPU programming for accelerated computing.
- Experience with audio processing libraries (e.g., Libopus, FFmpeg).
- Familiarity with TTS models like Tacotron, FastSpeech, or custom implementations.
- Understanding of REST and gRPC APIs.
- Strong problem-solving and debugging skills.
- Knowledge of multi-threading and GPU resource management.
Nice to Have
- Experience with TensorRT for model optimization.
- Knowledge of streaming protocols like WebSocket or HTTP/2.
- Background in real-time systems or multimedia applications.
- Experience with containerization tools like Docker for deployment.
Ready to rumble?
Send your CV or contact us here.