Audio and text data processing.
In collaboration with our partner Humanitas, I carried out a proof of concept on the use of recent AI technologies (mainly LLM, including Zephyr, OpenZephyrChat, Mixtral) to create a chatbot system that integrates with internal tools. This project was extended to also support:
- Audio transcription, which would be useful for processing meeting minutes, for example.
- Real-time audio transcription, which is interesting for getting immediate feedback during a conversation.
- Voice synthesis, for assistance to visually impaired people.
The solutions chosen for the audio part are OpenAI Whisper for audio transcription, and pyttsx3 (and espeak) for audio synthesis.
A demo is available at https://chronos.spiderweak.fr, the demonstration is quite slow, it runs on a device with limited resources.
This project is the first step towards 3 other internal projects of the partner:
- VOIP server integrated voice assistant (Asterisk)
- Personal assistant for people with visual impairments
- Drone control interface (Mission Planner)