Audio and text data processing.

In collaboration with our partner Humanitas, I carried out a proof of concept on the use of recent AI technologies (mainly LLM, including Zephyr, OpenZephyrChat, Mixtral) to create a chatbot system that integrates with internal tools. This project was extended to also support:

  • Audio transcription, which would be useful for processing meeting minutes, for example.
  • Real-time audio transcription, which is interesting for getting immediate feedback during a conversation.
  • Voice synthesis, for assistance to visually impaired people.

The solutions chosen for the audio part are OpenAI Whisper for audio transcription, and pyttsx3 (and espeak) for audio synthesis.

A demo is available at https://chronos.spiderweak.fr, the demonstration is quite slow, it runs on a device with limited resources.

This project is the first step towards 3 other internal projects of the partner:

  • VOIP server integrated voice assistant (Asterisk)
  • Personal assistant for people with visual impairments
  • Drone control interface (Mission Planner)
Antoine BERNARD
Antoine BERNARD
Postdoctoral Fellow @ Polytechnique Montréal

Tech enthousiast, interested in computer networks, distributed processing and a bit of AI.