🇷🇴🇫🇷 On the 3rd and 4th of October, together with my colleagues from the AI Multimedia Lab, we proudly represented Romania at FrancoTech Paris 2024.
📽️ My exposition featured a new module that I developed and integrated into the ContinualBot Framework, which supports chatting with video — a feature I am introducing at this event for the first time. More to come!
▶️ This state-of-the-art system processes YouTube video URLs by extracting distinct key frames. These key frames are then used to generate both brief and detailed descriptions, following a pre-configured schema of dimensions. The detailed descriptions, along with their corresponding timestamps, are stored in a vector database and a graph database.
🗨️ Users can literally talk to a video, by providing text prompts to query specific information about the video, such as:
- Identifying objects or text
- Finding scenes or objects and their corresponding timestamps
- Asking for details about a specific scene
- Comparing items
- Requesting clarifications about a scene if something is unclear
🙏 I am grateful for the opportunity and for the amazing ecosystem we have at AIM.