I am frequently asked by clients and potential new colleagues about our team’s AI capabilities.

Therefore, I believe that over 2+ hours of video content provides a more comprehensive answer than a verbal or textual response, or even a slide/PDF presentation.

💻 During this session, we built a RAG system starting with a Java LLM wrapper featuring Spring AI and OpenAI integration, locally on Windows.
💻 We then transitioned to Python with Langchain, using a Llama 2 LLM locally on Linux, with a FAISS vector DB. Subsequently, we packaged everything into a Docker image and deployed it on AWS, demonstrating infrastructure as code capabilities and the portability of our architecture.
💻 In the next phase, we again used Langchain but with JavaScript locally on macOS, replacing the vector DB FAISS with PgVector to leverage a well-known DBMS.
💻 We then used the same Docker container, initially built and deployed on AWS, and shifted everything to Azure. We switched the vector DB to Qdrant and pivoted from Llama back to OpenAI. At this level, we delved deeper into chat history, context awareness, and semantic searching.
💻 Between each phase, we introduced theoretical concepts (such as data chunking, embeddings, retrieval, vector databases, semantic search or neural network quantization) and reviewed the steps we had just taken, providing technical explanations and the motivations behind our choices.
💻 Afterwards, we covered the QA aspect, focusing on methodologies and best practices for data protection, enterprise privacy, and digital assurance in AI development.
💻 Ultimately, we introduced fundamental concepts on how users can interact with the LLM and RAG through Prompt Engineering and some key concepts, such as Context, Tone, Temperature, Prompt tactics, Chain of thoughts and Thread-of-thoughts.

I would like to thank Ivan Shelonik, Daniel – Mihai Gorgan, Saikumar U, Mariana Batiuk, Maks Lypivskyi for a great collaboration and Iryna Rud for moderating this amazing event!

I am grateful for having such great colleagues at Ciklum and working together as one team!


 
Watch the video recording here:
Architecting Scalable AI RAG Systems:
From Startup To Enterprise. A Live Coding Session