Loading video player...
This project demonstrates the deployment of an open-source Large Language Model (Mistral 7B) on a Kubernetes cluster, integrated with real-time monitoring using Prometheus and Grafana. A Streamlit-based chatbot UI is built to interact with the model, allowing users to send queries and receive responses. The system highlights key aspects of modern AI infrastructure, including containerization, orchestration, and performance monitoring. Additionally, a comparative analysis is performed between the open-source model and commercial models based on speed, cost, and quality.