Loading video player...
In this demo, I run Qwen 2.5-0.5B directly inside LOOM, my cross-language AI runtime that supports Go, Python, C#, WASM, and WebGPU — all without PyTorch or TensorFlow. The model weights are loaded from Hugging Face and executed through LOOM’s unified tensor engine, proving transformer inference works across: * 🧩 Native Go backend * ⚙️ Python Flask front-end * 🌐 Browser interface * 📱 Mobile and WebGPU targets This marks a milestone for model-agnostic AI, where a single runtime can load and run any transformer architecture — BERT, Qwen, Llama, or Mistral — using the same core engine. 🔹 Framework: LOOM (OpenFluke project) 🔹 Model: Qwen/Qwen2.5-0.5B 🔹 Layers: 24 Transformer Blocks (96 internal layers) 🔹 Runtime: Go + C-ABI + WebGPU 🔹 Output: Native generation via CPU and Web UI #AIFramework #Transformer #Qwen #LLM #OpenSourceAI #WebGPU #GoLang #Python #MachineLearning #HuggingFace #LoomAI #OpenFluke #NeuralNetwork #ModelConversion #InferenceEngine #CrossPlatformAI #BERT #Mistral #Llama #EdgeAI #MobileAI