We support a range of high-performance open-source models, each optimized for different use cases and budgets.
Qwen3-8B
Fast and efficient 8B parameter model, perfect for general-purpose AI chat and document processing.
€0.04 - €0.13
per 1M tokens
Qwen3-14B
Advanced 14B parameter model with superior reasoning capabilities for complex tasks and analysis.
€0.06 - €0.24
per 1M tokens
Phi-3-mini-128k-instruct
Microsoft's efficient model with 128k context window, ideal for long document processing and analysis.
€0.09
per 1M tokens
Mistral-8B
Reliable 8B parameter model from Mistral AI, excellent for production workloads and enterprise use.
€0.09
per 1M tokens
All our inference servers are located within European Union countries, ensuring GDPR compliance and data sovereignty.
Benefits of EU-based infrastructure:
Full GDPR compliance and data protection
Low latency for European users
Data never leaves EU jurisdiction
Compliance with financial services regulations
24/7 monitoring and support
Get up to 70% off on inference costs with our flexible spot instance rental program.
Perfect for fine-tuning and batch processing
Get started with our EU-based inference infrastructure and choose the model that fits your needs and budget.
📩 Questions? Contact us at hello@localassistant.ai
Human-first. Confidential by default.
© 2025 LocalAssistant.AI - Made in Luxembourg 🇪🇺