24.11.2025
GPT-OSS 120B on AI Cube Pro: Run OpenAI's Open-Source Model Locally
With GPT-OSS 120B, OpenAI released their first open-weight model since GPT-2 in August 2025 – and it's impressive. The model achieves near o4-mini performance but...
Our Managed AI Servers provide you with the perfect infrastructure for hosting AI models and LLMs in your own environment.
With our powerful GPU servers, you can run compute-intensive AI applications while maintaining complete control over your data.
Our Managed AI Servers are fully configured and optimized for maximum performance and reliability.
Perfect for inference and small to medium-sized models
For large models and model training
We handle the complete management: installation, updates, monitoring, backups, and personal support.
Tested with leading open-source LLMs: Gemma, DeepSeek, Llama, Mistral, Qwen, Phi and many more models for diverse use cases.
State-of-the-art models from Meta. Available in 8B, 70B, and 405B. Excellent tool support.
Currently the most powerful model running on a single GPU. Integrated vision support.
Open reasoning models with performance at the level of O3 and Gemini 2.5 Pro. Thinking & tool support.
MoE architecture for efficient Large Language Models
Compact, efficient model from Microsoft
Multilingual LLMs from Alibaba Cloud
Ollama offers ease of use for fast prototyping, while vLLM delivers maximum performance for production environments.
Upon request, we install and configure both solutions on your server so you can use the optimum engine for your requirements (optional).
$ ollama run gemma:27b
$ ollama run deepseek:32b
$ vllm serve llama3:70b
$ vllm serve mixtral:8x7bvLLM is a highly optimized inference engine that has been specially developed for production environments with high throughput requirements. Ideal for APIs, batch processing, and applications with many simultaneous users.
Optimized for maximum token generation with concurrent requests
Efficient processing of multiple requests simultaneously
Ideal for production environments with high requirements
OpenWebUI provides a user-friendly web interface for Ollama that makes working with AI models much easier.
With features such as chat history, model management, and prompt templates, you optimize your interactions with the AI models.
Your data stays in Germany. Full control over your AI models and generated data.
Dedicated GPU resources without sharing. Optimized for low latency and high throughput.
We handle installation, updates, and maintenance. You simply use your AI models.
Start small and grow with your requirements. Upgrades possible at any time.
Full API access for integration into your applications and workflows.
Use any open-source models. No vendor lock-ins or restrictions.
Explore our AI Server offerings
Own hardware, no vendor lock-in - local AI inference from €4,299.90
Host Large Language Models securely and GDPR-compliant in Germany
Powerful GPU servers with NVIDIA RTX for AI applications
Fully managed AI servers with 24/7 monitoring
Real-world examples and scenarios for AI servers
Understand the differences between inference and training servers
Custom RAG solutions to connect your knowledge databases to AI models
24.11.2025
With GPT-OSS 120B, OpenAI released their first open-weight model since GPT-2 in August 2025 – and it's impressive. The model achieves near o4-mini performance but...
09.11.2025
In times of rising cloud costs, data sovereignty challenges and vendor lock-in, the topic of local AI inference is becoming increasingly important for companies. With...
09.11.2025
More and more companies are considering running Large Language Models (LLMs) on their own hardware rather than via cloud APIs. The reasons for this are...
CTO, EVA Real Estate, UAE
"I recently worked with Timo and the WZ-IT team, and honestly, it turned out to be one of the best tech decisions I have made for my business. Right from the start, Timo took the time to walk me through every step in a simple and calm way. No matter how many questions I had, he never rushed me. The results speak for themselves. With WZ-IT, we reduced our monthly expenses from $1,300 down to $250. This was a huge win for us."
Data Manager, ARGE, Germany
"With Timo and Robin, you're not only on the safe side technically - you also get the best human support! Whether it's quick help in everyday life or complex IT solutions: the guys from WZ-IT think along with you, act quickly and speak a language you understand. The collaboration is uncomplicated, reliable and always on an equal footing. That makes IT fun - and above all: it works! Big thank you to the team! (translated) "
"WZ-IT manages our Proxmox cluster reliably and professionally. The team handles continuous monitoring and regular updates for us and responds very quickly to any issues or inquiries. They also configure new nodes, systems, and applications that we need to add to our cluster. With WZ-IT's proactive support, our cluster and the business-critical applications running on it remain stable, and high availability is consistently ensured. We value the professional collaboration and the noticeable relief it brings to our daily operations."
CEO, Aphy B.V., Netherlands
Timo and Robin from WZ-IT set up a RocketChat server for us - and I couldn't be more satisfied! From the initial consultation to the final implementation, everything was absolutely professional, efficient, and to my complete satisfaction. I particularly appreciate the clear communication, transparent pricing, and the comprehensive expertise that both bring to the table. Even after the setup, they take care of the maintenance, which frees up my time enormously and allows me to focus on other important areas of my business - with the good feeling that our IT is in the best hands. I can recommend WZ-IT without reservation and look forward to continuing our collaboration! (translated)
We have had very good experiences with Mr. Wevelsiep and WZ-IT. The consultation was professional, clearly understandable, and at fair prices. The team not only implemented our requirements but also thought along and proactively. Instead of just processing individual tasks, they provided us with well-founded explanations that strengthened our own understanding. WZ-IT took a lot of pressure off us with their structured approach - that was exactly what we needed and is the reason why we keep coming back. (translated)
Robin and Timo provided excellent support during our migration from AWS to Hetzner! We received truly competent advice and will gladly return to their services in the future. (translated)
WZ-IT set up our Jitsi Meet Server anew - professional, fast, and reliable. (translated)
Whether a specific IT challenge or just an idea – we look forward to the exchange. In a brief conversation, we'll evaluate together if and how your project fits with WZ-IT.
Timo Wevelsiep & Robin Zins
CEOs of WZ-IT















