WZ-IT Logo
WZ-IT AI Cube - Die kompakte und lokale KI-Lösung für Unternehmen
GDPR Compliant
NVIDIA RTX Blackwell
Support from Germany
MadeinGermany

The compact and local AI solution for businesses

Run your AI infrastructure locally, without cloud and without huge server racks!

Ready to use with pre-installed software

100% data sovereignty in your network

One-time investment instead of monthly fees

Why Local AI Infrastructure?

Cloud services offer convenience – but also dependency. With an AI Cube, you retain full control over your data, models, and systems. Whether chatbots, RAG systems, or internal AI automation: Your sensitive data stays within your company, while computing power is directly on-site.

The AI Cube is owned by your company – no monthly fees, no token limits, no vendor lock-in. You decide which software runs, which models are used, and how your AI infrastructure grows.

Data Sovereignty

Your models and data never leave your corporate network. Complete control over sensitive information.

Full Control

No API limits, no external updates, no restrictions. You decide every aspect of your AI infrastructure.

Performance

Minimal latency through local inference. No delays from cloud connections.

Cost Efficiency

No token or pay-per-use fees. One-time investment instead of ongoing costs.

Ownership vs. Rental

The AI Cube is completely yours. No monthly subscriptions, no vendor dependency.

Optional Managed Service

If desired, we handle operation, maintenance, and updates – you focus on your projects.

Local AI Usage

Local GPT with our AI Cube

Use Open WebUI for a ChatGPT-like experience – completely local on your own hardware

Open WebUI Screenshot - ChatGPT-like interface

The AI Cube can be delivered with Open WebUI based on customer requirements – an intuitive, user-friendly interface that enables a local ChatGPT-like experience. No cloud dependency, no API keys, no token limits – just you and your AI models.

ChatGPT-like Interface

Familiar and intuitive user interface for natural conversations with your local AI models

Completely Local

All data and conversations stay on your hardware – no connection to external servers required

Multi-Model Support

Switch seamlessly between different AI models within the same interface

No Token Fees

Unlimited usage without pay-per-use fees or monthly API costs

Open WebUI can be pre-installed and delivered ready to use upon request. Simply plug in, power on, and immediately interact with your local AI models – like ChatGPT, but completely under your control.

Vorinstalliert
Sofort einsatzbereit
100% lokal

How Our Customers Successfully Use the AI Cube

Our customers benefit from the locally operated AI solution – independent, secure and efficient. Here are two exemplary use cases.

Case Study: Law Firm

RAG-based Document Research

!Challenge

A medium-sized law firm with numerous mandates and a large file archive found that research for precedent cases, briefs, and internal evidence was often very time-consuming – several hours per case. Additionally, sensitive client data was present that should not go to external cloud systems.

Solution with AI Cube

  • RAG solution for knowledge database search: All briefs, judgments and internal documents in searchable knowledge database
  • Lawyers ask questions in natural language and immediately receive relevant document sections with source citations
  • Infrastructure remains completely in the firm's own network, operation and maintenance by the firm's IT service provider

Result

Drastically Reduced Research Time

Lawyers can argue and decide faster

Strengthened Knowledge Base

New employees access proven documents much faster

Case Study: Private Clinics Network (Psychiatric Facilities)

Knowledge Database for Medical Protocols

!Challenge

A clinic network with multiple locations must manage large amounts of medical protocols, SOPs, training materials and internal reports. Documentation was fragmented and difficult to access – especially when it came to quick decision support and quality checks.

Solution with AI Cube

  • Knowledge platform with BookStack as knowledge source (integration programmed by us), connected to RAG pipeline with Open WebUI + vLLM
  • Employees can ask questions directly with immediate citation of the source
  • AI Cube runs locally in the corporate network, operation and maintenance by us

Result

Drastically Reduced Access Time

Relevant documents are accessed immediately

Strengthened Quality & Compliance

Employees at different locations consistently access the same knowledge pool

New

The new Blackwell Architecture is here!

Even more performance, even more VRAM – the next generation of AI Cubes with NVIDIA Blackwell GPUs

Hardware for Purchase

Hardware Options for Your AI Projects

Proven configurations for every use case

Entry Model

AI Cube Basic

NVIDIA RTX PRO 4000 Blackwell

VRAM

24 GB

Performance

46.9 TFLOPS

CUDA Cores

8.960

Recommended Use:

Chatbots, Code Assistance, Text Inference

  • Ideal for models up to 13B parameters
  • Fast real-time inference
  • Perfect for 24/7 operation
  • Mini-ITX form factor
from €3,999.90
Learn More
Enterprise Model

AI Cube Pro

NVIDIA RTX PRO 6000 Blackwell

VRAM

96 GB

Performance

125 TFLOPS

CUDA Cores

24.064

Recommended Use:

Large LLM Models, Multi-GPU Workloads

  • For models up to 120B+ parameters (e.g. GPT-OSS 120B)
  • 96 GB VRAM for largest models
  • Enterprise-Grade Performance
  • Maximum Scalability
from €12,999.90
Learn More

Included in Delivery

Pre-installed Software (Ollama, vLLM, Open WebUI) – plug in & infer
Operating System & GPU Drivers
Setup Documentation
German Support

*Custom configurations available upon request.

Which AI Cube is right for you?

Answer 3 quick questions – we'll suggest the perfect model. Free & no commitment.

3 Questions

Instant Recommendation

100% Free

Test first, buy later

Unsure whether the AI Cube meets your requirements? No problem! We offer similar GPU configurations as managed hosting in the cloud – without minimum contract term.

Advantages of managed hosting for testing:

  • Similar hardware configurations as the AI Cubes
  • No minimum contract term – cancel monthly
  • Perfect for testing your workloads and determining the right size
  • When satisfied: Simply switch to AI Cube and save long-term
More than just Hardware

Your AI Cube & WZ-IT
Possibilities are endless together

With us you get not only powerful hardware, but also a competent partner for your entire AI infrastructure

Infrastructure Setup

From planning to implementation – we build your complete AI infrastructure and integrate the AI Cube seamlessly.

Custom Development

Tailored software solutions, RAG pipelines, APIs and integrations – perfectly matched to your requirements.

Innovative Solutions

Together we develop new AI applications for your specific use cases – from idea to production readiness.

Support & Maintenance

Continuous support, updates and optimizations – so your AI infrastructure always runs optimally.

Success Story: From Hardware to Complete Solution

A clinic network purchased the AI Cube Ultra for local AI inference. We not only delivered the hardware, but also programmed a custom RAG pipeline that uses BookStack as a knowledge source and is integrated into Open WebUI. The result: employees can access medical protocols and SOPs in seconds – fully GDPR compliant and without cloud.

Let's realize your AI vision together

Software Stack & Compatibility

Ready to Use with Leading Open-Source Frameworks

Pre-installed Software:

Ollama – for simple model management
vLLM – for high-performance inference
Open WebUI – for visual interaction
Docker / Podman – for containerized deployments
REST API Access – for integration

Compatible with:

Llama 3.1 (7B–70B)
Gemma 3 (2B–27B)
DeepSeek-R1
Mistral
Phi-4
Qwen
Custom Models
Ollama

Ollama

Simple model management with one-command installation. Perfect for rapid prototyping and smaller projects.

$ ollama run llama3.1:70b
vLLM

vLLM

High-performance inference with PagedAttention for production workloads with high throughput.

$ vllm serve llama3.1:70b

Technical Specifications

More technical details on request

KomponenteAI Cube BasicAI Cube Pro
Graphics CardNVIDIA RTX PRO 4000 Blackwell (24 GB GDDR7)NVIDIA RTX PRO 6000 Blackwell (96 GB GDDR7)
Network1 GbE (10 GbE optional)1 GbE (10 GbE optional)
Dimensions & Weight292×185×372 mm (H×W×D), approx. 8 kg292×185×372 mm (H×W×D), approx. 8 kg
CertificationCE, RoHS, GDPR-compliantCE, RoHS, GDPR-compliant
SecuritySecure Boot, TPM 2.0, WireGuard VPNSecure Boot, TPM 2.0, WireGuard VPN

AI Cubes (Purchase) vs Managed AI Server (Rental)

Find the Right Model for Your Business

AI Cubes – Purchase

  • Complete hardware ownership
  • CapEx: One-time investment from €3,999.90
  • Full data sovereignty – hardware stays with you
  • No recurring fees (except optional support)
  • Ideal for long-term projects

Managed AI Server – Rental

  • OpEx: Monthly payment from €499/month
  • Fast start without capital commitment
  • 24/7 monitoring & maintenance included
  • Scalable: upgrade or downgrade anytime
  • Ideal for flexible or experimental projects

Why AI Cube?

All benefits at a glance

On-Prem LLM Hosting vs. Cloud API: Costs & Risks

Cloud-based LLM APIs like OpenAI, Anthropic, or Google Gemini are convenient – but expensive and risky. At high volumes, costs can quickly spiral out of control: 1 million tokens per day via cloud APIs can easily cost €15,000 per month or more. With an AI Cube, you pay once from €3,999.90 and run unlimited inferences – no token fees, no monthly bills.

Additionally, on-premise LLM hosting gives you full control over your data. Sensitive information – customer data, internal documents, proprietary content – never leaves your corporate network. You're independent of API downtimes, price increases, or sudden service changes.

How the WZ-IT AI Cube Works

1

Analysis & Consultation

We jointly evaluate your requirements and use cases. In a free consultation, we determine which hardware configuration is optimal for your models and use cases.

2

Hardware Selection & Configuration

Based on model size and requirements, we select the appropriate GPU configuration. We fully configure the system and install Ollama, vLLM, Open WebUI, and other software according to your preferences.

3

Delivery & Setup

The Cube is delivered pre-installed and tested. After plugging it in, it can be operational within minutes. We support you in integrating it into your network.

4

Operation & Support (Optional)

You operate the Cube independently with full root access – or leave maintenance, monitoring, and updates to us through our optional managed service.

Typical Use Cases

Enterprises & Government

For sensitive data that cannot go to the cloud. Run internal chatbots, document analysis, or code assistants completely locally and GDPR-compliant.

Development & Research

Test and develop AI applications locally without cloud dependency. Ideal for rapid prototyping, model fine-tuning, and experimental projects.

On-Premise Deployment

Integrate AI capabilities directly into your existing infrastructure. No internet connection required, complete control over your data.

No Dependencies. No Vendor Lock-in.

With AI Cubes, you retain full decision-making freedom: you can install your own models, migrate existing setups, or integrate software solutions of your choice – without license binding, API constraints, or external control. All components are open-source based and documented.

100% Open Source Stack

Frequently Asked Questions

Industry-leading companies rely on us

  • Keymate
  • SolidProof
  • Rekorder
  • Führerscheinmacher
  • ARGE
  • NextGym
  • Paritel
  • EVADXB
  • Boese VA
  • Maho Management
  • Aphy
  • Negosh
  • Millenium
  • Yonju
  • Mr. Clipart

What do our customers say?

Sonja Aßer

Sonja Aßer

Data Manager, ARGE

ARGE
"With Timo and Robin, you're not only on the safe side technically - you also get the best human support! Whether it's quick help in everyday life or complex IT solutions: the guys from WZ-IT think along with you, act quickly and speak a language you understand. The collaboration is uncomplicated, reliable and always on an equal footing. That makes IT fun - and above all: it works! Big thank you to the team! (translated) "
"

Timo and Robin from WZ-IT set up a RocketChat server for us - and I couldn't be more satisfied! From the initial consultation to the final implementation, everything was absolutely professional, efficient, and to my complete satisfaction. I particularly appreciate the clear communication, transparent pricing, and the comprehensive expertise that both bring to the table. Even after the setup, they take care of the maintenance, which frees up my time enormously and allows me to focus on other important areas of my business - with the good feeling that our IT is in the best hands. I can recommend WZ-IT without reservation and look forward to continuing our collaboration! (translated)

S
Sebastian Maier
CEO Yonju GmbH
Yonju
"

We have had very good experiences with Mr. Wevelsiep and WZ-IT. The consultation was professional, clearly understandable, and at fair prices. The team not only implemented our requirements but also thought along and proactively. Instead of just processing individual tasks, they provided us with well-founded explanations that strengthened our own understanding. WZ-IT took a lot of pressure off us with their structured approach - that was exactly what we needed and is the reason why we keep coming back. (translated)

M
Matthias Zimmermann
CEO Annota GmbH
"

Robin and Timo provided excellent support during our migration from AWS to Hetzner! We received truly competent advice and will gladly return to their services in the future. (translated)

S
Simon Deutsch
CEO WiseWhile UG
"

WZ-IT set up our Jitsi Meet Server anew - professional, fast, and reliable. (translated)

M
Mails Nielsen
CEO SolidProof (FutureVisions Deutschland UG)
SolidProof

Let's Talk About Your Idea

Whether a specific IT challenge or just an idea – we look forward to the exchange. In a brief conversation, we'll evaluate together if and how your project fits with WZ-IT.

Trusted by leading companies

  • Keymate
  • SolidProof
  • Rekorder
  • Führerscheinmacher
  • ARGE
  • NextGym
  • Paritel
  • EVADXB
  • Boese VA
  • Maho Management
  • Aphy
  • Negosh
  • Millenium
  • Yonju
  • Mr. Clipart
E-Mail
[email protected]
1/3 – Topic Selection33%

What is your inquiry about?

Select one or more areas where we can support you.