WZ-IT Logo
WZ-IT AI Cube - Die kompakte und lokale KI-Lösung für Unternehmen
GDPR Compliant
NVIDIA RTX Blackwell
Support from Germany
MadeinGermany

The local plug-and-play AI solution for businesses

Prevent data leakage by employees using ChatGPT & Co. – run your AI infrastructure locally, without cloud and without huge server racks!

Ready to use with pre-installed software

100% data sovereignty in your network

One-time investment instead of monthly fees

Europe-wide personal delivery & commissioning

Trusted by leading companies

  • Keymate
  • SolidProof
  • Rekorder
  • Führerscheinmacher
  • ARGE
  • NextGym
  • Paritel
  • EVADXB
  • Boese VA
  • Maho Management
  • Aphy
  • Negosh
  • Millenium
  • Yonju
  • Mr. Clipart

Why Local AI Infrastructure?

Cloud services offer convenience – but also dependency. With an AI Cube, you retain full control over your data, models, and systems. Whether chatbots, RAG systems, or internal AI automation: Your sensitive data stays within your company, while computing power is directly on-site.

The AI Cube is owned by your company – no monthly fees, no token limits, no vendor lock-in. You decide which software runs, which models are used, and how your AI infrastructure grows.

Data Sovereignty

Your models and data never leave your corporate network. Complete control over sensitive information.

Full Control

No API limits, no external updates, no restrictions. You decide every aspect of your AI infrastructure.

Performance

Minimal latency through local inference. No delays from cloud connections.

Cost Efficiency

No token or pay-per-use fees. One-time investment instead of ongoing costs.

Ownership vs. Rental

The AI Cube is completely yours. No monthly subscriptions, no vendor dependency.

Optional Managed Service

If desired, we handle operation, maintenance, and updates – you focus on your projects.

ROI Calculator

Cloud vs. On-Premises: When Does AI Cube Pay Off?

At 500 tokens/s continuous load, the AI Cube Pro pays for itself in under 4 months

OpenAI GPT-5 mini

Cloud API

Monthly$3,564
Yearly$42,768
Tokens/Mo.5.18B

Input: $0.25/1M • Output: $2.00/1M • 500 t/s output, 1,500 t/s input (3:1 ratio)

AI Cube Pro

On-Premises

One-time€13,599
Token costs€0
Token Limit

96 GB VRAM • 500+ t/s output • Unlimited usage

<4
Months Break-Even
€30K+
Savings/Year
100%
Data Control
Local AI Usage

Local GPT with our AI Cube

Use Open WebUI for a ChatGPT-like experience – completely local on your own hardware

Open WebUI Screenshot - ChatGPT-like interface

The AI Cube can be delivered with Open WebUI based on customer requirements – an intuitive, user-friendly interface that enables a local ChatGPT-like experience. No cloud dependency, no API keys, no token limits – just you and your AI models.

ChatGPT-like Interface

Familiar and intuitive user interface for natural conversations with your local AI models

Completely Local

All data and conversations stay on your hardware – no connection to external servers required

Multi-Model Support

Switch seamlessly between different AI models within the same interface

No Token Fees

Unlimited usage without pay-per-use fees or monthly API costs

Open WebUI can be pre-installed and delivered ready to use upon request. Simply plug in, power on, and immediately interact with your local AI models – like ChatGPT, but completely under your control.

Vorinstalliert
Sofort einsatzbereit
100% lokal

How Our Customers Successfully Use the AI Cube

Our customers benefit from the locally operated AI solution – independent, secure and efficient. Here are two exemplary use cases.

Case Study: Law Firm

RAG-based Document Research

!Challenge

A medium-sized law firm with numerous mandates and a large file archive found that research for precedent cases, briefs, and internal evidence was often very time-consuming – several hours per case. Additionally, sensitive client data was present that should not go to external cloud systems.

Solution with AI Cube

  • RAG solution for knowledge database search: All briefs, judgments and internal documents in searchable knowledge database
  • Lawyers ask questions in natural language and immediately receive relevant document sections with source citations
  • Infrastructure remains completely in the firm's own network, operation and maintenance by the firm's IT service provider

Result

Drastically Reduced Research Time

Lawyers can argue and decide faster

Strengthened Knowledge Base

New employees access proven documents much faster

Case Study: Private Clinics Network (Psychiatric Facilities)

Knowledge Database for Medical Protocols

!Challenge

A clinic network with multiple locations must manage large amounts of medical protocols, SOPs, training materials and internal reports. Documentation was fragmented and difficult to access – especially when it came to quick decision support and quality checks.

Solution with AI Cube

  • Knowledge platform with BookStack as knowledge source (integration programmed by us), connected to RAG pipeline with Open WebUI + vLLM
  • Employees can ask questions directly with immediate citation of the source
  • AI Cube runs locally in the corporate network, operation and maintenance by us

Result

Drastically Reduced Access Time

Relevant documents are accessed immediately

Strengthened Quality & Compliance

Employees at different locations consistently access the same knowledge pool

Reseller Program

Your clients need AI hardware?

As a reseller, you offer local AI solutions – we deliver the hardware and service

Want to not only use local AI solutions yourself, but also resell them to your customers? As a reseller, you receive preferred terms, technical support, and fully pre-installed systems. For Enterprise and Pro customers, we deliver personally.

Attractive Purchase Terms

Direct margin advantages for resellers and integrators.

White-Label Option

On request, we deliver the AI Cube completely neutral – ideal for system integrators who want to operate under their own brand.

Pre-installed AI Software

Ollama, vLLM, Open WebUI – ready to use for your end customers.

Technical Priority Support

Direct contact with us for questions about integration, RAG, models & hardware.

Custom Configurations

Custom models, RAG pipelines, GPU layouts, and network setups for specific customer requirements.

Expand Your Service Portfolio

You can now offer your customers their own local AI solutions – without having to develop hardware yourself.

Become a Reseller Partner

Contact us for a non-binding conversation about terms, technical details, and your individual requirements.

Enterprise & Pro Service

On-Site Service for Maximum Security & Comfort

For our AI Cube Pro customers, we offer personal delivery and professional commissioning in Germany and the Netherlands. For Enterprise customers, this service is available Europe-wide.

Secure Delivery

Directly to your company premises or to your customers – personally

Physical Installation

Professional installation and cabling on-site

Initial Setup

Operating system, GPU drivers, container environment and security configuration (VPN, firewall, backup)

Validation & Acceptance

Performance test, stability check and GDPR compliance review before commissioning

All-Inclusive Package

For Enterprise & Pro Customers

Our on-site service ensures that your AI Cube runs optimally from the start – without you having to worry about installation or configuration.

Perfect for companies that value:

Highest quality standards
Compliance & Data Protection
Clean Integration
AI Cube Pro: DE & NL
Enterprise: Europe-wide
New

We have replaced the Ada Generation!

Our AI Cubes now use NVIDIA RTX PRO Blackwell GPUs – the latest generation with more VRAM, higher efficiency, and better performance. Benefit from the latest technology for your local AI infrastructure.

Hardware for Purchase

Hardware Options for Your AI Projects

Proven configurations for every use case

Aufgrund von steigenden Speicherpreisen mussten wir unsere Preise anpassen, um weiterhin den gewohnten Support und Unterstützung gewährleisten zu können.

Entry Model

AI Cube Basic

NVIDIA RTX PRO 4000 Blackwell

VRAM

24 GB

Performance

46.9 TFLOPS

CUDA Cores

8.960

Recommended Use:

Chatbots, Code Assistance, Text Inference

GPT-OSS 20B Performance

50

token/s

Batch Size 1

  • Ideal for models up to 20B parameters
  • Fast real-time inference
  • Perfect for 24/7 operation
  • Mini-ITX form factor
  • < 6 months ROI vs. cloud APIs
  • Trade-In available
from €4,299.90
excl. VAT
Learn More
Enterprise Model

AI Cube Pro

NVIDIA RTX PRO 6000 Blackwell

VRAM

96 GB

Performance

125 TFLOPS

CUDA Cores

24.064

Recommended Use:

Large LLM Models, Training

GPT-OSS 20B Performance

200

token/s

Batch Size 1

  • For models up to 120B+ parameters (e.g. GPT-OSS 120B)
  • 96 GB VRAM for largest models
  • Enterprise-Grade Performance
  • < 4 months ROI vs. cloud APIs
  • Personal delivery & commissioning (DE & NL)
  • Trade-In available
from €13,599.90
excl. VAT
Learn More
Custom Configuration

AI Cube Custom

Multi-GPU Setups (e.g. H200, RTX Blackwell)

VRAM

Configurable

Performance

Configurable

CUDA Cores

Configurable

Recommended Use:

Multi-GPU Workloads, High-Performance Training

  • Multi-GPU with NVLink (2-8 GPUs)
  • NVIDIA H200 or RTX Blackwell
  • Extended storage & network options
  • Rack-Mount or Tower chassis
On Request
Learn More

Included in Delivery

Pre-installed Software (Ollama, vLLM, Open WebUI) – plug in & infer
Operating System & GPU Drivers
Setup Documentation
German Support
Interactive Demo

How fast is the AI Cube?

Test different token speeds and see the difference

Token Speed Simulator

Experience the difference of various token rates

50 tok/s
10 tok/s300 tok/s

At 50 tok/s, generating takes:

1.0s

Chat response

(~50 tokens)

3.0s

Email

(~150 tokens)

40.0s

Report

(~2000 tokens)

* Token rates vary depending on model size and query complexity

Upgrade Program

Upgrade & Trade-In – When Your AI Cube Needs to Grow

Your requirements are increasing — e.g. larger models, more concurrent users or more intensive AI workloads? With our trade-in program, you can easily exchange your existing AI Cube for a more powerful model — whether from Basic to Pro or from Pro to Custom.

Upgrade affordably

No complete new purchase — credit towards your new system

Planning security

Start small and upgrade as needed

Sustainable & secure

Secure data deletion and environmentally friendly recycling

How it works

1

Express interest

Contact us

2

Evaluation

We assess your device and determine a fair residual value

3

Receive credit

Discount on your new AI Cube Pro or Custom

More than just Hardware

Your AI Cube & WZ-IT
Possibilities are endless together

With us you get not only powerful hardware, but also a competent partner for your entire AI infrastructure

Infrastructure Setup

From planning to implementation – we build your complete AI infrastructure and integrate the AI Cube seamlessly.

Custom Development

Tailored software solutions, RAG pipelines, APIs and integrations – perfectly matched to your requirements.

Innovative Solutions

Together we develop new AI applications for your specific use cases – from idea to production readiness.

Support & Maintenance

Continuous support, updates and optimizations – so your AI infrastructure always runs optimally.

Timo Wevelsiep & Robin Zins - CEOs of WZ-IT

Timo Wevelsiep & Robin Zins

CEOs of WZ-IT

Success Story: From Hardware to Complete Solution

A clinic network purchased the AI Cube Pro for local AI inference. We not only delivered the hardware, but also programmed a custom RAG pipeline that uses BookStack as a knowledge source and is integrated into Open WebUI. The result: employees can access medical protocols and SOPs in seconds – fully GDPR compliant and without cloud.

Let's realize your AI vision together

Software Stack & Compatibility

Ready to Use with Leading Open-Source Frameworks

Pre-installed Software:

Ollama – for simple model management
vLLM – for high-performance inference
Open WebUI – for visual interaction
Docker / Podman – for containerized deployments
REST API Access – for integration

Compatible with:

Llama 3.3
Gemma 3
DeepSeek-R1
Ministral 3
Qwen 3
Phi-4
Custom Models
Ollama

Ollama

Simple model management with one-command installation. Perfect for rapid prototyping and smaller projects.

$ ollama run llama3.1:70b
vLLM

vLLM

High-performance inference with PagedAttention for production workloads with high throughput.

$ vllm serve llama3.1:70b
Performance Benchmarks

Datacenter Performance for Your Office

Real performance metrics of our AI Cubes with large open-source models – measured in tokens per second at batch size 1

ModellAI Cube Basic
RTX PRO 4000 (24 GB)
AI Cube Pro
RTX PRO 6000 (96 GB)
GPT-OSS 20B
~20 Milliarden Parameter
50 token/s
200 token/s
GPT-OSS 120B
~120 Milliarden Parameter
Not enough VRAM
150 token/s

All values were measured with batch size 1 and represent inference speed for interactive use cases. Actual performance may vary depending on model configuration and prompt length. Higher batch sizes increase throughput for parallel requests.

Technical Specifications

More technical details on request

KomponenteAI Cube BasicAI Cube Pro
Graphics CardNVIDIA RTX PRO 4000 Blackwell (24 GB GDDR7)NVIDIA RTX PRO 6000 Blackwell (96 GB GDDR7)
Network1 GbE (10 GbE optional)1 GbE (10 GbE optional)
Dimensions & Weight292×185×372 mm (H×W×D), approx. 8 kg292×185×372 mm (H×W×D), approx. 8 kg
CertificationCE, RoHS, GDPR-compliantCE, RoHS, GDPR-compliant
SecuritySecure Boot, TPM 2.0, WireGuard VPNSecure Boot, TPM 2.0, WireGuard VPN

AI Cubes (Purchase) vs Managed AI Server (Rental)

Find the Right Model for Your Business

AI Cubes – Purchase

  • Complete hardware ownership
  • CapEx: One-time investment from €4,299.90
  • Full data sovereignty – hardware stays with you
  • No recurring fees (except optional support)
  • Ideal for long-term projects

Managed AI Server – Rental

  • OpEx: Monthly payment from €499/month
  • Fast start without capital commitment
  • 24/7 monitoring & maintenance included
  • Scalable: upgrade or downgrade anytime
  • Ideal for flexible or experimental projects

Why AI Cube?

All benefits at a glance

On-Prem LLM Hosting vs. Cloud API: Costs & Risks

Cloud-based LLM APIs like OpenAI, Anthropic, or Google Gemini are convenient – but expensive and risky. At high volumes, costs can quickly spiral out of control: 1 million tokens per day via cloud APIs can easily cost €15,000 per month or more. With an AI Cube, you pay once from €4,299.90 and run unlimited inferences – no token fees, no monthly bills.

Additionally, on-premise LLM hosting gives you full control over your data. Sensitive information – customer data, internal documents, proprietary content – never leaves your corporate network. You're independent of API downtimes, price increases, or sudden service changes.

How the WZ-IT AI Cube Works

1

Analysis & Consultation

We jointly evaluate your requirements and use cases. In a free consultation, we determine which hardware configuration is optimal for your models and use cases.

2

Hardware Selection & Configuration

Based on model size and requirements, we select the appropriate GPU configuration. We fully configure the system and install Ollama, vLLM, Open WebUI, and other software according to your preferences.

3

Delivery & Setup

The Cube is delivered pre-installed and tested. After plugging it in, it can be operational within minutes. We support you in integrating it into your network.

4

Operation & Support (Optional)

You operate the Cube independently with full root access – or leave operation, maintenance, and updates to us. We remain your contact for extensions, support, and new requirements.

Typical Use Cases

Enterprises & Government

For sensitive data that cannot go to the cloud. Run internal chatbots, document analysis, or code assistants completely locally and GDPR-compliant.

Development & Research

Test and develop AI applications locally without cloud dependency. Ideal for rapid prototyping, model fine-tuning, and experimental projects.

On-Premise Deployment

Integrate AI capabilities directly into your existing infrastructure. No internet connection required, complete control over your data.

Industry Solutions

AI Cube for Your Industry

Tailored AI solutions for specific requirements

For Law Firms

GDPR-compliant document research, contract analysis and client communication. Attorney-client privilege maintained.

Learn more

For Clinics & Practices

Local AI for patient data, protocol analysis and medical knowledge databases.

Coming soon

For Financial Services

Compliance-conform AI for risk assessment, document analysis and advisory support.

Coming soon

Your industry not listed? We create custom solutions for your requirements.

No Dependencies. No Vendor Lock-in.

With AI Cubes, you retain full decision-making freedom: you can install your own models, migrate existing setups, or integrate software solutions of your choice – without license binding, API constraints, or external control. All components are open-source based and documented.

100% Open Source Stack

Frequently Asked Questions about AI Cube

Answers to the most important questions about your local AI solution

Topics

Hardware & Technology

The AI Cube is a plug-and-play AI hardware for businesses — ideal for running LLMs, transcriptions, or data-intensive workloads locally in your own network, without cloud dependency and fully GDPR-compliant.

We offer standard setups (AI Cube Basic / Pro) as well as custom systems: multi-GPU, large VRAM cards, rack-mount servers, or clusters with NVLink — depending on model size, user count, and workload.

The AI Cube Basic requires approx. 150–250W, the Pro approx. 350–450W. Both run on standard 230V and don't require special power supply. Individual builds are assessed separately.

Yes — since you own the hardware, you can replace or expand RAM, storage (NVMe/SSD), or GPU yourself at any time. We're happy to assist if needed — but you have full control over your hardware.

Privacy & Compliance

Yes — the AI Cube runs entirely locally. There's no communication with external cloud servers, no data transfer outside your network. This ensures maximum data sovereignty and GDPR compliance.

The AI Cube stores data exclusively locally. With TPM 2.0, Secure Boot, and optionally encrypted SSD/NVMe, we ensure maximum protection. For sensitive data, we recommend encrypted filesystem and restrictive access control.

Delivery & Service

Yes — on request, we deliver the AI Cube as plug-and-play: with pre-installed software, GPU drivers, and basic configuration. After powering on, you can start working with AI models immediately — no complex setup required.

Yes — for AI Cube Pro, we offer personal delivery and professional on-site setup in Germany and the Netherlands. For enterprise customers, this service is available Europe-wide.

Our technician delivers the AI Cube, connects it to power and network, and configures VPN/firewall on request. This is followed by a functional test and optional onboarding. We also offer training and documentation.

Software & Usage

The AI Cube supports common open-source frameworks and models — e.g., Llama, Mistral, Qwen, Gemma, DeepSeek, multimodal and transcription models. The pre-installed environment allows quick start.

Yes — depending on hardware configuration, multiple models can run in parallel. For intensive or parallel use, we recommend more powerful or customized hardware configurations.

Beyond chatbots and RAG systems: audio/video transcription, document indexing, data processing, code assistance, automation of internal processes — ideal for privacy-critical or compliance-relevant scenarios.

Costs & Economics

The entry configuration (AI Cube Basic) starts at approx. €4,299.90 (excl. VAT). Compared to cloud solutions, you save long-term — no ongoing token or API costs, no vendor lock-in.

When data privacy, control, consistent performance, and long-term planning are important — e.g., with sensitive data, compliance requirements, or frequent AI use.

Yes. We support migration: data and model transfer, re-setup on your on-prem system — without external dependency.

The AI Cube is owned by your company (one-time payment from €4,299.90 excl. VAT), while our AI servers are rented (from €499/month excl. VAT with managed service). The Cube is suitable for long-term planning, rented servers for flexible projects.

Maintenance & Support

Our pre-configured models are designed to be low-maintenance. If needed, we offer managed service: regular security patches, monitoring, updates — keeping your infrastructure stable and secure.

Yes — the AI Cube is compatible with common corporate networks. On request, we configure VPN, firewall, and connectivity so the Cube integrates securely and seamlessly.

In addition to hardware, we optionally offer managed service, maintenance, updates, monitoring, and support — especially for enterprise customers. Hardware, software, and support from a single source.

On request, we provide a backup concept: regular snapshots, redundant or external storage options, remote backup — keeping you protected even in case of hardware failure.

Regions & Reseller

We deliver Europe-wide — with special focus on Germany, the Ruhr area, and the Netherlands. This means short delivery times, regional service, and direct support.

Our AI Cubes are custom-built in our workshop in Dortmund. Each AI Cube is an individual configuration optimized for hardware and use case.

Yes — we offer a reseller program with attractive purchasing conditions, technical support, and optional white-label license. Ideal for system integrators and IT service providers.

More questions? We are happy to help!

Still have questions? Contact us!

Industry-leading companies rely on us

  • Keymate
  • SolidProof
  • Rekorder
  • Führerscheinmacher
  • ARGE
  • NextGym
  • Paritel
  • EVADXB
  • Boese VA
  • Maho Management
  • Aphy
  • Negosh
  • Millenium
  • Yonju
  • Mr. Clipart

What do our customers say?

Aleksandr Shuliko

Aleksandr Shuliko

CTO, EVA Real Estate, UAE

EVA Real Estate
"I recently worked with Timo and the WZ-IT team, and honestly, it turned out to be one of the best tech decisions I have made for my business. Right from the start, Timo took the time to walk me through every step in a simple and calm way. No matter how many questions I had, he never rushed me. The results speak for themselves. With WZ-IT, we reduced our monthly expenses from $1,300 down to $250. This was a huge win for us."
Sonja Aßer

Sonja Aßer

Data Manager, ARGE, Germany

ARGE
"With Timo and Robin, you're not only on the safe side technically - you also get the best human support! Whether it's quick help in everyday life or complex IT solutions: the guys from WZ-IT think along with you, act quickly and speak a language you understand. The collaboration is uncomplicated, reliable and always on an equal footing. That makes IT fun - and above all: it works! Big thank you to the team! (translated) "
Pascal Hakkers

Pascal Hakkers

CEO, Aphy B.V., Netherlands

Aphy
"WZ-IT manages our Proxmox cluster reliably and professionally. The team handles continuous monitoring and regular updates for us and responds very quickly to any issues or inquiries. They also configure new nodes, systems, and applications that we need to add to our cluster. With WZ-IT's proactive support, our cluster and the business-critical applications running on it remain stable, and high availability is consistently ensured. We value the professional collaboration and the noticeable relief it brings to our daily operations."
Gabriel Sanz Señor

Gabriel Sanz Señor

CEO, Odiseo Solutions, Spain

Odiseo Solutions
"Counting on WZ-IT team was crucial, their expertise and solutions gave us the pace to deploy in production our services, even suggesting and performing improvements over our configuration and setup. We expect to keep counting on them for continuous maintenance of our services and implementation of new solutions."
"

Timo and Robin from WZ-IT set up a RocketChat server for us - and I couldn't be more satisfied! From the initial consultation to the final implementation, everything was absolutely professional, efficient, and to my complete satisfaction. I particularly appreciate the clear communication, transparent pricing, and the comprehensive expertise that both bring to the table. Even after the setup, they take care of the maintenance, which frees up my time enormously and allows me to focus on other important areas of my business - with the good feeling that our IT is in the best hands. I can recommend WZ-IT without reservation and look forward to continuing our collaboration! (translated)

Sebastian Maier
Sebastian Maier
CEO Yonju GmbH
Yonju
"

We have had very good experiences with Mr. Wevelsiep and WZ-IT. The consultation was professional, clearly understandable, and at fair prices. The team not only implemented our requirements but also thought along and proactively. Instead of just processing individual tasks, they provided us with well-founded explanations that strengthened our own understanding. WZ-IT took a lot of pressure off us with their structured approach - that was exactly what we needed and is the reason why we keep coming back. (translated)

Matthias Zimmermann
Matthias Zimmermann
CEO Annota GmbH
Annota
"

Robin and Timo provided excellent support during our migration from AWS to Hetzner! We received truly competent advice and will gladly return to their services in the future. (translated)

S
Simon Deutsch
CEO WiseWhile UG
"

WZ-IT set up our Jitsi Meet Server anew - professional, fast, and reliable. (translated)

Mails Nielsen
Mails Nielsen
CEO SolidProof (FutureVisions Deutschland UG)
SolidProof

Let's Talk About Your Idea

Whether a specific IT challenge or just an idea – we look forward to the exchange. In a brief conversation, we'll evaluate together if and how your project fits with WZ-IT.

Trusted by leading companies

  • Keymate
  • SolidProof
  • Rekorder
  • Führerscheinmacher
  • ARGE
  • NextGym
  • Paritel
  • EVADXB
  • Boese VA
  • Maho Management
  • Aphy
  • Negosh
  • Millenium
  • Yonju
  • Mr. Clipart
Timo Wevelsiep & Robin Zins - CEOs of WZ-IT

Timo Wevelsiep & Robin Zins

CEOs of WZ-IT

1/3 – Topic Selection33%

What is your inquiry about?

Select one or more areas where we can support you.