itria

ai, uncompromised

Introduction

Itria is a revolutionary iOS application that brings state-of-the-art on-device AI directly to your iPhone and iPad. Powered by Apple's Metal GPU acceleration and the open-source llama.cpp framework, Itria runs powerful language models entirely locally — no internet connection required, no data sent to the cloud, ever.

Whether you're a:

  • 👨‍💻 Developer debugging code or architecting systems
  • ✍️ Writer brainstorming stories or refining prose
  • 🔒 Privacy Advocate demanding zero data collection
  • 🎓 Student solving problems offline on flights or commutes
  • 🧠 Creative exploring ideas with an intelligent partner

—Itria gives you cutting-edge AI capabilities while keeping your conversations completely private and under your control.

Itria Splash Screen

What is Itria?

Itria is a native iOS application that brings state-of-the-art on-device AI to your pocket. Unlike cloud-based assistants that send your data to remote servers, Itria processes everything locally using your device's Neural Engine and GPU.

The Privacy-First Approach

  • Zero Cloud Dependency: Once you download a model, Itria works offline forever. No API keys, no subscriptions, no hidden data collection.
  • Local Processing Only: All AI inference happens on your iPhone or iPad using Metal GPU acceleration. Your conversations never leave your device.
  • Open Ecosystem: Itria supports the GGUF format, giving you access to thousands of community-trained models from HuggingFace and other repositories.

Who Is Itria For?

  • Developers & Engineers: Get instant code assistance, architecture reviews, and debugging help without exposing proprietary code to cloud services.
  • Privacy Advocates: Anyone who wants AI capabilities without surrendering their data to tech giants.
  • Creatives & Writers: Brainstorm stories, refine prose, and explore ideas with a creative partner that respects your intellectual privacy.
  • Students & Researchers: Analyze documents, solve math problems, and conduct research offline — perfect for flights, commutes, or secure environments.

Features Overview

Core Capabilities

Feature Description Benefit
20+ Preloaded Models Curated selection of text and vision models ready to download instantly Start using AI immediately — no hunting for models
Vision Model Support Analyze images with Qwen2.5-VL, InternVL3, Moondream2, and more Get OCR, chart interpretation, and visual reasoning on photos you take
Metal GPU Acceleration Fully optimized for Apple Silicon using SIMD matrix multiplication Fast token generation (15-30 tok/s on modern devices)
HuggingFace Integration Browse and download any GGUF model directly from within the app Unlimited model selection from the open-source community
Specialist Personas Architect (coding), Muse (writing), Oracle (math), Zen (concise) Tailored AI behavior for different use cases — Pro feature
Voice Input Speak your prompts with automatic transcription and send Hands-free interaction, perfect for on-the-go usage
Background Downloads Download large models (2-8 GB) while using other apps No need to keep the app open during downloads
Haptic Feedback Subtle vibrations for token generation and interactions Tactile confirmation of AI activity
Custom GGUF Import Import models from Files app or cloud storage Use your own fine-tuned or experimental models
Advanced Generation Controls Tune temperature, top-p, top-k, repeat penalty, context size Fine-grained control over AI behavior and creativity

See Itria in Action

Browse through these screenshots to explore the app's interface and features. Each view showcases a different aspect of how Itria brings powerful AI to your device.

Model Categories

Itria supports three main categories of models:

  • Text-Only Models: Llama 3.1, Mistral, Phi-3, Gemma, and more — perfect for chat, coding, and reasoning tasks.
  • Vision-Language Models (VLMs): Qwen2.5-VL, InternVL3, SmolVLM — analyze images, extract text, interpret charts and diagrams.
  • Specialized Models: Code-focused models (CodeLlama, DeepSeek-Coder), multilingual models, and domain-specific fine-tunes.

Supported Models

Itria comes with a curated selection of models optimized for iOS devices. All models are quantized (Q4_K_M or similar) to balance quality and memory usage.

Browse through Itria's extensive model library to find the perfect AI for your needs. Each model is carefully selected for performance on iOS devices, with options ranging from lightweight 2GB models for older devices to powerful 13B+ models for the latest iPhone and iPad Pro.

The built-in HuggingFace integration lets you search and download any GGUF model directly within the app. Every model is quantized using the GGUF format (Q4_K_M is the sweet spot — 95%+ of full-precision quality at just 25% of the memory cost), and large models download in the background while you use other apps.

Model Library

Recommended Text Models

Model Size RAM Required Best For
Llama 3.1 8B 4.7 GB ~6 GB General-purpose chat, coding, reasoning
Mistral 7B 4.4 GB ~5.5 GB Fast, capable general assistant
Phi-3 Mini 3.8B 2.3 GB ~4 GB Quick responses on older devices
Gemma 2 9B 5.6 GB ~7 GB Creative writing, nuanced conversations

Recommended Vision Models

Model Size RAM Required Best For
Qwen2.5-VL 7B 4.7 GB ~7.5 GB Best overall vision model — OCR, charts, documents
InternVL3 8B 4.7 GB ~7 GB Math, science images, academic diagrams
SmolVLM 2B 1.1 GB ~2.5 GB Fast image captioning on older devices
Moondream2 2.8 GB ~4.5 GB Ultra-fast visual Q&A, simple image descriptions

How to Download Models

  1. Open Itria and navigate to the "Models" tab
  2. Browse the curated list or search HuggingFace directly
  3. Tap a model to see details (size, RAM requirements, speed estimate)
  4. Tap "Download" — the model will download in the background
  5. Once complete, select the model and start chatting

Getting Started

Prerequisites

  • Device: iPhone XS/iPad Air (3rd gen) or later recommended. iOS 16.0+ required.
  • Storage: Allow 5-10 GB free space for models (each model is 2-8 GB)
  • Internet: Required only for initial model downloads

Quick Start Guide

  1. Download Itria from the App Store
  2. Open the app — you'll see the Models tab by default
  3. Choose a model: Start with "Llama 3.1 8B" for general use, or "Moondream2" if you want to try vision
  4. Wait for download — models are large but download in the background
  5. Select your model from the dropdown and start chatting
  6. Customize settings: Adjust temperature, context size, and other parameters in Settings

Pro Tips

Itria is designed to work out of the box, but a few key settings unlock its full potential. Dialing in GPU layers, context size, and the right model for your device makes the difference between a sluggish assistant and one that keeps up with your thoughts.

  • GPU Layers: In Settings, set GPU Layers to 99 (All) for maximum speed. The app will automatically use your device's Metal GPU.
  • Context Size: Larger context (4096-8192 tokens) enables longer conversations but uses more RAM. Start with 2048 if you have an older device.
  • Vision Models: Tap the image icon in the chat to upload photos. Ask questions like "What text is in this image?" or "Explain this chart."
  • Personas: Pro users can switch between Architect (coding), Muse (writing), Oracle (math), and Zen (concise) for tailored responses.
Getting Started Guide

Privacy & Security

Itria is built on the principle that AI should enhance your capabilities without compromising your privacy. Here's how we deliver on that promise:

No Data Collection

  • No personal information: We do not collect, store, or transmit any personally identifiable information (PII)
  • No analytics: Zero third-party analytics, telemetry, or tracking SDKs are integrated
  • No cloud storage: All processing occurs locally on your device — your conversations never leave your iPhone or iPad
  • No internet permissions: The app does not require network access for core functionality (only needed for model downloads)

Data Ownership & Control

  • Your data, your control: All chat history and settings are stored locally in iOS Keychain and UserDefaults
  • No background transmission: The app cannot send data without your explicit action
  • Local model storage: Downloaded models are stored in the app's sandbox — you can delete them anytime

Technical Privacy Measures

  • Sandboxed execution: Models run in iOS's secure sandbox with no access to other apps or system data
  • In-memory processing: All inference calculations happen in RAM and are cleared when the app closes
  • No remote code execution: The app cannot fetch or execute code from external servers after installation

Transparency

Itria is built on open-source technologies (llama.cpp, GGUF format) that are auditable by anyone. The app's privacy policy is simple: we don't touch your data. If you're concerned about privacy, Itria gives you the most control possible.

Every aspect of Itria's privacy architecture is designed around the principle that your data belongs to you, not to a corporation. The app uses iOS's built-in Keychain for secure storage of sensitive settings, and all model files are stored in the app's isolated sandbox — completely inaccessible to other apps or Apple itself.

Delete the app and everything disappears. Export your chats via the Files app if you want to keep them. Itria respects your data sovereignty at every level.

Privacy Settings

Troubleshooting Guide

Common Issues

Symptom Likely Cause Solution
App crashes on launch Corrupted cache or iOS bug Restart your device. If issue persists, delete and reinstall Itria (chat history is local but can be backed up via Files app).
Model fails to load Insufficient RAM or corrupted download Check your device's available RAM in Settings > About. Close other apps. Delete the model and re-download it.
Very slow generation GPU offloading not enabled Go to Settings > Performance and ensure GPU Layers is set to 99 (All). Your device must support Metal (iPhone XS/iPad Air 3 or later).
Download stuck at 99% Network interruption Pause and resume the download. If still stuck, cancel and restart the download with a stable Wi-Fi connection.
Vision model returns errors Image too large or unsupported format Resize images to under 2048px on the longest side. Use JPG or PNG format only.
App uses too much battery Large context size or long conversations Reduce Context Size in Settings (try 2048 instead of 8192). Longer conversations naturally use more power.

Performance Optimization

  • Close background apps: Free up RAM before loading large models (8B+ parameter models need 6-8 GB available)
  • Use quantized models: Q4_K_M models offer the best balance of speed and quality for iOS devices
  • Avoid multitasking: Running Itria alongside video editing or gaming apps can cause memory pressure
  • Keep iOS updated: Newer iOS versions include Metal performance improvements that benefit Itria

Frequently Asked Questions

Q: Do I need an internet connection to use Itria?

No, once you download a model, Itria works entirely offline. You only need internet for the initial model download and for browsing HuggingFace in the app.

Q: Can I sync my chats across devices?

Currently, all data is stored locally on each device. We're considering cloud sync as a Pro feature, but it would be end-to-end encrypted with your choice of storage provider.

Q: How much storage do I need?

Each model is 2-8 GB depending on size and quantization. We recommend at least 10 GB free space to download a few models comfortably.

Q: Which models work best on older devices?

For iPhone XS/XR or iPad Air 3, try Phi-3 Mini (3.8B) or SmolVLM (2B). These require only ~4 GB RAM and still deliver impressive performance.

Q: Can I use my own custom models?

Yes! Import any GGUF file from the Files app or cloud storage. Itria supports all standard GGUF variants (Q4_K_M, Q5_K_M, Q8_0, etc.).

Q: Is there a Pro version?

Itria is free to download with all core features. Pro ($4.99/month or $29.99/year) unlocks Specialist Personas (Architect, Muse, Oracle, Zen), priority support, and early access to new models.

Q: How does the vision model work?

Vision models like Qwen2.5-VL and InternVL3 can analyze any image you upload. Ask questions like "What text is in this image?" (OCR), "Explain this graph" (chart interpretation), or "What objects are in this photo?" (object detection).

Q: Can I use Itria for coding?

Absolutely! Select the Architect persona or use a code-focused model like CodeLlama. Itria can write, debug, and explain code in any language — right from your device, with no code ever leaving it. The Pro version includes specialized coding optimizations and persona presets tuned for architecture and debugging workflows.

Coding Assistant

Technical Specifications

System Requirements

Minimum iOS 16.0
Recommended iOS 17.0 or later
Development Framework SwiftUI
Programming Language Swift 5.9+
Inference Engine llama.cpp with Metal GPU acceleration
Model Format GGUF (all quantization levels)
GPU Requirements Apple A12 Bionic (iPhone XS) or later for full Metal support
RAM Requirements 4 GB minimum, 8 GB recommended for 7B+ models

Performance Benchmarks

Token generation speeds vary by device and model size. Here are typical results on iPhone 15 Pro (A17 Pro):

  • Llama 3.1 8B (Q4_K_M): ~25-30 tokens/second
  • Mistral 7B (Q4_K_M): ~28-32 tokens/second
  • Phi-3 Mini 3.8B (Q4_K_M): ~40-45 tokens/second
  • Qwen2.5-VL 7B (vision): ~15-20 tokens/second (prefill + generation)

These benchmarks demonstrate why Apple Silicon is ideal for on-device AI. The A17 Pro's GPU can process thousands of parallel operations simultaneously, delivering desktop-like performance in a mobile form factor.

Older devices will see proportionally lower speeds — an iPhone 13 Pro might achieve 15-20 tok/s with Llama 3.1 8B, while an iPhone XS (A12) could manage 8-12 tok/s. Match the model size to your device's capabilities for optimal performance.

Performance Benchmarks

Data Storage

  • Chat History: Stored in iOS Keychain (encrypted)
  • Settings: UserDefaults with encryption for sensitive values
  • Models: App sandbox storage (~2-8 GB per model)
  • No cloud backup: All data is local-only by design

Get in Touch

We'd love to hear from you! Whether you have questions, feedback, or just want to share how you're using Itria:

Feature Requests & Bug Reports

For detailed bug reports or feature requests, please include:

  • Your device model and iOS version
  • The model you were using (name and size)
  • Steps to reproduce the issue
  • Screenshots or logs if applicable
Joshua Hrisko

Maker Portal is a blog-centric company intended for young innovators interested in real-world applications to engineering. Resources include: physical products, mobile applications, software development, e-learning, and blog-style article writing. The maker-based approach is explored using written articles with topics ranging from Raspberry Pi, heat transfer, acoustics, robotics, data analysis, Arduino, sensor design, Python programming, and much more. Difficulty levels range depending on the topic and there is extensive focus on open-source software implementation, however, there will be articles with a focus on software design as well. The intention is to demonstrate applications of engineering that are repeatable at the intermediate level without requiring colossal resources. 

https://makersportal.com/
Next
Next

GridVerse