Cracked AI Engineering

Llama 3.2 3B Instruct vs Google Gemma 3

Google Gemma 3 offers 125K tokens context vs 16K tokens, Google Gemma 3 supports vision. Compare full specs, pricing, and choose the best model for your use case.

Quick Overview

Llama 3.2 3B Instruct

Inference

16K tokens context • $0.02 / $0.02 per 1M tokens

View full specifications →

Google Gemma 3

Inference

125K tokens context • $0.15 / $0.30 per 1M tokens

View full specifications →

Detailed Comparison

Specification
Llama 3.2 3B Instruct
Google Gemma 3
Provider
Inference
Inference
Context Window
16K tokens
125K tokens
Max Output Tokens
4K tokens
4K tokens
Input Pricing (per 1M tokens)
$0.02
$0.15
Output Pricing (per 1M tokens)
$0.02
$0.30
Release Date
Jan 2025
Jan 2025

Capabilities

Capability
Llama 3.2 3B Instruct
Google Gemma 3
Text Generation
Function Calling
Vision
File Attachments

Which Model Should You Choose?

Choose Llama 3.2 3B Instruct if:

  • • Cost efficiency is a priority

Choose Google Gemma 3 if:

  • • You need a larger context window