Released Apr 21, 2026262,144 context$0/M input tokens$0/M output tokens
Ling-2.6-flash is an instant (instruct) model from inclusionAI with 104B total parameters and 7.4B active parameters, designed for real-world agents that require fast responses, strong execution, and high token efficiency. It delivers performance comparable to state-of-the-art models at a similar scale while significantly reducing token usage across coding, document processing, and lightweight agent workflows.
Recent activity on Ling-2.6-flash (free)
Total usage per day on OpenRouter
Prompt
31.7B
Completion
152M
Prompt tokens measure input size. Reasoning tokens show internal thinking before a response. Completion tokens reflect total output length.