Technology
Context window
The context window is the maximum number of tokens an LLM can process in a single request — the combined system prompt + user prompt + conversation history + retrieved context + output. Claude 4.5: 200K tokens. GPT-4o: 128K. Gemini 2.5: 2M.
More detail
Why it matters: long context = more grounding data per request = less hallucination. Aiprosol's RAG over customer playbooks fits ~150 pages in a single Claude request. For workflows hitting context limits, the pattern is: chunk + retrieve → only feed the most-relevant chunks → maintain a summary in the message thread.
