Bigger context window + max_tokens Setting
Kinkazma
1 – Please update the system parameters to reflect the extended context capacities of modern LLMs, which now support up to 256k and even 1 million tokens.
1.2 — Context Limit → Add a function to set the context limit via a number of tokens.
To ensure broad compatibility with upcoming models, support should ideally range from 256 to 4,000,000 tokens.
2 – Add a user-controllable max_tokens output parameter
It would be valuable to introduce a user-configurable setting for defining the max_tokens limit for generated responses, allowing:
• concise outputs (e.g., SMS, micro-summaries),
• or conversely, extended and exhaustive responses (e.g., technical reports, documentation generation).
This parameter should:
• be configurable directly by the user through the UI/UX,
• be overridable during assistant calls or manual regenerations,
• remain optional, with a default fallback value when not explicitly set.