1 – Please update the system parameters to reflect the extended context capacities of modern LLMs, which now support up to 256k and even 1 million tokens. 1.2 — Context Limit → Add a function to set the context limit via a number of tokens. To ensure broad compatibility with upcoming models, support should ideally range from 256 to 4,000,000 tokens. 2 – Add a user-controllable max_tokens output parameter It would be valuable to introduce a user-configurable setting for defining the max_tokens limit for generated responses, allowing: • concise outputs (e.g., SMS, micro-summaries), • or conversely, extended and exhaustive responses (e.g., technical reports, documentation generation). This parameter should: • be configurable directly by the user through the UI/UX, • be overridable during assistant calls or manual regenerations, • remain optional, with a default fallback value when not explicitly set.