Chat models
Features (natively supported)
All ChatModels implement the Runnable interface, which comes with default implementations of all methods, ie. invoke
, batch
, stream
. This gives all ChatModels basic support for invoking, streaming and batching, which by default is implemented as below:
- Streaming support defaults to returning an
AsyncIterator
of a single value, the final result returned by the underlying ChatModel provider. This obviously doesn't give you token-by-token streaming, which requires native support from the ChatModel provider, but ensures your code that expects an iterator of tokens can work for any of our ChatModel integrations. - Batch support defaults to calling the underlying ChatModel in parallel for each input. The concurrency can be controlled with the
maxConcurrency
key inRunnableConfig
. - Map support defaults to calling
.invoke
across all instances of the array which it was called on.
Each ChatModel integration can optionally provide native implementations to truly enable invoke, streaming or batching requests. The table shows, for each integration, which features have been implemented with native support.
Model | Invoke | Stream | Batch |
---|---|---|---|
ChatAnthropic | ✅ | ✅ | ✅ |
ChatBaiduWenxin | ✅ | ❌ | ✅ |
ChatCloudflareWorkersAI | ✅ | ✅ | ✅ |
ChatFireworks | ✅ | ✅ | ✅ |
ChatGooglePaLM | ✅ | ❌ | ✅ |
ChatLlamaCpp | ✅ | ✅ | ✅ |
ChatMinimax | ✅ | ❌ | ✅ |
ChatOllama | ✅ | ✅ | ✅ |
ChatOpenAI | ✅ | ✅ | ✅ |
PromptLayerChatOpenAI | ✅ | ✅ | ✅ |
PortkeyChat | ✅ | ✅ | ✅ |
ChatYandexGPT | ✅ | ❌ | ✅ |