You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
The best current proposal seems to be adding another type StreamingDataContent which does not inherit from DataContent but rather inherits AIContent directly. The person instantiating this type would need to pass in a suitable factory/callback to the constructor that would return the stream. The person consuming this type would call content.TryTakeStream(out var stream), which may or may not give them a stream (depending on whether it's capable of supplying more than one reader) and if it does, hands responsibility for disposing that stream instance to the consumer.
I'm unclear on how this interacts with JSON serialization. Is there at least one practical case where this actually avoids buffering a large file in memory? I know it probably doesn't help with receiving any data from JSON-over-HTTP. But maybe it does help when sending JSON-over-HTTP. If there are no cases today where this actually avoids buffering in practice, then we might choose not to prioritize doing this now.
The text was updated successfully, but these errors were encountered:
For OpenAI, as of today, we just map our exchange types to OpenAI-dotnet SDK ones, there's no way to avoid buffering the contents into memory from our consumer perspective.
The SDK uses System.ClientModel, which works with BinaryContent and IJsonModel which creates a Utf8JsonWriter backed by array pool which uses to serialize ChatCompletionOptions with the ChatMessages attached.
Only way I found to intercept serialization was by extending ChatCompletionOptions and overriding JsonModelWriteCore and we would have to use reflection since Messages is internal.
But even if we could intercept writing JSON, we still have the problem that the writer is backed by array pool, because is using ModelBinaryContent.
As an alternative, I was testing with OpenAI Files API, which the SDK supports, but I didn't get great results. My ideas was that when someone opted into using StreamingDataContent, we could upload the files first and then reference them by ID when chatting with the model.
Not sure if my issues with OpenAI are specific to Azure and GitHub, which are the endpoints I have access to. I filed openai/openai-dotnet#396.
I can continue investigating how to do this in the other adapters (ollama and Azure AI Inference).
We need a proof of concept that we could implement streaming data using existing API shapes.
Broken out as a separate issue from #5719. Per #5719 (comment):
StreamingDataContent
which does not inherit fromDataContent
but rather inheritsAIContent
directly. The person instantiating this type would need to pass in a suitable factory/callback to the constructor that would return the stream. The person consuming this type would callcontent.TryTakeStream(out var stream)
, which may or may not give them a stream (depending on whether it's capable of supplying more than one reader) and if it does, hands responsibility for disposing that stream instance to the consumer.The text was updated successfully, but these errors were encountered: