simonw/async.md Secret

## async.md

      
    Raw
  

              async.md
            
          
    Designing Sync and Async APIs for Model Plugins

human (Nov 5, 2024, 06:23 PM)
I have an API I designed that looks like this:
model = get_model("name-of-model")
response = model.prompt("hi there")
for word in response:
print(word, end="")
I want to create an asyncio alternative design.
If I aim to support both sync and async access, I can think of two options.
The first is to have a model.aprompt(...) method on the existing class:
async for word in model.aprompt("hi there"):
print(word, end="")
The second is to have a whole different kind of model classes accessed like this:
model = get_async_model("name-of-model")
async for word in model.prompt("hi there"):
 print(word, end="")

In my system there are going to be many models, implemented by plugins. I care about convenience for plugin authors too.
Discuss the trade-offs of the two approaches.
assistant (Nov 5, 2024, 06:23 PM)
Let me analyze the trade-offs between these two API design approaches for supporting both synchronous and asynchronous access to your models.
Approach 1: Dual Methods (prompt and aprompt)
model = get_model("name-of-model")
# Sync usage
for word in model.prompt("hi there"): ...
# Async usage
async for word in model.aprompt("hi there"): ...
Pros:

Single class means simpler object structure and type hierarchy
Plugin authors only need to implement one class per model
Model identity is clearer - same model instance can be used either way
Follows precedent from standard library (e.g., asyncio.Queue having both get() and get_async())
Easy to add async capability to existing models incrementally

Cons:

Models that are inherently sync-only or async-only still need to handle both methods
Having two methods doing essentially the same thing violates the "one obvious way" Python principle
Plugin authors need to implement both methods or provide appropriate error messages
Method naming convention (aprompt) feels a bit awkward
Risk of accidentally using wrong method version in mixed sync/async code

Approach 2: Separate Classes (get_model vs get_async_model)
# Sync usage
model = get_model("name-of-model")
for word in model.prompt("hi there"): ...
# Async usage
model = get_async_model("name-of-model") 
async for word in model.prompt("hi there"): ...
Pros:

Cleaner separation of concerns - each class only needs to implement what it can actually do
Method names stay consistent and clear (prompt means same thing in both contexts)
Type system can better enforce correct usage
Plugin authors can choose to implement only sync or only async if that's all they need
Less risk of accidentally mixing sync/async code incorrectly
Follows the principle that async code is "contagious" and should be explicit

Cons:

Duplicate class hierarchy needs to be maintained
Plugin authors need to understand and potentially implement two parallel class structures
More complex factory system needed to manage two kinds of models
If a model supports both modes, code duplication might be needed
Less obvious that sync and async versions of same model are related

Recommendation:
I would lean toward Approach 2 (separate classes) for several reasons:

It better reflects the fundamental difference between sync and async code in Python
It provides cleaner separation of concerns for plugin authors
It makes async usage more explicit, reducing risk of accidentally mixing sync/async code
The type system can better help catch errors

To mitigate the downsides, I would suggest:

Create clear base classes and mixins to minimize code duplication for plugin authors
Use naming conventions that make the relationship between sync/async versions clear
Consider providing helper utilities to implement one version from the other for simple cases

Example implementation structure:
class BaseModel:
    """Shared functionality for all models"""
    pass

class SyncModel(BaseModel):
    """Base class for synchronous models"""
    def prompt(self, text: str) -> Iterator[str]:
        raise NotImplementedError

class AsyncModel(BaseModel):
    """Base class for asynchronous models"""
    async def prompt(self, text: str) -> AsyncIterator[str]:
        raise NotImplementedError

# Helper for implementing sync version from async
class SyncFromAsyncModel(SyncModel):
    def prompt(self, text: str) -> Iterator[str]:
        return asyncio.run(self._async_impl.prompt(text))

# Helper for implementing async version from sync
class AsyncFromSyncModel(AsyncModel):
    async def prompt(self, text: str) -> AsyncIterator[str]:
        # Implement using asyncio.to_thread or similar
        pass
This structure gives plugin authors flexibility while maintaining clean separation and providing helpers for common cases. It also makes the async nature of code explicit at the type level, which is generally considered a good practice in Python's async ecosystem.
human (Nov 5, 2024, 06:24 PM)
The most convincing arguments here relate to the type system, illustrate that
assistant (Nov 5, 2024, 06:24 PM)
Let me demonstrate how the type system provides clearer guarantees and catches potential errors with both approaches.
Approach 1 (Dual Methods):
from typing import Iterator, AsyncIterator, Protocol

class Model(Protocol):
    def prompt(self, text: str) -> Iterator[str]: ...
    async def aprompt(self, text: str) -> AsyncIterator[str]: ...

async def process_text_async(model: Model, text: str) -> None:
    # Type checker won't prevent this mistake:
    for word in model.prompt(text):  # Blocking call in async context! 
        print(word)

def process_text_sync(model: Model, text: str) -> None:
    # Type checker won't prevent this mistake:
    async for word in model.aprompt(text):  # Async call in sync context!
        print(word)

# Even worse, this is valid but probably not what we want:
async def confusing(model: Model, text: str) -> None:
    # Mixing sync and async randomly
    for word in model.prompt(text):
        print(word)
    async for word in model.aprompt(text):
        print(word)
Approach 2 (Separate Classes):
from typing import Iterator, AsyncIterator, Protocol

class SyncModel(Protocol):
    def prompt(self, text: str) -> Iterator[str]: ...

class AsyncModel(Protocol):
    async def prompt(self, text: str) -> AsyncIterator[str]: ...

async def process_text_async(model: AsyncModel, text: str) -> None:
    # Type error: Async model required
    for word in model.prompt(text):  # Error: "await" missing
        print(word)

def process_text_sync(model: SyncModel, text: str) -> None:
    # Type error: Sync model required
    async for word in model.prompt(text):  # Error: Invalid "async for"
        print(word)

# This won't type check - clear error about wrong model type
def wrong_model(model: SyncModel) -> None:
    async def inner() -> None:
        async for word in model.prompt("hi"):  # Type error!
            print(word)

# Can't accidentally mix sync/async:
async def cant_mix(model: AsyncModel, text: str) -> None:
    for word in model.prompt(text):  # Type error: Missing await
        print(word)
Let's demonstrate a more complex real-world scenario where the type system helps prevent bugs:
from typing import AsyncIterator, Iterator, Protocol
from contextlib import asynccontextmanager
import asyncio

# Approach 1: Single class with both methods
class DualModel(Protocol):
    def prompt(self, text: str) -> Iterator[str]: ...
    async def aprompt(self, text: str) -> AsyncIterator[str]: ...
    
    def batch_process(self, texts: list[str]) -> list[list[str]]: ...
    async def batch_aprocess(self, texts: list[str]) -> list[list[str]]: ...

async def pipeline1(model: DualModel, texts: list[str]) -> None:
    results = []
    for text in texts:
        # Type checker won't catch this performance bug:
        # We're using sync method in async context!
        words = list(model.prompt(text))  # Blocking call!
        results.append(words)
    
    # Could accidentally mix like this:
    async for word in model.aprompt(texts[0]):
        results[0].append(word)
    
    # No error even though this defeats the purpose of async:
    return model.batch_process(texts)  # Blocking call!

# Approach 2: Separate classes
class SyncModel(Protocol):
    def prompt(self, text: str) -> Iterator[str]: ...
    def batch_process(self, texts: list[str]) -> list[list[str]]: ...

class AsyncModel(Protocol):
    async def prompt(self, text: str) -> AsyncIterator[str]: ...
    async def batch_process(self, texts: list[str]) -> list[list[str]]: ...

async def pipeline2(model: AsyncModel, texts: list[str]) -> None:
    results = []
    for text in texts:
        # Type error: Cannot use sync iteration
        words = list(model.prompt(text))  # Error: Missing await
        results.append(words)
    
    # Type error: Cannot mix sync/async iteration
    for word in model.prompt(texts[0]):  # Error: Wrong iteration style
        results[0].append(word)
    
    # Type error: Must await async call
    return model.batch_process(texts)  # Error: Missing await

# Resource management is also clearer:
class ResourceModel(AsyncModel):
    @asynccontextmanager
    async def session(self):
        # setup
        yield self
        # cleanup

async def safe_usage(model: ResourceModel):
    async with model.session() as session:
        async for word in session.prompt("hi"):
            print(word)

# With Approach 1, this would be valid but wrong:
def unsafe_usage(model: DualModel):
    # Type checker won't catch that we should be using async context:
    words = list(model.prompt("hi"))  # Missing resource cleanup!
The separate classes approach (Approach 2) provides several type-system benefits:

Clear separation between sync and async contexts in function signatures
Cannot accidentally use sync methods in async contexts or vice versa
Resource management patterns are clearer and safer
Better IDE support with more specific type hints
Clearer error messages about missing await or wrong iteration style
Cannot accidentally mix sync and async operations
Forces conscious decisions about whether code should be sync or async

These type system guarantees become even more valuable in larger codebases where sync/async boundaries need to be clearly maintained and where accidental blocking calls could cause serious performance issues.