LanguageModelSession#
This page documents the session management classes for handling sessions with the model.
Note
Swift Equivalent: This Python API corresponds to the LanguageModelSession class in the Swift Foundation Models Framework.
LanguageModelSession#
- class apple_fm_sdk.LanguageModelSession[source]#
Bases:
_ManagedObjectRepresents a language model session for foundation model interactions.
A
LanguageModelSessionmanages the lifecycle of a session with a foundation model, maintaining session history (transcript), handling tool calls, and providing both synchronous and streaming response capabilities.The session is thread-safe for sequential requests but does not support concurrent requests. If a request is in progress, attempting another request will wait for the first to complete.
Session Lifecycle:
Creation: Initialize with optional instructions, model configuration, and tools
Active Use: Make requests via
respond()orstream_response()Cleanup: Automatically handled via context manager or explicit cleanup
Concurrent Request Handling:
Sessions use an internal lock to prevent concurrent requests. If you need to handle multiple requests simultaneously, create multiple session instances.
Examples
Basic session creation and usage:
import apple_fm_sdk as fm # Create a simple session session = fm.LanguageModelSession() response = await session.respond("Hello, how are you?") print(response)
Session with instructions:
import apple_fm_sdk as fm # Guide the model's behavior with instructions session = fm.LanguageModelSession( instructions="You are a helpful bird expert. Provide concise, " "accurate information about birds." ) response = await session.respond("What is a Swift?")
Session with custom model and tools:
import apple_fm_sdk as fm from my_tools import CalculatorTool, WeatherTool model = fm.SystemLanguageModel( temperature=0.7, top_p=0.9 ) session = fm.LanguageModelSession( instructions="You are a helpful assistant with access to tools.", model=model, tools=[CalculatorTool(), WeatherTool()] ) response = await session.respond("What's the weather like in Cupertino?")
See also
SystemLanguageModel: For model configurationTool: For creating custom toolsTranscript: For accessing session history
- __init__(instructions=None, model=None, tools=None)[source]#
Create a language model session.
- Parameters:
instructions (Optional[str]) – Optional system instructions to guide the model’s behavior throughout the session. These instructions persist across all requests in the session. Example: “You are a helpful coding assistant.”
model (Optional[SystemLanguageModel]) – Optional specialized system model configuration. If not provided, uses default SystemLanguageModel() with standard settings. Use this to customize temperature, top_p, and other generation parameters.
tools (Optional[list[Tool]]) – Optional list of Tool instances that the model can invoke during generation. Tools enable the model to perform actions like calculations, API calls, or database queries. The model will automatically decide when to use tools based on the session context.
- Raises:
FoundationModelsError – If session creation fails
Note
The session maintains a transcript of all interactions, which can be accessed via the
transcriptproperty. This transcript is automatically updated after each request.
- property is_responding: bool#
Check if the session is currently responding to a request.
- Returns:
True if the session is currently processing a request, False otherwise
- Return type:
- async respond(prompt: str) str[source]#
- async respond(prompt: str, *, generating: type[Generable]) Type[Any]
- async respond(prompt: str, *, generating: Generable) Type[Any]
- async respond(prompt: str, *, schema: GenerationSchema) GeneratedContent
- async respond(prompt: str, *, json_schema: dict) GeneratedContent
Get a response to a prompt with optional guided generation.
This function supports multiple response modes:
Basic text response: Returns a plain string
Guided generation with Generable: Returns a typed Python object
Guided generation with schema: Returns structured GeneratedContent
Guided generation with JSON schema: Returns structured GeneratedContent
The session automatically updates its transcript after each response, maintaining the full session history.
- Parameters:
prompt (str) – The input prompt string to send to the model
generating (Optional[Union[Type[Generable], Generable]]) – Optional Generable type or instance for type-safe guided generation. When provided, the response will be constrained to match the structure of the Generable type and automatically converted to an instance of that type.
schema (Optional[GenerationSchema]) – Optional GenerationSchema for explicit schema-based guided generation. Use this for custom schemas that don’t map to a Generable type.
json_schema (Optional[dict]) – Optional JSON schema dictionary for guided generation. The schema should follow JSON Schema specification.
- Returns:
Plain text response if no generation constraints are specified, or instance of generating type if
generatingparameter is provided, or structured content ifschemaorjson_schemais provided- Return type:
Union[str, Any, GeneratedContent]
- Raises:
FoundationModelsError – If the response fails or times out
ValueError – If both
generatingandschemaare provided, or if the generating type is not a valid Generableasyncio.CancelledError – If the request is cancelled
Examples
- Basic text response::
import apple_fm_sdk as fm session = fm.LanguageModelSession() response = await session.respond(“What is the capital of France?”) print(response) # Plain string response
Guided generation with Generable type:
import apple_fm_sdk as fm @fm.generable() class Cat: name: str age: int profile: str session = fm.LanguageModelSession() cat = await session.respond( "Generate a cat named Maomao", generating=Cat ) print(f"{cat.name} is {cat.age} years old")
Multi-turn session:
import apple_fm_sdk as fm session = fm.LanguageModelSession( instructions="You are a helpful expert on architecture." ) # First turn response1 = await session.respond("What is the tallest building in the world?") print(response1) # Second turn - context is maintained response2 = await session.respond( "What's the architectural style of that building?" ) print(response2)
Note
Only one of
generating,schema, orjson_schemacan be specifiedThe session maintains session context across multiple
respond()callsConcurrent calls to
respond()on the same session will be serializedFor streaming responses, use
stream_response()instead
See also
stream_response(): For streaming text responsesGenerable: For creating typed response structuresGenerationSchema: For custom schemas
- async stream_response(prompt)[source]#
Stream response chunks for a prompt (text only).
This function provides real-time streaming of the model’s response, yielding text snapshots as they become available. Each yielded value represents the complete response text generated so far, rather than the delta from the previous chunk.
Streaming Behavior:
Yields complete text snapshots (not deltas) as generation progresses
The final yield contains the complete response
Automatically updates the session transcript after completion
Does not support guided generation (text responses only)
Can be cancelled mid-stream using asyncio cancellation
- Parameters:
prompt (str) – The input prompt string to send to the model
- Yields:
Progressive snapshots of the response text. Each snapshot contains the full text generated so far, rather than only the new tokens.
- Ytype:
str
- Raises:
FoundationModelsError – If streaming fails or encounters an error
asyncio.CancelledError – If the stream is cancelled
Examples
Basic streaming:
import apple_fm_sdk as fm session = fm.LanguageModelSession() async for chunk in session.stream_response("Tell me a story"): print(chunk, end="", flush=True)
Cancelling a stream:
import asyncio import apple_fm_sdk as fm session = fm.LanguageModelSession() async def stream_with_timeout(): try: async for chunk in session.stream_response("Write a long essay"): print(chunk) # Simulate some processing await asyncio.sleep(0.1) except asyncio.CancelledError: print("Stream cancelled") raise # Cancel after 5 seconds task = asyncio.create_task(stream_with_timeout()) await asyncio.sleep(5) task.cancel()
- Streaming with error handling::
import apple_fm_sdk as fm session = fm.LanguageModelSession()
- try:
- async for chunk in session.stream_response(“Hello”):
print(chunk)
- except fm.FoundationModelsError as e:
print(f”Streaming error: {e}”)
Note
Streaming currently only supports basic text responses
For guided generation, use
respond()insteadEach snapshot contains the full text, rather than only new tokens
The session transcript is updated only after streaming completes
Breaking out of the async for loop early will properly clean up resources
See also
respond(): For non-streaming responses with guided generation support