Text Generation API¶
Functions for deterministic text generation.
generate()¶
Generate deterministic text from a prompt.
def generate(
prompt: str,
return_logprobs: bool = False,
eos_string: str = "[EOS]"
) -> Union[str, Tuple[str, Optional[Dict[str, Any]]]]
Parameters¶
Parameter | Type | Default | Description |
---|---|---|---|
prompt |
str |
required | Input text to generate from |
return_logprobs |
bool |
False |
Return log probabilities with text |
eos_string |
str |
"[EOS]" |
Custom end-of-sequence string |
Returns¶
Returns: str
- Generated text (512 tokens max)
Returns: Tuple[str, Optional[Dict]]
- Generated text and log probabilities
Examples¶
generate_iter()¶
Generate text iteratively, yielding tokens as produced.
def generate_iter(
prompt: str,
eos_string: str = "[EOS]",
include_logprobs: bool = False
) -> Iterator[Union[str, Tuple[str, Optional[Dict[str, Any]]]]]
Parameters¶
Parameter | Type | Default | Description |
---|---|---|---|
prompt |
str |
required | Input text to generate from |
eos_string |
str |
"[EOS]" |
Custom end-of-sequence string |
include_logprobs |
bool |
False |
Yield log probabilities with tokens |
Returns¶
Yields: str
- Individual tokens/words
Yields: Tuple[str, Optional[Dict]]
- Token and log probabilities
Examples¶
Advanced Usage¶
Deterministic Behavior¶
Both functions return identical results for identical inputs:
# These will always be identical
result1 = steadytext.generate("hello world")
result2 = steadytext.generate("hello world")
assert result1 == result2 # Always passes!
# Streaming produces same tokens in same order
tokens1 = list(steadytext.generate_iter("hello world"))
tokens2 = list(steadytext.generate_iter("hello world"))
assert tokens1 == tokens2 # Always passes!
Caching¶
Results are automatically cached using a frecency cache (LRU + frequency):
# First call: generates and caches result
text1 = steadytext.generate("common prompt") # ~2 seconds
# Second call: returns cached result
text2 = steadytext.generate("common prompt") # ~0.1 seconds
assert text1 == text2 # Same result, much faster
Fallback Behavior¶
When models can't be loaded, deterministic fallbacks are used:
# Even without models, these always return the same results
text = steadytext.generate("test prompt") # Hash-based fallback
assert len(text) > 0 # Always has content
# Fallback is also deterministic
text1 = steadytext.generate("fallback test")
text2 = steadytext.generate("fallback test")
assert text1 == text2 # Same fallback result
Performance Tips¶
Optimization Strategies
- Preload models: Call
steadytext.preload_models()
at startup - Batch processing: Use
generate()
for multiple prompts rather than streaming individual tokens - Cache warmup: Pre-generate common prompts to populate cache
- Memory management: Models stay loaded once initialized (singleton pattern)