LLM Troubleshooting
Rate Limits
429 Too Many Requests
Exceeded 1000 requests/minute.
Solution:
import time
def call_with_backoff(func):
for i in range(5):
try:
return func()
except RateLimitError:
time.sleep(2 ** i)
Model Issues
Model Unavailable
Temporary issue with model.
Solution: Try different model or retry.
Slow Responses
Solutions:
- Use smaller model (mixtral-8x7b)
- Reduce max_tokens
- Check if streaming helps
Response Quality
Irrelevant Responses
Solutions:
- Improve system prompt
- Add examples in messages
- Lower temperature (0.3-0.5)
- Try different model
Repetitive Text
Solutions:
- Increase frequency_penalty (0.5-1.0)
- Add stop sequences
- Reduce temperature