Skip to main content

LLM Troubleshooting

Rate Limits

429 Too Many Requests

Exceeded 1000 requests/minute.

Solution:

import time

def call_with_backoff(func):
for i in range(5):
try:
return func()
except RateLimitError:
time.sleep(2 ** i)

Model Issues

Model Unavailable

Temporary issue with model.

Solution: Try different model or retry.

Slow Responses

Solutions:

  1. Use smaller model (mixtral-8x7b)
  2. Reduce max_tokens
  3. Check if streaming helps

Response Quality

Irrelevant Responses

Solutions:

  1. Improve system prompt
  2. Add examples in messages
  3. Lower temperature (0.3-0.5)
  4. Try different model

Repetitive Text

Solutions:

  1. Increase frequency_penalty (0.5-1.0)
  2. Add stop sequences
  3. Reduce temperature