Skip to main content
Response time matters in AI chat experiences. Users expect near-instant replies, and every millisecond you add to the response pipeline increases the chance they’ll notice a delay. Since ChatAds sits between your LLM call and the final response, we’ve designed the API to be as fast as possible — but there are ways to make it even faster depending on your use case.

Optimization Checklist

1. Set extraction_mode=fast

This is the single biggest improvement, dropping keyword extraction from ~500ms to under 100ms. The tradeoff is slightly less accurate term extraction — it may miss nuanced product references — but for most messages with clear product mentions, it performs well.
{
  "message": "The Bose QuietComfort Ultra earbuds are great for commuting.",
  "extraction_mode": "fast"
}

2. Set resolution_mode=fast

Resolution is the step that turns an extracted keyword into an affiliate URL. In standard mode, the API cascades through multiple lookup sources if the first one misses. fast mode does a single-pass lookup without cascading to additional sources — it’s rarely above 600ms, and often much faster.
{
  "message": "The Bose QuietComfort Ultra earbuds are great for commuting.",
  "extraction_mode": "fast",
  "resolution_mode": "fast"
}

3. Add a timeout to your ChatAds call

If you’d rather show no ad than wait too long, wrap the API call in a timeout. This way your chat response is never blocked — if ChatAds doesn’t respond in time, you simply skip the affiliate link.
const controller = new AbortController();
const timeout = setTimeout(() => controller.abort(), 500);

try {
  const response = await fetch("https://api.getchatads.com/v1/chatads/messages", {
    method: "POST",
    headers: {
      "Content-Type": "application/json",
      "x-api-key": "YOUR_API_KEY",
    },
    body: JSON.stringify({ message: aiResponse }),
    signal: controller.signal,
  });
  clearTimeout(timeout);
  const data = await response.json();
  // Insert affiliate link into response
} catch (err) {
  // Timed out — send the response without an affiliate link
}

4. Call ChatAds in parallel

If you’re doing any post-processing after the LLM responds — formatting, safety checks, logging — fire the ChatAds call at the same time. The latency overlaps instead of stacking, so ChatAds adds zero perceived delay.
const llmResponse = await openai.chat.completions.create({ ... });

// Run ChatAds and your other post-processing in parallel
const [chatadsResult, formattedResponse] = await Promise.all([
  fetch("https://api.getchatads.com/v1/chatads/messages", {
    method: "POST",
    headers: {
      "Content-Type": "application/json",
      "x-api-key": "YOUR_API_KEY",
    },
    body: JSON.stringify({ message: llmResponse.choices[0].message.content }),
  }).then(r => r.json()),

  formatAndSanitize(llmResponse),
]);

// Insert affiliate link if available
if (chatadsResult.data?.offers?.length) {
  const offer = chatadsResult.data.offers[0];
  formattedResponse = formattedResponse.replace(
    offer.link_text,
    `<a href="${offer.url}">${offer.link_text}</a>`
  );
}