Skip to main content
POST
/
api
/
request-logs
Create log
curl --request POST \
  --url https://api.keywordsai.co/api/request-logs/ \
  --header 'Authorization: Bearer <token>' \
  --header 'Content-Type: application/json' \
  --data '
{
  "input": {},
  "output": {},
  "log_type": "<string>",
  "model": "<string>",
  "usage": {
    "prompt_tokens": 123,
    "completion_tokens": 123,
    "total_tokens": 123,
    "prompt_tokens_details": {},
    "cache_creation_prompt_tokens": 123
  },
  "cost": 123,
  "latency": 123,
  "time_to_first_token": 123,
  "tokens_per_second": 123,
  "metadata": {},
  "customer_identifier": "<string>",
  "customer_params": {
    "customer_identifier": "<string>",
    "name": "<string>",
    "email": "<string>"
  },
  "thread_identifier": "<string>",
  "custom_identifier": "<string>",
  "group_identifier": "<string>",
  "trace_unique_id": "<string>",
  "span_workflow_name": "<string>",
  "span_name": "<string>",
  "span_parent_id": "<string>",
  "tools": [
    {
      "type": "<string>",
      "function": {
        "name": "<string>",
        "description": "<string>",
        "parameters": {}
      }
    }
  ],
  "tool_choice": {},
  "response_format": {},
  "temperature": 123,
  "top_p": 123,
  "frequency_penalty": 123,
  "presence_penalty": 123,
  "max_tokens": 123,
  "stop": {},
  "status_code": 123,
  "error_message": "<string>",
  "warnings": {},
  "status": "<string>",
  "stream": true,
  "prompt_id": "<string>",
  "prompt_name": "<string>",
  "is_custom_prompt": true,
  "timestamp": "<string>",
  "start_time": "<string>",
  "full_request": {},
  "full_response": {},
  "prompt_unit_price": 123,
  "completion_unit_price": 123,
  "keywordsai_api_controls": {
    "block": true
  },
  "positive_feedback": true
}
'

Documentation Index

Fetch the complete documentation index at: https://docs.keywordsai.co/llms.txt

Use this file to discover all available pages before exploring further.

This guide shows you how to log any type of LLM request to Keywords AI using the universal input/output design that supports all span types.
Log size limit: 20MBEach log payload has a maximum size limit of 20MB. This includes the input, output, and all other fields combined. Logs exceeding this limit will be rejected.

Input/Output

Keywords AI uses universal input and output fields across all span types.
  • Chat completions: Messages arrays
  • Embeddings: Text strings or arrays
  • Transcriptions: Audio metadata → text
  • Speech: Text → audio
  • Workflows/Tasks: Any custom data structure
  • Agent operations: Complex nested objects
How it works:
  1. You provide input and output fields in any structure (string, object, array, etc.)
  2. Set log_type to indicate span type ("chat", "embedding", "workflow", etc.)
  3. Keywords AI automatically extracts type-specific fields for backward compatibility
  4. Your data is stored efficiently and retrieved with both universal and type-specific fields
For complete log_type specifications, see log types.

Legacy field support

For backward compatibility, Keywords AI still supports legacy fields:
prompt_messages
array
Legacy field. Use input instead.
completion_message
object
Legacy field. Use output instead.

Request body

Core fields

input
string | object | array
Universal input field for the span. Structure depends on log_type:
  • Chat: JSON string of messages array or messages array directly
  • Embedding: Text string or array of strings
  • Workflow/Task: Any JSON-serializable structure
  • Transcription: Audio file reference or metadata object
  • Speech: Text string or TTS configuration object
See the Span Types section below for complete specifications.
"input": "[{\"role\":\"system\",\"content\":\"You are helpful.\"},{\"role\":\"user\",\"content\":\"Hello\"}]"
"input": "Keywords AI is an LLM observability platform"
"input": "{\"query\":\"Help with order #12345\",\"context\":{\"user_id\":\"123\"}}"
output
string | object | array
Universal output field for the span. Structure depends on log_type:
  • Chat: JSON string of completion message or message object directly
  • Embedding: Array of vector embeddings
  • Workflow/Task: Any JSON-serializable result structure
  • Transcription: Transcribed text string
  • Speech: Audio file reference or base64 audio data
"output": "{\"role\":\"assistant\",\"content\":\"Hello! How can I help you?\"}"
"output": "[0.123, -0.456, 0.789, ...]"
log_type
string
default:"chat"
Type of span being logged. Determines how input and output are parsed.Supported types:
  • "chat" - Chat completion requests (default)
  • "completion" - Legacy completion requests
  • "response" - OpenAI Response API
  • "embedding" - Embedding generation
  • "transcription" - Speech-to-text
  • "speech" - Text-to-speech
  • "workflow" or "agent" - Workflow/agent execution
  • "task" or "tool" - Task/tool execution
  • "function" - Function call
  • "generation" - Generation span
  • "handoff" - Agent handoff
  • "guardrail" - Safety check
  • "custom" - Custom span type
If not specified, defaults to "chat". For chat types, the system automatically extracts prompt_messages and completion_message from input and output for backward compatibility.For complete specifications of each type, see log types.
model
string
The model used for the inference. Optional but recommended for chat/completion/embedding types.
"model": "gpt-4o-mini"

Telemetry

Performance metrics and cost tracking for monitoring LLM efficiency.
usage
object
Token usage information for the request.
prompt_tokens
integer
Number of tokens in the prompt/input.
completion_tokens
integer
Number of tokens in the completion/output.
total_tokens
integer
Total tokens (prompt + completion).
prompt_tokens_details
object
Detailed breakdown of prompt tokens (e.g., cached tokens).
cache_creation_prompt_tokens
integer
For Anthropic models: tokens used to create the cache.
{
  "usage": {
    "prompt_tokens": 150,
    "completion_tokens": 85,
    "total_tokens": 235,
    "prompt_tokens_details": {
      "cached_tokens": 10
    }
  }
}
cost
float
Cost of the inference in US dollars. If not provided, will be calculated automatically based on model pricing.
latency
float
Total request latency in seconds (replaces deprecated generation_time).
Previously called generation_time. For backward compatibility, both field names are supported.
time_to_first_token
float
Time to first token (TTFT) in seconds. Useful for streaming responses and voice AI applications.
Previously called ttft. Both field names are supported.
tokens_per_second
float
Generation speed in tokens per second.

Metadata

Custom tracking and identification parameters for advanced analytics and filtering.
metadata
object
You can add any key-value pair to this metadata field for your reference. Useful for custom analytics and filtering.
{
  "metadata": {
    "language": "en",
    "environment": "production",
    "version": "v1.0.0",
    "feature": "chat_support",
    "user_tier": "premium"
  }
}
customer_identifier
string
An identifier for the customer that invoked this request. Helps with visualizing user activities. See customer identifier details.
"customer_identifier": "user_123"
customer_params
object
Extended customer information (alternative to individual customer fields).
customer_identifier
string
Customer identifier.
name
string
Customer name.
email
string
Customer email.
{
  "customer_params": {
    "customer_identifier": "customer_123",
    "name": "John Doe",
    "email": "john.doe@example.com"
  }
}
thread_identifier
string
A unique identifier for the conversation thread. Useful for multi-turn conversations.
custom_identifier
string
Same functionality as metadata, but indexed for faster querying.
"custom_identifier": "ticket_12345"
group_identifier
string
Group identifier. Use to group related logs together.

Workflow & tracing

Parameters for distributed tracing and workflow tracking.
trace_unique_id
string
Unique identifier for the trace. Used to link multiple spans together in distributed tracing.
span_workflow_name
string
Name of the workflow this span belongs to.
span_name
string
Name of this specific span/task within the workflow.
span_parent_id
string
ID of the parent span. Used to build the trace hierarchy.

Advanced parameters

Tool calls and function calling

tools
array
A list of tools the model may call. Currently, only functions are supported as a tool.
type
string
required
The type of the tool. Currently, only function is supported.
function
object
required
name
string
required
The name of the function.
description
string
A description of what the function does.
parameters
object
The parameters the function accepts.
"tools": [
    {
        "type": "function",
        "function": {
            "name": "get_current_weather",
            "description": "Get the current weather in a given location",
            "parameters": {
                "type": "object",
                "properties": {
                    "location": {
                        "type": "string",
                        "description": "The city and state, e.g. San Francisco, CA"
                    },
                    "unit": {"type": "string", "enum": ["celsius", "fahrenheit"]}
                },
                "required": ["location"]
            }
        }
    }
]
tool_choice
string | object
Controls which (if any) tool is called by the model. Can be "none", "auto", or an object specifying a specific tool.
"tool_choice": {
    "type": "function",
    "function": {
        "name": "get_current_weather"
    }
}

Response configuration

response_format
object
Setting to { "type": "json_schema", "json_schema": {...} } enables Structured Outputs.
  • Text: { "type": "text" } - Default response format
  • JSON Schema: { "type": "json_schema", "json_schema": {...} } - Structured outputs
  • JSON Object: { "type": "json_object" } - Legacy JSON format

Model configuration

temperature
number
default:1
Controls randomness in the output (0-2). Higher values produce more random responses.
top_p
number
default:1
Nucleus sampling parameter. Alternative to temperature.
frequency_penalty
number
Penalizes tokens based on their frequency in the text so far.
presence_penalty
number
Penalizes tokens based on whether they appear in the text so far.
max_tokens
integer
Maximum number of tokens to generate.
stop
array[string]
Stop sequences where generation will stop.

Error handling and status

status_code
integer
default:200
The HTTP status code for the request. Default is 200 (success).
All valid HTTP status codes are supported: 200, 201, 400, 401, 403, 404, 429, 500, 502, 503, 504, etc.
error_message
string
Error message if the request failed. Default is empty string.
warnings
string | object
Any warnings that occurred during the request.
status
string
Request status. Common values: "success", "error".

Additional configuration

stream
boolean
default:false
Whether the response was streamed.
prompt_id
string
ID of the prompt template used. See Prompts documentation.
prompt_name
string
Name of the prompt template.
is_custom_prompt
boolean
default:false
Whether the prompt is a custom prompt. Set to true if using custom prompt_id.
timestamp
string
ISO 8601 timestamp when the request completed.
"timestamp": "2025-01-01T10:30:00Z"
start_time
string
ISO 8601 timestamp when the request started.
full_request
object
The full request object. Useful for logging additional configuration parameters.
Tool calls and other nested objects will be automatically extracted from full_request.
full_response
object
The full response object from the model provider.

Pricing configuration

prompt_unit_price
number
Custom price per 1M prompt tokens. Used for self-hosted or fine-tuned models.
"prompt_unit_price": 0.0042  // $0.0042 per 1M tokens
completion_unit_price
number
Custom price per 1M completion tokens. Used for self-hosted or fine-tuned models.
"completion_unit_price": 0.0042  // $0.0042 per 1M tokens

API controls

keywordsai_api_controls
object
Control the behavior of the Keywords AI logging API.
block
boolean
default:true
If false, the server immediately returns initialization status without waiting for log completion.
{
  "keywordsai_api_controls": {
    "block": true
  }
}
positive_feedback
boolean
Whether the user liked the output. true means positive feedback.