OpenAI Realtime SIP Integration
This guide will walk you through integrating Bandwidth's Voice Network with OpenAI's Realtime SIP Interface. This integration allows you to leverage OpenAI's advanced AI capabilities in your call flows.
This integration is only available on the Universal Platform.
Please reach out to your Bandwidth CSM to confirm that your account and trunk configuration are correctly enabled to interconnect with OpenAI's Realtime API via SIP Connector.
What you'll need
- A Universal Platform account + a phone number associated to a Voice Configuration Package
- Your OpenAI Project ID
- Your OpenAI API Key
- Docker
- A publicly accessible server to host your webhook + websocket application (e.g., using ngrok)
- (Optional) Our sample application to get started: bandwidth-samples/openai-realtime-sip-python
Call Flow
Before we dive in, let's walk through what an inbound call flow looks like with this integration.
This flow demonstrates a call that is answered by an AI agent and then transferred to a human agent. Let's break it down:
- A user calls your Bandwidth number.
- Bandwidth routes the call to OpenAI's SIP endpoint.
- OpenAI sends a
realtime.call.incoming
event to your webhook URL. - Your application makes a
POST /calls/{callId}/accept
request to accept the call and give the agent its context. - OpenAI responds with a
200 OK
response. - Your application asynchronously establishes a WebSocket connection with OpenAI to start the stream, enabling you to send commands to the agent during the call.
- Your application responds with a
200 OK
response to the initial event sent in step 3.- This is very important - OpenAI will not connect the call until your application responds with a
200 OK
response to the initialrealtime.call.incoming
event.
- This is very important - OpenAI will not connect the call until your application responds with a
- The user and the AI agent can now converse.
- When the user is ready to speak to a human agent, your application makes a
POST /calls/{callId}/refer
request to transfer the call. - OpenAI sends a SIP REFER to Bandwidth to transfer the call.
- OpenAI responds with a
200 OK
response to your application after Bandwidth accepts the refer request. - Bandwidth refers the call, and now the user and the human agent can now converse.
REFER
to a tel_uri
is not supported yet. You must REFER
to a sip_uri
that points to your Bandwidth trunk, which will then route the call to the desired Bandwidth number.
The IP Address needed for the REFER can be found in the contact
header of the initial POST
request that OpenAI sends in step 3.
Let's Build It!
For convenience - we have provided a sample application to get you started. You can find it here: bandwidth-samples/openai-realtime-sip-python. The sample application is built using Python and FastAPI, but you can use any language or framework that you prefer, such as NodeJS + Express or Java + Spring.
To run the sample application, simply clone the repository and follow the instructions in the README.
The following sections will walk you through the sample application code to help you understand how it works.
Setup our Environment
Lets first clone our sample application:
git clone https://github.com/Bandwidth-Samples/openai-realtime-sip-python
cd openai-realtime-sip-python
The application provides a docker compose file to help you get started quickly, but you can also run the application via your local Python environment if you prefer.
First - ensure you have a .env
file in the root of the project with the following variables:
export OPENAI_API_KEY="your_openai_api_key_here"
export OPENAI_SIGNING_SECRET="your_openai_signing_secret_here"
export REFER_TO="+19195554321"
export LOG_LEVEL="DEBUG"
export LOCAL_PORT=3000
Using Docker
docker compose up --build
Using Local Python Environment
python -m venv .venv
source .venv/bin/activate
cd app
pip install -r requirements.txt
python main.py
A successful startup should log the following:
INFO: Will watch for changes in these directories: ['/app']
INFO: Uvicorn running on http://0.0.0.0:3000 (Press CTRL+C to quit)
INFO: Started reloader process [1] using WatchFiles
INFO: Started server process [8]
INFO: Waiting for application startup.
INFO: Application startup complete.
The application runs on port 3000
by default, but can be overridden by setting the LOCAL_PORT
environment variable.
Creating our FastAPI Server
The sample application uses FastAPI to create a simple web server that can handle incoming HTTP requests from OpenAI.
The sample application also provides a models
directory that contains Pydantic models for the various OpenAI webhook events. We wont define what those models look like here, but you can find them in the models
directory of the sample application.
# main.py
# !/usr/bin/env python3
# ...imports...
# Set our Environment Variables
try:
OPENAI_API_KEY = os.environ["OPENAI_API_KEY"]
OPENAI_SIGNING_SECRET = os.environ["OPENAI_SIGNING_SECRET"]
REFER_TO = os.environ["REFER_TO"]
LOG_LEVEL = os.environ["LOG_LEVEL"]
LOCAL_PORT = int(os.environ.get("LOCAL_PORT", 3000))
except KeyError:
print("environment variables not set")
exit(1)
app = FastAPI()
# Health Check
@app.get("/health", status_code=http.HTTPStatus.NO_CONTENT)
def health():
return
# Handle Inbound Call Event from OpenAI
@app.post("/webhooks/openai/realtime/call/inbound", status_code=http.HTTPStatus.OK)
def handle_inbound_call(payload: RealtimeCallIncoming) -> Response:
return Response()
def start_server(port: int) -> None:
uvicorn.run(
"main:app",
host="0.0.0.0",
port=port,
log_level="debug",
reload=True,
)
if __name__ == "__main__":
start_server(LOCAL_PORT)
The above code creates a FastAPI application with two endpoints:
- A health check endpoint at
/health
that returns a204 No Content
response. - A webhook endpoint at
/webhooks/openai/realtime/call/inbound
that handles incoming call events from OpenAI. Right now, it simply returns a200 OK
response.
The brunt of the logic will be added to the handle_inbound_call
function.
Handle Inbound Call Event
When a user calls your Bandwidth number, Bandwidth will route the call to OpenAI's SIP endpoint. OpenAI will then send a realtime.call.incoming
event to your webhook URL. Lets implement the logic to handle this event.
# main.py
# Now we need some of our constants
AUTH_HEADER = {"Authorization": f"Bearer {OPENAI_API_KEY}"}
OPENAI_REALTIME_CALLS_BASE_URL = "https://api.openai.com/v1/realtime/calls/"
GREETING = "Hello! How can I assist you today?"
AGENT_PROMPT = "You are a helpful customer support agent."
REFER_TOOL = Tool(
type="function",
name="refer",
description="Transfer the call to another person whenever the caller requests to be transferred or to speak to a person."
)
CALL_ACCEPTANCE_REQUEST = CallAcceptanceRequest(
type="realtime",
instructions=AGENT_PROMPT,
model="gpt-4o-realtime-preview",
tools=[REFER_TOOL],
)
@app.post("/webhooks/openai/realtime/call/inbound", status_code=http.HTTPStatus.OK)
def handle_inbound_call(payload: RealtimeCallIncoming) -> Response:
if payload.is_incoming_call():
# Grab the relevant info from the incoming event and stash it in a variable
call_id = payload.get_call_id()
acceptance_response = requests.post(
OPENAI_REALTIME_CALLS_BASE_URL + call_id + "/accept",
headers={**AUTH_HEADER, "Content-Type": "application/json"},
json=CALL_ACCEPTANCE_REQUEST.model_dump(),
)
if acceptance_response.status_code != http.HTTPStatus.OK:
return Response(status_code=http.HTTPStatus.INTERNAL_SERVER_ERROR)
return Response()
Let's look at some of our constants in more detail:
AUTH_HEADER
: This is the authorization header that we will use to authenticate our requests to OpenAI. It contains our API key.OPENAI_REALTIME_CALLS_BASE_URL
: This is the base URL for the OpenAI Realtime Calls API.GREETING
: This is a simple greeting message that the agent will say when the call is answered. We will use this later when we implement the websocket connection.AGENT_PROMPT
: This is the prompt that we will use to instruct the agent on how to behave during the call. The sample app includes a more detailed example prompt for demonstration purposes.REFER_TOOL
: This is a tool that we will provide to the agent to allow it to transfer the call to another person.CALL_ACCEPTANCE_REQUEST
: This is the request body that we will send to OpenAI when we accept the call.
Now lets walk through what we just did:
- We defined some constants that we will use later in the function.
- We check if the incoming event is indeed an incoming call using the
is_incoming_call
method on theRealtimeCallIncoming
model. - We extract the
call_id
andsip_host
from the incoming event using theget_call_id
andget_sip_host
methods on theRealtimeCallIncoming
model. - We make a
POST /calls/{callId}/accept
request to OpenAI to accept the call and provide the agent with its context. - We check if the response from OpenAI is a
200 OK
response. If not, we log the error and return a500 Internal Server Error
response. - Finally, we return a
200 OK
response to OpenAI to let them know that we have successfully handled the event.
Establish WebSocket Connection
# main.py
# We need a couple more constants now
OPENAI_REALTIME_WEBSOCKET_URL = "wss://api.openai.com/v1/realtime"
CREATE_RESPONSE_REQUEST = CreateResponseRequest(
response=OpenAIResponse(instructions=f"Say to the caller: '{GREETING}'")
)
# This function will handle our websocket connection asynchronously
# OpenAI uses websockets to stream information about the conversation
# We can interact with the call and instruct the agent in real-time using this connection
async def websocket_task(call_id: str, sip_host: str = None) -> None:
try:
async with websockets.connect(
f"{OPENAI_REALTIME_WEBSOCKET_URL}?call_id={call_id}",
additional_headers=AUTH_HEADER,
) as websocket:
await websocket.send(CREATE_RESPONSE_REQUEST.model_dump_json())
while True:
response = await websocket.recv()
realtime_message = RealtimeMessage(**json.loads(response))
# Handle different message types from the OpenAI WebSocket Connection
# There are many different types of messages that OpenAI can send us
# For now we are only concerned with the "response.output_item.done" message type
# This message type indicates that the agent has finished processing a response
# and is ready for us to take action (e.g., transfer the call)
match realtime_message.type:
case "response.output_item.done":
if realtime_message.item.type == "function_call":
if realtime_message.item.name == "refer":
requests.post(
f"{OPENAI_REALTIME_CALLS_BASE_URL}{call_id}/refer",
headers=AUTH_HEADER,
json={"target_uri": f"sip:{REFER_TO}@{sip_host}"}
)
case _:
pass
except Exception as e:
print(f"WebSocket error: {e}")
@app.post("/webhooks/openai/realtime/call/inbound", status_code=http.HTTPStatus.OK)
def handle_inbound_call(payload: RealtimeCallIncoming) -> Response:
if payload.is_incoming_call():
# Grab the relevant info from the incoming event and stash it in a variable
call_id = payload.get_call_id()
# we need this for our REFER request
sip_host = payload.get_sip_host()
acceptance_response = requests.post(
OPENAI_REALTIME_CALLS_BASE_URL + call_id + "/accept",
headers={**AUTH_HEADER, "Content-Type": "application/json"},
json=CALL_ACCEPTANCE_REQUEST.model_dump(),
)
if acceptance_response.status_code != http.HTTPStatus.OK:
inspect(acceptance_response.json(), title="OpenAI API Error")
return Response(status_code=http.HTTPStatus.INTERNAL_SERVER_ERROR)
# New step - Start our websocket connection in a new thread
threading.Thread(
target=asyncio.run,
args=(websocket_task(call_id, sip_host),),
daemon=True,
).start()
return Response()
In the above code, we added a new function called websocket_task
that establishes a WebSocket connection with OpenAI. This function is called in a new thread after we accept the call.
Let's break down what we did:
- We defined a new constant called
OPENAI_REALTIME_WEBSOCKET_URL
that contains the URL for the OpenAI Realtime WebSocket endpoint. - We defined a new constant called
CREATE_RESPONSE_REQUEST
that contains the request body that we will send to OpenAI to create a response. - We created a new asynchronous function called
websocket_task
that takes in thecall_id
andsip_host
as parameters. - Inside the
websocket_task
function, we establish a WebSocket connection with OpenAI using thewebsockets
library. - We send a
CREATE_RESPONSE_REQUEST
to OpenAI to instruct the agent to say our greeting message to the caller. - We enter a loop where we listen for messages from OpenAI.
- We use a
match
statement to handle different types of messages from OpenAI.- In this case, we are only concerned with the
response.output_item.done
message type, which indicates that the agent has finished processing a response and is ready for us to take action. - If the message contains a
function_call
item with the namerefer
, we make aPOST /calls/{callId}/refer
request to OpenAI to transfer the call to the number specified in theREFER_TO
environment variable.
- In this case, we are only concerned with the
- We handle any exceptions that may occur during the WebSocket connection and log them to the console.
- Finally, we start the
websocket_task
function in a new thread after we accept the call. - We also extract the
sip_host
from the incoming event using theget_sip_host
method on theRealtimeCallIncoming
model. This is needed for ourREFER
request. - We pass the
sip_host
to thewebsocket_task
function so that it can be used in theREFER
request. - We return a
200 OK
response to OpenAI to let them know that we have successfully handled the event.
Connect to your Public Webhook URL
Now that we have our application running locally, we need to expose it to the internet so that OpenAI can send webhook events to it. You can use a tool like ngrok to create a secure tunnel to your local server.
In a new terminal window, run the following command:
ngrok http 3000
This will give you a public URL that you can use to configure your OpenAI project.
Configure your OpenAI Project
Now that you have a public URL for your webhook, you need to configure your OpenAI project to use it.
- Log in to the OpenAI Dashboard.
- Navigate to the "Webhooks" section of your project settings.
- Add a new webhook with the following details:
- Name:
Realtime Inbound Call - SIP
- URL:
https://{your-ngrok-id}.ngrok-free.app/webhooks/openai/realtime/call/inbound
- Events: Select
realtime.call.incoming
- Name:
Configure your Bandwidth Voice Configuration Package
Finally, you need to configure your Bandwidth Voice Configuration Package to route inbound calls to OpenAI's SIP endpoint.
- Log in to the Bandwidth Dashboard.
- Under
Service Management
, selectVoice Configuration
and create a new Voice Configuration Package or edit an existing one. - Add a new Route and select
Route to SIP URI
. - In the
SIP URI
field, entersip:{my_project_id}@sip.api.openai.com
where{my_project_id}
is your OpenAI Project ID. exsip:proj_12345abc@sip.api.openai.com
- Save your changes.
- Assign the Voice Configuration Package to your Bandwidth phone number.
Test the Integration
Now that everything is set up, you can test the integration by calling your Bandwidth phone number!
You should hear the AI agent greet you and be able to have a conversation with it.
When you're ready to speak to a human agent, simply ask to be transferred and the call will be transferred to the number specified in the REFER_TO
environment variable.