Transcription Notifications
Overview
This feature requires a Cloud based Edge, so either Genesys Cloud Voice (GCV) or Bring Your Own Carrier Cloud (BYOCC) type deployments are supported. BYOCP is not supported with this feature.
Users of this feature will need to create a Notification channel in order to subscribe to any Notification topics.
Public API Notification endpoints for creating channels and subscribing to topics can be found here: Notifications API
This feature introduces the new v2.conversations.{id}.transcription
topic, which can be found listed on this page: Public Notification API Topics
Clients that wish to subscribe to this topic will need the conversation:transcription:view
permission
Note that this permission is not division aware or limited in scope and allows the client to subscribe to transcripts from any live conversation within the organization.
Flow of Messages
As the transcription request for a conversation is being processed, TranscriptionMessages will be dispatched to the Transcription Notification topic with id value corresponding to the conversationId of the conversation.
As speech audio for the conversation is detected and processed, and transcription engines return transcript results, the transcripts are collected and periodically grouped into TranscriptionMessages which get published to the Notification topic. Depending on various factors, these messages could be emitted within a window (usually about 35 seconds) from the moment that speech is detected. These messages are sent with status SESSION_ONGOING.
In addition to the periodic messages containing transcripts, at the start of each transcription session and at an interval of every 5 minutes after that while the session is still active, TranscriptionMessages without any transcript data and only with the SESSION_ONGOING status will be dispatched. These messages are sent regardless of whether the session processes any audio or receives any transcripts. This means that the absence of any messages on the topic for over 5 minutes would indicate that no transcription session is in progress for the corresponding conversation (i.e. the session either has not started or has already ended).
TranscriptionMessage Details
All of the fields that can be included in TranscriptionMessages are detailed in the tables below.
Root object
Note that transcripts in the transcripts
array are not guaranteed to be ordered in time. OffsetMs values of the transcripts' alternatives can be used to order them based on time at which the corresponding audio occurred in the conversation.
Field Name | Type | Required | Description |
eventTime | string (Timestamp) | yes | ISO-8601 time at which the producer published the event that was used to generate this message. |
organizationId | string (UUID) | yes | Genesys Cloud organization id of the organization that the conversation belongs to. |
conversationId | string (UUID) | yes | Genesys Cloud conversation id of the conversation that the transcript is for. |
communicationId | string (UUID) | no | Genesys Cloud communication id. Null if unavailable. |
sessionStartTimeMs | long | no | Epoch time in milliseconds at which the transcription session started. Null if unavailable. |
transcriptionStartTimeMs | long | no | Epoch time in milliseconds at which the audio stream for the conversation being transcribed started (offsetMs is 0 at this time). Null if unavailable. |
transcripts | TranscriptResult[] | no | The transcripts included in this message. It can be null if the message is only for conveying status information. |
status | TranscriptionRequestStatus | yes | An object representing the status of the transcription request session at a given point in time. |
TranscriptionRequestStatus object
Field Name | Type | Required | Description |
offsetMs | long | yes | Offset in milliseconds at which this status is valid from the start of the audio stream. |
status | string (enum) | yes | Current status of the transcription session. Possible values are "UNKNOWN", "SESSIONONGOING", and "SESSIONENDED", which may be extended in the future. |
TranscriptResult object
Field Name | Type | Required | Description |
utteranceId | string (UUID) | yes | Unique id associated with this transcript. |
isFinal | boolean | yes | Whether this is a final result (as opposed to an interim result) from the transcription engine. |
channel | string (enum) | yes | Channel of the participant that uttered the phrase corresponding to this transcript. Possible values include "UNKNOWN", "EXTERNAL", "INTERNAL". |
alternatives | TranscriptAlternative[] | yes | The alternatives for this transcript result. Will always contain at least one alternative corresponding to the lexical transcription. |
engineId | string | yes | Id corresponding to the transcription engine that generated this result. |
dialect | string | yes | Dialect configured for the transcription engine at the time this result was generated. |
speechTextAnalyticsProgramId | string | no | ProgramId configured for the transcription engine at the time this result was generated. Null if no programId was used. |
agentAssistantId | string (UUID) | no | AssistantId configured for Agent Assist at the time this result was generated. Null if Agent Assist was not configured for this segment. |
agentAssistEnabled | boolean | yes | Whether Agent Assist was enabled at the time this transcript was generated. Is also false if the Agent Assist processing failed. |
voiceTranscriptionEnabled | boolean | yes | Whether Voice Transcription was enabled at the time this transcript was generated. |
TranscriptAlternative object
Field Name | Type | Required | Description |
confidence | double | no | Confidence score of this alternative from 0 to 1. Alternatives are sorted by confidence in the |
offsetMs | long | yes | Offset in milliseconds of this transcript alternative from the beginning of the audio stream. |
durationMs | long | yes | Duration in milliseconds of this transcript alternative. |
transcript | string | yes | Transcript alternative generated by the transcription engine. |
decoratedTranscript | string | no | Decorated transcript generated by the transcription engine. Can be null if not provided by the transcription engine. The |
words | TranscriptWord[] | no | Breakdown of words for this transcript alternative. Can be null if not provided by the transcription engine. |
decoratedWords | TranscriptWord[] | no | Breakdown of decorated words for this transcript alternative. Can be null if not provided by the transcription engine. The |
TranscriptWord object
Field Name | Type | Required | Description |
confidence | double | no | Confidence score of this alternative from 0 to 1. |
offsetMs | long | yes | Offset in milliseconds at which this word was uttered from the beginning of the audio stream. Words are sorted by offsetMs in the |
durationMs | long | yes | Duration in milliseconds of the word. |
word | string | yes | Word generated by the transcription engine. |
Example - With transcripts
{
"eventTime":"2021-05-27T19:40:46.871Z",
"organizationId":"376d79ab-87f6-4fa2-87e7-aaecfd873d8a",
"conversationId":"504ffdc4-03d2-4ca2-9a5c-8704647f4835",
"communicationId":"f4fee41b-8484-4f4a-bc12-e3b18642aff0",
"transcripts":[
{
"utteranceId":"d04adf44-1d1e-490b-9764-847fde7ff574",
"isFinal":false,
"channel":"EXTERNAL",
"alternatives":[
{
"confidence":0.0,
"transcript":"you know when they thought uh",
"words":[
{
"confidence":0.951,
"offsetMs":120018,
"durationMs":59,
"word":"you"
},
{
"confidence":0.991,
"offsetMs":120315,
"durationMs":79,
"word":"know"
},
{
"confidence":0.971,
"offsetMs":120493,
"durationMs":79,
"word":"when"
},
{
"confidence":0.987,
"offsetMs":120651,
"durationMs":79,
"word":"they"
},
{
"confidence":0.939,
"offsetMs":120849,
"durationMs":139,
"word":"thought"
},
{
"confidence":0.714,
"offsetMs":121324,
"durationMs":40,
"word":"uh"
}
]
}
],
"engineId":"r2d2",
"dialect":"en-US",
"offsetMs":120018,
"durationMs":1346,
"agentAssistEnabled":false,
"voiceTranscriptionEnabled":true
},
{
"utteranceId":"eb885c99-329b-4705-8d6f-6a9a8d37879b",
"isFinal":false,
"channel":"EXTERNAL",
"alternatives":[
{
"confidence":0.0,
"transcript":"they thought that niro",
"words":[
{
"confidence":0.864,
"offsetMs":122077,
"durationMs":79,
"word":"they"
},
{
"confidence":0.979,
"offsetMs":122451,
"durationMs":138,
"word":"thought"
},
{
"confidence":0.962,
"offsetMs":122648,
"durationMs":79,
"word":"that"
},
{
"confidence":0.675,
"offsetMs":122825,
"durationMs":276,
"word":"niro"
}
]
}
],
"engineId":"r2d2",
"dialect":"en-US",
"offsetMs":122077,
"durationMs":1024,
"agentAssistEnabled":false,
"voiceTranscriptionEnabled":true
}
],
"status":{
"offsetMs":122077,
"status":"SESSION_ONGOING"
}
}
Example - Status only
{
"eventTime":"2021-05-27T19:41:22.883Z",
"organizationId":"376d79ab-87f6-4fa2-87e7-aaecfd873d8a",
"conversationId":"fbf17d33-7e62-4626-a424-5ae0b49440cd",
"communicationId":"2a56e2fa-c784-4c08-a387-f6abc38cf3c3",
"status":{
"offsetMs":0,
"status":"SESSION_ONGOING"
}
}