Voice Transcription API Response

Stephanie_Fong · April 15, 2021, 11:27pm

I'm using the PureCloud::SpeechTextAnalyticsApi to pull down the voice transcription for a call.

I used the conversation_id and communication_id to get the transcript_url, and then successfully made a GET request to the transcript_url and am now working with the Transcript object described here:
https://developer.genesys.cloud/api/rest/v2/speechtextanalytics/transcript_url

We want to keep a record of what was said during the call, and who said it. It looks like the "phrases" property has all the "text" values we need. My issue now is: I see the only property that helps us attribute a phrase's "text" to the person who said it, is the "participantPurpose" value (either "external" or "internal").

In the case of a phone call where the customer was transferred and spoke to multiple internal Agents, how would we go about attributing a given phrase (with participantPurpose=internal) to the correct agent?

Stephanie_Fong · April 19, 2021, 7:04pm

2 more questions I had regarding voice transcription:

Is there a attribute anywhere that tells us whether a call has finished being transcribed, so we know when the transcript is ready to be pulled down?
The transcription url returns a list of phrases, with the only time-related attribute being startTimeMs. How do we go about using this to figure out a timestamp? For example, one of the values I received was:
"startTimeMs"=>1618594874451

anon11147534 · April 21, 2021, 8:47am

Hi,

For your first question, there doesn't appear to be a way to map a transcript to an exact participant. Please feel free to request this feature using our ideas portal.

I'm not sure about the second question because I'm not familiar with the API. it's possible that polling could be used to check when it has finished transcribing.

For the last question, that value is the total milliseconds since the epoch (JAN 01 1970). I'm sure Ruby has a utility function to turn that into a human readable timestamp.

Jerome.Saint-Marc · April 21, 2021, 4:28pm

Hello,

Just an idea regarding your first question.
I have only practiced the Speech Analytics & the transcript once or twice.
But although it is not straightforward, you could possibly identify the agent requesting the conversation details [GET /api/v2/analytics/conversations/{conversationId}/details].

In the transcript, retrieve the startTimeMs from the agent's phrase (participantPurpose = internal).

Then, in the conversation details, look for agent's participants (participant with purpose="agent").
And check/find a segment with segmentType="interact" (participant -> sessions array -> segments array) for which the startTimeMs (from the transcript phrase) would be between segmentStart and segmentEnd of the segment (converting the UTC datetime to Unix/Epoch timestamp).

This should work for calls where a single agent is interacting with the customer at a time (first agent, transfer to different agents).
But it won't be enough if there is a conference (customer and 2 or more agents).

Note: It might be easier to do the processing the other way around.
I mean extracting agent's participants and their segmentStart/segmentEnd (segmentType="interact"), converting these datetimes to Unix/Epoch timestamps. And then going through the transcript phrase belonging to participantPurpose="internal" and checking in which "bucket" they fall into.

Regards,

system · May 22, 2021, 4:30pm

This topic was automatically closed 31 days after the last reply. New replies are no longer allowed.