Speaker embeddings

by meganoob1337 - opened Apr 30

Discussion

meganoob1337

Apr 30

is there any way to get speaker embeddings to match speakers to known speakers of previous transcripts?

I guess probably not, but wanted to ask to make sure :)

thank you for the model and sorry for the questions !

konszvi

IBM Granite org May 1

Hi,
The model isn't able to produce speaker embedding. The speaker numbers are based only on the order of appearance.
One possible solution to maintain speakers IDs between segments is to concatenate audio segments of known speakers before the segment to be decoded.

meganoob1337

May 1

Hey, Thank you for your answer, i will maybe try to test it out if i get timestamp segments mapped to speakers and then generate speaker embeddings with pyannote or similar for it , to check if it can replace my current pipeline (whisper/canary + pyannote with a check afterwards for known speaker embeddings to have names in the transcript)

meganoob1337 changed discussion status to closed May 1

Upload images, audio, and videos by dragging in the text input, pasting, or clicking here.

Tap or paste here to upload images

· Sign up or log in to comment