Speaker embeddings

#2
by meganoob1337 - opened

is there any way to get speaker embeddings to match speakers to known speakers of previous transcripts?

I guess probably not, but wanted to ask to make sure :)

thank you for the model and sorry for the questions !

IBM Granite org

Hi,
The model isn't able to produce speaker embedding. The speaker numbers are based only on the order of appearance.
One possible solution to maintain speakers IDs between segments is to concatenate audio segments of known speakers before the segment to be decoded.

Hey, Thank you for your answer, i will maybe try to test it out if i get timestamp segments mapped to speakers and then generate speaker embeddings with pyannote or similar for it , to check if it can replace my current pipeline (whisper/canary + pyannote with a check afterwards for known speaker embeddings to have names in the transcript)

meganoob1337 changed discussion status to closed

Sign up or log in to comment