Sentence-BERT: Sentence Embeddings using Siamese BERT-Networks
Paper
• 1908.10084 • Published
• 12
This is a sentence-transformers model finetuned from tomaarsen/clap-htsat-fused-librispeech on the librispeech_asr dataset. It maps sentences & paragraphs to a None-dimensional dense vector space and can be used for semantic textual similarity, semantic search, paraphrase mining, text classification, clustering, and more.
SentenceTransformer(
(0): Transformer({'transformer_task': 'feature-extraction', 'modality_config': {'text': {'method': 'get_text_features', 'method_output_name': None}, 'audio': {'method': 'get_audio_features', 'method_output_name': None}}, 'module_output_name': 'sentence_embedding', 'architecture': 'ClapModel'})
)
First install the Sentence Transformers library:
pip install -U sentence-transformers
Then you can load this model and run inference.
from sentence_transformers import SentenceTransformer
# Download from the 🤗 Hub
model = SentenceTransformer("tomaarsen/clap-htsat-fused-librispeech-cont-4-epochs-128bs")
# Run inference
sentences = [
'THERE ARE NATURES TOO TO WHOSE SENSE OF JUSTICE THE PRICE EXACTED LOOMS UP MONSTROUSLY ENORMOUS ODIOUS OPPRESSIVE WORRYING HUMILIATING EXTORTIONATE INTOLERABLE THOSE ARE THE FANATICS',
'HE BEGAN TO WISH THAT HE HAD COMPROMISED IN SOME WAY OR OTHER THAT HE HAD SENT THE MONEY PERHAPS HE COULD DO IT UP HERE',
'HERE THE HOLY PRELATE OF FERNS MET HIM AND RELATED A VISION IN WHICH HE HAD BEEN INSTRUCTED TO DEMAND THE ABOLITION OF THE IMPOST',
]
embeddings = model.encode(sentences)
print(embeddings.shape)
# [3, 1024]
# Get the similarity scores for the embeddings
similarities = model.similarity(embeddings, embeddings)
print(similarities)
# tensor([[ 1.0000, -0.1652, -0.0721],
# [-0.1652, 1.0000, 0.6024],
# [-0.0721, 0.6024, 1.0000]])
librispeech-eval and librispeech-testInformationRetrievalEvaluator| Metric | librispeech-eval | librispeech-test |
|---|---|---|
| cosine_accuracy@1 | 0.616 | 0.66 |
| cosine_accuracy@3 | 0.81 | 0.838 |
| cosine_accuracy@5 | 0.875 | 0.9 |
| cosine_accuracy@10 | 0.93 | 0.94 |
| cosine_precision@1 | 0.616 | 0.66 |
| cosine_precision@3 | 0.27 | 0.2793 |
| cosine_precision@5 | 0.175 | 0.18 |
| cosine_precision@10 | 0.093 | 0.094 |
| cosine_recall@1 | 0.616 | 0.66 |
| cosine_recall@3 | 0.81 | 0.838 |
| cosine_recall@5 | 0.875 | 0.9 |
| cosine_recall@10 | 0.93 | 0.94 |
| cosine_ndcg@10 | 0.7732 | 0.8051 |
| cosine_mrr@10 | 0.7227 | 0.7613 |
| cosine_map@100 | 0.7263 | 0.7645 |
audio and text| audio | text | |
|---|---|---|
| type | dict | string |
| details |
|
| audio | text |
|---|---|
{'path': '374-180298-0000.flac', 'array': array([ 6.92203816e-04, 8.04404495e-04, 8.03834875e-04, ..., |
CHAPTER SIXTEEN I MIGHT HAVE TOLD YOU OF THE BEGINNING OF THIS LIAISON IN A FEW LINES BUT I WANTED YOU TO SEE EVERY STEP BY WHICH WE CAME I TO AGREE TO WHATEVER MARGUERITE WISHED |
{'path': '374-180298-0001.flac', 'array': array([-9.33515839e-05, -1.25754057e-04, -1.44482241e-04, ..., |
MARGUERITE TO BE UNABLE TO LIVE APART FROM ME IT WAS THE DAY AFTER THE EVENING WHEN SHE CAME TO SEE ME THAT I SENT HER MANON LESCAUT FROM THAT TIME SEEING THAT I COULD NOT CHANGE MY MISTRESS'S LIFE I CHANGED MY OWN |
{'path': '374-180298-0002.flac', 'array': array([-2.47883319e-04, -2.91854434e-04, -2.82971043e-04, ..., |
I WISHED ABOVE ALL NOT TO LEAVE MYSELF TIME TO THINK OVER THE POSITION I HAD ACCEPTED FOR IN SPITE OF MYSELF IT WAS A GREAT DISTRESS TO ME THUS MY LIFE GENERALLY SO CALM |
CachedMultipleNegativesSymmetricRankingLoss with these parameters:{
"scale": 20.0,
"similarity_fct": "cos_sim",
"mini_batch_size": 64,
"gather_across_devices": false
}
audio and text| audio | text | |
|---|---|---|
| type | dict | string |
| details |
|
| audio | text |
|---|---|
{'path': '2277-149896-0000.flac', 'array': array([ 0.00179741, 0.00170625, 0.00120927, ..., -0.00144462, |
HE WAS IN A FEVERED STATE OF MIND OWING TO THE BLIGHT HIS WIFE'S ACTION THREATENED TO CAST UPON HIS ENTIRE FUTURE |
{'path': '2277-149896-0001.flac', 'array': array([ 0.00111104, 0.00081758, 0.00021103, ..., -0.00138193, |
HE WOULD HAVE TO PAY HER THE MONEY WHICH SHE WOULD NOW REGULARLY DEMAND OR THERE WOULD BE TROUBLE IT DID NOT MATTER WHAT HE DID |
{'path': '2277-149896-0002.flac', 'array': array([0.00080266, 0.00088462, 0.00083408, ..., 0.00105488, 0.00083673, |
HURSTWOOD WALKED THE FLOOR MENTALLY ARRANGING THE CHIEF POINTS OF HIS SITUATION |
CachedMultipleNegativesSymmetricRankingLoss with these parameters:{
"scale": 20.0,
"similarity_fct": "cos_sim",
"mini_batch_size": 64,
"gather_across_devices": false
}
eval_strategy: stepsper_device_train_batch_size: 128per_device_eval_batch_size: 128learning_rate: 2e-05num_train_epochs: 4warmup_ratio: 0.1bf16: Truebatch_sampler: no_duplicatesoverwrite_output_dir: Falsedo_predict: Falseeval_strategy: stepsprediction_loss_only: Trueper_device_train_batch_size: 128per_device_eval_batch_size: 128gradient_accumulation_steps: 1eval_accumulation_steps: Nonetorch_empty_cache_steps: Nonelearning_rate: 2e-05weight_decay: 0.0adam_beta1: 0.9adam_beta2: 0.999adam_epsilon: 1e-08max_grad_norm: 1.0num_train_epochs: 4max_steps: -1lr_scheduler_type: linearlr_scheduler_kwargs: {}warmup_ratio: 0.1warmup_steps: 0log_level: passivelog_level_replica: warninglog_on_each_node: Truelogging_nan_inf_filter: Truesave_safetensors: Truesave_on_each_node: Falsesave_only_model: Falserestore_callback_states_from_checkpoint: Falseuse_cpu: Falseseed: 42data_seed: Nonejit_mode_eval: Falsebf16: Truefp16: Falsehalf_precision_backend: Nonebf16_full_eval: Falsefp16_full_eval: Falsetf32: Nonelocal_rank: 0ddp_backend: Nonetpu_num_cores: Nonedebug: []dataloader_drop_last: Falsedataloader_num_workers: 0dataloader_prefetch_factor: Nonepast_index: -1disable_tqdm: Falseremove_unused_columns: Truelabel_names: Noneload_best_model_at_end: Falseignore_data_skip: Falsefsdp: []fsdp_config: {'min_num_params': 0, 'xla': False, 'xla_fsdp_v2': False, 'xla_fsdp_grad_ckpt': False}accelerator_config: {'split_batches': False, 'dispatch_batches': None, 'even_batches': True, 'use_seedable_sampler': True, 'non_blocking': False, 'gradient_accumulation_kwargs': None}parallelism_config: Nonedeepspeed: Nonelabel_smoothing_factor: 0.0optim: adamw_torch_fusedoptim_args: Nonegroup_by_length: Falselength_column_name: lengthddp_find_unused_parameters: Noneddp_bucket_cap_mb: Noneddp_broadcast_buffers: Falsedataloader_pin_memory: Truedataloader_persistent_workers: Falseskip_memory_metrics: Trueuse_legacy_prediction_loop: Falsepush_to_hub: Falseresume_from_checkpoint: Nonehub_model_id: Nonehub_strategy: every_savehub_private_repo: Nonehub_always_push: Falsehub_revision: Nonegradient_checkpointing: Falsegradient_checkpointing_kwargs: Noneinclude_for_metrics: []eval_do_concat_batches: Truemp_parameters: auto_find_batch_size: Falsefull_determinism: Falseray_scope: lastddp_timeout: 1800torch_compile: Falsetorch_compile_backend: Nonetorch_compile_mode: Noneinclude_tokens_per_second: Falseinclude_num_input_tokens_seen: noneftune_noise_alpha: Noneoptim_target_modules: Nonebatch_eval_metrics: Falseeval_on_start: Falseuse_liger_kernel: Falseliger_kernel_config: Noneeval_use_gather_object: Falseaverage_tokens_across_devices: Trueprompts: Nonebatch_sampler: no_duplicatesmulti_dataset_batch_sampler: proportionalrouter_mapping: {}learning_rate_mapping: {}| Epoch | Step | Training Loss | Validation Loss | librispeech-eval_cosine_ndcg@10 | librispeech-test_cosine_ndcg@10 |
|---|---|---|---|---|---|
| -1 | -1 | - | - | 0.2578 | 0.3112 |
| 0.0801 | 83 | 1.8184 | 2.0262 | 0.2650 | - |
| 0.1602 | 166 | 1.8023 | 2.0307 | 0.2663 | - |
| 0.2403 | 249 | 1.7706 | 1.8957 | 0.3116 | - |
| 0.3205 | 332 | 1.7092 | 1.8817 | 0.2935 | - |
| 0.4006 | 415 | 1.6373 | 1.8190 | 0.3382 | - |
| 0.4807 | 498 | 1.6326 | 1.8886 | 0.3072 | - |
| 0.5608 | 581 | 1.5066 | 1.8244 | 0.3356 | - |
| 0.6409 | 664 | 1.47 | 1.6148 | 0.3962 | - |
| 0.7210 | 747 | 1.3779 | 1.5519 | 0.4049 | - |
| 0.8012 | 830 | 1.3566 | 1.4406 | 0.4418 | - |
| 0.8813 | 913 | 1.3229 | 1.4122 | 0.4560 | - |
| 0.9614 | 996 | 1.2295 | 1.3453 | 0.4777 | - |
| 1.0415 | 1079 | 1.1413 | 1.3783 | 0.4647 | - |
| 1.1216 | 1162 | 1.0143 | 1.2593 | 0.4813 | - |
| 1.2017 | 1245 | 0.9226 | 1.3579 | 0.4552 | - |
| 1.2819 | 1328 | 0.8701 | 1.1575 | 0.5407 | - |
| 1.3620 | 1411 | 0.8354 | 1.0661 | 0.5742 | - |
| 1.4421 | 1494 | 0.7969 | 1.0900 | 0.5615 | - |
| 1.5222 | 1577 | 0.7667 | 0.9902 | 0.6099 | - |
| 1.6023 | 1660 | 0.7354 | 1.0506 | 0.5770 | - |
| 1.6824 | 1743 | 0.6864 | 0.9822 | 0.5971 | - |
| 1.7625 | 1826 | 0.6407 | 0.9009 | 0.6293 | - |
| 1.8427 | 1909 | 0.6193 | 0.8974 | 0.6319 | - |
| 1.9228 | 1992 | 0.5999 | 0.8587 | 0.6571 | - |
| 2.0029 | 2075 | 0.5631 | 0.8723 | 0.6448 | - |
| 2.0830 | 2158 | 0.5036 | 0.8252 | 0.6558 | - |
| 2.1631 | 2241 | 0.4913 | 0.8168 | 0.6585 | - |
| 2.2432 | 2324 | 0.4722 | 0.7609 | 0.6969 | - |
| 2.3234 | 2407 | 0.4558 | 0.7469 | 0.6923 | - |
| 2.4035 | 2490 | 0.4425 | 0.6988 | 0.7048 | - |
| 2.4836 | 2573 | 0.4307 | 0.7233 | 0.6907 | - |
| 2.5637 | 2656 | 0.4047 | 0.6843 | 0.7170 | - |
| 2.6438 | 2739 | 0.3956 | 0.6634 | 0.7251 | - |
| 2.7239 | 2822 | 0.3846 | 0.6762 | 0.7214 | - |
| 2.8041 | 2905 | 0.3781 | 0.6236 | 0.7428 | - |
| 2.8842 | 2988 | 0.3511 | 0.6418 | 0.7397 | - |
| 2.9643 | 3071 | 0.3408 | 0.6076 | 0.7537 | - |
| 3.0444 | 3154 | 0.3324 | 0.6056 | 0.7553 | - |
| 3.1245 | 3237 | 0.3029 | 0.6142 | 0.7437 | - |
| 3.2046 | 3320 | 0.2983 | 0.6205 | 0.7451 | - |
| 3.2847 | 3403 | 0.288 | 0.5939 | 0.7618 | - |
| 3.3649 | 3486 | 0.2841 | 0.5538 | 0.7750 | - |
| 3.4450 | 3569 | 0.2796 | 0.5916 | 0.7637 | - |
| 3.5251 | 3652 | 0.2781 | 0.5686 | 0.7671 | - |
| 3.6052 | 3735 | 0.2762 | 0.5639 | 0.7726 | - |
| 3.6853 | 3818 | 0.2635 | 0.5395 | 0.7825 | - |
| 3.7654 | 3901 | 0.2657 | 0.5386 | 0.7781 | - |
| 3.8456 | 3984 | 0.2652 | 0.5323 | 0.7821 | - |
| 3.9257 | 4067 | 0.2637 | 0.5405 | 0.7797 | - |
| -1 | -1 | - | - | 0.7732 | 0.8051 |
Carbon emissions were measured using CodeCarbon.
@inproceedings{reimers-2019-sentence-bert,
title = "Sentence-BERT: Sentence Embeddings using Siamese BERT-Networks",
author = "Reimers, Nils and Gurevych, Iryna",
booktitle = "Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing",
month = "11",
year = "2019",
publisher = "Association for Computational Linguistics",
url = "https://arxiv.org/abs/1908.10084",
}
Base model
laion/clap-htsat-fused