contradiction psb lds
nategroIntroduction
CONTRADICTION-PSB-LDS is a model designed to identify contradictory sentences in patents using PatentSBERTa. It maps sentences and paragraphs to a 768-dimensional dense vector space, useful for tasks like clustering or semantic search.
Architecture
The model is a SentenceTransformer consisting of:
- A Transformer with MPNetModel architecture, configured with a max sequence length of 512 and case sensitivity.
- A Pooling layer that employs CLS token pooling, with a word embedding dimension of 768.
Training
The training parameters include:
- DataLoader:
torch.utils.data.dataloader.DataLoader
with a length of 1128, batch size 16. - Loss:
sentence_transformers.losses.CosineSimilarityLoss
. - Optimizer: AdamW with a learning rate of 2e-05.
- Scheduler: WarmupLinear with 113 warmup steps.
- Training was conducted for 1 epoch with parameters for gradient clipping and weight decay.
Guide: Running Locally
-
Installation:
- Install
sentence-transformers
:pip install -U sentence-transformers
- Install
-
Usage with SentenceTransformers:
from sentence_transformers import SentenceTransformer sentences = ["This is an example sentence", "Each sentence is converted"] model = SentenceTransformer('nategro/contradiction-psb-lds') embeddings = model.encode(sentences) print(embeddings)
-
Usage with Hugging Face Transformers:
from transformers import AutoTokenizer, AutoModel import torch def cls_pooling(model_output, attention_mask): return model_output[0][:,0] sentences = ['This is an example sentence', 'Each sentence is converted'] tokenizer = AutoTokenizer.from_pretrained('nategro/contradiction-psb-lds') model = AutoModel.from_pretrained('nategro/contradiction-psb-lds') encoded_input = tokenizer(sentences, padding=True, truncation=True, return_tensors='pt') with torch.no_grad(): model_output = model(**encoded_input) sentence_embeddings = cls_pooling(model_output, encoded_input['attention_mask']) print("Sentence embeddings:") print(sentence_embeddings)
-
Cloud GPUs:
- For better performance, consider using cloud GPU services like AWS, Google Cloud, or Azure.
License
The model leverages the pre-trained model AI-Growth-Lab/PatentSBERTa. Check the Hugging Face model page for specific licensing details.