Trial Space
ksg-dfciIntroduction
The TrialSpace model is a sentence-transformers model designed to map sentences and paragraphs into a 1024-dimensional dense vector space. It is primarily used for various applications such as semantic textual similarity, semantic search, paraphrase mining, text classification, and clustering.
Architecture
- Model Type: Sentence Transformer
- Maximum Sequence Length: 512 tokens
- Output Dimensionality: 1024 tokens
- Similarity Function: Cosine Similarity
- Base Model: dunzhang/stella_en_1.5B_v5
- Library: Sentence-transformers
Training
The model is trained using a dataset of size 1,395,384 with losses such as OnlineContrastiveLoss and MultipleNegativesRankingLoss. It focuses on applications in medical, clinical trials, and cancer research, providing robust feature extraction capabilities.
Guide: Running Locally
To run the TrialSpace model locally, follow these steps:
- Clone the repository from Hugging Face.
- Install necessary dependencies using
pip install sentence-transformers
. - Load the model using the SentenceTransformer library in Python.
- Input sentences to obtain their embeddings.
For optimal performance, consider using cloud GPUs such as those provided by AWS or Google Cloud.
License
The TrialSpace model is released under the CC BY-NC 2.0 license, which allows for non-commercial use with appropriate attribution.