gliner multitask large v0.5

knowledgator

Introduction

GLiNER-Multitask is a versatile model designed for extracting various types of information from text using user-defined prompts. It employs a bidirectional transformer encoder, akin to BERT, ensuring high generalization and computational efficiency. The model excels in tasks such as named entity recognition (NER), relation extraction, summarization, sentiment extraction, key-phrase extraction, question-answering, and open information extraction.

Architecture

GLiNER-Multitask utilizes a compact yet powerful architecture centered around a bidirectional transformer encoder. This design allows it to achieve state-of-the-art performance on NER zero-shot benchmarks, making it robust and flexible for various natural language processing tasks.

Training

The model is trained on synthetic multi-task data provided by Knowledgator, allowing it to handle diverse information extraction tasks effectively. It supports tasks like NER, relation extraction, summarization, sentiment extraction, and more, through a prompt-tunable framework.

Guide: Running Locally

  1. Install the GLiNER Library:

    pip install gliner
    
  2. Load the Model in Python:

    from gliner import GLiNER
    model = GLiNER.from_pretrained("knowledgator/gliner-multitask-large-v0.5")
    
  3. Perform Tasks:

    • For Named Entity Recognition:
      text = "Microsoft was founded by Bill Gates..."
      labels = ["founder", "computer", "software", "position", "date"]
      entities = model.predict_entities(text, labels)
      for entity in entities:
          print(entity["text"], "=>", entity["label"])
      
    • For Relation Extraction:
      text = "Microsoft was founded by Bill Gates..."
      labels = ["Microsoft <> founder", "Microsoft <> inception date"]
      entities = model.predict_entities(text, labels)
      for entity in entities:
          print(entity["label"], "=>", entity["text"])
      
  4. Utilize Cloud GPUs: For optimal performance, consider leveraging cloud GPUs from providers like AWS, Google Cloud, or Azure.

License

GLiNER-Multitask is released under the Apache-2.0 license, allowing for both personal and commercial use, as long as the terms of the license are followed.

More Related APIs in Token Classification