luke japanese large
studio-ousiaIntroduction
LUKE-JAPANESE-LARGE is a Japanese version of LUKE (Language Understanding with Knowledge-based Embeddings), a pre-trained model designed to produce contextualized representations of words and entities. It treats words and entities as independent tokens and is equipped with Wikipedia entity embeddings. It is suitable for tasks such as named entity recognition, entity typing, relation classification, and question answering.
Architecture
LUKE-JAPANESE-LARGE is a knowledge-enhanced contextualized representation transformer model. It uses independent tokens for words and entities to generate context-aware representations. The model includes Wikipedia entity embeddings, which are generally not used in standard NLP tasks, making it a specialized tool for knowledge-based tasks.
Training
The model was evaluated on the JGLUE dev set and compared with various baselines:
- LUKE Japanese large achieved:
- MARC-ja: 0.965
- JSTS (Pearson/Spearman): 0.932/0.902
- JNLI: 0.927
- JCommonsenseQA: 0.893
- Baseline models like Tohoku BERT large and Waseda RoBERTa large showed slightly lower performance metrics.
Guide: Running Locally
To run LUKE-JAPANESE-LARGE locally, follow these steps:
- Install Dependencies: Ensure you have Python and PyTorch installed.
- Clone Repository: Clone the LUKE repository from GitHub.
- Download Model: Access the model from Hugging Face's model hub.
- Run Inference: Use the model for tasks like named entity recognition or question answering.
For optimal performance, consider using a cloud GPU service such as AWS or Google Cloud Platform.
License
LUKE-JAPANESE-LARGE is released under the Apache License 2.0, allowing for use and modification under defined terms.