GPT-NEO 1.3B Vietnamese News

Introduction

The GPT-NEO 1.3B Vietnamese News model is a language model designed for text generation tasks in Vietnamese. It is based on the GPT-Neo architecture and utilizes a causal language model approach to generate coherent text sequences.

Architecture

The model is built using the GPT-Neo architecture, a variant of the transformer model that is particularly suited for causal language modeling tasks. It is implemented using PyTorch, and it supports text generation in Vietnamese.

Training

Details regarding the training process of the GPT-NEO 1.3B Vietnamese News model are currently unavailable. For further information, contact the contributors via the provided email addresses.

Guide: Running Locally

To run the model locally, follow these steps:

Install Dependencies: Ensure you have the transformers library installed in your Python environment.

Load the Model:

from transformers import AutoTokenizer, AutoModelForCausalLM
tokenizer = AutoTokenizer.from_pretrained("VietAI/gpt-neo-1.3B-vietnamese-news")
model = AutoModelForCausalLM.from_pretrained("VietAI/gpt-neo-1.3B-vietnamese-news", low_cpu_mem_usage=True)

Set Up Device: Move the model to a GPU if available.

device = torch.device("cuda" if torch.cuda.is_available() else "cpu")
model.to(device)

Generate Text: Use a prompt to generate text.

prompt = "Tiềm năng của trí tuệ nhân tạo" # Example input sentence
input_ids = tokenizer(prompt, return_tensors="pt")['input_ids'].to(device)
gen_tokens = model.generate(input_ids, max_length=100, do_sample=True, temperature=0.9, top_k=20)
gen_text = tokenizer.batch_decode(gen_tokens)[0]
print(gen_text)

Cloud GPUs: For better performance, consider using cloud-based GPU services such as AWS, Google Cloud, or Azure.

License

The license information for the GPT-NEO 1.3B Vietnamese News model is not explicitly mentioned. For licensing details, please contact the model contributors via email.