lit 6 B LLM Model — Open LLM List

Introduction

LIT-6B is a large-scale, fine-tuned model designed for generating fictional storytelling text. It is based on the GPT-J 6B architecture and specifically tailored for creating novel-like outputs.

Architecture

The model architecture is based on GPT-J, a 6 billion parameter auto-regressive language model. It was initially trained on a diverse dataset known as The Pile. The fine-tuning process involved 2GB of data from a variety of sources, including light novels, erotica, and annotated literature.

Training

Training data was sourced from diverse origins such as the Gutenberg Project. The dataset includes annotated prompts to guide the model's text generation toward specific styles and themes. Annotations can include details like title, author, genre, and style, which help the model in generating text in a desired format.

Guide: Running Locally

To run the LIT-6B model locally, you can use the Hugging Face Transformers library. Below are the steps to set up and generate text:

Install Transformers Library:
```
pip install transformers
```

Load the Model and Tokenizer:

from transformers import AutoTokenizer, AutoModelForCausalLM

model = AutoModelForCausalLM.from_pretrained('hakurei/lit-6B')
tokenizer = AutoTokenizer.from_pretrained('hakurei/lit-6B')

Generate Text:

prompt = '''[ Title: The Dunwich Horror; Author: H. P. Lovecraft; Genre: Horror ]***When a traveler'''
input_ids = tokenizer.encode(prompt, return_tensors='pt')
output = model.generate(input_ids, do_sample=True, temperature=1.0, top_p=0.9, repetition_penalty=1.2, max_length=len(input_ids[0])+100, pad_token_id=tokenizer.eos_token_id)

generated_text = tokenizer.decode(output[0])
print(generated_text)

For optimal performance, it is recommended to utilize cloud GPUs, such as those available through Google Cloud or AWS.

License

LIT-6B is released under the MIT License, which allows for free use, modification, and distribution of the software.

More Related APIs in Text Generation