83.5K Downloads Updated 8 months ago
A commercial-friendly small language model by NVIDIA optimized for roleplay, RAG QA, and function calling.
Updated 8 months ago
8 months ago
ed76ab18784f · 2.7GB
Readme
Nemotron-Mini-4B-Instruct is a model for generating responses for roleplaying, retrieval augmented generation, and function calling. It is a small language model (SLM) optimized through distillation, pruning and quantization for speed and on-device deployment.
This instruct model is optimized for roleplay, RAG QA, and function calling in English. It supports a context length of 4,096 tokens. This model is ready for commercial use.