GemmaPro

Author:

GemmaPro is a large language model operated by the AI program of the SciMaker community, aiming to allow most people to easily use large language models on local or mobile devices.

GemmaPro Introduction

The GemmaPro model is based on Gemma3 4B, using Q4 quantization technology. After weight reorganization, the model size is only about 2.5 GB, requiring just 4 GB RAM to run smoothly. This means most office computers or high-end phones have the potential to use it.

Any GPU with 4GB or more VRAM can fully load the model. On computers without a GPU, the model can achieve 7-8 tokens per second in inference speed, which is close to normal human speaking speed, making conversations feel smooth without noticeable delay. When using a GPU, the speed is even faster.

GemmaPro Model Features

Small size and low resource requirements
Enhanced thinking ability and logical performance
Better answer depth and logical performance compared to original Gemma3 4B
Model supports both normal mode and inference mode
Smooth handling of Chinese language tasks

Model Download Location

Hugging Face - SciMaker/GemmaPro

LLM: Local Installation and Operation Notes

About Author

GemmaPro

GemmaPro Introduction

GemmaPro Model Features

Model Download Location

Related Articles