Saturday, April 5, 2025
GemmaPro
Author:
Yuwei Yang
文章封面圖片
GemmaPro is a large language model operated by the AI program of the SciMaker community, aiming to allow most people to easily use large language models on local or mobile devices.
Introduction
The GemmaPro model is based on Gemma3 4B, using Q4 quantization technology. After weight reorganization, the model size is only about 2.5 GB, requiring just 4 GB RAM to run smoothly. This means most office computers or high-end phones have the potential to use it.
Any GPU with 4GB or more VRAM can fully load the model. On computers without a GPU, the model can achieve 7-8 tokens per second in inference speed, which is close to normal human speaking speed, making conversations feel smooth without noticeable delay. When using a GPU, the speed is even faster.
GemmaPro Model Features
Model Download Location
Hugging Face - SciMaker/GemmaPro
About Author
GemmaPro|Yuwei Yang