Gemma 2, the next generation of Google Gemma models, is now optimized with TensorRT-LLM and packaged as NVIDIA NIM inference microservice.
Gemma 2, the next generation of Google Gemma models, is now optimized with TensorRT-LLM and packaged as NVIDIA NIM inference microservice.
Gemma 2, the next generation of Google Gemma models, is now optimized with TensorRT-LLM and packaged as NVIDIA NIM inference microservice. Generative AI, LLMs, Speech Generation NVIDIA Technical Blog