(nim_plugin)= # NIM ```{eval-rst} .. tags:: Inference, NVIDIA ``` Serve optimized model containers with NIM in a Flyte task. [NVIDIA NIM](https://www.nvidia.com/en-in/ai/), part of NVIDIA AI Enterprise, provides a streamlined path for developing AI-powered enterprise applications and deploying AI models in production. It includes an out-of-the-box optimization suite, enabling AI model deployment across any cloud, data center, or workstation. Since NIM can be self-hosted, there is greater control over cost, data privacy, and more visibility into behind-the-scenes operations. With NIM, you can invoke the model's endpoint as if it is hosted locally, minimizing network overhead. ## Installation To use the NIM plugin, run the following command: ``` pip install flytekitplugins-inference ``` ## Example usage For a usage example, see {doc}`NIM example usage `. ```{note} NIM can only be run in a Flyte cluster as it must be deployed as a sidecar service in a Kubernetes pod. ``` ```{toctree} :maxdepth: -1 :hidden: serve_nim_container ```