github.com/NVIDIA/TensorRT-LLM
Preview meta tags from the github.com website.
Linked Hostnames
28- 155 links togithub.com
- 25 links todeveloper.nvidia.com
- 8 links tonvidia.github.io
- 4 links toblogs.nvidia.com
- 4 links todocs.github.com
- 3 links todrive.google.com
- 3 links tohuggingface.co
- 2 links toarxiv.org
Thumbnail
Search Engine Appearance
GitHub - NVIDIA/TensorRT-LLM: TensorRT-LLM provides users with an easy-to-use Python API to define Large Language Models (LLMs) and support state-of-the-art optimizations to perform inference efficiently on NVIDIA GPUs. TensorRT-LLM also contains components to create Python and C++ runtimes that orchestrate the inference execution in performant way.
TensorRT-LLM provides users with an easy-to-use Python API to define Large Language Models (LLMs) and support state-of-the-art optimizations to perform inference efficiently on NVIDIA GPUs. TensorRT-LLM also contains components to create Python and C++ runtimes that orchestrate the inference execution in performant way. - NVIDIA/TensorRT-LLM
Bing
GitHub - NVIDIA/TensorRT-LLM: TensorRT-LLM provides users with an easy-to-use Python API to define Large Language Models (LLMs) and support state-of-the-art optimizations to perform inference efficiently on NVIDIA GPUs. TensorRT-LLM also contains components to create Python and C++ runtimes that orchestrate the inference execution in performant way.
TensorRT-LLM provides users with an easy-to-use Python API to define Large Language Models (LLMs) and support state-of-the-art optimizations to perform inference efficiently on NVIDIA GPUs. TensorRT-LLM also contains components to create Python and C++ runtimes that orchestrate the inference execution in performant way. - NVIDIA/TensorRT-LLM
DuckDuckGo
GitHub - NVIDIA/TensorRT-LLM: TensorRT-LLM provides users with an easy-to-use Python API to define Large Language Models (LLMs) and support state-of-the-art optimizations to perform inference efficiently on NVIDIA GPUs. TensorRT-LLM also contains components to create Python and C++ runtimes that orchestrate the inference execution in performant way.
TensorRT-LLM provides users with an easy-to-use Python API to define Large Language Models (LLMs) and support state-of-the-art optimizations to perform inference efficiently on NVIDIA GPUs. TensorRT-LLM also contains components to create Python and C++ runtimes that orchestrate the inference execution in performant way. - NVIDIA/TensorRT-LLM
General Meta Tags
46- titleGitHub - NVIDIA/TensorRT-LLM: TensorRT-LLM provides users with an easy-to-use Python API to define Large Language Models (LLMs) and support state-of-the-art optimizations to perform inference efficiently on NVIDIA GPUs. TensorRT-LLM also contains components to create Python and C++ runtimes that orchestrate the inference execution in performant way.
- charsetutf-8
- route-pattern/:user_id/:repository
- route-controllerfiles
- route-actiondisambiguate
Open Graph Meta Tags
9- og:imagehttps://opengraph.githubassets.com/1cc6792cea9a808dbd43fc1308006595d21bb93b7642b6a034507a08a3446e26/NVIDIA/TensorRT-LLM
- og:image:altTensorRT-LLM provides users with an easy-to-use Python API to define Large Language Models (LLMs) and support state-of-the-art optimizations to perform inference efficiently on NVIDIA GPUs. TensorR...
- og:image:width1200
- og:image:height600
- og:site_nameGitHub
Twitter Meta Tags
5- twitter:imagehttps://opengraph.githubassets.com/1cc6792cea9a808dbd43fc1308006595d21bb93b7642b6a034507a08a3446e26/NVIDIA/TensorRT-LLM
- twitter:site@github
- twitter:cardsummary_large_image
- twitter:titleGitHub - NVIDIA/TensorRT-LLM: TensorRT-LLM provides users with an easy-to-use Python API to define Large Language Models (LLMs) and support state-of-the-art optimizations to perform inference efficiently on NVIDIA GPUs. TensorRT-LLM also contains components to create Python and C++ runtimes that orchestrate the inference execution in performant way.
- twitter:descriptionTensorRT-LLM provides users with an easy-to-use Python API to define Large Language Models (LLMs) and support state-of-the-art optimizations to perform inference efficiently on NVIDIA GPUs. TensorR...
Link Tags
47- alternate iconhttps://github.githubassets.com/favicons/favicon.png
- assetshttps://github.githubassets.com/
- canonicalhttps://github.com/NVIDIA/TensorRT-LLM
- dns-prefetchhttps://github.githubassets.com
- dns-prefetchhttps://avatars.githubusercontent.com
Links
231- https://arxiv.org/abs/2211.10438
- https://arxiv.org/abs/2306.00978
- https://aws.amazon.com/blogs/hpc/scaling-your-llm-inference-workloads-multi-node-deployment-with-tensorrt-llm-and-triton-on-amazon-eks
- https://aws.amazon.com/blogs/machine-learning/boost-inference-performance-for-llms-with-new-amazon-sagemaker-containers
- https://blogs.bing.com/search-quality-insights/December-2024/Bing-s-Transition-to-LLM-SLM-Models-Optimizing-Search-with-TensorRT-LLM