triton-inference-server / tensorrtllm_backend Public

Notifications You must be signed in to change notification settings
Fork 115
Star 796

Code
Issues 304
Pull requests 22
Discussions
Actions
Security
Insights

Additional navigation options

Code
Issues
Pull requests
Discussions
Actions
Security
Insights

Pull requests: triton-inference-server/tensorrtllm_backend

Labels 13 Milestones 0

New pull request New

22 Open 144 Closed

Author

Filter by author

Label

Filter by label

Use alt + click/return to exclude labels

or ⇧ + click/return for logical OR

Projects

Filter by project

Milestones

Filter by milestone

Reviews

Filter by reviews

No reviews Review required Approved review Changes requested

Assignee

Filter by who’s assigned

Assigned to nobody

Sort

Sort by

Newest Oldest Most commented Least commented Recently updated Least recently updated Best match

Most reactions

Pull requests list

docs: update README.md

#716 opened Mar 4, 2025 by eltociear

Loading…

import PIL on demand

#673 opened Jan 2, 2025 by ShuaiShao93

Loading…

update tensorrt-llm default version

#670 opened Dec 25, 2024 by BasicCoder

Loading…

[PoC] Improve TRTLLM deployment UX

#650 opened Nov 22, 2024 by rmccorm4 • Draft

2 of 6 tasks

Update the multinode tutorial link

#644 opened Nov 14, 2024 by harryskim

Loading…

Fix broken links in README.md

#631 opened Oct 23, 2024 by benchislett

Loading…

Update launch_triton_server.py

#628 opened Oct 22, 2024 by ankur1-samsung

Loading…

Update llama.md

#604 opened Sep 25, 2024 by surprisedPikachu007

Loading…

Add missing kv_cache related metrics

#592 opened Sep 3, 2024 by Pernekhan

Loading…

[Bugfix]fix the thread lock when user input same id

#585 opened Aug 27, 2024 by GGBond8488

Loading…

Fix the exiting bug in docker compose when using the scripts/launch_t…

#581 opened Aug 21, 2024 by Aquasar11

Loading…

fix inference quality caused by temperature parameter in bls

#523 opened Jul 4, 2024 by activezhao

Loading…

Added documentation of using warmups to initialize lora weights

#515 opened Jun 27, 2024 by TheCodeWrangler

Loading…

Replace subprocess.Popen with subprocess.run triaged

Issue has been triaged by maintainers

#452 opened May 14, 2024 by rlempka

Loading…

Fixed Whitespace Error in Streaming mode

#423 opened Apr 19, 2024 by enochlev

Loading…

Update end_to_end_test.py

#409 opened Apr 14, 2024 by r0cketdyne

Loading…

fix: add foreground argument

#343 opened Feb 21, 2024 by pfldy2850

Loading…

Expose verbose as pram in launch triton script

#295 opened Jan 12, 2024 by ekagra-ranjan

Loading…

Add all_models/bert as an example for tensorrt-llm classification models

#269 opened Dec 31, 2023 by erenup

Loading…

Add example of tensorrt-llm usage

#225 opened Dec 15, 2023 by Pernekhan

Loading…

Wrap long command-lines in README.md

#134 opened Nov 15, 2023 by wangkuiyi

Loading…

draft pr about non-streaming output

#95 opened Nov 3, 2023 by BasicCoder

Loading…

ProTip! Follow long discussions with comments:>50.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly