Java LLM Inference Tools

View 136 business solutions

Browse free open source Java LLM Inference Tools and projects below. Use the toggles on the left to filter open source Java LLM Inference Tools by OS, license, language, programming language, and project status.

  • MongoDB Atlas runs apps anywhere Icon
    MongoDB Atlas runs apps anywhere

    Deploy in 115+ regions with the modern database for every enterprise.

    MongoDB Atlas gives you the freedom to build and run modern applications anywhere—across AWS, Azure, and Google Cloud. With global availability in over 115 regions, Atlas lets you deploy close to your users, meet compliance needs, and scale with confidence across any geography.
    Start Free
  • Iris Powered By Generali - Iris puts your customer in control of their identity. Icon
    Iris Powered By Generali - Iris puts your customer in control of their identity.

    Increase customer and employee retention by offering Onwatch identity protection today.

    Iris Identity Protection API sends identity monitoring and alerts data into your existing digital environment – an ideal solution for businesses that are looking to offer their customers identity protection services without having to build a new product or app from scratch.
    Learn More
  • 1
    Openfire LLM Chatbot Plugin

    Openfire LLM Chatbot Plugin

    LLM Chatbot Assistant for Openfire server

    This plugin is a wrapper to hosted AI Inference server for LLM chat models. It uses the Botz API to create a chatbot in Openfire which will engage in XMPP chat and groupchat conversations.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 2
    TorchServe

    TorchServe

    Serve, optimize and scale PyTorch models in production

    TorchServe is a performant, flexible and easy-to-use tool for serving PyTorch eager mode and torschripted models. Multi-model management with the optimized worker to model allocation. REST and gRPC support for batched inference. Export your model for optimized inference. Torchscript out of the box, ORT, IPEX, TensorRT, FasterTransformer. Performance Guide: built-in support to optimize, benchmark and profile PyTorch and TorchServe performance. Expressive handlers: An expressive handler architecture that makes it trivial to support inferencing for your use case with many supported out of the box. Out-of-box support for system-level metrics with Prometheus exports, custom metrics and PyTorch profiler support.
    Downloads: 0 This Week
    Last Update:
    See Project
  • Previous
  • You're on page 1
  • Next