Join/Login
Business Software
Open Source Software
For Vendors
Blog
About
More

For Vendors Help Create Join Login

Business Software

Open Source Software

SourceForge Podcast

Resources

Articles
Case Studies
Blog

Menu

Help
Create
Join
Login

Home
Open Source Software
Artificial Intelligence
Reinforcement Learning Frameworks

Open Source Python Reinforcement Learning Frameworks

x

Sort By:

Most Popular

Clear All Filters

OS

Linux 87
Mac 87
Windows 87
More...
BSD 33
ChromeOS 33
Mobile Operating Systems 3

Category

Artificial Intelligence 89
Software Development 10
Scientific/Engineering 4
Education 3
Business 2
Games 2
Database 1
Formats and Protocols 1
System 1

License

OSI-Approved Open Source 85
Creative Commons Attribution License 1

Programming Language

Python 89
C++ 2
Java 1

Status

Alpha 3
Planning 1
Pre-Alpha 1
Beta 1

Python Reinforcement Learning Frameworks

View 28 business solutions

Reinforcement Learning Frameworks Python Clear Filters

Browse free open source Python Reinforcement Learning Frameworks and projects below. Use the toggles on the left to filter open source Python Reinforcement Learning Frameworks by OS, license, language, programming language, and project status.

MongoDB Atlas runs apps anywhere
Deploy in 115+ regions with the modern database for every enterprise.

MongoDB Atlas gives you the freedom to build and run modern applications anywhere—across AWS, Azure, and Google Cloud. With global availability in over 115 regions, Atlas lets you deploy close to your users, meet compliance needs, and scale with confidence across any geography.

Start Free
ContractSafe: Contract Management Software
Take Control Of Your Contracts Without Wrecking The Budget

Ditch those spreadsheets, shared drives & crazy-expensive solutions with too many bells & whistles. ContractSafe offers the simplest way to manage your contracts efficiently without breaking the bank.

Learn More
1

DeepSeek-V3

Powerful AI language model (MoE) optimized for efficiency/performance

DeepSeek-V3 is a robust Mixture-of-Experts (MoE) language model developed by DeepSeek, featuring a total of 671 billion parameters, with 37 billion activated per token. It employs Multi-head Latent Attention (MLA) and the DeepSeekMoE architecture to enhance computational efficiency. The model introduces an auxiliary-loss-free load balancing strategy and a multi-token prediction training objective to boost performance. Trained on 14.8 trillion diverse, high-quality tokens, DeepSeek-V3 underwent supervised fine-tuning and reinforcement learning to fully realize its capabilities. Evaluations indicate that it outperforms other open-source models and rivals leading closed-source models, achieving this with a training duration of 55 days on 2,048 Nvidia H800 GPUs, costing approximately $5.58 million.

1 Review

Downloads: 155 This Week

Last Update: 2025-07-09
See Project
2

DeepSeek R1

Open-source, high-performance AI model with advanced reasoning

DeepSeek-R1 is an open-source large language model developed by DeepSeek, designed to excel in complex reasoning tasks across domains such as mathematics, coding, and language. DeepSeek R1 offers unrestricted access for both commercial and academic use. The model employs a Mixture of Experts (MoE) architecture, comprising 671 billion total parameters with 37 billion active parameters per token, and supports a context length of up to 128,000 tokens. DeepSeek-R1's training regimen uniquely integrates large-scale reinforcement learning (RL) without relying on supervised fine-tuning, enabling the model to develop advanced reasoning capabilities. This approach has resulted in performance comparable to leading models like OpenAI's o1, while maintaining cost-efficiency. To further support the research community, DeepSeek has released distilled versions of the model based on architectures such as LLaMA and Qwen.

1 Review

Downloads: 87 This Week

Last Update: 2025-07-09
See Project
3

TorchRL

A modular, primitive-first, python-first PyTorch library

TorchRL is an open-source Reinforcement Learning (RL) library for PyTorch. TorchRL provides PyTorch and python-first, low and high-level abstractions for RL that are intended to be efficient, modular, documented, and properly tested. The code is aimed at supporting research in RL. Most of it is written in Python in a highly modular way, such that researchers can easily swap components, transform them, or write new ones with little effort.

Downloads: 77 This Week

Last Update: 2026-02-05
See Project
4

LightZero

[NeurIPS 2023 Spotlight] LightZero

LightZero is an efficient, scalable, and open-source framework implementing MuZero, a powerful model-based reinforcement learning algorithm that learns to predict rewards and transitions without explicit environment models. Developed by OpenDILab, LightZero focuses on providing a highly optimized and user-friendly platform for both academic research and industrial applications of MuZero and similar algorithms.

Downloads: 11 This Week

Last Update: 2025-04-09
See Project
Peer to Peer Recognition Brings Teams Together
The modern employee engagement platform for the modern workforce

Create a positive and energetic workplace environment with Motivosity, an innovative employee recognition and engagement platform. With Motivosity, employees can give each other small monetary bonuses for doing great things, promoting trust, collaboration, and appreciation in the workplace. The software solution comes with features such as an open-currency open-reward system, insights and analytics, dynamic organization chart, award programs, milestones, and more.

Learn More
5

MedicalGPT

MedicalGPT: Training Your Own Medical GPT Model with ChatGPT Training

MedicalGPT training medical GPT model with ChatGPT training pipeline, implementation of Pretraining, Supervised Finetuning, Reward Modeling and Reinforcement Learning. MedicalGPT trains large medical models, including secondary pre-training, supervised fine-tuning, reward modeling, and reinforcement learning training.

Downloads: 9 This Week

Last Update: 5 days ago
See Project
6

Weights and Biases

Tool for visualizing and tracking your machine learning experiments

Use W&B to build better models faster. Track and visualize all the pieces of your machine learning pipeline, from datasets to production models. Quickly identify model regressions. Use W&B to visualize results in real time, all in a central dashboard. Focus on the interesting ML. Spend less time manually tracking results in spreadsheets and text files. Capture dataset versions with W&B Artifacts to identify how changing data affects your resulting models. Reproduce any model, with saved code, hyperparameters, launch commands, input data, and resulting model weights. Set wandb.config once at the beginning of your script to save your hyperparameters, input settings (like dataset name or model type), and any other independent variables for your experiments. This is useful for analyzing your experiments and reproducing your work in the future. Setting configs also allows you to visualize the relationships between features of your model architecture or data pipeline and model performance.

Downloads: 9 This Week

Last Update: 5 days ago
See Project
7

AgentUniverse

agentUniverse is a LLM multi-agent framework

AgentUniverse is a multi-agent AI framework that enables coordination between multiple intelligent agents for complex task execution and automation.

Downloads: 8 This Week

Last Update: 2025-11-17
See Project
8

Brax

Massively parallel rigidbody physics simulation

Brax is a fast and fully differentiable physics engine for large-scale rigid body simulations, built on JAX. It is designed for research in reinforcement learning and robotics, enabling efficient simulations and gradient-based optimization.

Downloads: 8 This Week

Last Update: 2026-03-15
See Project
9

dm_control

DeepMind's software stack for physics-based simulation

DeepMind's software stack for physics-based simulation and Reinforcement Learning environments, using MuJoCo. DeepMind's software stack for physics-based simulation and Reinforcement Learning environments, using MuJoCo physics. The MuJoCo Python bindings support three different OpenGL rendering backends: EGL (headless, hardware-accelerated), GLFW (windowed, hardware-accelerated), and OSMesa (purely software-based). At least one of these three backends must be available in order render through dm_control. Hardware rendering with a windowing system is supported via GLFW and GLEW. On Linux these can be installed using your distribution's package manager. "Headless" hardware rendering (i.e. without a windowing system such as X11) requires EXT_platform_device support in the EGL driver. While dm_control has been largely updated to use the pybind11-based bindings provided via the mujoco package, at this time it still relies on some legacy components that are automatically generated.

Downloads: 8 This Week

Last Update: 4 days ago
See Project
Virtual data rooms designed to achieve better outcomes
Now you can get ready for and experience success in M&A, divestments, capital raising, restructure, IPOs, tenders and more

Ansarada is a SaaS company that provides world-leading AI-powered Virtual Data Rooms and dealmaking tools. These tools include advanced AI insights and automation, next level Q&A and collaboration, plus pre-built, digitized and customizable workflows and checklists - known as Pathways - for M&A, capital raising, business audits, tenders and other high stakes outcomes.

Free Trial
10

Agent S

Agent S: an open agentic framework that uses computers like a human

Agent S is an open-source agentic framework designed to enable autonomous computer use through an Agent-Computer Interface (ACI). Built to operate graphical user interfaces like a human, it allows AI agents to perceive screens, reason about tasks, and execute actions across macOS, Windows, and Linux systems. The latest version, Agent S3, surpasses human-level performance on the OSWorld benchmark, demonstrating state-of-the-art results in complex multi-step computer tasks. Agent S combines powerful foundation models (such as GPT-5) with grounding models like UI-TARS to translate visual inputs into precise executable actions. It supports flexible deployment via CLI, SDK, or cloud, and integrates with multiple model providers including OpenAI, Anthropic, Gemini, Azure, and Hugging Face endpoints. With optional local code execution, reflection mechanisms, and compositional planning, Agent S provides a scalable and research-driven framework for building advanced computer-use agents.

Downloads: 7 This Week

Last Update: 2025-12-16
See Project
11

Alibi Explain

Algorithms for explaining machine learning models

Alibi is a Python library aimed at machine learning model inspection and interpretation. The focus of the library is to provide high-quality implementations of black-box, white-box, local and global explanation methods for classification and regression models.

Downloads: 7 This Week

Last Update: 2024-08-09
See Project
12

OpenRLHF

An Easy-to-use, Scalable and High-performance RLHF Framework

OpenRLHF is an easy-to-use, scalable, and high-performance framework for Reinforcement Learning with Human Feedback (RLHF). It supports various training techniques and model architectures.

Downloads: 7 This Week

Last Update: 4 days ago
See Project
13

Project Malmo

A platform for Artificial Intelligence experimentation on Minecraft

How can we develop artificial intelligence that learns to make sense of complex environments? That learns from others, including humans, how to interact with the world? That learns transferable skills throughout its existence, and applies them to solve new, challenging problems? Project Malmo sets out to address these core research challenges, addressing them by integrating (deep) reinforcement learning, cognitive science, and many ideas from artificial intelligence. The Malmo platform is a sophisticated AI experimentation platform built on top of Minecraft, and designed to support fundamental research in artificial intelligence. The Project Malmo platform consists of a mod for the Java version, and code that helps artificial intelligence agents sense and act within the Minecraft environment. The two components can run on Windows, Linux, or Mac OS, and researchers can program their agents in any programming language they’re comfortable with.

Downloads: 7 This Week

Last Update: 2023-03-23
See Project
14

H2O LLM Studio

Framework and no-code GUI for fine-tuning LLMs

Welcome to H2O LLM Studio, a framework and no-code GUI designed for fine-tuning state-of-the-art large language models (LLMs). You can also use H2O LLM Studio with the command line interface (CLI) and specify the configuration file that contains all the experiment parameters. To finetune using H2O LLM Studio with CLI, activate the pipenv environment by running make shell. With H2O LLM Studio, training your large language model is easy and intuitive. First, upload your dataset and then start training your model. Start by creating an experiment. You can then monitor and manage your experiment, compare experiments, or push the model to Hugging Face to share it with the community.

Downloads: 6 This Week

Last Update: 2026-04-07
See Project
15

TensorHouse

A collection of reference Jupyter notebooks and demo AI/ML application

TensorHouse is a scalable reinforcement learning (RL) platform that focuses on high-throughput experience generation and distributed training. It is designed to efficiently train agents across multiple environments and compute resources. TensorHouse enables flexible experiment management, making it suitable for large-scale RL experiments in both research and applied settings.

Downloads: 6 This Week

Last Update: 2025-03-13
See Project
16

AnyTrading

The most simple, flexible, and comprehensive OpenAI Gym trading

gym-anytrading is an OpenAI Gym-compatible environment designed for developing and testing reinforcement learning algorithms on trading strategies. It simulates trading environments for financial markets, including stocks and forex.

Downloads: 5 This Week

Last Update: 2025-03-13
See Project
17

Gym

Toolkit for developing and comparing reinforcement learning algorithms

Gym by OpenAI is a toolkit for developing and comparing reinforcement learning algorithms. It supports teaching agents, everything from walking to playing games like Pong or Pinball. Open source interface to reinforce learning tasks. The gym library provides an easy-to-use suite of reinforcement learning tasks. Gym provides the environment, you provide the algorithm. You can write your agent using your existing numerical computation library, such as TensorFlow or Theano. It makes no assumptions about the structure of your agent, and is compatible with any numerical computation library, such as TensorFlow or Theano. The gym library is a collection of test problems — environments — that you can use to work out your reinforcement learning algorithms. These environments have a shared interface, allowing you to write general algorithms.

Downloads: 5 This Week

Last Update: 2025-03-06
See Project
18

Machine Learning PyTorch Scikit-Learn

Code Repository for Machine Learning with PyTorch and Scikit-Learn

Initially, this project started as the 4th edition of Python Machine Learning. However, after putting so much passion and hard work into the changes and new topics, we thought it deserved a new title. So, what’s new? There are many contents and additions, including the switch from TensorFlow to PyTorch, new chapters on graph neural networks and transformers, a new section on gradient boosting, and many more that I will detail in a separate blog post. For those who are interested in knowing what this book covers in general, I’d describe it as a comprehensive resource on the fundamental concepts of machine learning and deep learning. The first half of the book introduces readers to machine learning using scikit-learn, the defacto approach for working with tabular datasets. Then, the second half of this book focuses on deep learning, including applications to natural language processing and computer vision.

Downloads: 5 This Week

Last Update: 2022-08-22
See Project
19

Stable Baselines3

PyTorch version of Stable Baselines

Stable Baselines3 (SB3) is a set of reliable implementations of reinforcement learning algorithms in PyTorch. It is the next major version of Stable Baselines. You can read a detailed presentation of Stable Baselines3 in the v1.0 blog post or our JMLR paper. These algorithms will make it easier for the research community and industry to replicate, refine, and identify new ideas, and will create good baselines to build projects on top of. We expect these tools will be used as a base around which new ideas can be added, and as a tool for comparing a new approach against existing ones. We also hope that the simplicity of these tools will allow beginners to experiment with a more advanced toolset, without being buried in implementation details.

Downloads: 5 This Week

Last Update: 2026-04-01
See Project
20

TextWorld

TextWorld is a sandbox learning environment for the training

TextWorld is a learning environment designed to train reinforcement learning agents to play text-based games, where actions and observations are entirely in natural language. Developed by Microsoft Research, TextWorld focuses on language understanding, planning, and interaction in complex, narrative-driven environments. It generates games procedurally, enabling scalable testing of agents’ natural language processing and decision-making abilities.

Downloads: 5 This Week

Last Update: 2026-01-30
See Project
21

Deep Reinforcement Learning for Keras

Deep Reinforcement Learning for Keras.

keras-rl implements some state-of-the-art deep reinforcement learning algorithms in Python and seamlessly integrates with the deep learning library Keras. Furthermore, keras-rl works with OpenAI Gym out of the box. This means that evaluating and playing around with different algorithms is easy. Of course, you can extend keras-rl according to your own needs. You can use built-in Keras callbacks and metrics or define your own. Even more so, it is easy to implement your own environments and even algorithms by simply extending some simple abstract classes. Documentation is available online.

Downloads: 4 This Week

Last Update: 2024-08-02
See Project
22

ManiSkill

SAPIEN Manipulation Skill Framework

ManiSkill is a benchmark platform for training and evaluating reinforcement learning agents on dexterous manipulation tasks using physics-based simulations. Developed by Hao Su Lab, it focuses on robotic manipulation with diverse, high-quality 3D tasks designed to challenge perception, control, and planning in robotics. ManiSkill provides both low-level control and visual observation spaces for realistic learning scenarios.

Downloads: 4 This Week

Last Update: 2025-12-05
See Project
23

Multi-Agent Orchestrator

Flexible and powerful framework for managing multiple AI agents

Multi-Agent Orchestrator is an AI coordination framework that enables multiple intelligent agents to work together to complete complex, multi-step workflows.

Downloads: 4 This Week

Last Update: 2025-06-24
See Project
24

PaLM + RLHF - Pytorch

Implementation of RLHF (Reinforcement Learning with Human Feedback)

PaLM-rlhf-pytorch is a PyTorch implementation of Pathways Language Model (PaLM) with Reinforcement Learning from Human Feedback (RLHF). It is designed for fine-tuning large-scale language models with human preference alignment, similar to OpenAI’s approach for training models like ChatGPT.

Downloads: 4 This Week

Last Update: 2025-09-19
See Project
25

RL Games

RL implementations

rl_games is a high-performance reinforcement learning framework optimized for GPU-based training, particularly in environments like robotics and continuous control tasks. It supports advanced algorithms and is built with PyTorch.

Downloads: 4 This Week

Last Update: 2026-02-20
See Project

Previous
You're on page 1
2
3
4
Next

Related Searches

deepseek

ai agent mod

ai agent

games

rivals

minecraft ai bot

llm

forex trading robot

gym software

machine learning

SourceForge

Create a Project
Open Source Software
Business Software
Top Downloaded Projects

Company

About
Team
SourceForge Headquarters
1320 Columbia Street Suite 310
San Diego, CA 92101
+1 (858) 422-6466

Resources

Support
Site Documentation
Site Status
SourceForge Reviews

© 2026 Slashdot Media. All Rights Reserved.

Terms Privacy Privacy Choices Advertise