Reinforcement Learning RL

SERL: A Software Suite for Sample-Efficient Robotic Reinforcement Learning

Abstract: In recent years, significant progress has been made in the field of robotic reinforcement learning (RL), enabling methods that handle complex image observations, train in the real world, and ...

Anthropic AI Safety Fellowship 2026: How to apply, $15,000 funding, duration & hiring chances

Anthropic, founded by former OpenAI researchers, has positioned itself as one of the leading firms focused on AI alignment ...

DATAQUEST

NVIDIA and Ineffable Intelligence build reinforcement learning infrastructure

NVIDIA and Ineffable Intelligence join forces to advance reinforcement learning infrastructure, creating scalable systems for ...

TMCnet

CoreWeave Sandboxes Launches to Accelerate Reinforcement Learning, Agent Tool Use, and Model Evaluation

The Essential Cloud for AI™, today announced CoreWeave Sandboxes, an execution layer that gives AI researchers and platform teams secure, isolate ...

The Robot Report

RLWRLD releases RLDX-1, a dexterity-first foundation model for robot hands

RLWRLD said with RLDX-1, it aimed to include things like context memorization or force sensing, which existing models often ...

SAP and Cyberwave Deploy Fully Autonomous AI-Powered Robots in Live SAP Logistics Warehouse

The deployment at SAP's warehouse in St. Leon-Rot, Germany-operated on SAP Logistics Management (LGM), SAP's cloud-native logistics execution solution-demonstrates that Physical AI is no longer a ...

i-scoop.eu

Ernie 5.1 from Baidu cuts compute costs while chasing top AI models

Ernie 5.1 from Baidu inherits the pretraining foundation of Ernie 5.0, yet compresses total parameters to roughly one third and active parameters to about one half. The headline claim is even sharper: ...

Can ChatGPT Hide What It’s Really Thinking? OpenAI Says It’s Possible

OpenAI says some ChatGPT models were accidentally trained in a way that could hide parts of their internal reasoning, raising ...

How Sakana trained a 7B model to orchestrate GPT, Claude and Gemini LLMs

Claude Sonnet 4, and Gemini 2.5 Pro dynamically — no hardcoded pipelines, fewer tokens than competing frameworks.

Opinion

Database Trends and ApplicationsOpinion

Show inaccessible results