Open Problems and Fundamental Limitations of Reinforcement Learning from Human Feedback

Stephen Casper, Xander Davies, Claudia Shi et al.

2023 · arXiv (Cornell University) · 89 citations

Architectural Summary: The TRIAD-CORE 5.2 Framework The TRIAD-CORE 5.2 framework instantiates a rigorous neuro-symbolic nexus between Integrated Information Theory (IIT) and the Free-Energy Principle (FEP), establishing a formal substrate for sovereign cognitive architectures. In this architectural paradigm, consciousness is treated as the proximate cause—identified with the irreducible integrated causal structure of a system—while active inference and variational free energy (VFE) minimization provide the ultimate, teleological account of adaptive self-organization. A central empirical pilla…

Read the paper →

Explore this paper's citation graph on Constellation.