- The AGI Landscape
- 我们的愿景 Our vision
- Papers
- Rationality and intelligence
- AI safety gridworlds
- Modeling Friends and Foes
- Forget-me-not-Process
- Cognitive Psychology for Deep Neural Networks: A Shape Bias Case Study
- Universal Transformers
- Graph Convolutional Policy Network
- Thermodynamics as a theory of decision-making with informationprocessing costs
- Concrete Problems in AI Safety
- A course in game theory
- Theory of games and economic behavior
- Reinforcement learning: An introduction 1e
- Regret analysis of stochastic and nonstochastic multi-armed bandit problems
- The nonstochastic multiarmed bandit problem
- Information theory of decisions and actions
- Clustering with bregman divergences
- Quantal Response Equilibria for Normal Form Games
- The numerics of gans
- The Mechanics of n-Player Differentiable Games
- Reactive bandits with attitude
- Data clustering by markovian relaxation and the information bottleneck method
- Information bottleneck for Gaussian variables
- Bounded Rationality, Abstraction, and Hierarchical Decision-Making: An Information-Theoretic Optimal
- Risk sensitive path integral control
- Information, utility and bounded rationality
- Hysteresis effects of changing the parameters of noncooperative games
- The best of both worlds: stochastic and adversarial bandits
- One practical algorithm for both stochastic and adversarial bandits
- An algorithm with nearly optimal pseudo-regret for both stochastic and adversarial bandits
- Friend-or-Foe Q-Learning in General-Sum Games
- New criteria and a new algorithm for learning in multi-agent systems
- Correlated Q-Learning
- Learning to compete, coordinate, and cooperate in repeated games using reinforcement learning
- Learning against sequential opponents in repeated stochastic games
- On the likelihood that one unknown probability exceeds another in view of the evidence of two sample
- An empirical evaluation of Thompson Sampling
- What game are we playing? end-to-end learning in normal and extensive form games
- Intriguing properties of neural networks
- Explaining and harnessing adversarial examples
- go-explore
- The Landscape of Deep Reinforcement Learning
- 用因果影响图建模通用人工智能安全框架