Chapter 2.2: AI, Learning, and Decision Making

Learning Objectives

By the end of this chapter, you will be able to:

Explain the different machine learning approaches used in humanoid robotics
Analyze path planning and navigation techniques for humanoid robots
Evaluate cognitive architectures that enable intelligent behavior in humanoids
Compare the advantages and challenges of different AI decision-making approaches

2.2.1 Machine Learning Applications in Humanoids

Machine learning (ML) has become an indispensable tool in humanoid robotics, enabling robots to adapt, learn from experience, and perform complex tasks that are difficult to program explicitly.

Reinforcement Learning (RL):
- Concept: RL allows a robot to learn optimal behaviors through trial and error, by interacting with an environment and receiving rewards or penalties. The robot learns a policy that maps states to actions.
- Applications: Learning dynamic locomotion gaits (e.g., walking, running, jumping), developing manipulation strategies for novel objects, human-robot interaction through learned responses, and adaptive control in uncertain environments. Deep Reinforcement Learning (DRL), which combines RL with deep neural networks, has been particularly successful in enabling humanoids to learn highly complex, high-dimensional behaviors.
- Examples: Learning to walk on uneven terrain, performing intricate manipulation tasks with varying objects, or adapting interaction strategies based on human feedback.
Supervised Learning:
- Concept: Robots learn from labeled datasets, where input-output pairs are provided.
- Applications: Object recognition and classification (using cameras to identify objects in the environment), speech recognition (for human-robot voice commands), gesture recognition (interpreting human body language), and state estimation (predicting internal robot states based on sensor data). Convolutional Neural Networks (CNNs) are widely used for visual tasks, while Recurrent Neural Networks (RNNs) or Transformers can be applied to sequential data like speech.
Unsupervised Learning:
- Concept: Robots discover patterns and structures in unlabeled data.
- Applications: Clustering similar sensor readings to identify distinct environmental features, dimensionality reduction for high-dimensional sensory data, and anomaly detection (identifying unusual events or robot behaviors).
Imitation Learning/Learning from Demonstration (LfD):
- Concept: Robots learn by observing human demonstrations of tasks, inferring the underlying skills and policies.
- Applications: Learning complex manipulation sequences (e.g., pouring water, opening doors), teaching precise trajectories for assembly tasks, and acquiring social behaviors. LfD can significantly reduce the need for extensive manual programming or time-consuming RL exploration.

note

Reinforcement learning has shown remarkable success in humanoid robotics, particularly for learning complex motor skills. However, training in simulation and transferring to real robots (sim-to-real transfer) remains a significant challenge due to the reality gap between simulated and real environments.

Humanoid robots operate in dynamic, often cluttered environments, necessitating robust path planning and navigation capabilities to move safely and efficiently.

Localization:
- Concept: Determining the robot's precise position and orientation within a known map.
- Techniques: Kalman filters, particle filters, and visual odometry combined with Inertial Measurement Units (IMUs) are commonly used to fuse sensor data (LiDAR, cameras, encoders) for accurate self-localization.
Mapping:
- Concept: Building a representation of the environment.
- Techniques: Occupancy grid maps (2D or 3D) are commonly used to represent free and occupied spaces. SLAM (Simultaneous Localization and Mapping) algorithms enable robots to build maps of unknown environments while simultaneously localizing themselves within these maps. This often involves fusing data from LiDAR, depth cameras, and visual features.
Path Planning:
- Global Path Planning: Generates an optimal path from a start to a goal location, considering the entire known map and avoiding static obstacles. Algorithms like Dijkstra, A*, or RRT (Rapidly-exploring Random Tree) are often employed.
- Local Path Planning (Obstacle Avoidance): Deals with dynamic obstacles and unexpected changes in the environment, modifying the global path in real-time. Techniques include Dynamic Window Approach (DWA), Artificial Potential Fields, and model predictive control (MPC).
- Whole-Body Motion Planning: For humanoids, path planning extends beyond just the base. It involves coordinating the movement of all joints to avoid self-collisions, maintain balance, and perform tasks while respecting joint limits and torque constraints.
Navigation Strategies:
- Waypoint Navigation: The robot follows a series of predefined waypoints to reach its destination.
- Goal-Oriented Navigation: The robot autonomously plans its path to a specified goal, adapting to environmental changes.
- Human-Aware Navigation: Humanoids must navigate respectfully and safely around humans, predicting human movements and maintaining appropriate personal space.

2.2.3 Humanoid Cognitive Architectures

Cognitive architectures provide a framework for integrating various AI components—perception, learning, reasoning, planning, and action—into a coherent system that enables humanoids to exhibit intelligent behavior.

Hierarchical Architectures:
- Concept: Organize cognitive functions into layers, with higher layers performing abstract reasoning and task planning, and lower layers handling reactive control and sensor processing.
- Structure: Typically involve a deliberative layer (for long-term planning, symbolic reasoning), a reactive layer (for immediate responses to stimuli), and a hybrid layer to bridge the two. This allows for both goal-directed behavior and rapid adaptation to unforeseen events.
Deliberative vs. Reactive Control:
- Deliberative: Involves complex reasoning, planning, and world modeling. It's often slower but can handle novel situations and complex goals.
- Reactive: Fast, reflexive responses to immediate sensory input. It's crucial for safety and quick adaptation but lacks foresight.
- Integration: Modern architectures often strive for seamless integration, allowing the robot to switch between deliberative planning and reactive execution based on the situation's demands.
Knowledge Representation and Reasoning:
- Concept: How humanoids store, organize, and process information about the world, tasks, and themselves.
- Techniques: Symbolic knowledge representation (e.g., ontologies, logical rules), probabilistic models (e.g., Bayesian networks for uncertainty), and neural network-based embeddings for semantic understanding. Reasoning engines allow the robot to infer new facts, make decisions, and resolve conflicts.
Learning and Adaptation:
- Concept: Cognitive architectures must support continuous learning and adaptation to new tasks, environments, and human interactions.
- Mechanisms: Integration of machine learning modules (as discussed above), mechanisms for updating internal world models, and meta-learning capabilities that allow the robot to learn how to learn more efficiently.
Human-Robot Interaction (HRI) Integration:
- Concept: Cognitive architectures must facilitate natural and effective communication and collaboration with humans.
- Components: Modules for understanding human intent (through speech, gesture, gaze), generating appropriate robot responses (speech synthesis, facial expressions, body language), and maintaining shared mental models of tasks and goals.

The development of sophisticated cognitive architectures is key to moving humanoids beyond pre-programmed behaviors towards truly intelligent and autonomous agents capable of complex human-like cognition.

Try it yourself

Research and compare the cognitive architectures of two different humanoid robots (e.g., ASIMO and Pepper). Analyze how their different design goals and target applications influence their cognitive architecture choices, and identify the key components that enable intelligent behavior in each robot.

Learning Objectives​

2.2.1 Machine Learning Applications in Humanoids​

2.2.2 Path Planning and Navigation​

2.2.3 Humanoid Cognitive Architectures​

Try it yourself​

Learning Objectives

2.2.1 Machine Learning Applications in Humanoids

2.2.2 Path Planning and Navigation

2.2.3 Humanoid Cognitive Architectures

Try it yourself