Communications in computer and information science, Journal Year: 2023, Volume and Issue: unknown, P. 14 - 28
Published: Nov. 15, 2023
Language: Английский
Communications in computer and information science, Journal Year: 2023, Volume and Issue: unknown, P. 14 - 28
Published: Nov. 15, 2023
Language: Английский
IEEE Transactions on Pattern Analysis and Machine Intelligence, Journal Year: 2024, Volume and Issue: 46(12), P. 9677 - 9696
Published: July 1, 2024
Disentangled Representation Learning (DRL) aims to learn a model capable of identifying and disentangling the underlying factors hidden in observable data representation form. The process separating variation into variables with semantic meaning benefits learning explainable representations data, which imitates meaningful understanding humans when observing an object or relation. As general strategy, DRL has demonstrated its power improving explainability, controlability, robustness, as well generalization capacity wide range scenarios such computer vision, natural language processing, mining. In this article, we comprehensively investigate from various aspects including motivations, definitions, methodologies, evaluations, applications, designs. We first present two well-recognized i.e., Intuitive Definition Group Theory for disentangled learning. further categorize methodologies four groups following perspectives, type, structure, supervision signal, independence assumption. also analyze principles design different models that may benefit tasks practical applications. Finally, point out challenges potential research directions deserving future investigations. believe work provide insights promoting community.
Language: Английский
Citations
18Neural Computation, Journal Year: 2024, Volume and Issue: 36(4), P. 677 - 704
Published: March 8, 2024
Representing a scene and its constituent objects from raw sensory data is core ability for enabling robots to interact with their environment. In this letter, we propose novel approach understanding, leveraging an object-centric generative model that enables agent infer object category pose in allocentric reference frame using active inference, neuro-inspired framework action perception. For evaluating the behavior of vision agent, also new benchmark where, given target viewpoint particular object, needs find best matching workspace randomly positioned 3D. We demonstrate our inference able balance epistemic foraging goal-driven behavior, quantitatively outperforms both supervised reinforcement learning baselines by more than factor two terms success rate.
Language: Английский
Citations
4Neural Networks, Journal Year: 2025, Volume and Issue: 185, P. 107075 - 107075
Published: Jan. 8, 2025
By dynamic planning, we refer to the ability of human brain infer and impose motor trajectories related cognitive decisions. A recent paradigm, active inference, brings fundamental insights into adaptation biological organisms, constantly striving minimize prediction errors restrict themselves life-compatible states. Over past years, many studies have shown how animal behaviors could be explained in terms inference - either as discrete decision-making or continuous control inspiring innovative solutions robotics artificial intelligence. Still, literature lacks a comprehensive outlook on effectively planning realistic actions changing environments. Setting ourselves goal modeling complex tasks such tool use, delve topic keeping mind two crucial aspects behavior: capacity understand exploit affordances for object manipulation, learn hierarchical interactions between self environment, including other agents. We start from simple unit gradually describe more advanced structures, comparing recently proposed design choices providing basic examples. This study distances itself traditional views centered neural networks reinforcement learning, points toward yet unexplored direction inference: hybrid representations models.
Language: Английский
Citations
0Frontiers in Neurorobotics, Journal Year: 2025, Volume and Issue: 19
Published: April 30, 2025
Understanding the world in terms of objects and possible interactions with them is an important cognitive ability. However, current models adopted reinforcement learning typically lack this structure represent state a global latent vector. To address this, we propose FOCUS, model-based agent that learns object-centric model. This novel representation also enables design exploration mechanism, which encourages to interact discover useful interactions. We benchmark FOCUS several robotic manipulation settings, where found our method can be used improve skills. The model leads more accurate predictions scene it efficient learning. strategy fosters environment, such as reaching, moving, rotating them, allows fast adaptation sparse reward tasks. Using Franka Emika robot arm, showcase how proves real-world applications. Website: focus-manipulation.github.io .
Language: Английский
Citations
0Heliyon, Journal Year: 2024, Volume and Issue: 10(20), P. e39129 - e39129
Published: Oct. 1, 2024
Language: Английский
Citations
2Interface Focus, Journal Year: 2023, Volume and Issue: 13(3)
Published: April 14, 2023
Humans perceive and interact with hundreds of objects every day. In doing so, they need to employ mental models these often exploit symmetries in the object's shape appearance order learn generalizable transferable skills. Active inference is a first principles approach understanding modeling sentient agents. It states that agents entertain generative model their environment, act by minimizing an upper bound on surprisal, i.e. Free Energy. The Energy decomposes into accuracy complexity term, meaning favor least complex model, can accurately explain sensory observations. this paper, we investigate how inherent particular also emerge as latent state space learnt under deep active inference. particular, focus object-centric representations, which are trained from pixels predict novel object views agent moves its viewpoint. First, relation between symmetry exploitation space. Second, do principal component analysis demonstrate encodes axis Finally, more symmetrical representations be exploited for better generalization context manipulation.
Language: Английский
Citations
5Neural Computation, Journal Year: 2024, Volume and Issue: 36(5), P. 963 - 1021
Published: March 8, 2024
Abstract The free energy principle and its corollary, the active inference framework, serve as theoretical foundations in domain of neuroscience, explaining genesis intelligent behavior. This states that processes perception, learning, decision making—within an agent—are all driven by objective “minimizing energy,” evincing following behaviors: learning employing a generative model environment to interpret observations, thereby achieving selecting actions maintain stable preferred state minimize uncertainty about environment, making. fundamental can be used explain how brain perceptual information, learns selects actions. Two pivotal tenets are agent employs for perception planning interaction with world (and other agents) enhances performance augments perception. With evolution control theory deep tools, agents based on FEP have been instantiated various ways across different domains, guiding design multitude models decision-making algorithms. letter first introduces basic concepts FEP, followed historical development connections theories intelligence, then delves into specific application making, encompassing both low-dimensional simple situations high-dimensional complex situations. It compares model-based reinforcement show provides better function. We illustrate this using numerical studies Dreamer3 adding expected information gain standard In complementary fashion, existing algorithms also help implement FEP-based agents. Finally, we discuss capabilities need possess environments aid acquiring these capabilities.
Language: Английский
Citations
1bioRxiv (Cold Spring Harbor Laboratory), Journal Year: 2023, Volume and Issue: unknown
Published: Aug. 21, 2023
A bstract tradeoff exists when dealing with complex tasks composed of multiple steps. High-level cognitive processes can find the best sequence actions to achieve a goal in uncertain environments, but they are slow and require significant computational demand. In contrast, lower-level processing allows reacting environmental stimuli rapidly, limited capacity determine optimal or replan expectations not met. Through reiteration same task, biological organisms tradeoff: from action primitives, composite trajectories gradually emerge by creating task-specific neural structures. The two frameworks active inference – recent brain paradigm that views perception as subject free energy minimization imperative well capture high-level low-level human behavior, how task specialization occurs these terms is still unclear. this study, we compare strategies on dynamic pick-and-place task: hybrid (discrete-continuous) model planning capabilities continuous-only fixed transitions. Both models rely hierarchical (intrinsic extrinsic) structure, suited for defining reaching grasping movements, respectively. Our results show perform better minimal resource expenditure at cost less flexibility. Finally, propose discrete might lead continuous attractors different motor learning phases, laying foundations further studies bio-inspired adaptation.
Language: Английский
Citations
2Communications in computer and information science, Journal Year: 2023, Volume and Issue: unknown, P. 14 - 28
Published: Nov. 15, 2023
Language: Английский
Citations
2