Cited by What comparing deep neural networks can teach us about human vision

A large-scale ultra-high-resolution segmentation dataset augmentation framework for photovoltaic panels in photovoltaic power plants based on priori knowledge DOI

Ruiqing Yang, Guojin He, Ranyu Yin

et al.

Applied Energy, Journal Year: 2025, Volume and Issue: 390, P. 125879 - 125879

Published: April 10, 2025

Language: Английский

Citations

A high-throughput approach for the efficient prediction of perceived similarity of natural objects DOI

Philipp Kaniuth,

Florian P. Mahner,

Jonas Perkuhn

et al.

Published: April 22, 2025

Perceived similarity offers a window into the mental representations underlying our ability to make sense of visual world, yet, collection judgments quickly becomes infeasible for larger datasets, limiting their generality. To address this challenge, here we introduce computational approach that predicts perceived from neural network activations through set 49 interpretable dimensions learned on 1.46 million triplet odd-one-out judgments. The allowed us predict separate, independently-sampled scores with an accuracy up 0.898. Combining human ratings same led only small improvements, indicating used similar information as humans in task. Predicting highly homogeneous image classes revealed performance critically depends granularity training data. Our improve brain-behavior correspondence large-scale neuroimaging dataset and visualize candidate features use making judgments, thus highlighting which parts may carry behaviorally-relevant information. Together, results demonstrate current networks sufficient capturing broadly-sampled scores, offering pathway towards automated natural images.

Language: Английский

Citations

Comprehensive Neural Representations of Naturalistic Stimuli through Multimodal Deep Learning DOI

Mingxue Fu,

Guoqiu Chen, Yijie Zhang

et al.

Published: April 19, 2025

Abstract A central challenge in cognitive neuroscience is understanding how the brain represents and predicts complex, multimodal experiences naturalistic settings. Traditional neural encoding models, often based on unimodal or static features, fall short capturing rich, dynamic structure of real-world cognition. Here, we address this by introducing a video-text alignment framework that whole-brain responses integrating visual linguistic features across time. Using state-of-the-art deep learning model (VALOR), achieve more accurate generalizable than (AlexNet, WordNet) (CLIP) baselines. Beyond improving prediction, our automatically maps cortical semantic spaces, aligning with human-annotated dimensions without requiring manual labeling. We further uncover hierarchical predictive coding gradient, where different regions anticipate future events over distinct timescales—an organization correlates individual abilities. These findings provide new evidence temporal integration core mechanism function. Our results demonstrate models aligned stimuli can reveal ecologically valid mechanisms, offering powerful, scalable approach for investigating perception, semantics, prediction human brain. This advances neuroimaging bridging computational modeling

Language: Английский

Citations

A high-throughput approach for the efficient prediction of perceived similarity of natural objects DOI

Philipp Kaniuth,

Florian P. Mahner,

Jonas Perkuhn

et al.

Published: April 22, 2025

Language: Английский

Citations

Brain2GAN: Feature-disentangled neural encoding and decoding of visual perception in the primate brain DOI

Thirza Dado, Paolo Papale, Antonio Lozano

et al.

PLoS Computational Biology, Journal Year: 2024, Volume and Issue: 20(5), P. e1012058 - e1012058

Published: May 6, 2024

A challenging goal of neural coding is to characterize the representations underlying visual perception. To this end, multi-unit activity (MUA) macaque cortex was recorded in a passive fixation task upon presentation faces and natural images. We analyzed relationship between MUA latent state-of-the-art deep generative models, including conventional feature-disentangled adversarial networks (GANs) (i.e., z - w -latents StyleGAN, respectively) language-contrastive diffusion CLIP-latents Stable Diffusion). mass univariate encoding analysis showed that outperform both CLIP explaining responses. Further, -latent features were found be positioned at higher end complexity gradient which indicates they capture information relevant high-level activity. Subsequently, multivariate decoding resulted spatiotemporal reconstructions Taken together, our results not only highlight important role feature-disentanglement shaping perception but also serve as an benchmark for future coding.

Language: Английский

Citations

A high-throughput approach for the efficient prediction of perceived similarity of natural objects DOI

Philipp Kaniuth,

Florian P. Mahner,

Jonas Perkuhn

et al.

bioRxiv (Cold Spring Harbor Laboratory), Journal Year: 2024, Volume and Issue: unknown

Published: July 2, 2024

ABSTRACT Perceived similarity offers a window into the mental representations underlying our ability to make sense of visual world, yet, collection judgments quickly becomes infeasible for larger datasets, limiting their generality. To address this challenge, here we introduce computational approach that predicts perceived from neural network activations through set 49 interpretable dimensions learned on 1.46 million triplet odd-one-out judgments. The allowed us predict separate, independently-sampled scores with an accuracy up 0.898. Combining human ratings same led only small improvements, indicating used similar information as humans in task. Predicting highly homogeneous image classes revealed performance critically depends granularity training data. Our improve brain-behavior correspondence large-scale neuroimaging dataset and visualize candidate features use making judgments, thus highlighting which parts may carry behaviorally-relevant information. Together, results demonstrate current networks sufficient capturing broadly-sampled scores, offering pathway towards automated natural images.

Language: Английский

Citations

Adva Shoham,

Rotem Broday-Dvir,

Itay Yaron

et al.

bioRxiv (Cold Spring Harbor Laboratory), Journal Year: 2024, Volume and Issue: unknown

Published: April 2, 2024

Summary The functional role of visual activations human pre-frontal cortex remains a deeply debated question. Its significance extends to fundamental issues localization and global theories consciousness. Here we addressed this question by comparing, dynamically, the potential parallels between relational structure prefrontal textual-trained deep neural networks (DNNs). frontal structures were revealed in intra-cranial recordings patients, conducted for clinical purposes, while patients viewed familiar images faces places. Our results reveal that were, surprisingly, predicted text not DNNs. Importantly, temporal dynamics these correlations showed striking differences, with rapid decline over time component, but persistent including significant image offset response component. point dynamic text-related function responses brain.

Language: Английский

Citations

Fine-grained knowledge about manipulable objects is well-predicted by contrastive language image pre-training DOI

Jon Walbrin, Nikita Sossounov, Morteza Mahdiani

et al.

iScience, Journal Year: 2024, Volume and Issue: 27(7), P. 110297 - 110297

Published: June 18, 2024

Language: Английский

Citations

The Brain Tells a Story: Unveiling Distinct Representations of Semantic Content in Speech, Objects, and Stories in the Human Brain with Large Language Models DOI

Yuko Nakagi,

Takuya Matsuyama,

Naoko Koide–Majima

et al.

bioRxiv (Cold Spring Harbor Laboratory), Journal Year: 2024, Volume and Issue: unknown

Published: Feb. 6, 2024

Abstract In recent studies, researchers have used large language models (LLMs) to explore semantic representations in the brain; however, they typically assessed different levels of content, such as speech, objects, and stories, separately. this study, we recorded brain activity using functional magnetic resonance imaging (fMRI) while participants viewed 8.3 hours dramas movies. We annotated these stimuli at multiple levels, which enabled us extract latent LLMs for content. Our findings demonstrate that predict human more accurately than traditional models, particularly complex background stories. Furthermore, identify distinct regions associated with representations, including multi-modal vision-semantic highlights importance modeling multi-level simultaneously. will make our fMRI dataset publicly available facilitate further research on aligning function. Please check out webpage https://sites.google.com/view/llm-and-brain/ .

Language: Английский

Citations

Individual differences in prefrontal coding of visual features DOI

Qi Lin, Hakwan Lau

bioRxiv (Cold Spring Harbor Laboratory), Journal Year: 2024, Volume and Issue: unknown

Published: May 10, 2024

Abstract Each of us perceives the world differently. What may underlie such individual differences in perception? Here, we characterize lateral prefrontal cortex’s role vision using computational models, with a specific focus on differences. Using 7T fMRI dataset, found that encoding models relating visual features extracted from deep neural network to brain responses natural images robustly predict patches LPFC. We then explored representational structures and screened for high predicted observed more substantial coding schemes LPFC compared regions. Computational modeling suggests amplified could result random projection between sensory high-level regions underlying flexible working memory. Our study demonstrates under-appreciated processing idiosyncrasies how different individuals experience world.

Language: Английский

Citations