Cited by PSG-Adapter: Controllable Planning Scene Graph for Improving Text-to-Image Diffusion

PIXART-$$\Sigma $$: Weak-to-Strong Training of Diffusion Transformer for 4K Text-to-Image Generation DOI

Jun Song Chen, Chongjian Ge, Enze Xie

и другие.

Lecture notes in computer science, Год журнала: 2024, Номер unknown, С. 74 - 91

Опубликована: Ноя. 22, 2024

Язык: Английский

Процитировано

LaVie: High-Quality Video Generation with Cascaded Latent Diffusion Models DOI

Yaohui Wang, Xinyuan Chen, Xin Ma

и другие.

International Journal of Computer Vision, Год журнала: 2024, Номер unknown

Опубликована: Дек. 23, 2024

Язык: Английский

Процитировано

Distilling Diffusion Models Into Conditional GANs DOI

Minguk Kang, Richard Zhang,

Connelly Barnes

и другие.

Lecture notes in computer science, Год журнала: 2024, Номер unknown, С. 428 - 447

Опубликована: Окт. 30, 2024

Язык: Английский

Процитировано

Creatively Upscaling Images with Global-Regional Priors DOI

Yurui Qian, Qi Cai, Yingwei Pan

и другие.

International Journal of Computer Vision, Год журнала: 2025, Номер unknown

Опубликована: Март 31, 2025

Язык: Английский

Процитировано

A Survey of Reasoning with Foundation Models: Concepts, Methodologies, and Outlook DOI

Junkui Sun,

Chuanyang Zheng, Enze Xie

и другие.

ACM Computing Surveys, Год журнала: 2025, Номер unknown

Опубликована: Апрель 11, 2025

Reasoning, a crucial ability for complex problem-solving, plays pivotal role in various real-world settings such as negotiation, medical diagnosis, and criminal investigation. It serves fundamental methodology the field of Artificial General Intelligence (AGI). With ongoing development foundation models, there is growing interest exploring their abilities reasoning tasks. In this paper, we introduce seminal models proposed or adaptable reasoning, highlighting latest advancements tasks, methods, benchmarks. We then delve into potential future directions behind emergence within models. also discuss relevance multimodal learning, autonomous agents, super alignment context reasoning. By discussing these research directions, hope to inspire researchers exploration field, stimulate further with e.g. Large Language Models (LLMs), contribute AGI.

Язык: Английский

Процитировано

Paragraph-to-Image Generation with Information-Enriched Diffusion Model DOI

Weijia Wu, Zhuang Li,

Yefei He

и другие.

International Journal of Computer Vision, Год журнала: 2025, Номер unknown

Опубликована: Май 5, 2025

Язык: Английский

Процитировано

Alfie: Democratising RGBA Image Generation with No $$$ DOI

Fabio Quattrini, Vittorio Pippi, Silvia Cascianelli

и другие.

Lecture notes in computer science, Год журнала: 2025, Номер unknown, С. 38 - 55

Опубликована: Янв. 1, 2025

Язык: Английский

Процитировано

Contrasting Deepfakes Diffusion via Contrastive Learning and Global-Local Similarities DOI

Lorenzo Baraldi, Federico Cocchi, Marcella Cornia

и другие.

Lecture notes in computer science, Год журнала: 2024, Номер unknown, С. 199 - 216

Опубликована: Ноя. 20, 2024

Язык: Английский

Процитировано

Bridging Different Language Models and Generative Vision Models for Text-to-Image Generation DOI

Shihao Zhao,

Shaozhe Hao,

Bojia Zi

и другие.

Lecture notes in computer science, Год журнала: 2024, Номер unknown, С. 70 - 86

Опубликована: Окт. 31, 2024

Язык: Английский

Процитировано

Merging and Splitting Diffusion Paths for Semantically Coherent Panoramas DOI

Fabio Quattrini, Vittorio Pippi, Silvia Cascianelli

и другие.

Lecture notes in computer science, Год журнала: 2024, Номер unknown, С. 234 - 251

Опубликована: Ноя. 1, 2024

Язык: Английский

Процитировано