PSG-Adapter: Controllable Planning Scene Graph for Improving Text-to-Image Diffusion DOI

Yi Gao

Lecture notes in computer science, Journal Year: 2024, Volume and Issue: unknown, P. 205 - 221

Published: Dec. 7, 2024

Language: Английский

PIXART-$$\Sigma $$: Weak-to-Strong Training of Diffusion Transformer for 4K Text-to-Image Generation DOI
Jun Song Chen, Chongjian Ge, Enze Xie

et al.

Lecture notes in computer science, Journal Year: 2024, Volume and Issue: unknown, P. 74 - 91

Published: Nov. 22, 2024

Language: Английский

Citations

13

LaVie: High-Quality Video Generation with Cascaded Latent Diffusion Models DOI
Yaohui Wang, Xinyuan Chen, Xin Ma

et al.

International Journal of Computer Vision, Journal Year: 2024, Volume and Issue: unknown

Published: Dec. 23, 2024

Language: Английский

Citations

11

Distilling Diffusion Models Into Conditional GANs DOI
Minguk Kang, Richard Zhang,

Connelly Barnes

et al.

Lecture notes in computer science, Journal Year: 2024, Volume and Issue: unknown, P. 428 - 447

Published: Oct. 30, 2024

Language: Английский

Citations

4

Creatively Upscaling Images with Global-Regional Priors DOI
Yurui Qian, Qi Cai, Yingwei Pan

et al.

International Journal of Computer Vision, Journal Year: 2025, Volume and Issue: unknown

Published: March 31, 2025

Language: Английский

Citations

0

A Survey of Reasoning with Foundation Models: Concepts, Methodologies, and Outlook DOI

Junkui Sun,

Chuanyang Zheng, Enze Xie

et al.

ACM Computing Surveys, Journal Year: 2025, Volume and Issue: unknown

Published: April 11, 2025

Reasoning, a crucial ability for complex problem-solving, plays pivotal role in various real-world settings such as negotiation, medical diagnosis, and criminal investigation. It serves fundamental methodology the field of Artificial General Intelligence (AGI). With ongoing development foundation models, there is growing interest exploring their abilities reasoning tasks. In this paper, we introduce seminal models proposed or adaptable reasoning, highlighting latest advancements tasks, methods, benchmarks. We then delve into potential future directions behind emergence within models. also discuss relevance multimodal learning, autonomous agents, super alignment context reasoning. By discussing these research directions, hope to inspire researchers exploration field, stimulate further with e.g. Large Language Models (LLMs), contribute AGI.

Language: Английский

Citations

0

Paragraph-to-Image Generation with Information-Enriched Diffusion Model DOI
Weijia Wu, Zhuang Li,

Yefei He

et al.

International Journal of Computer Vision, Journal Year: 2025, Volume and Issue: unknown

Published: May 5, 2025

Language: Английский

Citations

0

Alfie: Democratising RGBA Image Generation with No $$$ DOI
Fabio Quattrini, Vittorio Pippi, Silvia Cascianelli

et al.

Lecture notes in computer science, Journal Year: 2025, Volume and Issue: unknown, P. 38 - 55

Published: Jan. 1, 2025

Language: Английский

Citations

0

Contrasting Deepfakes Diffusion via Contrastive Learning and Global-Local Similarities DOI
Lorenzo Baraldi, Federico Cocchi, Marcella Cornia

et al.

Lecture notes in computer science, Journal Year: 2024, Volume and Issue: unknown, P. 199 - 216

Published: Nov. 20, 2024

Language: Английский

Citations

2

Bridging Different Language Models and Generative Vision Models for Text-to-Image Generation DOI

Shihao Zhao,

Shaozhe Hao,

Bojia Zi

et al.

Lecture notes in computer science, Journal Year: 2024, Volume and Issue: unknown, P. 70 - 86

Published: Oct. 31, 2024

Language: Английский

Citations

1

Merging and Splitting Diffusion Paths for Semantically Coherent Panoramas DOI
Fabio Quattrini, Vittorio Pippi, Silvia Cascianelli

et al.

Lecture notes in computer science, Journal Year: 2024, Volume and Issue: unknown, P. 234 - 251

Published: Nov. 1, 2024

Language: Английский

Citations

1