Knowledge Graph Quality Management: a Comprehensive Survey DOI Creative Commons
Bingcong Xue, Lei Zou

IEEE Transactions on Knowledge and Data Engineering, Год журнала: 2022, Номер unknown, С. 1 - 1

Опубликована: Янв. 1, 2022

As a powerful expression of human knowledge in structural form, graph (KG) has drawn great attention from both the academia and industry large number construction application technologies have been proposed. Large-scale graphs such as DBpedia, YAGO Wikidata are published widely used various tasks. However, most them far perfect many quality issues. For example, they may contain inaccurate or outdated entries do not cover enough facts, which limits their credibility further utility. Data long research history field traditional relational data recently attracts more experts. In this paper, we provide systematic comprehensive review management on graphs, covering overall topics about only issues, dimentions metrics, but also processes assessment error detection, to correction KG completion. We categorize existing works terms target goals methods for better understanding. end, discuss some key issues possible directions research.

Язык: Английский

Graph neural networks: A review of methods and applications DOI Creative Commons
Jie Zhou, Ganqu Cui,

Shengding Hu

и другие.

AI Open, Год журнала: 2020, Номер 1, С. 57 - 81

Опубликована: Янв. 1, 2020

Lots of learning tasks require dealing with graph data which contains rich relation information among elements. Modeling physics systems, molecular fingerprints, predicting protein interface, and classifying diseases demand a model to learn from inputs. In other domains such as non-structural like texts images, reasoning on extracted structures (like the dependency trees sentences scene graphs images) is an important research topic also needs models. Graph neural networks (GNNs) are models that capture dependence via message passing between nodes graphs. recent years, variants GNNs convolutional network (GCN), attention (GAT), recurrent (GRN) have demonstrated ground-breaking performances many deep tasks. this survey, we propose general design pipeline for GNN discuss each component, systematically categorize applications, four open problems future research.

Язык: Английский

Процитировано

3825

Self-supervised Graph Learning for Recommendation DOI
Jiancan Wu, Xiang Wang, Fuli Feng

и другие.

Proceedings of the 45th International ACM SIGIR Conference on Research and Development in Information Retrieval, Год журнала: 2021, Номер unknown, С. 726 - 735

Опубликована: Июль 11, 2021

Representation learning on user-item graph for recommendation has evolved from using single ID or interaction history to exploiting higher-order neighbors. This leads the success of convolution networks (GCNs) such as PinSage and LightGCN. Despite effectiveness, we argue that they suffer two limitations: (1) high-degree nodes exert larger impact representation learning, deteriorating recommendations low-degree (long-tail) items; (2) representations are vulnerable noisy interactions, neighborhood aggregation scheme further enlarges observed edges.

Язык: Английский

Процитировано

941

Graph Contrastive Learning with Augmentations DOI Creative Commons

Yuning You,

Tianlong Chen, Yongduo Sui

и другие.

arXiv (Cornell University), Год журнала: 2020, Номер unknown

Опубликована: Янв. 1, 2020

Generalizable, transferrable, and robust representation learning on graph-structured data remains a challenge for current graph neural networks (GNNs). Unlike what has been developed convolutional (CNNs) image data, self-supervised pre-training are less explored GNNs. In this paper, we propose contrastive (GraphCL) framework unsupervised representations of data. We first design four types augmentations to incorporate various priors. then systematically study the impact combinations multiple datasets, in different settings: semi-supervised, unsupervised, transfer as well adversarial attacks. The results show that, even without tuning augmentation extents nor using sophisticated GNN architectures, our GraphCL can produce similar or better generalizability, transferrability, robustness compared state-of-the-art methods. also investigate parameterized patterns, observe further performance gains preliminary experiments. Our codes available at https://github.com/Shen-Lab/GraphCL.

Язык: Английский

Процитировано

756

Self-supervised Learning: Generative or Contrastive DOI
Xiao Liu, Fanjin Zhang, Zhenyu Hou

и другие.

IEEE Transactions on Knowledge and Data Engineering, Год журнала: 2021, Номер unknown, С. 1 - 1

Опубликована: Янв. 1, 2021

Deep supervised learning has achieved great success in the last decade. However, its deficiencies of dependence on manual labels and vulnerability to attacks have driven people explore a better solution. As an alternative, self-supervised attracts many researchers for soaring performance representation several years. Self-supervised leverages input data itself as supervision benefits almost all types downstream tasks. In this survey, we take look into new methods computer vision, natural language processing, graph learning. We comprehensively review existing empirical summarize them three main categories according their objectives: generative, contrastive, generative-contrastive (adversarial). further investigate related theoretical analysis work provide deeper thoughts how works. Finally, briefly discuss open problems future directions An outline slide survey is provided.

Язык: Английский

Процитировано

676

Molecular contrastive learning of representations via graph neural networks DOI
Yuyang Wang, Jianren Wang, Zhonglin Cao

и другие.

Nature Machine Intelligence, Год журнала: 2022, Номер 4(3), С. 279 - 287

Опубликована: Март 3, 2022

Язык: Английский

Процитировано

459

Graph Self-Supervised Learning: A Survey DOI
Yixin Liu, Ming Jin, Shirui Pan

и другие.

IEEE Transactions on Knowledge and Data Engineering, Год журнала: 2022, Номер unknown, С. 1 - 1

Опубликована: Янв. 1, 2022

Deep learning on graphs has attracted significant interests recently. However, most of the works have focused (semi-) supervised learning, resulting in shortcomings including heavy label reliance, poor generalization, and weak robustness. To address these issues, self-supervised (SSL), which extracts informative knowledge through well-designed pretext tasks without relying manual labels, become a promising trending paradigm for graph data. Different from SSL other domains like computer vision natural language processing, an exclusive background, design ideas, taxonomies. Under umbrella we present timely comprehensive review existing approaches employ techniques We construct unified framework that mathematically formalizes SSL. According to objectives tasks, divide into four categories: generation-based, auxiliary property-based, contrast-based, hybrid approaches. further describe applications across various research fields summarize commonly used datasets, evaluation benchmark, performance comparison open-source codes Finally, discuss remaining challenges potential future directions this field.

Язык: Английский

Процитировано

364

Self-Supervised Graph Transformer on Large-Scale Molecular Data DOI Creative Commons
Yu Rong, Yatao Bian, Tingyang Xu

и другие.

arXiv (Cornell University), Год журнала: 2020, Номер unknown

Опубликована: Янв. 1, 2020

How to obtain informative representations of molecules is a crucial prerequisite in AI-driven drug design and discovery. Recent researches abstract as graphs employ Graph Neural Networks (GNNs) for molecular representation learning. Nevertheless, two issues impede the usage GNNs real scenarios: (1) insufficient labeled supervised training; (2) poor generalization capability new-synthesized molecules. To address them both, we propose novel framework, GROVER, which stands Representation frOm self-superVised mEssage passing tRansformer. With carefully designed self-supervised tasks node-, edge- graph-level, GROVER can learn rich structural semantic information from enormous unlabelled data. Rather, encode such complex information, integrates Message Passing into Transformer-style architecture deliver class more expressive encoders The flexibility allows it be trained efficiently on large-scale dataset without requiring any supervision, thus being immunized mentioned above. We pre-train with 100 million parameters 10 -- biggest GNN largest training then leverage pre-trained property prediction followed by task-specific fine-tuning, where observe huge improvement (more than 6% average) current state-of-the-art methods 11 challenging benchmarks. insights gained are that well-designed self-supervision losses largely-expressive models enjoy significant potential performance boosting.

Язык: Английский

Процитировано

270

Self-Supervised Representation Learning: Introduction, advances, and challenges DOI
Linus Ericsson, Henry Gouk, Chen Change Loy

и другие.

IEEE Signal Processing Magazine, Год журнала: 2022, Номер 39(3), С. 42 - 62

Опубликована: Май 1, 2022

Self-supervised representation learning methods aim to provide powerful deep feature without the requirement of large annotated datasets, thus alleviating annotation bottleneck that is one main barriers practical deployment today. These have advanced rapidly in recent years, with their efficacy approaching and sometimes surpassing fully supervised pre-training alternatives across a variety data modalities including image, video, sound, text graphs. This article introduces this vibrant area key concepts, four families approach associated state art, how self-supervised are applied diverse data. We further discuss considerations workflows, transferability, compute cost. Finally, we survey major open challenges field fertile ground for future work.

Язык: Английский

Процитировано

258

ChemBERTa: Large-Scale Self-Supervised Pretraining for Molecular Property Prediction DOI Creative Commons
Seyone Chithrananda, Gabriel Grand, Bharath Ramsundar

и другие.

arXiv (Cornell University), Год журнала: 2020, Номер unknown

Опубликована: Янв. 1, 2020

GNNs and chemical fingerprints are the predominant approaches to representing molecules for property prediction. However, in NLP, transformers have become de-facto standard representation learning thanks their strong downstream task transfer. In parallel, software ecosystem around is maturing rapidly, with libraries like HuggingFace BertViz enabling streamlined training introspection. this work, we make one of first attempts systematically evaluate on molecular prediction tasks via our ChemBERTa model. scales well pretraining dataset size, offering competitive performance MoleculeNet useful attention-based visualization modalities. Our results suggest that offer a promising avenue future work To facilitate these efforts, release curated 77M SMILES from PubChem suitable large-scale self-supervised pretraining.

Язык: Английский

Процитировано

241

Autonomous Discovery in the Chemical Sciences Part II: Outlook DOI

Connor W. Coley,

Natalie S. Eyke, Klavs F. Jensen

и другие.

Angewandte Chemie International Edition, Год журнала: 2019, Номер 59(52), С. 23414 - 23436

Опубликована: Сен. 25, 2019

This two-part review examines how automation has contributed to different aspects of discovery in the chemical sciences. In this second part, we reflect on a selection exemplary studies. It is increasingly important articulate what role and computation been scientific process that or not accelerated discovery. One can argue even best automated systems have yet ``discover'' despite being incredibly useful as laboratory assistants. We must carefully consider they be applied future problems order effectively design interact with autonomous platforms. The majority article defines large set open research directions, including improving our ability work complex data, build empirical models, automate both physical computational experiments for validation, select experiments, evaluate whether are making progress toward ultimate goal Addressing these practical methodological challenges will greatly advance extent which make meaningful discoveries.

Язык: Английский

Процитировано

238