no code implementations • 26 Oct 2023 • Zi Lin, Zihan Wang, Yongqi Tong, Yangkun Wang, Yuxin Guo, Yujia Wang, Jingbo Shang
This benchmark contains the rich, nuanced phenomena that can be tricky for current toxicity detection models to identify, revealing a significant domain difference compared to social media content.
no code implementations • 24 May 2023 • Prashant Krishnan, Zilong Wang, Yangkun Wang, Jingbo Shang
Recent advances of incorporating layout information, typically bounding box coordinates, into pre-trained language models have achieved significant performance in entity recognition from document images.
no code implementations • 25 Dec 2022 • Jiarui Jin, Yangkun Wang, Weinan Zhang, Quan Gan, Xiang Song, Yong Yu, Zheng Zhang, David Wipf
However, existing methods lack elaborate design regarding the distinctions between two tasks that have been frequently overlooked: (i) edges only constitute the topology in the node classification task but can be used as both the topology and the supervisions (i. e., labels) in the edge prediction task; (ii) the node classification makes prediction over each individual node, while the edge prediction is determinated by each pair of nodes.
1 code implementation • 14 Jun 2022 • Kounianhua Du, Weinan Zhang, Ruiwen Zhou, Yangkun Wang, Xilong Zhao, Jiarui Jin, Quan Gan, Zheng Zhang, David Wipf
Prediction over tabular data is an essential and fundamental problem in many important downstream tasks.
no code implementations • 12 Nov 2021 • Yongyi Yang, Tang Liu, Yangkun Wang, Zengfeng Huang, David Wipf
It has been observed that graph neural networks (GNN) sometimes struggle to maintain a healthy balance between the efficient modeling long-range dependencies across nodes while avoiding unintended consequences such oversmoothed node representations or sensitivity to spurious edges.
1 code implementation • 26 Oct 2021 • Jiuhai Chen, Jonas Mueller, Vassilis N. Ioannidis, Soji Adeshina, Yangkun Wang, Tom Goldstein, David Wipf
For supervised learning with tabular data, decision tree ensembles produced via boosting techniques generally dominate real-world applications involving iid training/test sets.
no code implementations • ICLR 2022 • Yangkun Wang, Jiarui Jin, Weinan Zhang, Yongyi Yang, Jiuhai Chen, Quan Gan, Yong Yu, Zheng Zhang, Zengfeng Huang, David Wipf
In this regard, it has recently been proposed to use a randomly-selected portion of the training labels as GNN inputs, concatenated with the original node features for making predictions on the remaining labels.
no code implementations • ICLR 2022 • Jiuhai Chen, Jonas Mueller, Vassilis N. Ioannidis, Soji Adeshina, Yangkun Wang, Tom Goldstein, David Wipf
Many practical modeling tasks require making predictions using tabular data composed of heterogeneous feature types (e. g., text-based, categorical, continuous, etc.).
no code implementations • ICLR 2022 • Jiarui Jin, Yangkun Wang, Kounianhua Du, Weinan Zhang, Zheng Zhang, David Wipf, Yong Yu, Quan Gan
Prevailing methods for relation prediction in heterogeneous graphs aim at learning latent representations (i. e., embeddings) of observed nodes and relations, and thus are limited to the transductive setting where the relation types must be known during training.
2 code implementations • 24 Mar 2021 • Yangkun Wang, Jiarui Jin, Weinan Zhang, Yong Yu, Zheng Zhang, David Wipf
Over the past few years, graph neural networks (GNN) and label propagation-based methods have made significant progress in addressing node classification tasks on graphs.
Ranked #1 on Node Property Prediction on ogbn-proteins
1 code implementation • 10 Mar 2021 • Yongyi Yang, Tang Liu, Yangkun Wang, Jinjing Zhou, Quan Gan, Zhewei Wei, Zheng Zhang, Zengfeng Huang, David Wipf
Despite the recent success of graph neural networks (GNN), common architectures often exhibit significant limitations, including sensitivity to oversmoothing, long-range dependencies, and spurious edges, e. g., as can occur as a result of graph heterophily or adversarial attacks.