Mathematics, Vol. 11, Pages 560: Column-Type Prediction for Web Tables Powered by Knowledge Base and Text

1 year ago 35

Mathematics, Vol. 11, Pages 560: Column-Type Prediction for Web Tables Powered by Knowledge Base and Text

Mathematics doi: 10.3390/math11030560

Authors: Junyi Wu Chen Ye Haoshi Zhi Shihao Jiang

Web tables are essential for applications such as data analysis. However, web tables are often incomplete and short of some critical information, which makes it challenging to understand the web table content. Automatically predicting column types for tables without metadata is significant for dealing with various tables from the Internet. This paper proposes a CNN-Text method to deal with this task, which fuses CNN prediction and voting processes. We present data augmentation and synthetic column generation approaches to improve the CNN’s performance and use extracted text to get better predictions. The experimental result shows that CNN-Text outperforms the baseline methods, demonstrating that CNN-Text is well qualified for the table column type prediction.

Read Entire Article