A Collaborative AI-enabled Pretrained Language Model for AIoT Domain Question Answering
Published in IEEE Transactions on Industrial Informatics, 2021
Recommended citation: Zhu, H., Tiwari, P., Ghoneim, A., & Hossain, M. S. (2021). A Collaborative AI-enabled Pretrained Language Model for AIoT Domain Question Answering. IEEE Transactions on Industrial Informatics. https://ieeexplore.ieee.org/abstract/document/9484781
Large-scale knowledge in the Artificial Intelligence of Things (AIoT) field urgently needs effective models to understand human language and automatically answer questions. Pre-trained language models (PLMs) achieve state-of-the-art performance on some question answering (QA) datasets, but few models can answer questions on AIoT domain knowledge. Currently, AIoT domain lacks sufficient QA datasets and large-scale pre-training corpora. We propose RoBERTa-AIoT to address the problem of the lack of high-quality large-scale labeled AIoT QA datasets. We construct an AIoT corpus to further pre-train RoBERTa and BERT. RoBERTa-AIoT and BERT-AIoT leverage unsupervised pre-training on a large corpus composed of AIoT-oriented Wikipedia webpages to learn more domain-specific context and improve performance on the AIoT QA tasks. To fine-tune and evaluate the model, we construct 3 AIoT QA datasets based on the community QA websites. We evaluate our approach on these datasets and the experimental results demonstrate the significant improvements of our approach.
Recommended citation: Zhu, H., Tiwari, P., Ghoneim, A., & Hossain, M. S. (2021). A Collaborative AI-enabled Pretrained Language Model for AIoT Domain Question Answering. IEEE Transactions on Industrial Informatics.