Ontology-based semi-supervised conditional random fields for automated information extraction from bridge inspection reports

Kaijian Liu, Nora El-Gohary

Research output: Contribution to journalArticlepeer-review

121 Scopus citations

Abstract

A large amount of detailed data about bridge conditions and maintenance actions are buried in bridge inspection reports without being used. Information extraction and data analytics open opportunities to leverage this wealth of data for improved bridge deterioration prediction and enhanced maintenance decision making. This paper proposes a novel ontology-based, semi-supervised conditional random fields (CRF)-based information extraction methodology for extracting information entities describing existing deficiencies and performed maintenance actions from bridge inspection reports. The ontology facilitates the analysis of the text based on content and domain-specific meaning. The proposed semi-supervised CRF simultaneously captures the dependency structures as well as the distributions of labeled and unlabeled data in a concave machine-learning function. It learns from a small set of fixed labeled data and, at the same time, dynamically adapts itself to unseen instances by further learning from a large set of unlabeled data for both reduced human effort and high performance. The proposed algorithm achieved an average precision, recall and, F-1 measure of 94.1%, 87.7%, and 90.7%, respectively.

Original languageEnglish
Pages (from-to)313-327
Number of pages15
JournalAutomation in Construction
Volume81
DOIs
StatePublished - Sep 2017

Keywords

  • Bridges
  • Conditional random fields
  • Deterioration prediction
  • Information extraction
  • Maintenance decision making
  • Ontology
  • Semi-supervised machine learning

Fingerprint

Dive into the research topics of 'Ontology-based semi-supervised conditional random fields for automated information extraction from bridge inspection reports'. Together they form a unique fingerprint.

Cite this