Work Orders - Value from Structureless Text in the Era of Digitisation
- Erik Salo (University of Strathclyde) | David McMillan (University of Strathclyde) | Richard Connor (University of Stirling)
- Document ID
- Society of Petroleum Engineers
- SPE Offshore Europe Conference and Exhibition, 3-6 September, Aberdeen, UK
- Publication Date
- Document Type
- Conference Paper
- 2019. Society of Petroleum Engineers
- text mining, active learning, digitisation, machine learning, work order
- 2 in the last 30 days
- 114 since 2007
- Show more detail
- View rights & permissions
|SPE Member Price:||USD 5.00|
|SPE Non-Member Price:||USD 28.00|
Free text and hand-written reports are losing ground to digitization fast, however many hours of effort are still lost across the industry to the manual creation and analysis of these data types. Work orders in particular contain valuable information from failure rates to asset health, but at the same time present operators with such analytical difficulties and lack of structure that many are missing out on the value completely. This research challenges the current mainstream practice of manual work order analysis by presenting a methodology fit for today’s context of efficiency and digitization.
A prototype text mining software for work order analysis was developed and tested in a user-oriented approach in cooperation with industrial partners. The final prototype combines classical machine learning methods, such as hierarchical clustering, with the operator’s expert knowledge obtained via an active learning approach. A novel distance metric in this context was adapted from information-theoretical research to improve clustering performance.
Using the prototype tool in a case study with real work order data, analytical effort for certain datasets was reduced by 90% - from two working weeks to a day. In addition, the active learning framework resulted in an approach that end users described as "practical" and "intuitive" during testing. An in-depth review was also conducted regarding the uncertainty of the results – a key factor for implementation in a decision-making context.
The outcomes of this work showcase the potential of machine learning to drive the digitization of not only new installations, but also older assets, where as a result the large amount of unstructured historical data becomes an advantage rather than a hindrance. User testing results encourage a wider uptake of machine learning solutions in the industry, and particularly a shift towards more accessible in-house analytical capabilities.
|File Size||787 KB||Number of Pages||11|
Arif-uz-zaman, K., Chotette, M.E., Li, F., Ma, L., Karim, A., 2016. A Data Fusion Approach of Multiple Maintenance Data Sources for Real-World Reliability Modelling, in: Proceedings of the 10th World Congress on Engineering Asset Management (WCEAM 2015). pp. 69-77. 10.1007/978-3-319-27064-7
Artigao, E., Martin-Martinez, S., Honrubia-Escribano, A., Gomez-Lazar°, E., 2018. Wind turbine reliability: A comprehensive review towards effective condition monitoring development. Appl. Energy 228, 1569-1583. 10.1016/j.apenergy.2018.07.037
Banchs, R. E., 2013. Text Mining with MATLAB. Springer, New York. 10.1007/978-1-4614-4151-9
Basu, S., Bilenko, M., Mooney, R.J., 2004. A probabilistic framework for semi-supervised clustering. Proc. 2004 ACM SIGKDD Int. Conf. Knowl. Discov. data Min. - KDD '04 59. 10.1145/1014052.1014062
Carroll, J., McDonald, A., McMillan, D., 2016. Failure rate, repair time and unscheduled O&M cost analysis of offshore wind turbines. Wind Energy 19, 1107-1119. 10.1002/we.1887
Castiñeira, D., Toronyi, R., Saleri, N., 2018. Machine Learning and Natural Language Processing for Automated Analysis of Drilling and Completion Data. 10.2118/192280-ms
Connor, R., Simeoni, F., Iakovos, M., Moss, R., 2011. A bounded distance metric for comparing tree structure. Inf. Syst. 36, 748-764. 10.1016/j.is.2010.12.003
Connor, R., Vadicamo, L., Rabitti, F., 2017. High-dimensional simplexes for supermetric search. Lect. Notes Comput. Sci. (including Subser. Lect. Notes Artif. Intell. Lect. Notes Bioinformatics) 10609 LNCS, 96-109. 10.1007/978-3-319-68474-1_7
Eriksson, B., Dasarathy, G., Singh, A., Nowak, R., 2011. Active Clustering: Robust and Efficient Hierarchical Clustering using Adaptively Selected Similarities. Proc. 14th Int. Conf. Artif. Intell. Stat. 15, 1-19. 10.1583/1545-1550(2007)14[585:DRTIVD]2.0.00;2
Fawcett, T., 2006. An introduction to ROC analysis 27, 861-874. 10.1016/j.patrec.2005.10.010
Feldman, R., Sanger, J., 2006. The Text Mining Handbook. Cambridge University Press, Cambridge. 10.1017/CB09780511546914
Fluss, R., Reiser, B., Faraggi, D., Rotnitzky, A., 2009. Estimation of the ROC curve under verification bias. Biometrical J. 51, 475-490. 10.1002/bimj.200800128
Hodkiewicz, M., Ho, M.T.-W., 2016. Cleaning historical maintenance work order data for reliability analysis. J. Qual. Maint. Eng. 22, 146-163. 10.1108/JQME-04-2015-0013
Kowalchuk, P., 2019. Implementing a Drilling Reporting Data Mining Tool Using Natural Language Processing Sentiment Analysis Techniques. 10.2118/194961-ms
Liu, D., Graham, J., 2018. Simple Measures of Individual Cluster-Membership Certainty for Hard Partitional Clustering. Am. Stat. 1-10. 10.1080/00031305.2018.1459315
Nogueira, B.M., Jorge, A.M., Rezende, S.O., 2012. HCAC: Semi-supervised Hierarchical Clustering Using Confidence-Based Active Learning. Springer, Berlin, Heidelberg, pp. 139-153. 10.1007/978-3-642-33492-4_13
Onyx Insight, 2018. ‘Digitisation’ of O&M data is the first step to ‘digitalisation’ of O&M - ONYX InSight [WVVW Document]. URL https://onyxinsight.com/2018/09/18/digitisation-of-om-data-is-the-first-step-to-digitalisation-of-om/ (accessed 2.14.19).
Suzuki, R., Shimodaira, H., 2004. An application of multiscale bootstrap resampling to hierarchical clustering of microarray data: How accurate are these clusters. 15th Annu. Int. Conf. Genome Informatics, Posters Softw. Demonstr. 1-2. 10.1126/science.1161925
Wagstaff, K., Cardie, C., 2000. Clustering with Instance-level Constraints. Proc. Seventeenth Int. Conf. Mach. Learn. 1103-1110. 10.1109/CCECE.1993.332311
Xiong, C., Johnson, D.M., Corso, J.J., 2017. Active Clustering with Model-Based Uncertainty Reduction. IEEE Trans. Pattern Anal. Mach. Intell. 39, 5-17. 10.1109/TPAMI.2016.2539965