Model Identification with Fuzzy-Optimisation Techniques in Geological Data Mining
- Jose Finol (PDVSA Intevep) | Saul Buitrago (PDVSA Intevep)
- Document ID
- Society of Petroleum Engineers
- European Petroleum Conference, 29-31 October, Aberdeen, United Kingdom
- Publication Date
- Document Type
- Conference Paper
- 2002. Society of Petroleum Engineers
- 4.1.2 Separation and Treating, 7.6.6 Artificial Intelligence, 6.1.5 Human Resources, Competence and Training, 4.1.5 Processing Equipment, 5.1.5 Geologic Modeling, 1.6.9 Coring, Fishing, 5.6.1 Open hole/cased hole log analysis, 7.6.4 Data Mining
- 1 in the last 30 days
- 231 since 2007
- Show more detail
- View rights & permissions
|SPE Member Price:||USD 8.50|
|SPE Non-Member Price:||USD 25.00|
Zadeh's proposal of modelling the mechanism of human thinking with linguistic values rather than ordinary (crisp) numbers led to the introduction of fuzziness into statistical and dynamical modelling and to the development of a new class of systems called fuzzy models. In fuzzy modelling, one of the most important problems is the identification of a predictive model from a set of measurements. In a similar way, Data Mining is aimed at sieving data from large databases, data repositories or data warehouses in order to discover interesting knowledge such as pattern associations, trends, and significant structures. Therefore, many similarities can be found between data mining and fuzzy model identification, specially when one has to deal with imprecision and noise in very large data sets. Two of these similarities correspond with the inverse problem of variable selection and parameter identification in order to select the most appropriate predictive model that fits a set of observed (training) data.
In this paper we introduce a data mining procedure which uses global optimisation methods, that seeks for useful features, pattern relations, and functional representations in order to derive fuzzy models from large data sets, that characterise the unknown functions as precise as possible. The effectiveness of the proposed data mining method is proved using petrophysical data (core plug measurements) from two oil wells in the Maracaibo Basin.
Data mining has been popularly treated as a synonym of knowledge discovery in databases (KND), although some researchers view data mining as an essential step of knowledge discovery . In general, a knowledge discovery process consists of a loop over the following four steps:
Data cleaning: which handles noisy, erroneous, missing or irrelevant data.
Data selection: where data relevant to the modelling task are retrieved from the database.
Data transformation: where data are transformed or consolidated into forms appropriate for mining by performing re-scaling or aggregation operations.
Data mining: which is the essential process whereby intelligent algorithms are applied to extract patterns or models from the data.
The data cleaning, data selection and data transformation processes are normally integrated into a single (possible iterative) process called data preprocessing, which aims to select the adequate variables to be used in the model construction or extraction. Data preprocessing is similar to the variable selection step in the fuzzy modelling process.
On the other hand, the data mining process may accomplish one or more of the following tasks:
Clustering: Clustering essentially deals with the task of splitting a finite set of data points into a number of classes (clusters) with respect to a similarity measure. The similarity measure used in the clustering task has an important effect on the size, shape and orientation of the clusters. A good clustering method produces high quality clusters, ensuring that data points within the same cluster are similar and data points in different clusters are as dissimilar as possible. For example, one may cluster a set of well log responses to distinguish a group of different sedimentary characteristic within a rock unit [2, 3].
Classification: Classification analyses a set of training data (a set of objects whose class labels is known) and constructs a model for each class based on the features of the data. A decision tree or a set of classification rules is generated by such a classification process, which can be used for better understanding of each class in the database and for the classification of future data. For example, one may characterize sedimentary facies with the help of gamma ray, bulk density and neutron porosity logs, and predict these electrofacies based on the same log characteristics in uncored wells [4, 5].
|File Size||506 KB||Number of Pages||6|