• MODELLING THE RELATIONSHIP BETWEEN SATURATED OXYGEN AND DISTOMS‘ ABUNDANCE USING WEIGTHED PATTERN TREES WITH ALGEBRAIC OPERATORS

    Mathematical Modeling, Vol. 2 (2018), Issue 4, pg(s) 170-172

    Machine learning has been used in many disciplines to reveal important patterns in data. One of the research disciplines that benefits from using these methods is eco-informatics. This branch of applied computer science to solve environmental problems uses computer algorithms to discover the impact of the environmental stress factors on the organisms’ abundance. Decision tree type of machine learning methods are particularly interesting for the computer scientists as well as ecologists, because they provide very easy interpretable structure without any practical knowledge in mathematics or the inner working of the algorithm. These methods do not rely only on classical sets, but many of them are using fuzzy set theory to overcome some problems like overfitting, robustness to data change and improved prediction accuracy. In this direction, this paper aims to discover the influence of one particular environmental stress factor (Saturated Oxygen) on real measured data containing information about the diatoms’ abundance in Lake Prespa, Macedonia, using weighted pattern tree (WPT) algorithm. WPT is a decision tree method variant that combines fuzzy set theory concepts, like similarity metrics, fuzzy membership functions and aggregation operators, to achieve better prediction accuracy, improve interpretability and increase the resistance to overfitting compared to the classical decision trees. In this study, we use Algebraic operators for aggregation. One WPT model is presented in this paper to relate the saturated oxygen parameter with the diatoms’ abundance and reveal which diatoms can be used to indicate certain water quality class (WQC). The obtained results are verified with the existing knowledge found in literature.

  • CLASSIFICATION OF PROTEIN STRUCTURES BY USING FUZZY KNN CLASSIFIER AND PROTEIN VOXEL-BASED DESCRIPTOR

    Mathematical Modeling, Vol. 2 (2018), Issue 3, pg(s) 116-118

    Protein classification is among the main themes in bioinformatics, for the reason that it helps understand the protein molecules. By classifying the protein structures, the evolutionary relations between them can be discovered. The knowledge for protein structures and the functions that they might have could be used to regulate the processes in organisms, which is made by developing medications for different diseases. In the literature, plethora of methods for protein classification are offered, including manual, automatic or semiautomatic methods. The manual methods are considered as precise, but their main problem is that they are time consuming, hence by using them a large number of protein structures stay uncategorized. Therefore, the researchers intensively work on developing methods that would afford classification of protein structures in automatic way with acceptable precision. In this paper, we propose an approach for classifying protein structures. Our protein voxel-based descriptor is used to describe the features of protein structures. For classification of unclassified protein structures, we use a k nearest neighbors classifier based on fuzzy logic. For evaluation, we use knowledge for the classification of protein structures in the SCOP database. We provide some results from the evaluation of our approach. The results show that the proposed approach provide accurate classification of protein structures with reasonable speed.