## Identifying N$$^6$$-methyladenosine sites using extreme gradient boosting system optimized by particle swarm optimizer.(English)Zbl 1409.92184

Summary: N$$^6$$-methyladenosine (m$$^6$$A) is the one of the most important RNA modifications, playing the role of splicing events, mRNA exporting and stability to cell differentiation. Because of wide distribution of m$$^6$$A in genes, identification of m$$^6$$A sites in RNA sequences has significant importance for basic biomedical research and drug development. High-throughput laboratory methods are time consuming and costly. Nowadays, effective computational methods are much desirable because of its convenience and fast speed. Thus, in this article, we proposed a new method to improve the performance of the m$$^6$$A prediction by using the combined features of deep features and original features with extreme gradient boosting optimized by particle swarm optimization (PXGB). The proposed PXGB algorithm uses three kinds of features, i.e., position-specific nucleotide propensity (PSNP), position-specific dinucleotide propensity (PSDP), and the traditional nucleotide composition (NC). By 10-fold cross validation, the performance of PXGB was measured with an AUC of 0.8390 and an MCC of 0.5234. Additionally, PXGB was compared with the existing methods, and the higher MCC and AUC of PXGB demonstrated that PXGB was effective to predict m$$^6$$A sites. The predictor proposed in this study might help to predict more m6A sites and guide related experimental validation.

### MSC:

 92D20 Protein sequences, DNA sequences 90C59 Approximation methods and heuristics in mathematical programming
Full Text: