Submitted:
08 September 2023
Posted:
12 September 2023
You are already at the latest version
Abstract
Keywords:
1. Introduction
2. Results and Discussion
2.1. Combination strategy using multiple index codes
2.2. Iterative connection strategy
2.3. Prediction of new improved AT-ATA mutants
2.4. Validation of MD simulation for predicting AT- ATA mutants
3. Materials and Methods
3.1. Aspergillus terreus dataset
3.2. Digital Signal Processing
3.3. Evaluation of modeling performance
3.4. Molecular dynamics simulation
4. Conclusions
Author Contributions
Funding
Institutional Review Board Statement
Informed Consent Statement
Data Availability Statement
Conflicts of Interest
Sample Availability
References
- Romero, P.A.; Arnold, F.H. Exploring Protein Fitness Landscapes by Directed Evolution. Nat. Rev. Mol. Cell Biol. 2009, 10, 866–876. [Google Scholar] [CrossRef]
- Packer, M.S.; Liu, D.R. Methods for the Directed Evolution of Proteins. Nat. Rev. Genet. 2015, 16, 379–394. [Google Scholar] [CrossRef]
- Reetz, M.T. Recent Advances in Directed Evolution of Stereoselective Enzymes. In Directed enzyme evolution: Advances and applications; 2017; pp. 69–99. [CrossRef]
- Reetz, M.T. Biocatalysis in Organic Chemistry and Biotechnology: Past, Present, and Future. J. Am. Chem. Soc. 2013, 135, 12480–12496. [Google Scholar] [CrossRef]
- Cen, Y.; Singh, W.; Arkin, M.; Moody, T.S.; Huang, M.; Zhou, J.; Wu, Q.; Reetz, M.T. Artificial Cysteine-Lipases with High Activity and Altered Catalytic Mechanism Created by Laboratory Evolution. Nat. Commun. 2019, 10, 3198–4208. [Google Scholar] [CrossRef]
- Li, A.; Wang, B.; Ilie, A.; Dubey, K.D.; Bange, G.; Korendovych, I.V.; Shaik, S.; Reetz, M.T. A Redox-Mediated Kemp Eliminase. Nat. Commun. 2017, 8, 14876. [Google Scholar] [CrossRef]
- Schwander, T.; von Borzyskowski, L.S.; Burgener, S.; Cortina, N.S.; Erb, T.J. A Synthetic Pathway for the Fixation of Carbon Dioxide in Vitro. Science 2016, 354, 900–904. [Google Scholar] [CrossRef]
- Savile, C.K.; Janey, J.M.; Mundorff, E.C.; Moore, J.C.; Tam, S.; Jarvis, W.R.; Colbeck, J.C.; Krebber, A.; Fleitz, F.J.; Brands, J.; et al. Biocatalytic Asymmetric Synthesis of Chiral Amines from Ketones Applied to Sitagliptin Manufacture. Science 2010, 329, 305–309. [Google Scholar] [CrossRef]
- Mazurenko, S.; Prokop, Z.; Damborsky, J. Machine Learning in Enzyme Engineering. ACS Catalysis 2019, 10, 1210–1223. [Google Scholar] [CrossRef]
- Yang, K.K.; Wu, Z.; Arnold, F.H. Machine-Learning-Guided Directed Evolution for Protein Engineering. Nature methods 2019, 16, 687–694. [Google Scholar] [CrossRef]
- Kim, G.B.; Kim, W.J.; Kim, H.U.; Lee, S.Y. Machine Learning Applications in Systems Metabolic Engineering. Current opinion in biotechnology 2020, 64, 1–9. [Google Scholar] [CrossRef]
- Woodley, J.M. Accelerating the Implementation of Biocatalysis in Industry. Applied Microbiology and Biotechnology 2019, 103, 4733–4739. [Google Scholar] [CrossRef] [PubMed]
- Wu, Z.; Kan, S.J.; Lewis, R.D.; Wittmann, B.J.; Arnold, F.H. Machine Learning-Assisted Directed Protein Evolution with Combinatorial Libraries. Proceedings of the National Academy of Sciences 2019, 116, 8852–8858. [Google Scholar] [CrossRef] [PubMed]
- Muggleton, S.; King, R.D.; Stenberg, M.J. Protein Secondary Structure Prediction Using Logic-Based Machine Learning. Protein Engineering, Design and Selection 1992, 5, 647–657. [Google Scholar] [CrossRef] [PubMed]
- Folkman, L.; Stantic, B.; Sattar, A.; Zhou, Y. EASE-MM: Sequence-Based Prediction of Mutation-Induced Stability Changes with Feature-Based Multiple Models. J. Mol. Biol. 2016, 428, 1394–1405. [Google Scholar] [CrossRef] [PubMed]
- Teng, S.; Srivastava, A.K.; Wang, L. Sequence Feature-Based Prediction of Protein Stability Changes upon Amino Acid Substitutions. BMC Genomics 2010, 11, S5. [Google Scholar] [CrossRef]
- Huang, L.T.; Gromiha, M.M.; Ho, S.Y. IPTREE-STAB: Interpretable Decision Tree Based Method for Predicting Protein Stability Changes upon Mutations. Bioinformatics 2007, 23, 1292–1293. [Google Scholar] [CrossRef]
- Yang, Y.; Niroula, A.; Shen, B.; Vihinen, M. PON-Sol: Prediction of Effects of Amino Acid Substitutions on Protein Solubility. Bioinformatics 2016, 32, 2032–2034. [Google Scholar] [CrossRef]
- Koskinen, P.; Törönen, P.; Nokso-Koivisto, J.; Holm, L. PANNZER: High-Throughput Functional Annotation of Uncharacterized Proteins in an Error-Prone Environment. Bioinformatics 2015, 31, 1544–1552. [Google Scholar] [CrossRef]
- Cadet, F.; Fontaine, N.; Vetrivel, I.; Ng Fuk Chong, M.; Savriama, O.; Cadet, X.; Charton, P. Application of Fourier Transform and Proteochemometrics Principles to Protein Engineering. BMC Bioinformatics 2018, 19, 382. [Google Scholar] [CrossRef]
- Cadet, F.; Fontaine, N.; Li, G.; Sanchis, J.; Ng Fuk Chong, M.; Pandjaitan, R.; Vetrivel, I.; Offmann, B.; Reetz, M.T. A Machine Learning Approach for Reliable Prediction of Amino Acid Interactions and Its Application in the Directed Evolution of Enantioselective Enzymes. Sci Rep 2018, 8, 16757–16772. [Google Scholar] [CrossRef]
- Ferrandi, E.E.; Monti, D. Amine Transaminases in Chiral Amines Synthesis: Recent Advances and Challenges. World Journal of Microbiology and Biotechnology 2018, 34, 1–10. [Google Scholar] [CrossRef]
- Gao, S.; Su, Y.; Zhao, L.; Li, G.; Zheng, G. Characterization of a (R)-Selective Amine Transaminase from Fusarium Oxysporum. Process Biochemistry 2017, 63, 130–136. [Google Scholar] [CrossRef]
- Kelly, S.A.; Mix, S.; Moody, T.S.; Gilmore, B.F. Transaminases for Industrial Biocatalysis: Novel Enzyme Discovery. Applied Microbiology and Biotechnology 2020, 104, 4781–4794. [Google Scholar] [CrossRef] [PubMed]
- Lyskowski, A.; Gruber, C.; Steinkellner, G.; Schürmann, M.; Schwab, H.; Gruber, K.; Steiner, K. Crystal Structure of an (R)-Selective ω-Transaminase from Aspergillus Terreus. PLoS ONE 2014, 9, e87350. [Google Scholar] [CrossRef]
- Xie, D.F.; Fang, H.; Mei, J.Q.; Gong, J.Y.; Wang, H.P.; Shen, X.Y.; Huang, J.; Mei, L.H. Improving Thermostability of (R)-Selective Amine Transaminase from Aspergillus Terreus through Introduction of Disulfide Bonds. Biotechnol Appl Biochem 2018, 65, 255–262. [Google Scholar] [CrossRef] [PubMed]
- Liu, C.Y.; Cecylia Severin, L.; Lyu, C.J.; Zhu, W.L.; Wang, H.P.; Jiang, C.J.; Mei, L.H.; Liu, H.G.; Huang, J. Improving Thermostability of (R)-Selective Amine Transaminase from Aspergillus Terreus by Evolutionary Coupling Saturation Mutagenesis. Biochemical Engineering Journal 2021, 167, 107926. [Google Scholar] [CrossRef]
- Xie, D.F.; Yang, J.X.; Lv, C.J.; Mei, J.Q.; Wang, H.P.; Hu, S.; Zhao, W.R.; Cao, J.R.; Tu, J.L.; Huang, J.; et al. Construction of Stabilized (R)-Selective Amine Transaminase from Aspergillus Terreus by Consensus Mutagenesis. Journal of Biotechnology 2019, 293, 8–16. [Google Scholar] [CrossRef] [PubMed]
- Huang, J.; Xie, D.F.; Feng, Y. Engineering Thermostable (R)-Selective Amine Transaminase from Aspergillus Terreus through in Silico Design Employing B-Factor and Folding Free Energy Calculations. Biochemical and Biophysical Research Communications 2017, 483, 397–402. [Google Scholar] [CrossRef] [PubMed]
- Jia, L.; Sun, T.T.; Wang, Y.; Shen, Y. A Machine Learning Study on the Thermostability Prediction of (R)-Omega-Selective Amine Transaminase from Aspergillus Terreus. Biomed Research International 2021, 2021, 2593748. [Google Scholar] [CrossRef] [PubMed]
- Kawashima, S.; Pokarowski, P.; Pokarowska, M.; Kolinski, A.; Katayama, T.; Kanehisa, M. AAindex: Amino Acid Index Database, Progress Report 2008. Nucleic Acids Res 2008, 36, 202–205. [Google Scholar] [CrossRef]
- Cao, J.R.; Fan, F.F.; Lv, C.J.; Wang, H.P.; Li, Y.; Hu, S.; Zhao, W.R.; Chen, H.B.; Huang, J.; Mei, L.H. Improving the Thermostability and Activity of Transaminase From Aspergillus Terreus by Charge-Charge Interaction. Frontiers in Chemistry 2021, 9. [Google Scholar] [CrossRef]
- Veljković, V.; Cosić, I.; Dimitrijević, B.; Lalović, D. Is It Possible to Analyze DNA and Protein Sequences by the Methods of Digital Signal Processing? IEEE Trans Biomed Eng 1985, 32, 337–341. [Google Scholar] [CrossRef] [PubMed]
- Benson, D.C. Digital Signal Processing Methods for Biosequence Comparison. Nucleic Acids Research 1990, 18, 3001–3006. [Google Scholar] [CrossRef] [PubMed]
- Delgado, J.; Radusky, L.G.; Cianferoni, D.; Serrano, L. FoldX 5.0: Working with RNA, Small Molecules and a New Graphical Interface. Bioinformatics 2019, 35, 4168–4169. [Google Scholar] [CrossRef] [PubMed]
- Buss, O.; Rudat, J.; Ochsenreither, K. FoldX as Protein Engineering Tool: Better Than Random Based Approaches? Comp. Struct. Biotechnol. J.. 2018, 16, 25–33. [Google Scholar] [CrossRef]
- Krieger, E.; Vriend, G. YASARA View-Molecular Graphics for All Devices-from Smartphones to Workstations. Bioinformatics 2014, 30, 2981–2982. [Google Scholar] [CrossRef]
- Fontaine, N.; Cadet, X.; Vetrivel, I. Novel Descriptors and Digital Signal Processing- Based Method for Protein Sequence Activity Relationship Study. IJMS 2019, 20, 5640. [Google Scholar] [CrossRef]







| Index | Index Number | cvRMSE | R2 |
|---|---|---|---|
| AURR980108 | 396 | 4.56 | 0.81 |
| OOBM770102 | 201 | 5.33 | 0.76 |
| MUNV940102 | 416 | 6.01 | 0.67 |
| CORJ870102 | 507 | 6.10 | 0.67 |
| GEOR030108 | 484 | 6.58 | 0.59 |
| Index number | cvRMSE | R2 |
|---|---|---|
| 396 201 507 484 | 3.64 | 0.86 |
| 396 507 484 | 3.71 | 0.86 |
| 396 201 507 | 3.91 | 0.85 |
| 201 507 | 4.36 | 0.84 |
| 396 416 507 | 4.22 | 0.84 |
| 396 201 416 507 | 4.17 | 0.83 |
| 201 507 484 | 4.20 | 0.83 |
| 396 201 | 4.25 | 0.82 |
| 396 507 | 4.33 | 0.82 |
| 396 201 416 | 4.37 | 0.81 |
| Variant | Mutations | Predicted T1/2 |
|---|---|---|
| P1 | Q97E_F115L_L118T_E133A_H210N_N245D_E253A_G292D | 107.59 |
| P2 | I77L_F115L_L118T_E133A_H210N_N245D_E253A | 101.25 |
| Mutations | T1/2 | Note |
|---|---|---|
| WT | 6.9 | Dataset 1/Dataset 2 |
| I77L | 20.1 | Dataset 1/Dataset 2 |
| Q97E | 16.5 | Dataset 1/Dataset 2 |
| F115L | 17.2 | Dataset 2 |
| L118T | 26.1 | Dataset 2 |
| E133A | 9.8 | Dataset 2 |
| H210N | 23.1 | Dataset 1/Dataset 2 |
| N245D | 14.8 | Dataset 1/Dataset 2 |
| E253A | 11.8 | Dataset 2 |
| G292D | 14.8 | Dataset 1/Dataset 2 |
| I295V | 9.3 | Dataset 1/Dataset 2 |
| F115L_L118T | 65.9 | Dataset 2 |
| I77L_H210N | 42.2 | Dataset 1/Dataset 2 |
| Q97E_H210N | 30.6 | Dataset 1/Dataset 2 |
| H210N_N245D | 18.4 | Dataset 1/Dataset 2 |
| H210N_G292D | 33.6 | Dataset 1/Dataset 2 |
| I77L_Q97E_H210N | 31 | Dataset 1/Dataset 2 |
| I77L_H210N_G292D | 16.7 | Dataset 1/Dataset 2 |
| I77L_Q97E_H210N_N245D | 14.4 | Dataset 2 |
| I77L_H210N_N245D_G292D | 16.3 | Dataset 2 |
| I77L_Q97E_H210N_N245D_G292D | 8.7 | Dataset 2 |
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content. |
© 2023 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (http://creativecommons.org/licenses/by/4.0/).