Browsing by Author "Norambuena, T."
Now showing 1 - 8 of 8
Results Per Page
Sort Options
- ItemAll-atom knowledge-based potential for RNA structure prediction and assessment(2011) Capriotti, E.; Norambuena, T.; Marti Renom, M. A.; Melo Ledermann, Francisco JavierMotivation: Over the recent years, the vision that RNA simply serves as information transfer molecule has dramatically changed. The study of the sequence/structure/function relationships in RNA is becoming more important. As a direct consequence, the total number of experimentally solved RNA structures has dramatically increased and new computer tools for predicting RNA structure from sequence are rapidly emerging. Therefore, new and accurate methods for assessing the accuracy of RNA structure models are clearly needed. Results: Here, we introduce an all-atom knowledge-based potential for the assessment of RNA three-dimensional (3D) structures. We have benchmarked our new potential, called Ribonucleic Acids Statistical Potential (RASP), with two different decoy datasets composed of near-native RNA structures. In one of the benchmark sets, RASP was able to rank the closest model to the X-ray structure as the best and within the top 10 models for ∼93 and ∼95% of decoys, respectively. The average correlation coefficient between model accuracy, calculated as the root mean square deviation and global distance test-total score (GDT-TS) measures of C3′ atoms, and the RASP score was 0.85 and 0.89, respectively. Based on a recently released benchmark dataset that contains hundreds of 3D models for 32 RNA motifs with non-canonical base pairs, RASP scoring function compared favorably to ROSETTA FARFAR force field in the selection of accurate models. Finally, using the self-splicing group I intron and the stem-loop IIIc from hepatitis C virus internal ribosome entry site as test cases, we show that RASP is able to discriminate between known structure-destabilizing mutations and compensatory mutations. Availability: RASP can be readily applied to assess all-atom or coarse-grained RNA structures and thus should be of interest to both developers and end-users of RNA structure prediction methods. The computer software and knowledge-based potentials are freely available at http://melolab.org/supmat.html.
- ItemComputer-based annotation of putative AraC/XylS-family transcription factors of known structure but unknown function(2012) Schüller, Andreas; Slater, A. W.; Norambuena, T.; Cifuentes, J. J.; Almonacid, L. I.; Melo Ledermann, Francisco JavierCurrently, about 20 crystal structures per day are released and deposited in the Protein Data Bank. A significant fraction of these structures is produced by research groups associated with the structural genomics consortium. The biological function of many of these proteins is generally unknown or not validated by experiment. Therefore, a growing need for functional prediction of protein structures has emerged. Here we present an integrated bioinformatics method that combines sequence-based relationships and three-dimensional (3D) structural similarity of transcriptional regulators with computer prediction of their cognate DNA binding sequences. We applied this method to the AraC/XylS family of transcription factors, which is a large family of transcriptional regulators found in many bacteria controlling the expression of genes involved in diverse biological functions. Three putative new members of this family with known 3D structure but unknown function were identified for which a probable functional classification is provided. Our bioinformatics analyses suggest that they could be involved in plant cell wall degradation (Lin2118 protein from Listeria innocua, PDB code 3oou), symbiotic nitrogen fixation (protein from Chromobacterium violaceum, PDB code 3oio), and either metabolism of plant-derived biomass or nitrogen fixation (protein from Rhodopseudomonas palustris, PDB code 3mn2).
- ItemdnaMATE: a consensus melting temperature prediction server for short DNA sequences(2005) Panjkovich, A.; Norambuena, T.; Melo Ledermann, Francisco JavierAbstract An accurate and robust large-scale melting temperature prediction server for short DNA sequences is dispatched. The server calculates a consensus melting temperature value using the nearest-neighbor model based on three independent thermodynamic data tables. The consensus method gives an accurate prediction of melting temperature, as it has been recently demonstrated in a benchmark performed using all available experimental data for DNA sequences within the length range of 16–30 nt. This constitutes the first web server that has been implemented to perform a large-scale calculation of melting temperatures in real time (up to 5000 DNA sequences can be submitted in a single run). The expected accuracy of calculations carried out by this server in the range of 50–600 mM monovalent salt concentration is that 89% of the melting temperature predictions will have an error or deviation of <5°C from experimental data. The server can be freely accessed at http://dna.bio.puc.cl/tm.html . The standalone executable versions of this software for LINUX, Macintosh and Windows platforms are also freely available at the same web site. Detailed further information supporting this server is available at the same web site referenced above.
- ItemProNA2020 predicts protein-DNA, protein-RNA, and protein-protein binding proteins and residues from sequence(2020) Qiu, J.; Bernhofer, M.; Heinzinger, M.; Kemper, S.; Norambuena, T.; Melo Ledermann, Francisco Javier; Rost, B.The intricate details of how proteins bind to proteins, DNA, and RNA are crucial for the understanding of almost all biological processes. Disease-causing sequence variants often affect binding residues. Here, we described a new, comprehensive system of in silico methods that take only protein sequence as input to predict binding of protein to DNA, RNA, and other proteins. Firstly, we needed to develop several new methods to predict whether or not proteins bind (per-protein prediction). Secondly, we developed independent methods that predict which residues bind (per-residue). Not requiring three-dimensional information, the system can predict the actual binding residue. The system combined homology-based inference with machine learning and motif-based profile-kernel approaches with word-based (ProtVec) solutions to machine learning protein level predictions. This achieved an overall non-exclusive three-state accuracy of 77% ± 1% (±one standard error) corresponding to a 1.8 fold improvement over random (best classification for protein–protein with F1 = 91 ± 0.8%). Standard neural networks for per-residue binding residue predictions appeared best for DNA-binding (Q2 = 81 ± 0.9%) followed by RNA-binding (Q2 = 80 ± 1%) and worst for protein–protein binding (Q2 = 69 ± 0.8%). The new method, dubbed ProNA2020, is available as code through github (https://github.com/Rostlab/ProNA2020.git) and through PredictProtein (www.predictprotein.org
- ItemSAGExplore: a web server for unambiguous tag mapping in serial analysis of gene expression oriented to gene discovery and annotation(2007) Norambuena, T.; Malig, R.; Melo Ledermann, Francisco JavierWe describe a web server for the accurate mapping of experimental tags in serial analysis of gene expression (SAGE). The core of the server relies on a database of genomic virtual tags built by a recently described method that attempts to reduce the amount of ambiguous assignments for those tags that are not unique in the genome. The method provides a complete annotation of potential virtual SAGE tags within a genome, along with an estimation of their confidence for experimental observation that ranks tags that present multiple matches in the genome. The output of the server consists of a table in HTML format that contains links to a graphic representation of the results and to some external servers and databases, facilitating the tasks of analysis of gene expression and gene discovery. Also, a table in tab delimited text format is produced, allowing the user to export the results into custom databases and software for further analysis. The current server version provides the most accurate and complete SAGE tag mapping source that is available for the yeast organism. In the near future, this server will also allow the accurate mapping of experimental SAGE-tags from other model organisms such as human, mouse, frog and fly. The server is freely available on the web at: http://dna.bio.puc.cl/SAGExplore.html.
- ItemStAR: a simple tool for the statistical comparison of ROC curves(2008) Vergara, I. A.; Norambuena, T.; Ferrada, E.; Slater, A. W.; Melo Ledermann, Francisco JavierBackground As in many different areas of science and technology, most important problems in bioinformatics rely on the proper development and assessment of binary classifiers. A generalized assessment of the performance of binary classifiers is typically carried out through the analysis of their receiver operating characteristic (ROC) curves. The area under the ROC curve (AUC) constitutes a popular indicator of the performance of a binary classifier. However, the assessment of the statistical significance of the difference between any two classifiers based on this measure is not a straightforward task, since not many freely available tools exist. Most existing software is either not free, difficult to use or not easy to automate when a comparative assessment of the performance of many binary classifiers is intended. This constitutes the typical scenario for the optimization of parameters when developing new classifiers and also for their performance validation through the comparison to previous art. Results In this work we describe and release new software to assess the statistical significance of the observed difference between the AUCs of any two classifiers for a common task estimated from paired data or unpaired balanced data. The software is able to perform a pairwise comparison of many classifiers in a single run, without requiring any expert or advanced knowledge to use it. The software relies on a non-parametric test for the difference of the AUCs that accounts for the correlation of the ROC curves. The results are displayed graphically and can be easily customized by the user. A human-readable report is generated and the complete data resulting from the analysis are also available for download, which can be used for further analysis with other software. The software is released as a web server that can be used in any client platform and also as a standalone application for the Linux operating system. Conclusion A new software for the statistical comparison of ROC curves is released here as a web server and also as standalone software for the LINUX operating system.
- ItemThe Protein-DNA Interface database(2010) Norambuena, T.; Melo Ledermann, Francisco JavierThe Protein-DNA Interface database (PDIdb) is a repository containing relevant structural information of Protein-DNA complexes solved by X-ray crystallography and available at the Protein Data Bank. The database includes a simple functional classification of the protein-DNA complexes that consists of three hierarchical levels: Class, Type and Subtype. This classification has been defined and manually curated by humans based on the information gathered from several sources that include PDB, PubMed, CATH, SCOP and COPS. The current version of the database contains only structures with resolution of 2.5 Å or higher, accounting for a total of 922 entries. The major aim of this database is to contribute to the understanding of the main rules that underlie the molecular recognition process between DNA and proteins. To this end, the database is focused on each specific atomic interface rather than on the separated binding partners. Therefore, each entry in this database consists of a single and independent protein-DNA interface. We hope that PDIdb will be useful to many researchers working in fields such as the prediction of transcription factor binding sites in DNA, the study of specificity determinants that mediate enzyme recognition events, engineering and design of new DNA binding proteins with distinct binding specificity and affinity, among others. Finally, due to its friendly and easy-to-use web interface, we hope that PDIdb will also serve educational and teaching purposes.
- ItemWebRASP: a server for computing energy scores to assess the accuracy and stability of RNA 3D structures(2013) Norambuena, T.; Cares, J. F.; Capriotti, E.; Melo Ledermann, Francisco JavierSummary: The understanding of the biological role of RNA molecules has changed. Although it is widely accepted that RNAs play important regulatory roles without necessarily coding for proteins, the functions of many of these non-coding RNAs are unknown. Thus, determining or modeling the 3D structure of RNA molecules as well as assessing their accuracy and stability has become of great importance for characterizing their functional activity. Here, we introduce a new web application, WebRASP, that uses knowledge-based potentials for scoring RNA structures based on distance-dependent pairwise atomic interactions. This web server allows the users to upload a structure in PDB format, select several options to visualize the structure and calculate the energy profile. The server contains online help, tutorials and links to other related resources. We believe this server will be a useful tool for predicting and assessing the quality of RNA 3D structures. Availability and implementation: The web server is available at http://melolab.org/webrasp. It has been tested on the most popular web browsers and requires Java plugin for Jmol visualization.