<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0"
  xmlns:trackback="http://madskills.com/public/xml/rss/module/trackback/">
  <channel>
    <title>Bioinformatics</title>
    <link>http://barf.jcowboy.org</link>
    <description>Bioinformatics recent publications</description>
    <language>en-us</language>
    <image>
      <url>http://barf.jcowboy.org/pubmed.gif</url>
      <title>the data for this feed is provided by PubMed</title>
      <link>http://barf.jcowboy.org</link>
    </image>
    <item>
      <title>An extended IUPAC nomenclature code for polymorphic nucleic acids.</title>
      <link>http://www.ncbi.nlm.nih.gov/entrez/query.fcgi?cmd=Retrieve&amp;db=PubMed&amp;dopt=Abstract&amp;list_uids=20202974</link>
      <description>Publication Date: 2010 Mar 3 PMID: 20202974&lt;br/&gt;Authors: Johnson, A. D.&lt;br/&gt;Journal: Bioinformatics&lt;br/&gt;&lt;br/&gt;The International Union of Pure and Applied Chemistry (IUPAC) code specified nearly 25 years ago provides a nomenclature for incompletely specified nucleic acids (Cornish-Bowden 1984). The IUPAC code has been applied in a wide-ranging manner, contributing to many biologically and chemically meaningful representations, including: 1) recognition sequences (e.g., restriction enzymes, protein and RNA binding sites, consensus signals), 2) codon degeneracy, 3) sequence base calling ambiguity 4) representation of ancestral states in phylogenetics and 5) to a vast extent in the fields of genetics and genomics in representing polymorphic nucleic acids, e.g., single nucleotide polymorphisms (SNPs). However, no system currently exists that allows for the informatics representation of the relative abundance at polymorphic nucleic acids (e.g., SNPs) in a single specified character, or a string of characters. Here I propose such an information code as a natural extension to the IUPAC nomenclature code, and present some potential uses and limitations to such a code. The original IUPAC code remains useful in all its previous applications and is also compatible as a subset of the extended code proposed here. The extended IUPAC code allows for new nucleic acid representations, in single characters or character strings, with potential applications in genetics, cross-species or cross-strain comparison, sequence alignment, bioinformatics, genome assembly, database design and querying, and chemical sequencing and synthesis. The primary anticipated use of this extended nomenclature code is to assist in the representation of the rapidly growing space of information on human genetic variation.&lt;br/&gt;&lt;br/&gt;post to: &lt;a href = &quot;http://www.citeulike.org/posturl?url=http%3A%2F%2Fwww.ncbi.nlm.nih.gov%2Fentrez%2Fquery.fcgi%3Fcmd%3DRetrieve%26db%3DPubMed%26dopt%3DAbstract%26list_uids%3D20202974&amp;title=Entrez+Pubmed&quot;&gt;CiteULike&lt;/a&gt;</description>
    </item>
    <item>
      <title>Statistical expression deconvolution from mixed tissue samples.</title>
      <link>http://www.ncbi.nlm.nih.gov/entrez/query.fcgi?cmd=Retrieve&amp;db=PubMed&amp;dopt=Abstract&amp;list_uids=20202973</link>
      <description>Publication Date: 2010 Mar 4 PMID: 20202973&lt;br/&gt;Authors: Clarke, J. - Seo, P. - Clarke, B.&lt;br/&gt;Journal: Bioinformatics&lt;br/&gt;&lt;br/&gt;MOTIVATION: Global expression patterns within cells are used for purposes ranging from the identification of disease biomarkers to basic understanding of cellular processes. Unfortunately tissue samples used in cancer studies are usually composed of multiple cell types and the non-cancerous portions can significantly affect expression profiles. This severely limits the conclusions that can be made about the specificity of gene expression in the cell-type of interest. However, statistical analysis can be used to identify differentially expressed genes that are related to the biological question being studied. RESULTS: We propose a statistical approach to expression deconvolution from mixed tissue samples in which the proportion of each component cell type is unknown. Our method estimates the proportion of each component in a mixed tissue sample; this estimate can be used to provide estimates of gene expression from each component. We demonstrate our technique on xenograft samples from breast cancer research and publicly available experimental data sets found in the National Center for Biotechnology Information (NCBI) Gene Expression Omnibus (GEO) repository. AVAILABILITY: R code (http://www.r-project.org/) for estimating sample proportions is freely available to non-commercial users and available at http://www.med.miami.edu/medicine/x2691.xml CONTACT: jclarke@med.miami.edu; pseo@med.miami.edu; bclarke2@med.miami.edu.&lt;br/&gt;&lt;br/&gt;post to: &lt;a href = &quot;http://www.citeulike.org/posturl?url=http%3A%2F%2Fwww.ncbi.nlm.nih.gov%2Fentrez%2Fquery.fcgi%3Fcmd%3DRetrieve%26db%3DPubMed%26dopt%3DAbstract%26list_uids%3D20202973&amp;title=Entrez+Pubmed&quot;&gt;CiteULike&lt;/a&gt;</description>
    </item>
    <item>
      <title>Cross-species common regulatory network inference without requirement for prior gene affiliation.</title>
      <link>http://www.ncbi.nlm.nih.gov/entrez/query.fcgi?cmd=Retrieve&amp;db=PubMed&amp;dopt=Abstract&amp;list_uids=20200011</link>
      <description>Publication Date: 2010 Mar 2 PMID: 20200011&lt;br/&gt;Authors: Moghaddas Gholami, A. - Fellenberg, K.&lt;br/&gt;Journal: Bioinformatics&lt;br/&gt;&lt;br/&gt;MOTIVATION: Cross-species meta-analyses of microarray data usually require prior affiliation of genes based on orthology information that often relies on sequence similarity. RESULTS: We present an algorithm merging microarray datasets on the basis of co-expression alone, without any requirement for orthology information to affiliate genes. Combining existing methods such as co-inertia analysis, back-transformation, Hungarian matching, and majority voting in an iterative non-greedy hill-climbing approach, it affiliates arrays and genes at the same time, maximizing the co-structure between the datasets. To introduce the method, we demonstrate its performance on two closely and two distantly related datasets of different experimental context and produced on different platforms. Each pair stems from two different species. The resulting cross-species dynamic Bayesian gene networks improve on the networks inferred from each dataset alone by yielding more significant network motifs, as well as more of the interactions already recorded in KEGG and other databases. Also, it is shown that our algorithm converges on the optimal number of nodes for network inference. Being readily extendable to more than two datasets, it provides the opportunity to infer extensive gene regulatory networks. Availability and Implementation: Source code (MATLAB and R) freely available for download at http://www.mchips.org/supplements/moghaddasi_source.tgz CONTACT: kurt@tum.de SUPPLEMENTARY INFORMATION: Supplementary data are available at http://www.mchips.org/supplements/moghaddasi_supp.pdf.&lt;br/&gt;&lt;br/&gt;post to: &lt;a href = &quot;http://www.citeulike.org/posturl?url=http%3A%2F%2Fwww.ncbi.nlm.nih.gov%2Fentrez%2Fquery.fcgi%3Fcmd%3DRetrieve%26db%3DPubMed%26dopt%3DAbstract%26list_uids%3D20200011&amp;title=Entrez+Pubmed&quot;&gt;CiteULike&lt;/a&gt;</description>
    </item>
    <item>
      <title>PaperMaker: Validation of biomedical scientific publications.</title>
      <link>http://www.ncbi.nlm.nih.gov/entrez/query.fcgi?cmd=Retrieve&amp;db=PubMed&amp;dopt=Abstract&amp;list_uids=20200010</link>
      <description>Publication Date: 2010 Mar 3 PMID: 20200010&lt;br/&gt;Authors: Rebholz-Schuhmann, D. - Kavaliauskas, S. - Pezik, P.&lt;br/&gt;Journal: Bioinformatics&lt;br/&gt;&lt;br/&gt;MOTIVATION: The automatic analysis of scientific literature can support authors in writing their manuscripts. Implementation: PaperMaker is a novel IT solution that receives a scientific manuscript via a Web interface, automatically analyses the publication, evaluates consistency parameters and interactively delivers feedback to the author. It analyses the proper use of acronyms and their definitions, and the use of specialized terminology. It provides GO and MeSH categorization of text passages, the retrieval of relevant publications from public scientific literature repositories, and the identification of missing or unused references. RESULT: The author receives a summary of findings, the manuscript in its corrected form and a digital abstract containing the GO and MeSH annotations in the NLM/PubMed format. AVAILABILITY: http://www.ebi.ac.uk/Rebholz-srv/PaperMaker.&lt;br/&gt;&lt;br/&gt;post to: &lt;a href = &quot;http://www.citeulike.org/posturl?url=http%3A%2F%2Fwww.ncbi.nlm.nih.gov%2Fentrez%2Fquery.fcgi%3Fcmd%3DRetrieve%26db%3DPubMed%26dopt%3DAbstract%26list_uids%3D20200010&amp;title=Entrez+Pubmed&quot;&gt;CiteULike&lt;/a&gt;</description>
    </item>
    <item>
      <title>Modeling Sample Variables with an Experimental Factor Ontology.</title>
      <link>http://www.ncbi.nlm.nih.gov/entrez/query.fcgi?cmd=Retrieve&amp;db=PubMed&amp;dopt=Abstract&amp;list_uids=20200009</link>
      <description>Publication Date: 2010 Mar 3 PMID: 20200009&lt;br/&gt;Authors: Malone, J. - Holloway, E. - Adamusiak, T. - Kapushesky, M. - Zheng, J. - Kolesnikov, N. - Zhukova, A. - Brazma, A. - Parkinson, H.&lt;br/&gt;Journal: Bioinformatics&lt;br/&gt;&lt;br/&gt;MOTIVATION: Describing biological sample variables with ontologies is complex due to the cross-domain nature of experiments. Ontologies provide annotation solutions, however, for cross-domain investigations, multiple ontologies are needed to represent the data. These are subject to rapid change, are often not interoperable and present complexities that are a barrier to biological resource users. RESULTS: We present the Experimental Factor Ontology (EFO), designed to meet cross-domain, application focused use cases for gene expression data. We describe our methodology and open source tools used to create the ontology. These include tools for creating ontology mappings, ontology views, detecting ontology changes and using ontologies in interfaces to enhance querying. The application of reference ontologies to data is a key problem and this work presents guidelines on how community ontologies can be presented in an application ontology in a data driven way. AVAILABILITY: http://www.ebi.ac.uk/efo CONTACT: malone@ebi.ac.uk.&lt;br/&gt;&lt;br/&gt;post to: &lt;a href = &quot;http://www.citeulike.org/posturl?url=http%3A%2F%2Fwww.ncbi.nlm.nih.gov%2Fentrez%2Fquery.fcgi%3Fcmd%3DRetrieve%26db%3DPubMed%26dopt%3DAbstract%26list_uids%3D20200009&amp;title=Entrez+Pubmed&quot;&gt;CiteULike&lt;/a&gt;</description>
    </item>
    <item>
      <title>Inferring dynamic gene networks under varying conditions for transcriptomic network comparison.</title>
      <link>http://www.ncbi.nlm.nih.gov/entrez/query.fcgi?cmd=Retrieve&amp;db=PubMed&amp;dopt=Abstract&amp;list_uids=20197286</link>
      <description>Publication Date: 2010 Mar 1 PMID: 20197286&lt;br/&gt;Authors: Shimamura, T. - Imoto, S. - Yamaguchi, R. - Nagasaki, M. - Miyano, S.&lt;br/&gt;Journal: Bioinformatics&lt;br/&gt;&lt;br/&gt;MOTIVATION: Elucidating the differences between cellular responses to various biological conditions or external stimuli is an important challenge in systems biology. Many approaches have been developed to reverse-engineer a cellular system, called gene network, from time-series microarray data in order to understand a transcriptomic response under a condition of interest. Comparative topological analysis has also been applied based on the gene networks inferred independently from each of the multiple time-series datasets under varying conditions to find critical differences between these networks. However, these comparisons often lead to misleading results, because each network contains considerable noise due to the limited length of the time-series. RESULTS: We propose an integrated approach for inferring multiple gene networks from time-series expression data under varying conditions. To the best of our knowledge, our approach is the first reverse-engineering method that is intended for transcriptomic network comparison between varying conditions. Furthermore, we propose a state-of-the-art parameter estimation method, relevanceweighted recursive elastic net, for providing higher precision and recall than existing reverse-engineering methods. We analyze experimental data of MCF-7 human breast cancer cells stimulated by EGF or HRG with several doses and provide novel biological hypotheses through network comparison. AVAILABILITY: The software NETCOMP is available at http://bonsai.ims.u-tokyo.ac.jp/~shima/NETCOMP/. CONTACT: shima@ims.u-tokyo.ac.jp SUPPLEMENTARY INFORMATION: All supplementary information can be accessed online.&lt;br/&gt;&lt;br/&gt;post to: &lt;a href = &quot;http://www.citeulike.org/posturl?url=http%3A%2F%2Fwww.ncbi.nlm.nih.gov%2Fentrez%2Fquery.fcgi%3Fcmd%3DRetrieve%26db%3DPubMed%26dopt%3DAbstract%26list_uids%3D20197286&amp;title=Entrez+Pubmed&quot;&gt;CiteULike&lt;/a&gt;</description>
    </item>
    <item>
      <title>ScripTree: scripting phylogenetic graphics.</title>
      <link>http://www.ncbi.nlm.nih.gov/entrez/query.fcgi?cmd=Retrieve&amp;db=PubMed&amp;dopt=Abstract&amp;list_uids=20194627</link>
      <description>Publication Date: 2010 Mar 1 PMID: 20194627&lt;br/&gt;Authors: Chevenet, F. - Croce, O. - Hebrard, M. - Christen, R. - Berry, V.&lt;br/&gt;Journal: Bioinformatics&lt;br/&gt;&lt;br/&gt;SUMMARY: There is a large amount of tools for interactive display of phylogenetic trees. However, there is a shortage of tools for the automation of tree rendering. Scripting phylogenetic graphics would enable the saving of graphical analyses involving numerous and complex tree handling operations and would allow the automation of repetitive tasks. ScripTree is a tool intended to fill this gap. It is an interpreter to be used in batch mode. Phylogenetic graphics instructions, related to tree rendering as well as tree annotation, are stored in a text file and processed in a sequential way. AVAILABILITY: ScriptTree can be used online or downloaded at www.scriptree.org, under the GPL license. Implementation: ScriptTree is written in Tcl/Tk is a cross-platform application available for Windows and Unix-like systems including OS X. It can be used either as a standalone package or included in a bioinformatic pipeline and linked to a HTTP server. CONTACT: chevenet@ird.fr.&lt;br/&gt;&lt;br/&gt;post to: &lt;a href = &quot;http://www.citeulike.org/posturl?url=http%3A%2F%2Fwww.ncbi.nlm.nih.gov%2Fentrez%2Fquery.fcgi%3Fcmd%3DRetrieve%26db%3DPubMed%26dopt%3DAbstract%26list_uids%3D20194627&amp;title=Entrez+Pubmed&quot;&gt;CiteULike&lt;/a&gt;</description>
    </item>
    <item>
      <title>An Ergatis-based Prokaryotic Genome Annotation Web Server.</title>
      <link>http://www.ncbi.nlm.nih.gov/entrez/query.fcgi?cmd=Retrieve&amp;db=PubMed&amp;dopt=Abstract&amp;list_uids=20194626</link>
      <description>Publication Date: 2010 Mar 1 PMID: 20194626&lt;br/&gt;Authors: Hemmerich, C. - Buechlein, A. - Podicheti, R. - Revanna, K. V. - Dong, Q.&lt;br/&gt;Journal: Bioinformatics&lt;br/&gt;&lt;br/&gt;SUMMARY: Ergatis is a flexible workflow management system for designing and executing complex bioinformatics pipelines. However, its complexity restricts its usage to only highly skilled bioinformaticians. We have developed a web-based prokaryotic genome annotation server, Integrative Services for Genomics Analysis (ISGA), which builds upon the Ergatis workflow system, integrates other dynamic analysis tools, and provides intuitive web interfaces for biologists to customize and execute their own annotation pipelines. ISGA is designed to be installed at genomics core facilities and be used directly by biologists. AVAILABILITY: ISGA is accessible at http://isga.cgb.indiana.edu/ and the system is also freely available for local installation. CONTACT: Qunfeng.Dong@unt.edu SUPPLEMENTARY INFORMATION: Supplementary figures are available at Bioinformatics online.&lt;br/&gt;&lt;br/&gt;post to: &lt;a href = &quot;http://www.citeulike.org/posturl?url=http%3A%2F%2Fwww.ncbi.nlm.nih.gov%2Fentrez%2Fquery.fcgi%3Fcmd%3DRetrieve%26db%3DPubMed%26dopt%3DAbstract%26list_uids%3D20194626&amp;title=Entrez+Pubmed&quot;&gt;CiteULike&lt;/a&gt;</description>
    </item>
    <item>
      <title>SDOP-DB: A comparative standardised-protocol database for mouse phenotypic analyses.</title>
      <link>http://www.ncbi.nlm.nih.gov/entrez/query.fcgi?cmd=Retrieve&amp;db=PubMed&amp;dopt=Abstract&amp;list_uids=20194625</link>
      <description>Publication Date: 2010 Mar 1 PMID: 20194625&lt;br/&gt;Authors: Tanaka, N. - Waki, K. - Kaneda, H. - Suzuki, T. - Yamada, I. - Furuse, T. - Kobayashi, K. - Motegi, H. - Toki, H. - Inoue, M. - Minowa, O. - Noda, T. - Takao, K. - Miyakawa, T. - Takahashi, A. - Koide, T. - Wakana, S. - Masuya, H.&lt;br/&gt;Journal: Bioinformatics&lt;br/&gt;&lt;br/&gt;SUMMARY: This paper reports the development of SDOP-DB, which can provide definite, detailed, and easy comparison of experimental protocols used in mouse phenotypic analyses among institutes or laboratories. Because SDOP-DB is fully compliant with international standards, it can act as a practical foundation for international sharing and integration of mouse phenotypic information. AVAILABILITY: SDOP-DB (http://www.brc.riken.jp/lab/bpmp/SDOP/) CONTACT: knowledge-base@brc.riken.jp SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online.&lt;br/&gt;&lt;br/&gt;post to: &lt;a href = &quot;http://www.citeulike.org/posturl?url=http%3A%2F%2Fwww.ncbi.nlm.nih.gov%2Fentrez%2Fquery.fcgi%3Fcmd%3DRetrieve%26db%3DPubMed%26dopt%3DAbstract%26list_uids%3D20194625&amp;title=Entrez+Pubmed&quot;&gt;CiteULike&lt;/a&gt;</description>
    </item>
    <item>
      <title>Feature-incorporated alignment based ligand-binding residue prediction for carbohydrate binding modules.</title>
      <link>http://www.ncbi.nlm.nih.gov/entrez/query.fcgi?cmd=Retrieve&amp;db=PubMed&amp;dopt=Abstract&amp;list_uids=20190251</link>
      <description>Publication Date: 2010 Feb 26 PMID: 20190251&lt;br/&gt;Authors: Chou, W. Y. - Chou, W. I. - Pai, T. W. - Lin, S. C. - Jiang, T. Y. - Tang, C. Y. - Chang, M. D.&lt;br/&gt;Journal: Bioinformatics&lt;br/&gt;&lt;br/&gt;MOTIVATION: Carbohydrate binding modules (CBMs) share similar secondary and tertiary topology, but their primary sequence identity is low. Computational identification of ligand-binding residues allows biologists to better understand the protein-carbohydrate binding mechanism. In general functional characterization can be alternatively solved by alignment-based manners. As alignment accuracy based on conventional methods is often sensitive to sequence identity, low sequence identity among query sequences makes it difficult to precisely locate small portions of relevant features. Therefore, we propose a feature-incorporated alignment (FIA) to flexibly align conserved signatures in CBMs. Then, an FIA-based target-template prediction model was further implemented to identify functional ligand-binding residues. RESULTS: Arabidopsis thaliana CBM45 and CBM53 were used to validate the FIA-based prediction model. The predicted ligand-binding residues residing on the surface in the hypothetical structures were verified to be ligand-binding residues. In the absence of three dimensional structural information, FIA demonstrated significant improvement in the estimation of sequence similarity and identity for a total of 808 sequences from 11 different CBM families as compared with six leading tools by Friedman rank test. CONTACT: dtchang@life.nthu.edu.tw.&lt;br/&gt;&lt;br/&gt;post to: &lt;a href = &quot;http://www.citeulike.org/posturl?url=http%3A%2F%2Fwww.ncbi.nlm.nih.gov%2Fentrez%2Fquery.fcgi%3Fcmd%3DRetrieve%26db%3DPubMed%26dopt%3DAbstract%26list_uids%3D20190251&amp;title=Entrez+Pubmed&quot;&gt;CiteULike&lt;/a&gt;</description>
    </item>
    <item>
      <title>High Quality SNP Calling Using Illumina Data at Shallow Coverage.</title>
      <link>http://www.ncbi.nlm.nih.gov/entrez/query.fcgi?cmd=Retrieve&amp;db=PubMed&amp;dopt=Abstract&amp;list_uids=20190250</link>
      <description>Publication Date: 2010 Feb 26 PMID: 20190250&lt;br/&gt;Authors: Malhis, N. - Jones, S. J.&lt;br/&gt;Journal: Bioinformatics&lt;br/&gt;&lt;br/&gt;MOTIVATION: Detection of single nucleotide polymorphisms (SNPs) has been a major application in processing second generation sequencing (SGS) data. In principle, SNPs are called on single base differences between a reference genome and a sequence generated from SGS short reads of a sample genome. However, this exercise is far from trivial; several parameters related to sequencing quality, and/or reference genome properties, play essential effect on the accuracy of called SNPs especially at shallow coverage data. In this work, we present Slider II, an alignment and SNP calling approach that demonstrates improved algorithmic approaches enabling larger number of called SNPs with lower false positive rate. In addition to the regular alignment and SNP calling, as an optional feature, Slider II is capable of utilizing information about known SNPs of a target genome, as priors, in the alignment and SNPs calling to enhance it's capability of detecting these known SNPs and novel SNPs and mutations in their vicinity. CONTACT: nmalhis@bcgsc.ca Supplementary information and availability: http://www.bcgsc.ca/platform/bioinfo/software/SliderII.&lt;br/&gt;&lt;br/&gt;post to: &lt;a href = &quot;http://www.citeulike.org/posturl?url=http%3A%2F%2Fwww.ncbi.nlm.nih.gov%2Fentrez%2Fquery.fcgi%3Fcmd%3DRetrieve%26db%3DPubMed%26dopt%3DAbstract%26list_uids%3D20190250&amp;title=Entrez+Pubmed&quot;&gt;CiteULike&lt;/a&gt;</description>
    </item>
    <item>
      <title>CPFP: A central proteomics facilities pipeline.</title>
      <link>http://www.ncbi.nlm.nih.gov/entrez/query.fcgi?cmd=Retrieve&amp;db=PubMed&amp;dopt=Abstract&amp;list_uids=20189941</link>
      <description>Publication Date: 2010 Feb 25 PMID: 20189941&lt;br/&gt;Authors: Trudgian, D. C. - Thomas, B. - McGowan, S. J. - Kessler, B. M. - Salek, M. - Acuto, O.&lt;br/&gt;Journal: Bioinformatics&lt;br/&gt;&lt;br/&gt;SUMMARY: The Central Proteomics Facilities Pipeline (CPFP) provides identification, validation, and quantitation of peptides and proteins from LC-MS/MS datasets through an easy to use web interface. It is the first analysis pipeline targeted specifically at the needs of proteomics core facilities, reducing the data-analysis load on staff, and allowing facility clients to easily access and work with their data. Identification of peptides is performed using multiple search engines, their output combined and validated using state-of-the-art techniques for improved results. Cluster execution of jobs allows analysis capacity to be increased easily as demand grows. AVAILABILITY: Released under the Common Development and Distribution License (CDDL) at http://cpfp.sourceforge.net/. Demonstration available at https://cpfp-master.molbiol.ox.ac.uk/cpfp_demo CONTACT: dctrud@ccmp.ox.ac.uk.&lt;br/&gt;&lt;br/&gt;post to: &lt;a href = &quot;http://www.citeulike.org/posturl?url=http%3A%2F%2Fwww.ncbi.nlm.nih.gov%2Fentrez%2Fquery.fcgi%3Fcmd%3DRetrieve%26db%3DPubMed%26dopt%3DAbstract%26list_uids%3D20189941&amp;title=Entrez+Pubmed&quot;&gt;CiteULike&lt;/a&gt;</description>
    </item>
    <item>
      <title>A min-cut Algorithm for the Consistency Problem in Multiple Sequence Alignment.</title>
      <link>http://www.ncbi.nlm.nih.gov/entrez/query.fcgi?cmd=Retrieve&amp;db=PubMed&amp;dopt=Abstract&amp;list_uids=20189940</link>
      <description>Publication Date: 2010 Feb 25 PMID: 20189940&lt;br/&gt;Authors: Corel, E. - Pitschi, F. - Morgenstern, B.&lt;br/&gt;Journal: Bioinformatics&lt;br/&gt;&lt;br/&gt;MOTIVATION: Multiple sequence alignments can be constructed on the basis of pairwise local sequence similarities. This approach is rather flexible and can combine the advantages of global and local alignment methods. The restriction to pairwise alignments as building blocks, however, can lead to misalignments since weak homologies may be missed if only pairs of sequences are compared. RESULTS: Herein, we propose a graph-theoretical approach to find local multiple sequence similarities. Starting with pairwise alignements produced by DIALIGN, we use a min-cut algorithm to find potential (partial) alignment columns that we use to construct a final multiple alignment. On real and simulated benchmark data, our approach consistently outperforms the standard version of DIALIGN where local pairwise alignments are greedily incorporated into a multiple alignment. AVAILABILITY: The prototype is freely available under GNU Public Licence from the first author. CONTACT: ecorel@gwdg.de.&lt;br/&gt;&lt;br/&gt;post to: &lt;a href = &quot;http://www.citeulike.org/posturl?url=http%3A%2F%2Fwww.ncbi.nlm.nih.gov%2Fentrez%2Fquery.fcgi%3Fcmd%3DRetrieve%26db%3DPubMed%26dopt%3DAbstract%26list_uids%3D20189940&amp;title=Entrez+Pubmed&quot;&gt;CiteULike&lt;/a&gt;</description>
    </item>
    <item>
      <title>BioNet: an R-Package for the Functional Analysis of Biological Networks.</title>
      <link>http://www.ncbi.nlm.nih.gov/entrez/query.fcgi?cmd=Retrieve&amp;db=PubMed&amp;dopt=Abstract&amp;list_uids=20189939</link>
      <description>Publication Date: 2010 Feb 25 PMID: 20189939&lt;br/&gt;Authors: Beisser, D. - Klau, G. W. - Dandekar, T. - Mueller, T. - Dittrich, M.&lt;br/&gt;Journal: Bioinformatics&lt;br/&gt;&lt;br/&gt;MOTIVATION: Increasing quantity and quality of data in transcriptomics and interactomics create the need for integrative approaches to network analysis. Here we present a comprehensive R-package for the analysis of biological networks including an exact and a heuristic approach to identify functional modules. RESULTS: The BioNet package provides an extensive framework for integrated network analysis in R. This includes the statistics for the integration of transcriptomic and functional data with biological networks, the scoring of nodes as well as methods for network search and visualization. AVAILABILITY: The BioNet package and a tutorial are available from http://bionet.bioapps.biozentrum.uni-wuerzburg.de. CONTACT: marcus.dittrich@biozentrum.uni-wuerzburg.de, tobias.mueller@biozentrum.uni-wuerzburg.de.&lt;br/&gt;&lt;br/&gt;post to: &lt;a href = &quot;http://www.citeulike.org/posturl?url=http%3A%2F%2Fwww.ncbi.nlm.nih.gov%2Fentrez%2Fquery.fcgi%3Fcmd%3DRetrieve%26db%3DPubMed%26dopt%3DAbstract%26list_uids%3D20189939&amp;title=Entrez+Pubmed&quot;&gt;CiteULike&lt;/a&gt;</description>
    </item>
    <item>
      <title>Post-hoc power estimation in large-scale multiple testing problems.</title>
      <link>http://www.ncbi.nlm.nih.gov/entrez/query.fcgi?cmd=Retrieve&amp;db=PubMed&amp;dopt=Abstract&amp;list_uids=20189938</link>
      <description>Publication Date: 2010 Feb 25 PMID: 20189938&lt;br/&gt;Authors: Zehetmayer, S. - Posch, M.&lt;br/&gt;Journal: Bioinformatics&lt;br/&gt;&lt;br/&gt;BACKGROUND: The statistical power or multiple Type II error rate in large scale multiple testing problems as, for example, in gene expression microarray experiments, depends on typically unknown parameters and is therefore difficult to assess a priori. However, it has been suggested to estimate the multiple Type II error rate post-hoc, based on the observed data. METHODS: We consider a class of post-hoc estimators that are functions of the estimated proportion of true null hypotheses among all hypotheses. Numerous estimators for this proportion have been proposed and we investigate the statistical properties of the derived multiple Type II error rate estimators in an extensive simulation study. RESULTS: The performance of the estimators in terms of the mean squared error depends sensitively on the distributional scenario. Estimators based on empirical distributions of the null hypotheses are superior in the presence of strongly correlated test statistics. AVAILABILITY: R-code (R Development Core Team, 2008) to compute all considered estimators based on p-values and supplementary material is available from the authors web page http://statistics.msi.meduniwien.ac.at/index.php?page=pageszfnr CONTACT: martin.posch@meduniwien.ac.at.&lt;br/&gt;&lt;br/&gt;post to: &lt;a href = &quot;http://www.citeulike.org/posturl?url=http%3A%2F%2Fwww.ncbi.nlm.nih.gov%2Fentrez%2Fquery.fcgi%3Fcmd%3DRetrieve%26db%3DPubMed%26dopt%3DAbstract%26list_uids%3D20189938&amp;title=Entrez+Pubmed&quot;&gt;CiteULike&lt;/a&gt;</description>
    </item>
    <item>
      <title>Exploiting sequence similarity to validate the sensitivity of SNP arrays in detecting fine-scaled copy number variations.</title>
      <link>http://www.ncbi.nlm.nih.gov/entrez/query.fcgi?cmd=Retrieve&amp;db=PubMed&amp;dopt=Abstract&amp;list_uids=20189937</link>
      <description>Publication Date: 2010 Feb 25 PMID: 20189937&lt;br/&gt;Authors: Wong, G. - Leckie, C. - Gorringe, K. L. - Haviv, I. - Campbell, I. G. - Kowalczyk, A.&lt;br/&gt;Journal: Bioinformatics&lt;br/&gt;&lt;br/&gt;MOTIVATION: High-density single nucleotide polymorphism (SNP) genotyping arrays are efficient and cost effective platforms for the detection of copy number variation. To ensure accuracy in probe synthesis and to minimise production costs, short oligonucleotide probe sequences are used. The use of short probe sequences limits the specificity of binding targets in the human genome. The specificity of these short probeset sequences has yet to be fully analysed against a normal reference human genome. Sequence similarity can artificially elevate or suppress copy number measurements and hence reduce the reliability of affected probe readings. For the purpose of detecting narrow copy number variations reliably down to the width of a single probeset, sequence similarity is an important issue that needs to be addressed. RESULTS: We surveyed the Affymetrix Human Mapping SNP arrays for probeset sequence similarity against the reference human genome. Utilising sequence similarity results, we identified a collection of fine-scaled putative copy number variations between gender from autosomal probesets whose sequence matches various loci on the sex chromosomes. To detect these variations, we utilised our statistical approach, DRECS, and showed that its performance was superior and more stable than the t-test in detecting copy number variations. Through the application of DRECS on the HapMap population datasets with multi-matching probesets filtered, we identified biologically relevant SNPs in aberrant regions across populations with known association to physical traits, such as height, covered by the span of a single probe. This provided empirical confirmation of the existence of naturally occurring narrow copy number variations as well as the sensitivity of the Affymetrix SNP array technology in detecting them. AVAILABILITY: The MATLAB implementation of DRECS is available at http://ww2.cs.mu.oz.au/~gwong/DRECS/index.html CONTACT: gwong@csse.unimelb.edu.au.&lt;br/&gt;&lt;br/&gt;post to: &lt;a href = &quot;http://www.citeulike.org/posturl?url=http%3A%2F%2Fwww.ncbi.nlm.nih.gov%2Fentrez%2Fquery.fcgi%3Fcmd%3DRetrieve%26db%3DPubMed%26dopt%3DAbstract%26list_uids%3D20189937&amp;title=Entrez+Pubmed&quot;&gt;CiteULike&lt;/a&gt;</description>
    </item>
    <item>
      <title>ESBTL: efficient PDB parser and data structure for the structural and geometric analysis of biological macromolecules.</title>
      <link>http://www.ncbi.nlm.nih.gov/entrez/query.fcgi?cmd=Retrieve&amp;db=PubMed&amp;dopt=Abstract&amp;list_uids=20185407</link>
      <description>Publication Date: 2010 Feb 24 PMID: 20185407&lt;br/&gt;Authors: Loriot, S. - Cazals, F. - Bernauer, J.&lt;br/&gt;Journal: Bioinformatics&lt;br/&gt;&lt;br/&gt;SUMMARY: The ever increasing number of structural biological data calls for robust and efficient software for analysis. ESBTL (Easy Structural Biology Template Library) is a lightweight C++ library that allows the handling of PDB data and provides a data structure suitable for geometric constructions and analyses. The parser and data model provided by this ready-to-use include-only library allows adequate treatment of usually discarded information (insertion code, atom occupancy...) while still being able to detect badly formatted files. The template-based structure allows rapid design of new computational structural biology applications and is fully compatible with the new remediated PDB archive format. It also allows the code to be easy-to-use while being versatile enough to allow advanced user developments. AVAILABILITY: ESBTL is freely available under the GNU General Public License from http://esbtl.sf.net. The website provides the source code, examples, code snippets, and documentation. CONTACT: julie.bernauer@inria.fr.&lt;br/&gt;&lt;br/&gt;post to: &lt;a href = &quot;http://www.citeulike.org/posturl?url=http%3A%2F%2Fwww.ncbi.nlm.nih.gov%2Fentrez%2Fquery.fcgi%3Fcmd%3DRetrieve%26db%3DPubMed%26dopt%3DAbstract%26list_uids%3D20185407&amp;title=Entrez+Pubmed&quot;&gt;CiteULike&lt;/a&gt;</description>
    </item>
    <item>
      <title>Biased hosting of intronic microRNA genes.</title>
      <link>http://www.ncbi.nlm.nih.gov/entrez/query.fcgi?cmd=Retrieve&amp;db=PubMed&amp;dopt=Abstract&amp;list_uids=20185406</link>
      <description>Publication Date: 2010 Feb 24 PMID: 20185406&lt;br/&gt;Authors: Golan, D. - Levy, C. - Friedman, B. - Shomron, N.&lt;br/&gt;Journal: Bioinformatics&lt;br/&gt;&lt;br/&gt;MOTIVATION: MicroRNAs (miRNAs) are involved in an abundant class of post-transcriptional regulation activated through binding to the 3' UTR of mRNAs. The current wealth of mammalian miRNA genes results mostly from genomic duplication events. Many of these events are located within introns of transcriptional units. In order to better understand the genomic expansion of miRNA genes we investigated the distribution of intronic miRNAs. RESULTS: We observe that miRNA genes are hosted within introns of short genes much larger then expected by chance. Implementation: We explore several explanations for this phenomenon and conclude that miRNA integration into short genes might be evolutionary favorable due to interaction with the pre-mRNA splicing mechanism. CONTACT: Noam Shomron, nshomron@post.tau.ac.il SUPPLEMENTARY INFORMATION: online.&lt;br/&gt;&lt;br/&gt;post to: &lt;a href = &quot;http://www.citeulike.org/posturl?url=http%3A%2F%2Fwww.ncbi.nlm.nih.gov%2Fentrez%2Fquery.fcgi%3Fcmd%3DRetrieve%26db%3DPubMed%26dopt%3DAbstract%26list_uids%3D20185406&amp;title=Entrez+Pubmed&quot;&gt;CiteULike&lt;/a&gt;</description>
    </item>
    <item>
      <title>SPICi: a fast clustering algorithm for large biological networks.</title>
      <link>http://www.ncbi.nlm.nih.gov/entrez/query.fcgi?cmd=Retrieve&amp;db=PubMed&amp;dopt=Abstract&amp;list_uids=20185405</link>
      <description>Publication Date: 2010 Feb 24 PMID: 20185405&lt;br/&gt;Authors: Jiang, P. - Singh, M.&lt;br/&gt;Journal: Bioinformatics&lt;br/&gt;&lt;br/&gt;MOTIVATION: Clustering algorithms play an important role in the analysis of biological networks, and can be used to uncover functional modules and obtain hints about cellular organization. While most available clustering algorithms work well on biological networks of moderate size, such as the yeast protein physical interaction network, they either fail or are too slow in practice for larger networks, such as functional networks for higher eukaryotes. Since an increasing number of larger biological networks are being determined, the limitations of current clustering approaches curtail the types of biological network analyses that can be performed. RESULTS: We present a fast local network clustering algorithm SPICi. SPICi runs in time O(V log V +E) and space O(E), where V and E are the number of vertices and edges in the network, respectively. We evaluate SPICi's performance on several existing protein interaction networks of varying size, and compare SPICi to nine previous approaches for clustering biological networks. We show that SPICi is typically several orders of magnitude faster than previous approaches and is the only one that can successfully cluster all test networks within very short time. We demonstrate that SPICi has state-of-the-art performance with respect to the quality of the clusters it uncovers, as judged by its ability to recapitulate protein complexes and functional modules. Finally, we demonstrate the power of our fast network clustering algorithm by applying SPICi across hundreds of large context-specific human networks, and identifying modules specific for single conditions. AVAILABILITY: Source code is available under the GNU Public License at http://compbio.cs.princeton.edu/spici CONTACT: mona@cs.princeton.edu.&lt;br/&gt;&lt;br/&gt;post to: &lt;a href = &quot;http://www.citeulike.org/posturl?url=http%3A%2F%2Fwww.ncbi.nlm.nih.gov%2Fentrez%2Fquery.fcgi%3Fcmd%3DRetrieve%26db%3DPubMed%26dopt%3DAbstract%26list_uids%3D20185405&amp;title=Entrez+Pubmed&quot;&gt;CiteULike&lt;/a&gt;</description>
    </item>
    <item>
      <title>Genomicus: a database and a browser to study gene synteny in modern and ancestral genomes.</title>
      <link>http://www.ncbi.nlm.nih.gov/entrez/query.fcgi?cmd=Retrieve&amp;db=PubMed&amp;dopt=Abstract&amp;list_uids=20185404</link>
      <description>Publication Date: 2010 Feb 24 PMID: 20185404&lt;br/&gt;Authors: Muffato, M. - Louis, A. - Poisnel, C. E. - Roest Crollius, H.&lt;br/&gt;Journal: Bioinformatics&lt;br/&gt;&lt;br/&gt;SUMMARY: Comparative genomics remains a pivotal strategy to study the evolution of gene organization, and this primacy is reinforced by the growing number of full genome sequences available in public repositories. Despite this growth, bioinformatic tools available to visualize and compare genomes, and to infer evolutionary events remain restricted to two or three genomes at a time, thus limiting the breadth and the nature of the question that can be investigated. Here we present Genomicus, a new synteny browser that can represent and compare unlimited numbers of genomes in a broad phylogenetic view. In addition, Genomicus includes reconstructed ancestral gene organization, thus greatly facilitating the interpretation of the data. AVAILABILITY: Genomicus is freely available for online use at http://www.dyogen.ens.fr/genomicus while data can be downloaded at ftp://ftp.biologie.ens.fr/pub/dyogen/genomicus.&lt;br/&gt;&lt;br/&gt;post to: &lt;a href = &quot;http://www.citeulike.org/posturl?url=http%3A%2F%2Fwww.ncbi.nlm.nih.gov%2Fentrez%2Fquery.fcgi%3Fcmd%3DRetrieve%26db%3DPubMed%26dopt%3DAbstract%26list_uids%3D20185404&amp;title=Entrez+Pubmed&quot;&gt;CiteULike&lt;/a&gt;</description>
    </item>
    <item>
      <title>The Power of Protein Interaction Networks for Associating Genes with Diseases.</title>
      <link>http://www.ncbi.nlm.nih.gov/entrez/query.fcgi?cmd=Retrieve&amp;db=PubMed&amp;dopt=Abstract&amp;list_uids=20185403</link>
      <description>Publication Date: 2010 Feb 24 PMID: 20185403&lt;br/&gt;Authors: Navlakha, S. - Kingsford, C.&lt;br/&gt;Journal: Bioinformatics&lt;br/&gt;&lt;br/&gt;MOTIVATION: Understanding the association between genetic diseases and their causal genes is an important problem concerning human health. With the recent influx of high-throughput data describing interactions between gene products, scientists have been provided a new avenue through which these associations can be inferred. Despite the recent interest in this problem, however, there is little understanding of the relative benefits and drawbacks underlying the proposed techniques. RESULTS: We assessed the utility of physical protein interactions for determining gene-disease associations by examining the performance of seven recently developed computational methods (plus several of their variants). We found that random-walk approaches individually outperform clustering and neighborhood approaches, although most methods make predictions not made by any other method. We show how combining these methods into a consensus method yields Pareto optimal performance. We also quantified how a diffuse topological distribution of disease-related proteins negatively affects prediction quality and are thus able to identify diseases especially amenable to network-based predictions and others for which additional information sources are absolutely required. AVAILABILITY: The predictions made by each algorithm considered are available online at http://www.cbcb.umd.edu/DiseaseNet. CONTACT: carlk@cs.umd.edu.&lt;br/&gt;&lt;br/&gt;post to: &lt;a href = &quot;http://www.citeulike.org/posturl?url=http%3A%2F%2Fwww.ncbi.nlm.nih.gov%2Fentrez%2Fquery.fcgi%3Fcmd%3DRetrieve%26db%3DPubMed%26dopt%3DAbstract%26list_uids%3D20185403&amp;title=Entrez+Pubmed&quot;&gt;CiteULike&lt;/a&gt;</description>
    </item>
    <item>
      <title>GOSemSim: an R package for measuring semantic similarity among GO terms and gene products.</title>
      <link>http://www.ncbi.nlm.nih.gov/entrez/query.fcgi?cmd=Retrieve&amp;db=PubMed&amp;dopt=Abstract&amp;list_uids=20179076</link>
      <description>Publication Date: 2010 Feb 23 PMID: 20179076&lt;br/&gt;Authors: Yu, G. - Li, F. - Qin, Y. - Bo, X. - Wu, Y. - Wang, S.&lt;br/&gt;Journal: Bioinformatics&lt;br/&gt;&lt;br/&gt;SUMMARY: The semantic comparisons of GO annotations provide quantitative ways to compute similarities between genes and gene groups, and have became important basis for many bioinformatics analysis approaches. GOSemSim is an R package for semantic similarity computation among GO terms, sets of GO terms, gene products, and gene clusters. Four Information-content-based and a graph-based methods are implemented in the GOSemSim package, multiple species including human, rat, mouse, fly and yeast are also supported. The functions provided by the GOSemSim offer flexibility for applications, and can be easily integrated into high-throughput analysis pipelines. AVAILABILITY: GOSemSim is released under GPL within Bioconductor project, and freely available at http://bioconductor.org/packages/2.6/bioc/html/GOSemSim.html. CONTACT: boxc@bmi.ac.cn; sqwang@bmi.ac.cn.&lt;br/&gt;&lt;br/&gt;post to: &lt;a href = &quot;http://www.citeulike.org/posturl?url=http%3A%2F%2Fwww.ncbi.nlm.nih.gov%2Fentrez%2Fquery.fcgi%3Fcmd%3DRetrieve%26db%3DPubMed%26dopt%3DAbstract%26list_uids%3D20179076&amp;title=Entrez+Pubmed&quot;&gt;CiteULike&lt;/a&gt;</description>
    </item>
    <item>
      <title>Accelerated Similarity Searching and Clustering of Large Compound Sets by Geometric Embedding and Locality Sensitive Hashing.</title>
      <link>http://www.ncbi.nlm.nih.gov/entrez/query.fcgi?cmd=Retrieve&amp;db=PubMed&amp;dopt=Abstract&amp;list_uids=20179075</link>
      <description>Publication Date: 2010 Feb 23 PMID: 20179075&lt;br/&gt;Authors: Cao, Y. - Jiang, T. - Girke, T.&lt;br/&gt;Journal: Bioinformatics&lt;br/&gt;&lt;br/&gt;MOTIVATION: Similarity searching and clustering of chemical compounds by structural similarities are important computational approaches for identifying drug-like small molecules. Most algorithms available for these tasks are limited by their speed and scalability, and cannot handle today's large compound databases with several million entries. RESULTS: In this paper, we introduce a new algorithm for accelerated similarity searching and clustering of very large compound sets using embedding and indexing techniques. First, we present EI-Search as a general purpose similarity search method for finding objects with similar features in large databases and apply it here to searching and clustering of large compound sets. The method embeds the compounds in a high-dimensional Euclidean space and searches this space using an efficient index-aware nearest neighbor search method based on Locality Sensitive Hashing. Second, to cluster large compound sets, we introduce the EI-Clustering algorithm which combines the EI-Search method with Jarvis-Patrick clustering. Both methods were tested on three large data sets with sizes ranging from about 260,000 to over 19 million compounds. In comparison to sequential search methods, the EI-Search method was 40-200 times faster, while maintaining comparable recall rates. The EI-Clustering method allowed us to significantly reduce the CPU time required to cluster these large compound libraries from several months to only a few days. AVAILABILITY: Software implementations and online services have been developed based on the methods introduced in this study. The online services provide access to the generated clustering results and ultra-fast similarity searching of the PubChem Compound database with sub-second response time. CONTACT: thomas.girke@ucr.edu.&lt;br/&gt;&lt;br/&gt;post to: &lt;a href = &quot;http://www.citeulike.org/posturl?url=http%3A%2F%2Fwww.ncbi.nlm.nih.gov%2Fentrez%2Fquery.fcgi%3Fcmd%3DRetrieve%26db%3DPubMed%26dopt%3DAbstract%26list_uids%3D20179075&amp;title=Entrez+Pubmed&quot;&gt;CiteULike&lt;/a&gt;</description>
    </item>
    <item>
      <title>Domain Adaptation for Semantic Role Labeling in the Biomedical Domain.</title>
      <link>http://www.ncbi.nlm.nih.gov/entrez/query.fcgi?cmd=Retrieve&amp;db=PubMed&amp;dopt=Abstract&amp;list_uids=20179074</link>
      <description>Publication Date: 2010 Feb 23 PMID: 20179074&lt;br/&gt;Authors: Dahlmeier, D. - Ng, H. T.&lt;br/&gt;Journal: Bioinformatics&lt;br/&gt;&lt;br/&gt;MOTIVATION: Semantic role labeling (SRL) is a natural language processing (NLP) task that extracts a shallow meaning representation from free text sentences. Several efforts to create SRL systems for the biomedical domain have been made during the last few years. However, state-of-the-art SRL relies on manually annotated training instances, which are rare and expensive to prepare. In this paper, we address SRL for the biomedical domain as a domain adaptation problem to leverage existing SRL resources from the newswire domain. RESULTS: We evaluate the performance of three recently proposed domain adaptation algorithms for SRL. Our results show that by using domain adaptation, the cost for developing an SRL system for the biomedical domain can be reduced significantly. Using domain adaptation, our system can achieve 97% of the performance with as little as 60 annotated target domain abstracts. AVAILABILITY: Our BioKIT system that performs SRL in the biomedical domain as described in this paper is implemented in Python and C and operates under the Linux operating system. BioKIT can be downloaded at http://nlp.comp.nus.edu.sg/software. The domain adaptation software is available for download at http://www.mysmu.edu/faculty/jingjiang/software/DALR.html. The BioProp corpus is available from the Linguistic Data Consortium http://www.ldc.upenn.edu. CONTACT: nght@comp.nus.edu.sg.&lt;br/&gt;&lt;br/&gt;post to: &lt;a href = &quot;http://www.citeulike.org/posturl?url=http%3A%2F%2Fwww.ncbi.nlm.nih.gov%2Fentrez%2Fquery.fcgi%3Fcmd%3DRetrieve%26db%3DPubMed%26dopt%3DAbstract%26list_uids%3D20179074&amp;title=Entrez+Pubmed&quot;&gt;CiteULike&lt;/a&gt;</description>
    </item>
    <item>
      <title>SBRML: a markup language for associating systems biology data with models.</title>
      <link>http://www.ncbi.nlm.nih.gov/entrez/query.fcgi?cmd=Retrieve&amp;db=PubMed&amp;dopt=Abstract&amp;list_uids=20176582</link>
      <description>Publication Date: 2010 Feb 21 PMID: 20176582&lt;br/&gt;Authors: Dada, J. O. - Spasic, I. - Paton, N. W. - Mendes, P.&lt;br/&gt;Journal: Bioinformatics&lt;br/&gt;&lt;br/&gt;MOTIVATION: Research in systems biology is carried out through a combination of experiments and models. Several data standards have been adopted for representing models (SBML) and various types of relevant experimental data (such as FuGE and those of the Proteomics Standards Initiative). However, until now, there has been no standard way to associate a model and its entities to the corresponding data sets, or vice versa. Such a standard would provide a means to represent computational simulation results as well as to frame experimental data in the context of a particular model. Target applications include model-driven data analysis, parameter estimation, and sharing and archiving model simulations. RESULTS: We propose the Systems Biology Results Markup Language (SBRML), an XML-based language which associates a model with several data sets. Each data set is represented as a series of values associated with model variables, and their corresponding parameter values. SBRML provides a flexible way of indexing the results to model parameter values, which supports both spreadsheetlike data and multidimensional data cubes. We present and discuss several examples of SBRML usage in applications such as enzyme kinetics, microarray gene expression, and various types of simulation results. Availability and Implementation: The XML Schema file for SBRML is available at http://www.comp-sys-bio.org/SBRML under the Academic Free License (AFL) v3.0. CONTACT: pedro.mendes@manchester.ac.uk.&lt;br/&gt;&lt;br/&gt;post to: &lt;a href = &quot;http://www.citeulike.org/posturl?url=http%3A%2F%2Fwww.ncbi.nlm.nih.gov%2Fentrez%2Fquery.fcgi%3Fcmd%3DRetrieve%26db%3DPubMed%26dopt%3DAbstract%26list_uids%3D20176582&amp;title=Entrez+Pubmed&quot;&gt;CiteULike&lt;/a&gt;</description>
    </item>
    <item>
      <title>GO-Bayes: Gene Ontology-based over-representation analysis using a Bayesian approach.</title>
      <link>http://www.ncbi.nlm.nih.gov/entrez/query.fcgi?cmd=Retrieve&amp;db=PubMed&amp;dopt=Abstract&amp;list_uids=20176581</link>
      <description>Publication Date: 2010 Feb 21 PMID: 20176581&lt;br/&gt;Authors: Zhang, S. - Cao, J. - Kong, Y. M. - Scheuermann, R. H.&lt;br/&gt;Journal: Bioinformatics&lt;br/&gt;&lt;br/&gt;MOTIVATION: A typical approach for the interpretation of highthroughput experiments, such as gene expression microarrays, is to produce groups of genes based on certain criteria (e.g., genes that are differentially expressed). To gain more mechanistic insights into the underlying biology, over-representation analysis (ORA) is often conducted to investigate whether gene sets associated with particular biological functions, for example as represented by Gene Ontology (GO) annotations, are statistically over-represented in the identified gene groups. However, the standard ORA, which is based on the hypergeometric test, analyzes each GO term in isolation and does not take into account the dependence structure of the GO term hierarchy. RESULTS: We have developed a Bayesian approach (GO-Bayes) to measure over-representation of GO terms that incorporates the GO dependence structure by taking into account evidence not only from individual GO terms, but also from their related terms (i.e., parents, children, and siblings, etc.). The Bayesian framework borrows information across related GO terms to strengthen the detection of over-representation signals. As a result, this method tends to identify sets of closely related GO terms rather than individual isolated GO terms. The advantage of the GO-Bayes approach is demonstrated with a simulation study and an application example. CONTACT: song.zhang@utsouthwestern.edu and richard.scheuermann@utsouthwestern.edu.&lt;br/&gt;&lt;br/&gt;post to: &lt;a href = &quot;http://www.citeulike.org/posturl?url=http%3A%2F%2Fwww.ncbi.nlm.nih.gov%2Fentrez%2Fquery.fcgi%3Fcmd%3DRetrieve%26db%3DPubMed%26dopt%3DAbstract%26list_uids%3D20176581&amp;title=Entrez+Pubmed&quot;&gt;CiteULike&lt;/a&gt;</description>
    </item>
    <item>
      <title>An optimal experimental design approach to model discrimination in dynamic biochemical systems.</title>
      <link>http://www.ncbi.nlm.nih.gov/entrez/query.fcgi?cmd=Retrieve&amp;db=PubMed&amp;dopt=Abstract&amp;list_uids=20176580</link>
      <description>Publication Date: 2010 Mar 1 PMID: 20176580&lt;br/&gt;Authors: Skanda, D. - Lebiedz, D.&lt;br/&gt;Journal: Bioinformatics&lt;br/&gt;&lt;br/&gt;MOTIVATION: Finding suitable models of dynamic biochemical systems is an important task in systems biology approaches to the biosciences. On the one hand a correct model helps to understand the underlying mechanisms on the other hand one can use the model to predict the behavior of a biological system under various circumstances. Typically, before the correct model of a biochemical system is found, different hypothetical models might be reasonable and consistent with previous knowledge and available data. The main goal now is to find the best suited model out of different hypotheses. The process of falsifying inappropriate candidate models is called model discrimination. RESULTS: We have developed a new computational tool to compute optimal experiments for biochemical kinetic systems with underlying ordinary differential equation models (ODE) for the purpose of model discrimination. We were inspired by the demands of biological experimentalists which perform one run measurement where perturbations to the system are possible.We provide a criterion which calculates the number and location of time points of optimal measurements as well as optimal initial conditions and optimal perturbations to the system. AVAILABILITY: The Model discrimination algorithm described here is implemented in C++ in the package ModelDiscriminationToolkit. The source code can be downloaded from http://omnibus.uni-freiburg.de/~ds500/e_software.html CONTACT: dirk.lebiedz@biologie.uni-freiburg.de.&lt;br/&gt;&lt;br/&gt;post to: &lt;a href = &quot;http://www.citeulike.org/posturl?url=http%3A%2F%2Fwww.ncbi.nlm.nih.gov%2Fentrez%2Fquery.fcgi%3Fcmd%3DRetrieve%26db%3DPubMed%26dopt%3DAbstract%26list_uids%3D20176580&amp;title=Entrez+Pubmed&quot;&gt;CiteULike&lt;/a&gt;</description>
    </item>
    <item>
      <title>Viewing cancer genes from co-evolving gene modules.</title>
      <link>http://www.ncbi.nlm.nih.gov/entrez/query.fcgi?cmd=Retrieve&amp;db=PubMed&amp;dopt=Abstract&amp;list_uids=20176579</link>
      <description>Publication Date: 2010 Feb 21 PMID: 20176579&lt;br/&gt;Authors: Zhu, J. - Xiao, H. - Shen, X. - Wang, J. - Zou, J. - Zhang, L. - Yang, D. - Ma, W. - Yao, C. - Gong, X. - Zhang, M. - Zhang, Y. - Guo, Z.&lt;br/&gt;Journal: Bioinformatics&lt;br/&gt;&lt;br/&gt;MOTIVATION: Studying the evolutionary conservation of cancer genes can improve our understanding of the genetic basis of human cancers. Functionally related proteins encoded by genes tend to interact with each other in a modular fashion, which may affect both the mode and tempo of their evolution. RESULTS: In the human PPI network, we searched for subnetworks within each of which all proteins have evolved at similar rates since the human and mouse split. Identified at a given co-evolving level, the subnetworks with non-randomly large sizes were defined as co-evolving modules. We showed that proteins within modules tend to be conserved, evolutionarily old and enriched with housekeeping genes, while proteins outside modules tend to be less-conserved, evolutionarily younger and enriched with genes expressed in specific tissues. Viewing cancer genes from co-evolving modules showed that the overall conservation of cancer genes should be mainly attributed to the cancer proteins enriched in the conserved modules. Functional analysis further suggested that cancer proteins within and outside modules might play different roles in carcino-genesis, providing a new hint for studying the mechanism of cancer. CONTACT: guoz@ems.hrbmu.edu.cn Supplementaty information: Supplementary data are available at Bioinformatics online.&lt;br/&gt;&lt;br/&gt;post to: &lt;a href = &quot;http://www.citeulike.org/posturl?url=http%3A%2F%2Fwww.ncbi.nlm.nih.gov%2Fentrez%2Fquery.fcgi%3Fcmd%3DRetrieve%26db%3DPubMed%26dopt%3DAbstract%26list_uids%3D20176579&amp;title=Entrez+Pubmed&quot;&gt;CiteULike&lt;/a&gt;</description>
    </item>
    <item>
      <title>DASS-GUI: a user interface for identification and analysis of significant patterns in non-sequential data.</title>
      <link>http://www.ncbi.nlm.nih.gov/entrez/query.fcgi?cmd=Retrieve&amp;db=PubMed&amp;dopt=Abstract&amp;list_uids=20172945</link>
      <description>Publication Date: 2010 Feb 19 PMID: 20172945&lt;br/&gt;Authors: Hollunder, J. - Friedel, M. - Kuiper, M. - Wilhelm, T.&lt;br/&gt;Journal: Bioinformatics&lt;br/&gt;&lt;br/&gt;SUMMARY: Many large &quot;omics&quot; data sets have been published and many more are expected in the near future. New analysis methods are needed for best exploitation. We have developed a graphical user interface (GUI) for easy data analysis. Our DASS approach (Discovery of All Significant Substructures) elucidates the underlying modularity, a typical feature of complex biological data. It is related to biclustering and other data mining approaches. Importantly, DASS-GUI also allows handling of multi-sets and calculation of statistical significances. DASS-GUI contains tools for further analysis of the identified patterns: analysis of the pattern hierarchy, enrichment analysis, module validation, analysis of additional numerical data, easy handling of synonymous names, clustering, filtering and merging. Different export options allow easy usage of additional tools such as Cytoscape. AVAILABILITY: Source code, precompiled binaries for different systems, a comprehensive tutorial, case studies and many additional datasets are freely available at http://www.ifr.ac.uk/dass/gui/. DASS-GUI is implemented in Qt. CONTACT: jehol@psb.vib-ugent.be; thomas.wilhelm@bbsrc.ac.uk SUPPLEMENTARY INFORMATION: http://www.ifr.ac.uk/dass/gui/.&lt;br/&gt;&lt;br/&gt;post to: &lt;a href = &quot;http://www.citeulike.org/posturl?url=http%3A%2F%2Fwww.ncbi.nlm.nih.gov%2Fentrez%2Fquery.fcgi%3Fcmd%3DRetrieve%26db%3DPubMed%26dopt%3DAbstract%26list_uids%3D20172945&amp;title=Entrez+Pubmed&quot;&gt;CiteULike&lt;/a&gt;</description>
    </item>
    <item>
      <title>A Principal Skeleton Algorithm for Standardizing Confocal Images of Fruit Fly Nervous Systems.</title>
      <link>http://www.ncbi.nlm.nih.gov/entrez/query.fcgi?cmd=Retrieve&amp;db=PubMed&amp;dopt=Abstract&amp;list_uids=20172944</link>
      <description>Publication Date: 2010 Feb 19 PMID: 20172944&lt;br/&gt;Authors: Qu, L. - Peng, H.&lt;br/&gt;Journal: Bioinformatics&lt;br/&gt;&lt;br/&gt;MOTIVATION: The fruit fly (Drosophila melanogaster) is a commonly used model organism in biology. We are currently building a 3D digital atlas of the fruit fly larval nervous system based on a large collection of fly larva GAL4 lines, each of which targets a subset of neurons. To achieve such a goal, we need to automatically align a number of high-resolution confocal image stacks of these GAL4 lines. One commonly employed strategy in image pattern registration is to first globally align images using an affine transform, followed by local nonlinear warping. Unfortunately, the spatially articulated and often twisted larval nervous system makes it difficult to globally align the images directly using the affine method. In a parallel project to build a 3D digital map of the adult fly ventral nerve cord, we are con-fronted with a similar problem. RESULTS: We proposed to standardize a larval image by best aligning its principal skeleton (PS), and thus used this method as an alternative of the usually considered affine alignment. The principal skeleton of a shape was defined as a series of connected polylines that spans the entire shape as broadly as possible, but with the shortest overall length. We developed an automatic principal skeleton detection algorithm to robustly detect the principal skeleton from an image. Then for a pair of larval images, we designed an automatic image registration method to align their principal skeletons and the entire images simultaneously. Our experimental results on both simulated images and real datasets showed that our method does not only produce satisfactory results for real confocal larval images, but also perform robustly and consistently when there is a lot of noise in the data. We also applied this method successfully to confocal images of some other patterns like the adult fruit fly ventral nerve cord and center brain, which have more complicated principal skeleton. This demonstrates the flexibility and extensibility of our method. AVAILABILITY: The supplementary movies, full size figures, test data, software, and tutorial on the software can be downloaded freely from our website http://penglab.janelia.org/proj/principal_skeleton.&lt;br/&gt;&lt;br/&gt;post to: &lt;a href = &quot;http://www.citeulike.org/posturl?url=http%3A%2F%2Fwww.ncbi.nlm.nih.gov%2Fentrez%2Fquery.fcgi%3Fcmd%3DRetrieve%26db%3DPubMed%26dopt%3DAbstract%26list_uids%3D20172944&amp;title=Entrez+Pubmed&quot;&gt;CiteULike&lt;/a&gt;</description>
    </item>
    <item>
      <title>A meta-analysis of two-dimensional electrophoresis pattern of the Parkinson's disease related protein DJ-1.</title>
      <link>http://www.ncbi.nlm.nih.gov/entrez/query.fcgi?cmd=Retrieve&amp;db=PubMed&amp;dopt=Abstract&amp;list_uids=20172943</link>
      <description>Publication Date: 2010 Feb 19 PMID: 20172943&lt;br/&gt;Authors: Natale, M. - Bonino, D. - Consoli, P. - Alberio, T. - Ravid, R. G. - Fasano, M. - Bucci, E. M.&lt;br/&gt;Journal: Bioinformatics&lt;br/&gt;&lt;br/&gt;MOTIVATION: The two-dimensional electrophoresis pattern of proteins is thought to be specifically related to the physiological or pathological condition at the moment of sample preparation. On this ground, most proteomic studies move to identify specific hallmarks for a number of different conditions. However, the information arising from these investigations is often incomplete due to inherent limitations of the technique, to extensive protein post-translational modifications and sometimes to the paucity of available samples. The meta-analysis of proteomic data can provide valuable information pertinent to various biological processes that otherwise remains hidden. RESULTS: Here, we show a meta-analysis of the Parkinson's disease protein DJ-1 in heterogeneous two-dimensional electrophoresis experiments. The protein was shown to segregate into specific clusters associated to defined conditions. Interestingly, the DJ-1 pool from neural tissues displayed a specific and characteristic molecular weight (MW) and isoelectric point (pI) pattern. Moreover, changes in this pattern have been related to neurodegenerative processes and aging. These results were experimentally validated on human brain specimens from control subjects and Parkinson Disease (PD) patients. AVAILABILITY: ImageJ is a public domain image processing program developed by the National Institutes of Health and is freely available at http://rsbweb.nih.gov/ij. All the ImageJ macros used in this study are available as supple-mentary material and upon request at info@biodigitalvalley.com. XLSTAT can be purchased online at http://www.xlstat.com/en/home/ at a current cost of approximately 300 EUR. CONTACT: enrico.bucci@biodigitalvalley.com.&lt;br/&gt;&lt;br/&gt;post to: &lt;a href = &quot;http://www.citeulike.org/posturl?url=http%3A%2F%2Fwww.ncbi.nlm.nih.gov%2Fentrez%2Fquery.fcgi%3Fcmd%3DRetrieve%26db%3DPubMed%26dopt%3DAbstract%26list_uids%3D20172943&amp;title=Entrez+Pubmed&quot;&gt;CiteULike&lt;/a&gt;</description>
    </item>
    <item>
      <title>CandiSNPer: a web-tool for the identification of candidate SNPs for causal variants.</title>
      <link>http://www.ncbi.nlm.nih.gov/entrez/query.fcgi?cmd=Retrieve&amp;db=PubMed&amp;dopt=Abstract&amp;list_uids=20172942</link>
      <description>Publication Date: 2010 Feb 19 PMID: 20172942&lt;br/&gt;Authors: Schmitt, A. O. - Assmus, J. - Bortfeldt, R. H. - Brockmann, G. A.&lt;br/&gt;Journal: Bioinformatics&lt;br/&gt;&lt;br/&gt;SUMMARY: Human SNP chips which are used in genome-wide association studies (GWAS) permit the genotyping of up to four million single nucleotide polymorphisms (SNPs) simultaneously. To date, about one thousand human SNPs have been identified as statistically significantly associated with a disease or another trait of interest. The identified SNP is not necessarily the causal variant, but it is rather in linkage disequilibrium (LD) with it. CandiSNPer is a software tool that determines the LD-region around a significant SNP from a GWAS. It provides a list with functional annotation and LD-values for the SNPs found in the LD region. This list contains not only the SNPs for which genotyping data is available, but all SNPs with rs-IDs, thus increasing the likelihood to include the causal variant. Furthermore, plots showing the LD-values are generated. CandiSNPer facilitates the preselection of candidate SNPs for causal variants. Availability and Implementation: The CandiSNPer server is freely available at http://www2.hu-berlin.de/wikizbnutztier/software/CandiSNPer. The source code is available to academic users &quot;as is&quot; upon request. The website is implemented in Perl and R and runs on an Apache server. The Ensembl database is queried for SNP data via Perl APIs CONTACT: armin.schmitt@agrar.hu-berlin.de.&lt;br/&gt;&lt;br/&gt;post to: &lt;a href = &quot;http://www.citeulike.org/posturl?url=http%3A%2F%2Fwww.ncbi.nlm.nih.gov%2Fentrez%2Fquery.fcgi%3Fcmd%3DRetrieve%26db%3DPubMed%26dopt%3DAbstract%26list_uids%3D20172942&amp;title=Entrez+Pubmed&quot;&gt;CiteULike&lt;/a&gt;</description>
    </item>
    <item>
      <title>Treephyler: fast taxonomic profiling of metagenomes.</title>
      <link>http://www.ncbi.nlm.nih.gov/entrez/query.fcgi?cmd=Retrieve&amp;db=PubMed&amp;dopt=Abstract&amp;list_uids=20172941</link>
      <description>Publication Date: 2010 Feb 19 PMID: 20172941&lt;br/&gt;Authors: Schreiber, F. - Gumrich, P. - Daniel, R. - Meinicke, P.&lt;br/&gt;Journal: Bioinformatics&lt;br/&gt;&lt;br/&gt;SUMMARY: Assessment of phylogenetic diversity is a key element to the analysis of microbial communities. Tools are needed to handle next-generation sequencing data and to cope with the computational complexity of large-scale studies. Here, we present Treephyler, a tool for fast taxonomic profiling of metagenomes. Treephyler was evaluated on real metagenome to assess its performance in comparison to previous approaches for taxonomic profiling. Results indicate that Treephyler is in terms of speed and accuracy prepared for next-generation sequencing techniques and large-scale analysis. AVAILABILITY: Treephyler is implemented in Perl, it is portable to all platforms and applicable to both nucleotide and protein input data. Treephyler is freely available for download at http://www.gobics.de/fabian/treephyler.php. CONTACT: fschrei@gwdg.de SUPPLEMENTARY INFORMATION: http://www.gobics.de/fabian/treephyler.php.&lt;br/&gt;&lt;br/&gt;post to: &lt;a href = &quot;http://www.citeulike.org/posturl?url=http%3A%2F%2Fwww.ncbi.nlm.nih.gov%2Fentrez%2Fquery.fcgi%3Fcmd%3DRetrieve%26db%3DPubMed%26dopt%3DAbstract%26list_uids%3D20172941&amp;title=Entrez+Pubmed&quot;&gt;CiteULike&lt;/a&gt;</description>
    </item>
    <item>
      <title>SLiM on DIet: Finding Short Linear Motifs on Domain Interaction Interfaces in PDB.</title>
      <link>http://www.ncbi.nlm.nih.gov/entrez/query.fcgi?cmd=Retrieve&amp;db=PubMed&amp;dopt=Abstract&amp;list_uids=20167627</link>
      <description>Publication Date: 2010 Feb 18 PMID: 20167627&lt;br/&gt;Authors: Willy, H. - Song, F. - Aung, Z. - Ng, S. K. - Sung, W. K.&lt;br/&gt;Journal: Bioinformatics&lt;br/&gt;&lt;br/&gt;MOTIVATION: An important class of protein interactions involves the binding of a protein's domain to a short linear motif (SLiM) on its interacting partner. Extracting such motifs, either experimentally or computationally, is challenging because of their weak binding and high degree of degeneracy. Recent rapid increase of available protein structures provides an excellent opportunity to study SLiMs directly from their 3D structures. RESULTS: Using Domain Interface extraction (DIet), we characterized 452 distinct SLiMs from the Protein Data Bank (PDB), of which 155 are validated in varying degrees-40 have literature validation, 54 are supported by at least one domain-peptide structural instance, and another 61 have over-representation in high throughput PPI data. We further observed that the lacklustre coverage of existing computational SLiM detection methods could be due to the common assumption that most SLiMs occur outside globular domain regions. 198 of 452 SLiM that we reported are actually found on domaindomain interface; some of them are implicated in autoimmune and neurodegenerative diseases. We suggest that these SLiMs would be useful for designing inhibitors against the pathogenic protein complexes underlying these diseases. Our findings show that 3D structure-based SLiM detection algorithms can provide a more complete coverage of SLiM-mediated protein interactions than current sequence-based approaches. CONTACT: ksung@comp.nus.edu.sg.&lt;br/&gt;&lt;br/&gt;post to: &lt;a href = &quot;http://www.citeulike.org/posturl?url=http%3A%2F%2Fwww.ncbi.nlm.nih.gov%2Fentrez%2Fquery.fcgi%3Fcmd%3DRetrieve%26db%3DPubMed%26dopt%3DAbstract%26list_uids%3D20167627&amp;title=Entrez+Pubmed&quot;&gt;CiteULike&lt;/a&gt;</description>
    </item>
    <item>
      <title>How significant is a protein structure similarity with TM-score=0.5?</title>
      <link>http://www.ncbi.nlm.nih.gov/entrez/query.fcgi?cmd=Retrieve&amp;db=PubMed&amp;dopt=Abstract&amp;list_uids=20164152</link>
      <description>Publication Date: 2010 Feb 17 PMID: 20164152&lt;br/&gt;Authors: Xu, J. R. - Zhang, Y.&lt;br/&gt;Journal: Bioinformatics&lt;br/&gt;&lt;br/&gt;MOTIVATION: Protein structure similarity is often measured by RMSD, GDT-score, and TM-score. However, the scores themselves cannot provide information on how significant the structural similarity is. Also, it lacks a quantitative relation between the scores and conventional fold classifications. This paper aims to answer two questions: (1) what is the statistical significance of TM-score? (2) What is the probability of two proteins having the same fold given a specific TM-score? RESULTS: We first made an all-to-all gapless structural match on 6,684 non-homologous single-domain proteins in the PDB and found that the TM-scores follow an extreme value distribution. The data allow us to assign each TM-score a P-value that measures the chance of two randomly selected proteins obtaining an equal or higher TM-score. With a TM-score at 0.5, for instance, its P-value is 5.5x10(-7), which means we need to consider at least 1.8 million random protein pairs to acquire a TM-score of no less than 0.5. Second, we examine the posterior probability of the same fold proteins from three datasets SCOP, CATH and the consensus of SCOP and CATH. It is found that the posterior probability from different datasets has a similar rapid phase transition around TM-score=0.5. This finding indicates that TM-score can be used as an approximate but quantitative criterion for protein topology classification, i.e. protein pairs with a TM-score&gt;0.5 are mostly in the same fold while those with a TM-score&lt;0.5 are mainly not in the same fold. * CONTACT: zhng@umich.edu.&lt;br/&gt;&lt;br/&gt;post to: &lt;a href = &quot;http://www.citeulike.org/posturl?url=http%3A%2F%2Fwww.ncbi.nlm.nih.gov%2Fentrez%2Fquery.fcgi%3Fcmd%3DRetrieve%26db%3DPubMed%26dopt%3DAbstract%26list_uids%3D20164152&amp;title=Entrez+Pubmed&quot;&gt;CiteULike&lt;/a&gt;</description>
    </item>
    <item>
      <title>Standard Virtual Biological Parts: A Repository of Modular Modeling Components for Synthetic Biology.</title>
      <link>http://www.ncbi.nlm.nih.gov/entrez/query.fcgi?cmd=Retrieve&amp;db=PubMed&amp;dopt=Abstract&amp;list_uids=20160009</link>
      <description>Publication Date: 2010 Feb 16 PMID: 20160009&lt;br/&gt;Authors: Cooling, M. T. - Rouilly, V. - Misirli, G. - Lawson, J. - Yu, T. - Hallinan, J. - Wipat, A.&lt;br/&gt;Journal: Bioinformatics&lt;br/&gt;&lt;br/&gt;MOTIVATION: Fabrication of synthetic biological systems is greatly enhanced by incorporating engineering design principles and techniques such as computer-aided design. To this end, the ongoing standardization of biological parts presents an opportunity to develop libraries of standard virtual parts in the form of mathematical models that can be combined to inform system design. RESULTS: We present an online Repository, populated with a collection of standardized models that can readily be recombined to model different biological systems using the inherent modularity support of the CellML 1.1 model exchange format. The applicability of this approach is demonstrated by modeling gold-medal winning iGEM machines. Availability and Implementation: The Repository is available online as part of http://models.cellml.org. We hope to stimulate the worldwide community to reuse and extend the models therein, and contribute to the Repository of Standard Virtual Parts thus founded. CONTACT: m.cooling@auckland.ac.nz SUPPLEMENTARY INFORMATION: Systems Model architecture information for the Systems Model described here, along with an additional example and a tutorial, is also available as Supplementary Information. The example Systems Model from this manuscript can be found at http://models.cellml.org/workspace/bugbuster. The Template models used in the example can be found at http://models.cellml.org/workspace/SVP_Templates200906.&lt;br/&gt;&lt;br/&gt;post to: &lt;a href = &quot;http://www.citeulike.org/posturl?url=http%3A%2F%2Fwww.ncbi.nlm.nih.gov%2Fentrez%2Fquery.fcgi%3Fcmd%3DRetrieve%26db%3DPubMed%26dopt%3DAbstract%26list_uids%3D20160009&amp;title=Entrez+Pubmed&quot;&gt;CiteULike&lt;/a&gt;</description>
    </item>
    <item>
      <title>iPHACE: integrative navigation in pharmacological space.</title>
      <link>http://www.ncbi.nlm.nih.gov/entrez/query.fcgi?cmd=Retrieve&amp;db=PubMed&amp;dopt=Abstract&amp;list_uids=20156991</link>
      <description>Publication Date: 2010 Feb 15 PMID: 20156991&lt;br/&gt;Authors: Garcia-Serna, R. - Ursu, O. - Oprea, T. I. - Mestres, J.&lt;br/&gt;Journal: Bioinformatics&lt;br/&gt;&lt;br/&gt;SUMMARY: The increasing availability of experimentally determined binding affinities for drugs on multiple protein targets requires the design of specific mining and visualisation tools that graphically integrate chemical and biological data in an efficient environment. With this aim, we developed iPHACE, an integrative web-based tool to navigate in the pharmacological space defined by small molecule drugs contained in the IUPHAR-DB, with additional interactions present in PDSP. Extending beyond traditional querying and filtering tools, iPHACE offers a means to extract knowledge from the target profile of drugs as well as from the drug profile of protein targets. AVAILABILITY: iPHACE is available at http://cgl.imim.es/iphace/ (EU site) and http://agave.health.unm.edu/iphace/ (US mirror). CONTACT: jmestres@imim.es.&lt;br/&gt;&lt;br/&gt;post to: &lt;a href = &quot;http://www.citeulike.org/posturl?url=http%3A%2F%2Fwww.ncbi.nlm.nih.gov%2Fentrez%2Fquery.fcgi%3Fcmd%3DRetrieve%26db%3DPubMed%26dopt%3DAbstract%26list_uids%3D20156991&amp;title=Entrez+Pubmed&quot;&gt;CiteULike&lt;/a&gt;</description>
    </item>
    <item>
      <title>Phybase: an R package for species tree analysis.</title>
      <link>http://www.ncbi.nlm.nih.gov/entrez/query.fcgi?cmd=Retrieve&amp;db=PubMed&amp;dopt=Abstract&amp;list_uids=20156990</link>
      <description>Publication Date: 2010 Feb 15 PMID: 20156990&lt;br/&gt;Authors: Liu, L. - Yu, L.&lt;br/&gt;Journal: Bioinformatics&lt;br/&gt;&lt;br/&gt;MOTIVATION: Phybase is an R package for phylogenetic analysis using species trees. It provides functions to read, write, manipulate, simulate, esti-mate, summarize and plot species trees which contain not only the topology and branch lengths but also population sizes. AVAILABILITY: The Phybase package is available at the R repository. The manual and supporting materials including source code, sample R code, and sample data files for the species tree analysis are available at http://stat.osu.edu/~liuliang/research/phybase.html CONTACT: lliu@desu.edu.&lt;br/&gt;&lt;br/&gt;post to: &lt;a href = &quot;http://www.citeulike.org/posturl?url=http%3A%2F%2Fwww.ncbi.nlm.nih.gov%2Fentrez%2Fquery.fcgi%3Fcmd%3DRetrieve%26db%3DPubMed%26dopt%3DAbstract%26list_uids%3D20156990&amp;title=Entrez+Pubmed&quot;&gt;CiteULike&lt;/a&gt;</description>
    </item>
    <item>
      <title>Gene function prediction from synthetic lethality networks via ranking on demand.</title>
      <link>http://www.ncbi.nlm.nih.gov/entrez/query.fcgi?cmd=Retrieve&amp;db=PubMed&amp;dopt=Abstract&amp;list_uids=20154010</link>
      <description>Publication Date: 2010 Feb 12 PMID: 20154010&lt;br/&gt;Authors: Lippert, C. - Ghahramani, Z. - Borgwardt, K. M.&lt;br/&gt;Journal: Bioinformatics&lt;br/&gt;&lt;br/&gt;MOTIVATION: Synthetic lethal interactions represent pairs of genes whose individual mutations are not lethal, while the double mutation of both genes does incur lethality. Several studies have shown a correlation between functional similarity of genes and their distances in networks based on synthetic lethal interactions. However, there is a lack of algorithms for predicting gene function from synthetic lethality interaction networks. RESULTS: In this paper, we present a novel technique called kernelROD for gene function prediction from synthetic lethal interaction networks based on kernel machines. We apply our novel algorithm to GO functional annotation prediction in yeast. Our experiments show that our method leads to improved gene function prediction compared to state-of-the-art competitors and that combining genetic and congruence networks leads to a further improvement in prediction accuracy. CONTACT: christoph.lippert@tuebingen.mpg.de.&lt;br/&gt;&lt;br/&gt;post to: &lt;a href = &quot;http://www.citeulike.org/posturl?url=http%3A%2F%2Fwww.ncbi.nlm.nih.gov%2Fentrez%2Fquery.fcgi%3Fcmd%3DRetrieve%26db%3DPubMed%26dopt%3DAbstract%26list_uids%3D20154010&amp;title=Entrez+Pubmed&quot;&gt;CiteULike&lt;/a&gt;</description>
    </item>
    <item>
      <title>MULTICOM: A Multi-Level Combination Approach to Protein Structure Prediction and its Assessments in CASP8.</title>
      <link>http://www.ncbi.nlm.nih.gov/entrez/query.fcgi?cmd=Retrieve&amp;db=PubMed&amp;dopt=Abstract&amp;list_uids=20150411</link>
      <description>Publication Date: 2010 Feb 11 PMID: 20150411&lt;br/&gt;Authors: Wang, Z. - Eickholt, J. - Cheng, J.&lt;br/&gt;Journal: Bioinformatics&lt;br/&gt;&lt;br/&gt;MOTIVATION: Protein structure prediction is one of the most important problems in structural bioinformatics. Here we describe MULTICOM, a multi-level combination approach to improve the various steps in protein structure prediction. In contrast to those methods which look for the best templates, alignments and models, our approach tries to combine complementary and alternative templates, alignments and models to achieve on average better accuracy. RESULTS: The multi-level combination approach was implemented via five automated protein structure prediction servers and one human predictor which participated in the eighth Critical Assessment of Techniques for Protein Structure Prediction (CASP8), 2008. The MULTICOM servers and human predictor were consistently ranked among the top predictors on the CASP8 benchmark. The methods can predict moderate- to high-resolution models for most template-based targets and low-resolution models for some template-free targets. The results show that the multi-level combination of com-plementary templates, alternative alignments, and similar models aided by model quality assessment can systematically improve both template-based and template-free protein modeling. AVAILABILITY: The MULTICOM server is freely available at http://casp.rnet.missouri.edu/multicom_3d.html.&lt;br/&gt;&lt;br/&gt;post to: &lt;a href = &quot;http://www.citeulike.org/posturl?url=http%3A%2F%2Fwww.ncbi.nlm.nih.gov%2Fentrez%2Fquery.fcgi%3Fcmd%3DRetrieve%26db%3DPubMed%26dopt%3DAbstract%26list_uids%3D20150411&amp;title=Entrez+Pubmed&quot;&gt;CiteULike&lt;/a&gt;</description>
    </item>
    <item>
      <title>Assigning roles to DNA regulatory motifs using comparative genomics.</title>
      <link>http://www.ncbi.nlm.nih.gov/entrez/query.fcgi?cmd=Retrieve&amp;db=PubMed&amp;dopt=Abstract&amp;list_uids=20147307</link>
      <description>Publication Date: 2010 Feb 10 PMID: 20147307&lt;br/&gt;Authors: Buske, F. A. - Boden, M. - Bauer, D. C. - Bailey, T. L.&lt;br/&gt;Journal: Bioinformatics&lt;br/&gt;&lt;br/&gt;MOTIVATION: Transcription factors (TFs) are crucial during the lifetime of the cell. Their functional roles are defined by the genes they regulate. Uncovering these roles not only sheds light on the TF at hand but puts it into the context of the complete regulatory network. RESULTS: Here, we present an alignment- and threshold-free comparative genomics approach for assigning functional roles to DNA regulatory motifs. We incorporate our approach into the GOMO algorithm, a computational tool for detecting associations between a user-specified DNA regulatory motif (expressed as a position weight matrix) and Gene Ontology (GO) terms. Incorporating multiple species into the analysis significantly improves GOMO's ability to identify GO terms associated with the regulatory targets of TFs. Including three comparative species in the process of predicting TF roles in S. cerevisiae and H. sapiens increases the number of significant predictions by 75% and 200%, respectively. The predicted GO terms are also more specific, yielding deeper biological insight into the role of the TF. Adjusting motif (binding) affinity scores for individual sequence composition proves to be essential for avoiding false-positive associations. We describe a novel DNA sequencescoring algorithm that compensates a thermodynamic measure of DNA-binding affinity for individual sequence base-composition. GOMO's prediction accuracy proves to be relatively insensitive to how promoters are defined. Because GOMO uses a threshold-free form of gene set analysis, there are no free parameters to tune. Biologists can investigate the potential roles of DNA regulatory motifs of interest using GOMO via the web (http://meme.nbcr.net). CONTACT: f.buske@uq.edu.au t.bailey@uq.edu.au.&lt;br/&gt;&lt;br/&gt;post to: &lt;a href = &quot;http://www.citeulike.org/posturl?url=http%3A%2F%2Fwww.ncbi.nlm.nih.gov%2Fentrez%2Fquery.fcgi%3Fcmd%3DRetrieve%26db%3DPubMed%26dopt%3DAbstract%26list_uids%3D20147307&amp;title=Entrez+Pubmed&quot;&gt;CiteULike&lt;/a&gt;</description>
    </item>
    <item>
      <title>Skyline: An Open Source Document Editor for Creating and Analyzing Targeted Proteomics Experiments.</title>
      <link>http://www.ncbi.nlm.nih.gov/entrez/query.fcgi?cmd=Retrieve&amp;db=PubMed&amp;dopt=Abstract&amp;list_uids=20147306</link>
      <description>Publication Date: 2010 Feb 9 PMID: 20147306&lt;br/&gt;Authors: Maclean, B. - Tomazela, D. M. - Shulman, N. - Chambers, M. - Finney, G. L. - Frewen, B. - Kern, R. - Tabb, D. L. - Liebler, D. C. - Maccoss, M. J.&lt;br/&gt;Journal: Bioinformatics&lt;br/&gt;&lt;br/&gt;SUMMARY: Skyline is a Windows client application for targeted proteomics method creation and quantitative data analysis. It is open source and freely available for academic and commercial use. The Skyline user interface simplifies the development of mass spectrometer methods and the analysis of data from targeted proteomics experiments performed using Selected Reaction Monitoring (SRM). Skyline supports using and creating MS/MS spectral libraries from a wide variety of sources to choose SRM filters and verify results based on previously observed ion trap data. Skyline exports transition lists to and imports the native output files from Agilent, Applied Biosystems, Thermo Fisher Scientific and Waters triple quadrupole instruments, seamlessly connecting mass spectrometer output back to the experimental design document. The fast and compact Skyline file format is easily shared, even for experiments requiring many sample injections. A rich array of graphs displays results and provides powerful tools for inspecting data integrity as data is acquired, helping instrument operators to identify problems early. The Skyline dynamic report designer exports tabular data from the Skyline document model for in-depth analysis with common statistical tools. AVAILABILITY: Single-click, self-updating web installation is available at http://proteome.gs.washington.edu/software/skyline. This web site also provides access to instructional videos, a support board, an issues list and a link to the source code project.&lt;br/&gt;&lt;br/&gt;post to: &lt;a href = &quot;http://www.citeulike.org/posturl?url=http%3A%2F%2Fwww.ncbi.nlm.nih.gov%2Fentrez%2Fquery.fcgi%3Fcmd%3DRetrieve%26db%3DPubMed%26dopt%3DAbstract%26list_uids%3D20147306&amp;title=Entrez+Pubmed&quot;&gt;CiteULike&lt;/a&gt;</description>
    </item>
    <item>
      <title>Estimating replicate time shifts using Gaussian process regression.</title>
      <link>http://www.ncbi.nlm.nih.gov/entrez/query.fcgi?cmd=Retrieve&amp;db=PubMed&amp;dopt=Abstract&amp;list_uids=20147305</link>
      <description>Publication Date: 2010 Mar 15 PMID: 20147305&lt;br/&gt;Authors: Liu, Q. - Lin, K. K. - Andersen, B. - Smyth, P. - Ihler, A.&lt;br/&gt;Journal: Bioinformatics&lt;br/&gt;&lt;br/&gt;MOTIVATION: Time-course gene expression datasets provide important insights into dynamic aspects of biological processes, such as circadian rhythms, cell cycle and organ development. In a typical microarray time-course experiment, measurements are obtained at each time point from multiple replicate samples. Accurately recovering the gene expression patterns from experimental observations is made challenging by both measurement noise and variation among replicates' rates of development. Prior work on this topic has focused on inference of expression patterns assuming that the replicate times are synchronized. We develop a statistical approach that simultaneously infers both (i) the underlying (hidden) expression profile for each gene, as well as (ii) the biological time for each individual replicate. Our approach is based on Gaussian process regression (GPR) combined with a probabilistic model that accounts for uncertainty about the biological development time of each replicate. RESULTS: We apply GPR with uncertain measurement times to a microarray dataset of mRNA expression for the hair-growth cycle in mouse back skin, predicting both profile shapes and biological times for each replicate. The predicted time shifts show high consistency with independently obtained morphological estimates of relative development. We also show that the method systematically reduces prediction error on out-of-sample data, significantly reducing the mean squared error in a cross-validation study. AVAILABILITY: Matlab code for GPR with uncertain time shifts is available at http://sli.ics.uci.edu/Code/GPRTimeshift/ CONTACT: ihler@ics.uci.edu.&lt;br/&gt;&lt;br/&gt;post to: &lt;a href = &quot;http://www.citeulike.org/posturl?url=http%3A%2F%2Fwww.ncbi.nlm.nih.gov%2Fentrez%2Fquery.fcgi%3Fcmd%3DRetrieve%26db%3DPubMed%26dopt%3DAbstract%26list_uids%3D20147305&amp;title=Entrez+Pubmed&quot;&gt;CiteULike&lt;/a&gt;</description>
    </item>
    <item>
      <title>A link between H3K27me3 mark and exon length in the gene promoters of pluripotent and differentiated cells.</title>
      <link>http://www.ncbi.nlm.nih.gov/entrez/query.fcgi?cmd=Retrieve&amp;db=PubMed&amp;dopt=Abstract&amp;list_uids=20147304</link>
      <description>Publication Date: 2010 Feb 9 PMID: 20147304&lt;br/&gt;Authors: Chen, L.&lt;br/&gt;Journal: Bioinformatics&lt;br/&gt;&lt;br/&gt;We conducted a re-analysis of genome-wide histone H3 tail methylation data in mammalian pluripotent and differentiated cells. We show that the promoters marked with histone H3 lysine 27 trimethylation (H3K27me3) tend to have more exonic positions in the promoter regions. This is however not due to any preferential marking on exons over introns by H3K27me3. The relationship is also independent of the status of histone H3 lysine 4 trimethylation (H3K4me3) mark, CpG content, and the platforms used in the highthroughput profiling of histone modifications. It provides evidence for the link between histone modifications and transcribed exons in promoter regions. CONTACT: liang.chen@usc.edu SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online.&lt;br/&gt;&lt;br/&gt;post to: &lt;a href = &quot;http://www.citeulike.org/posturl?url=http%3A%2F%2Fwww.ncbi.nlm.nih.gov%2Fentrez%2Fquery.fcgi%3Fcmd%3DRetrieve%26db%3DPubMed%26dopt%3DAbstract%26list_uids%3D20147304&amp;title=Entrez+Pubmed&quot;&gt;CiteULike&lt;/a&gt;</description>
    </item>
    <item>
      <title>SoDA2: A Hidden Markov Model Approach for Inference of Immunoglobulin Rearrangements.</title>
      <link>http://www.ncbi.nlm.nih.gov/entrez/query.fcgi?cmd=Retrieve&amp;db=PubMed&amp;dopt=Abstract&amp;list_uids=20147303</link>
      <description>Publication Date: 2010 Feb 9 PMID: 20147303&lt;br/&gt;Authors: Munshaw, S. - Kepler, T. B.&lt;br/&gt;Journal: Bioinformatics&lt;br/&gt;&lt;br/&gt;MOTIVATION: The inference of pre-mutation immunoglobulin (Ig) rearrangements is essential in the study of the antibody repertoires produced in response to infection, in B-cell neoplasms and in autoimmune disease. Often, there are several rearrangements that are nearly equivalent as candidates for a given Ig gene, but have different consequences in an analysis. Our aim in this paper is to develop a probabilistic model of the rearrangement process and a Bayesian method for estimating posterior probabilities for the comparison of multiple plausible rearrangements. RESULTS: We have developed SoDA2, which is based on a Hidden Markov Model and used to compute the posterior probabilities of candidate rearrangements and to find those with the highest values among them. We validated the software on a set of simulated data, a set of clonally related sequences, and a group of randomly selected Ig heavy chains from Genbank. In all tests, SoDA2 performed better than other available software for the task. Furthermore, the output format has been redesigned, in part, to facilitate comparison of multiple solutions. AVAILABILITY: SoDA2 is available online at https://hippocrates.duhs.duke.edu/soda. Simulated sequences available upon request. CONTACT: kepler@duke.edu.&lt;br/&gt;&lt;br/&gt;post to: &lt;a href = &quot;http://www.citeulike.org/posturl?url=http%3A%2F%2Fwww.ncbi.nlm.nih.gov%2Fentrez%2Fquery.fcgi%3Fcmd%3DRetrieve%26db%3DPubMed%26dopt%3DAbstract%26list_uids%3D20147303&amp;title=Entrez+Pubmed&quot;&gt;CiteULike&lt;/a&gt;</description>
    </item>
    <item>
      <title>Fast and SNP-tolerant detection of complex variants and splicing in short reads.</title>
      <link>http://www.ncbi.nlm.nih.gov/entrez/query.fcgi?cmd=Retrieve&amp;db=PubMed&amp;dopt=Abstract&amp;list_uids=20147302</link>
      <description>Publication Date: 2010 Feb 10 PMID: 20147302&lt;br/&gt;Authors: Wu, T. D. - Nacu, S.&lt;br/&gt;Journal: Bioinformatics&lt;br/&gt;&lt;br/&gt;MOTIVATION: Next-generation sequencing captures sequence differences in reads relative to a reference genome or transcriptome, including splicing events and complex variants involving multiple mismatches and long indels. We present computational methods for fast detection of complex variants and splicing in short reads, based on a successively constrained search process of merging and filtering position lists from a genomic index. Our implementation GSNAP can align both single-end and paired-end reads as short as 14 nt and of arbitrarily long length. It can detect short- and long-distance splicing, including interchromosomal splicing, in individual reads using probabilistic models or a database of known splice sites. Our program also permits SNP-tolerant alignment to a reference space of all possible combinations of major and minor alleles, and can align reads from bisulfite treated DNA for the study of methylation state. RESULTS: In comparison testing, GSNAP has speeds comparable to existing programs, especially in reads of 70 nucleotides or more, and is fastest in detecting complex variants with 4 or more mismatches or insertions of 1-9 nucleotides and deletions of 1-30 nucleotides. Although SNP tolerance does not increase alignment yield substantially, it affects alignment results in 7-8% of transcriptional reads, typically by revealing alternate genomic mappings for a read. Simulations of bisulfite-converted DNA show a decrease in identifying genomic positions uniquely in 6% of 36-nt reads and 3% of 70-nt reads. AVAILABILITY: Source code in C and utility programs in Perl are freely available for download as part of the GMAP package at http://share.gene.com/gmap. CONTACT: twu@gene.com.&lt;br/&gt;&lt;br/&gt;post to: &lt;a href = &quot;http://www.citeulike.org/posturl?url=http%3A%2F%2Fwww.ncbi.nlm.nih.gov%2Fentrez%2Fquery.fcgi%3Fcmd%3DRetrieve%26db%3DPubMed%26dopt%3DAbstract%26list_uids%3D20147302&amp;title=Entrez+Pubmed&quot;&gt;CiteULike&lt;/a&gt;</description>
    </item>
    <item>
      <title>Microindel detection in short-read sequence data.</title>
      <link>http://www.ncbi.nlm.nih.gov/entrez/query.fcgi?cmd=Retrieve&amp;db=PubMed&amp;dopt=Abstract&amp;list_uids=20144947</link>
      <description>Publication Date: 2010 Mar 15 PMID: 20144947&lt;br/&gt;Authors: Krawitz, P. - Rodelsperger, C. - Jager, M. - Jostins, L. - Bauer, S. - Robinson, P. N.&lt;br/&gt;Journal: Bioinformatics&lt;br/&gt;&lt;br/&gt;MOTIVATION: Several recent studies have demonstrated the effectiveness of resequencing and single nucleotide variant (SNV) detection by deep short-read sequencing platforms. While several reliable algorithms are available for automated SNV detection, the automated detection of microindels in deep short-read data presents a new bioinformatics challenge. RESULTS: We systematically analyzed how the short-read mapping tools MAQ, Bowtie, Burrows-Wheeler alignment tool (BWA), Novoalign and RazerS perform on simulated datasets that contain indels and evaluated how indels affect error rates in SNV detection. We implemented a simple algorithm to compute the equivalent indel region eir, which can be used to process the alignments produced by the mapping tools in order to perform indel calling. Using simulated data that contains indels, we demonstrate that indel detection works well on short-read data: the detection rate for microindels (&lt;4 bp) is &gt;90%. Our study provides insights into systematic errors in SNV detection that is based on ungapped short sequence read alignments. Gapped alignments of short sequence reads can be used to reduce this error and to detect microindels in simulated short-read data. A comparison with microindels automatically identified on the ABI Sanger and Roche 454 platform indicates that microindel detection from short sequence reads identifies both overlapping and distinct indels. CONTACT: peter.krawitz@googlemail.com; peter.robinson@charite.de SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online.&lt;br/&gt;&lt;br/&gt;post to: &lt;a href = &quot;http://www.citeulike.org/posturl?url=http%3A%2F%2Fwww.ncbi.nlm.nih.gov%2Fentrez%2Fquery.fcgi%3Fcmd%3DRetrieve%26db%3DPubMed%26dopt%3DAbstract%26list_uids%3D20144947&amp;title=Entrez+Pubmed&quot;&gt;CiteULike&lt;/a&gt;</description>
    </item>
    <item>
      <title>Regulatory Impact Factors: Unraveling the transcriptional regulation of complex traits from expression data.</title>
      <link>http://www.ncbi.nlm.nih.gov/entrez/query.fcgi?cmd=Retrieve&amp;db=PubMed&amp;dopt=Abstract&amp;list_uids=20144946</link>
      <description>Publication Date: 2010 Feb 9 PMID: 20144946&lt;br/&gt;Authors: Reverter-Gomez, A. - Hudson, N. J. - Nagaraj, S. H. - Perez-Enciso, M. - Dalrymple, B. P.&lt;br/&gt;Journal: Bioinformatics&lt;br/&gt;&lt;br/&gt;MOTIVATION: Although transcription factors (TF) play a central regulatory role, their detection from expression data is limited due to their low, and often sparse, expression. In order to fill this gap, we propose a regulatory impact factor (RIF) metric to identify critical TF from gene expression data. RESULTS: To substantiate the generality of RIF, we explore a set of experiments spanning a wide range of scenarios including breast cancer survival, fat, gonads and sex differentiation. We show that the strength of RIF lies in its ability to simultaneously integrate three sources of information into a single measure: i) the change in correlation existing between the TF and the differentially expressed (DE) genes; ii) the amount of differential expression of DE genes; and iii) the abundance of DE genes. As a result, RIF analysis assigns an extreme score to those TF that are consistently most differentially co-expressed with the highly abundant and highly DE genes (RIF1), and to those TF with the most altered ability to predict the abun-dance of DE genes (RIF2). We show that RIF analysis alone recov-ers well-known experimentally validated TF for the processes stud-ied. The TF identified confirm the importance of PPAR signaling in adipose development and the importance of transduction of estro-gen signals in breast cancer survival and sexual differentiation. We argue that RIF has universal applicability, and advocate its use as a promising hypotheses generating tool for the systematic identifica-tion of novel TF not yet documented as critical. CONTACT: Tony.Reverter-Gomez@csiro.au.&lt;br/&gt;&lt;br/&gt;post to: &lt;a href = &quot;http://www.citeulike.org/posturl?url=http%3A%2F%2Fwww.ncbi.nlm.nih.gov%2Fentrez%2Fquery.fcgi%3Fcmd%3DRetrieve%26db%3DPubMed%26dopt%3DAbstract%26list_uids%3D20144946&amp;title=Entrez+Pubmed&quot;&gt;CiteULike&lt;/a&gt;</description>
    </item>
    <item>
      <title>Improving protein secondary structure prediction using a simple k-mer model.</title>
      <link>http://www.ncbi.nlm.nih.gov/entrez/query.fcgi?cmd=Retrieve&amp;db=PubMed&amp;dopt=Abstract&amp;list_uids=20130034</link>
      <description>Publication Date: 2010 Mar 1 PMID: 20130034&lt;br/&gt;Authors: Madera, M. - Calmus, R. - Thiltgen, G. - Karplus, K. - Gough, J.&lt;br/&gt;Journal: Bioinformatics&lt;br/&gt;&lt;br/&gt;MOTIVATION: Some first order methods for protein sequence analysis inherently treat each position as independent. We develop a general framework for introducing longer range interactions. We then demonstrate the power of our approach by applying it to secondary structure prediction; under the independence assumption, sequences produced by existing methods can produce features that are not protein like, an extreme example being a helix of length 1. Our goal was to make the predictions from state of the art methods more realistic, without loss of performance by other measures. RESULTS: Our framework for longer range interactions is described as a k-mer order model. We succeeded in applying our model to the specific problem of secondary structure prediction, to be used as an additional layer on top of existing methods. We achieved our goal of making the predictions more realistic and protein like, and remarkably this also improved the overall performance. We improve the Segment OVerlap (SOV) score by 1.8%, but more importantly we radically improve the probability of the real sequence given a prediction from an average of 0.271 per residue to 0.385. Crucially, this improvement is obtained using no additional information. AVAILABILITY: http://supfam.cs.bris.ac.uk/kmer&lt;br/&gt;&lt;br/&gt;post to: &lt;a href = &quot;http://www.citeulike.org/posturl?url=http%3A%2F%2Fwww.ncbi.nlm.nih.gov%2Fentrez%2Fquery.fcgi%3Fcmd%3DRetrieve%26db%3DPubMed%26dopt%3DAbstract%26list_uids%3D20130034&amp;title=Entrez+Pubmed&quot;&gt;CiteULike&lt;/a&gt;</description>
    </item>
    <item>
      <title>Visualizing SNP statistics in the context of linkage disequilibrium using LD-Plus.</title>
      <link>http://www.ncbi.nlm.nih.gov/entrez/query.fcgi?cmd=Retrieve&amp;db=PubMed&amp;dopt=Abstract&amp;list_uids=20130027</link>
      <description>Publication Date: 2010 Feb 15 PMID: 20130027&lt;br/&gt;Authors: Bush, W. S. - Dudek, S. M. - Ritchie, M. D.&lt;br/&gt;Journal: Bioinformatics&lt;br/&gt;&lt;br/&gt;SUMMARY: Often in human genetic analysis, multiple tables of single nucleotide polymorphism (SNP) statistics are shown alongside a Haploview style correlation plot. Readers are then asked to make inferences that incorporate knowledge across these multiple sets of results. To better facilitate a collective understanding of all available data, we developed a Ruby-based web application, LD-Plus, to generate figures that simultaneously display physical location of SNPs, binary SNP attributes (such as coding/non-coding or presence on genotyping platforms), common haplotypes and their frequencies and continuously scaled values (such as F(st), minor allele frequency, genotyping efficiency or P-values), all in the context of the D' and r(2) linkage disequilibrium structures. Combining these results into one comprehensive figure reduces dereferencing between figures and tables, and can provide unique insights into genetic features that are not clearly seen when results are partitioned across multiple figures and tables.&lt;br/&gt;&lt;br/&gt;post to: &lt;a href = &quot;http://www.citeulike.org/posturl?url=http%3A%2F%2Fwww.ncbi.nlm.nih.gov%2Fentrez%2Fquery.fcgi%3Fcmd%3DRetrieve%26db%3DPubMed%26dopt%3DAbstract%26list_uids%3D20130027&amp;title=Entrez+Pubmed&quot;&gt;CiteULike&lt;/a&gt;</description>
    </item>
    <item>
      <title>Exploring classification strategies with the CoEPrA 2006 contest.</title>
      <link>http://www.ncbi.nlm.nih.gov/entrez/query.fcgi?cmd=Retrieve&amp;db=PubMed&amp;dopt=Abstract&amp;list_uids=20097914</link>
      <description>Publication Date: 2010 Mar 1 PMID: 20097914&lt;br/&gt;Authors: Demir-Kavuk, O. - Riedesel, H. - Knapp, E. W.&lt;br/&gt;Journal: Bioinformatics&lt;br/&gt;&lt;br/&gt;MOTIVATION: In silico methods to classify compounds as potential drugs that bind to a specific target become increasingly important for drug design. To build classification devices training sets of drugs with known activities are needed. For many such classification problems, not only qualitative but also quantitative information of a specific property (e.g. binding affinity) is available. The latter can be used to build a regression scheme to predict this property for new compounds. Predicting a compound property explicitly is generally more difficult than classifying that the property lies below or above a given threshold value. Hence, an indirect classification that is based on regression may lead to poorer results than a direct classification scheme. In fact, initially researchers are only interested to classify compounds as potential drugs. The activities of these compounds are subsequently measured in wet lab. RESULTS: We propose a novel approach that uses available quantitative information directly for classification rather than first using a regression scheme. It uses a new type of loss function called weighted biased regression. Application of this method to four widely studied datasets of the CoEPrA contest (Comparative Evaluation of Prediction Algorithms, http://coepra.org) shows that it can outperform simple classification methods that do not make use of this additional quantitative information. AVAILABILITY: A stand alone application is available at the webpage http://agknapp.chemie.fu-berlin.de/agknapp/index.php?menu=software&amp;page=Pe ptideClassifier that can be used to build a model for a peptide training set to be submitted.&lt;br/&gt;&lt;br/&gt;post to: &lt;a href = &quot;http://www.citeulike.org/posturl?url=http%3A%2F%2Fwww.ncbi.nlm.nih.gov%2Fentrez%2Fquery.fcgi%3Fcmd%3DRetrieve%26db%3DPubMed%26dopt%3DAbstract%26list_uids%3D20097914&amp;title=Entrez+Pubmed&quot;&gt;CiteULike&lt;/a&gt;</description>
    </item>
    <item>
      <title>Computational quantification of metabolic fluxes from a single isotope snapshot: application to an animal biopsy.</title>
      <link>http://www.ncbi.nlm.nih.gov/entrez/query.fcgi?cmd=Retrieve&amp;db=PubMed&amp;dopt=Abstract&amp;list_uids=20097912</link>
      <description>Publication Date: 2010 Mar 1 PMID: 20097912&lt;br/&gt;Authors: Binsl, T. W. - Alders, D. J. - Heringa, J. - Groeneveld, A. B. - van Beek, J. H.&lt;br/&gt;Journal: Bioinformatics&lt;br/&gt;&lt;br/&gt;MOTIVATION: Quantitative determination of metabolic fluxes in single tissue biopsies is difficult. We report a novel analysis approach and software package for in vivo flux quantification using stable isotope labeling. RESULTS: We developed a protocol based on brief, timed infusion of (13)C isotope-enriched substrates for the tricarboxylic acid (TCA) cycle followed by quick freezing of tissue biopsies. NMR measurements of tissue extracts were used for flux estimation based on a computational model of carbon transitions between TCA cycle metabolites and related amino acids. To this end, we developed a computational framework in which metabolic systems can be flexibly assembled, simulated and analyzed. Flux parameters were quantified from NMR multiplets by a partial grid search followed by repeated Nelder-Mead optimizations implemented on a computer grid. We implemented a model of the TCA cycle and showed by extensive simulations that the timed infusion protocol reliably quantitates multiple fluxes. Experimental validation of the method was done in vivo on hearts of anesthetized pigs under two different conditions: basal state (n = 7) and cardiac stress caused by infusion of dobutamine (n = 7). About nine tissue samples (40-200 mg dry-weight) were taken per heart. TCA cycle flux was 6.11 +/- 0.28 (SEM) micromol/min x gdw at baseline versus 9.29 +/- 1.03 micromol/min x gdw for dobutamine stress. Oxygen consumption calculated from the TCA cycle flux and from 'gold standard' blood gas-based measurements were close, correlating with r=0.88 (P &lt; 10(-4)). Spatial heterogeneity in metabolic fluxes is detectable amongst the small samples. We propose that our novel isotope snapshot methodology is suitable for flux measurements in biopsies in vivo. AVAILABILITY: Non-profit organizations will, upon request, be granted a non-exclusive license to use the software for internal research and teaching purposes at no charge. A web interface for using the software on our computer grid is available under http://www.ibi.vu.nl/programs/&lt;br/&gt;&lt;br/&gt;post to: &lt;a href = &quot;http://www.citeulike.org/posturl?url=http%3A%2F%2Fwww.ncbi.nlm.nih.gov%2Fentrez%2Fquery.fcgi%3Fcmd%3DRetrieve%26db%3DPubMed%26dopt%3DAbstract%26list_uids%3D20097912&amp;title=Entrez+Pubmed&quot;&gt;CiteULike&lt;/a&gt;</description>
    </item>
    <item>
      <title>ProteinWorldDB: querying radical pairwise alignments among protein sets from complete genomes.</title>
      <link>http://www.ncbi.nlm.nih.gov/entrez/query.fcgi?cmd=Retrieve&amp;db=PubMed&amp;dopt=Abstract&amp;list_uids=20089515</link>
      <description>Publication Date: 2010 Mar 1 PMID: 20089515&lt;br/&gt;Authors: Otto, T. D. - Catanho, M. - Tristao, C. - Bezerra, M. - Fernandes, R. M. - Elias, G. S. - Scaglia, A. C. - Bovermann, B. - Berstis, V. - Lifschitz, S. - de Miranda, A. B. - Degrave, W.&lt;br/&gt;Journal: Bioinformatics&lt;br/&gt;&lt;br/&gt;MOTIVATION: Many analyses in modern biological research are based on comparisons between biological sequences, resulting in functional, evolutionary and structural inferences. When large numbers of sequences are compared, heuristics are often used resulting in a certain lack of accuracy. In order to improve and validate results of such comparisons, we have performed radical all-against-all comparisons of 4 million protein sequences belonging to the RefSeq database, using an implementation of the Smith-Waterman algorithm. This extremely intensive computational approach was made possible with the help of World Community Grid, through the Genome Comparison Project. The resulting database, ProteinWorldDB, which contains coordinates of pairwise protein alignments and their respective scores, is now made available. Users can download, compare and analyze the results, filtered by genomes, protein functions or clusters. ProteinWorldDB is integrated with annotations derived from Swiss-Prot, Pfam, KEGG, NCBI Taxonomy database and gene ontology. The database is a unique and valuable asset, representing a major effort to create a reliable and consistent dataset of cross-comparisons of the whole protein content encoded in hundreds of completely sequenced genomes using a rigorous dynamic programming approach. AVAILABILITY: The database can be accessed through http://proteinworlddb.org&lt;br/&gt;&lt;br/&gt;post to: &lt;a href = &quot;http://www.citeulike.org/posturl?url=http%3A%2F%2Fwww.ncbi.nlm.nih.gov%2Fentrez%2Fquery.fcgi%3Fcmd%3DRetrieve%26db%3DPubMed%26dopt%3DAbstract%26list_uids%3D20089515&amp;title=Entrez+Pubmed&quot;&gt;CiteULike&lt;/a&gt;</description>
    </item>
    <item>
      <title>iDBPs: a web server for the identification of DNA binding proteins.</title>
      <link>http://www.ncbi.nlm.nih.gov/entrez/query.fcgi?cmd=Retrieve&amp;db=PubMed&amp;dopt=Abstract&amp;list_uids=20089514</link>
      <description>Publication Date: 2010 Mar 1 PMID: 20089514&lt;br/&gt;Authors: Nimrod, G. - Schushan, M. - Szilagyi, A. - Leslie, C. - Ben-Tal, N.&lt;br/&gt;Journal: Bioinformatics&lt;br/&gt;&lt;br/&gt;SUMMARY: The iDBPs server uses the three-dimensional (3D) structure of a query protein to predict whether it binds DNA. First, the algorithm predicts the functional region of the protein based on its evolutionary profile; the assumption is that large clusters of conserved residues are good markers of functional regions. Next, various characteristics of the predicted functional region as well as global features of the protein are calculated, such as the average surface electrostatic potential, the dipole moment and cluster-based amino acid conservation patterns. Finally, a random forests classifier is used to predict whether the query protein is likely to bind DNA and to estimate the prediction confidence. We have trained and tested the classifier on various datasets and shown that it outperformed related methods. On a dataset that reflects the fraction of DNA binding proteins (DBPs) in a proteome, the area under the ROC curve was 0.90. The application of the server to an updated version of the N-Func database, which contains proteins of unknown function with solved 3D-structure, suggested new putative DBPs for experimental studies. AVAILABILITY: http://idbps.tau.ac.il/&lt;br/&gt;&lt;br/&gt;post to: &lt;a href = &quot;http://www.citeulike.org/posturl?url=http%3A%2F%2Fwww.ncbi.nlm.nih.gov%2Fentrez%2Fquery.fcgi%3Fcmd%3DRetrieve%26db%3DPubMed%26dopt%3DAbstract%26list_uids%3D20089514&amp;title=Entrez+Pubmed&quot;&gt;CiteULike&lt;/a&gt;</description>
    </item>
    <item>
      <title>rMAT--an R/Bioconductor package for analyzing ChIP-chip experiments.</title>
      <link>http://www.ncbi.nlm.nih.gov/entrez/query.fcgi?cmd=Retrieve&amp;db=PubMed&amp;dopt=Abstract&amp;list_uids=20089513</link>
      <description>Publication Date: 2010 Mar 1 PMID: 20089513&lt;br/&gt;Authors: Droit, A. - Cheung, C. - Gottardo, R.&lt;br/&gt;Journal: Bioinformatics&lt;br/&gt;&lt;br/&gt;SUMMARY: Chromatin immunoprecipitation combined with DNA microarrays (ChIP-chip) has evolved as a popular technique to study DNA-protein binding or post-translational chromatin/histone modifications at the genomic level. However, the raw microarray intensities generate a massive amount of data, creating a need for efficient analysis algorithms and statistical methods to identify enriched regions. RESULTS: We present a fast, free and powerful, open source R package, rMAT, that allows the identification of regions enriched for transcription factor binding sites in ChIP-chip experiments on Affymetrix tiling arrays. AVAILABILITY: The R-package rMAT is available from the Bioconductor web site at http://bioconductor.org and runs on Linux, MAC OS and MS-Windows. rMAT is distributed under the terms of the Artistic Licence 2.0.&lt;br/&gt;&lt;br/&gt;post to: &lt;a href = &quot;http://www.citeulike.org/posturl?url=http%3A%2F%2Fwww.ncbi.nlm.nih.gov%2Fentrez%2Fquery.fcgi%3Fcmd%3DRetrieve%26db%3DPubMed%26dopt%3DAbstract%26list_uids%3D20089513&amp;title=Entrez+Pubmed&quot;&gt;CiteULike&lt;/a&gt;</description>
    </item>
    <item>
      <title>Protein secondary structure appears to be robust under in silico evolution while protein disorder appears not to be.</title>
      <link>http://www.ncbi.nlm.nih.gov/entrez/query.fcgi?cmd=Retrieve&amp;db=PubMed&amp;dopt=Abstract&amp;list_uids=20081223</link>
      <description>Publication Date: 2010 Mar 1 PMID: 20081223&lt;br/&gt;Authors: Schaefer, C. - Schlessinger, A. - Rost, B.&lt;br/&gt;Journal: Bioinformatics&lt;br/&gt;&lt;br/&gt;MOTIVATION: The mutation of amino acids often impacts protein function and structure. Mutations without negative effect sustain evolutionary pressure. We study a particular aspect of structural robustness with respect to mutations: regular protein secondary structure and natively unstructured (intrinsically disordered) regions. Is the formation of regular secondary structure an intrinsic feature of amino acid sequences, or is it a feature that is lost upon mutation and is maintained by evolution against the odds? Similarly, is disorder an intrinsic sequence feature or is it difficult to maintain? To tackle these questions, we in silico mutated native protein sequences into random sequence-like ensembles and monitored the change in predicted secondary structure and disorder. RESULTS: We established that by our coarse-grained measures for change, predictions and observations were similar, suggesting that our results were not biased by prediction mistakes. Changes in secondary structure and disorder predictions were linearly proportional to the change in sequence. Surprisingly, neither the content nor the length distribution for the predicted secondary structure changed substantially. Regions with long disorder behaved differently in that significantly fewer such regions were predicted after a few mutation steps. Our findings suggest that the formation of regular secondary structure is an intrinsic feature of random amino acid sequences, while the formation of long-disordered regions is not an intrinsic feature of proteins with disordered regions. Put differently, helices and strands appear to be maintained easily by evolution, whereas maintaining disordered regions appears difficult. Neutral mutations with respect to disorder are therefore very unlikely.&lt;br/&gt;&lt;br/&gt;post to: &lt;a href = &quot;http://www.citeulike.org/posturl?url=http%3A%2F%2Fwww.ncbi.nlm.nih.gov%2Fentrez%2Fquery.fcgi%3Fcmd%3DRetrieve%26db%3DPubMed%26dopt%3DAbstract%26list_uids%3D20081223&amp;title=Entrez+Pubmed&quot;&gt;CiteULike&lt;/a&gt;</description>
    </item>
    <item>
      <title>Multifactor dimensionality reduction for graphics processing units enables genome-wide testing of epistasis in sporadic ALS.</title>
      <link>http://www.ncbi.nlm.nih.gov/entrez/query.fcgi?cmd=Retrieve&amp;db=PubMed&amp;dopt=Abstract&amp;list_uids=20081222</link>
      <description>Publication Date: 2010 Mar 1 PMID: 20081222&lt;br/&gt;Authors: Greene, C. S. - Sinnott-Armstrong, N. A. - Himmelstein, D. S. - Park, P. J. - Moore, J. H. - Harris, B. T.&lt;br/&gt;Journal: Bioinformatics&lt;br/&gt;&lt;br/&gt;MOTIVATION: Epistasis, the presence of gene-gene interactions, has been hypothesized to be at the root of many common human diseases, but current genome-wide association studies largely ignore its role. Multifactor dimensionality reduction (MDR) is a powerful model-free method for detecting epistatic relationships between genes, but computational costs have made its application to genome-wide data difficult. Graphics processing units (GPUs), the hardware responsible for rendering computer games, are powerful parallel processors. Using GPUs to run MDR on a genome-wide dataset allows for statistically rigorous testing of epistasis. RESULTS: The implementation of MDR for GPUs (MDRGPU) includes core features of the widely used Java software package, MDR. This GPU implementation allows for large-scale analysis of epistasis at a dramatically lower cost than the standard CPU-based implementations. As a proof-of-concept, we applied this software to a genome-wide study of sporadic amyotrophic lateral sclerosis (ALS). We discovered a statistically significant two-SNP classifier and subsequently replicated the significance of these two SNPs in an independent study of ALS. MDRGPU makes the large-scale analysis of epistasis tractable and opens the door to statistically rigorous testing of interactions in genome-wide datasets. AVAILABILITY: MDRGPU is open source and available free of charge from http://www.sourceforge.net/projects/mdr.&lt;br/&gt;&lt;br/&gt;post to: &lt;a href = &quot;http://www.citeulike.org/posturl?url=http%3A%2F%2Fwww.ncbi.nlm.nih.gov%2Fentrez%2Fquery.fcgi%3Fcmd%3DRetrieve%26db%3DPubMed%26dopt%3DAbstract%26list_uids%3D20081222&amp;title=Entrez+Pubmed&quot;&gt;CiteULike&lt;/a&gt;</description>
    </item>
    <item>
      <title>Next-generation bioinformatics: using many-core processor architecture to develop a web service for sequence alignment.</title>
      <link>http://www.ncbi.nlm.nih.gov/entrez/query.fcgi?cmd=Retrieve&amp;db=PubMed&amp;dopt=Abstract&amp;list_uids=20081221</link>
      <description>Publication Date: 2010 Mar 1 PMID: 20081221&lt;br/&gt;Authors: Galvez, S. - Diaz, D. - Hernandez, P. - Esteban, F. J. - Caballero, J. A. - Dorado, G.&lt;br/&gt;Journal: Bioinformatics&lt;br/&gt;&lt;br/&gt;MOTIVATION: Bioinformatics algorithms and computing power are the main bottlenecks for analyzing huge amount of data generated by the current technologies, such as the 'next-generation' sequencing methodologies. At the same time, most powerful microprocessors are based on many-core chips, yet most applications cannot exploit such power, requiring parallelized algorithms. As an example of next-generation bioinformatics, we have developed from scratch a new parallelization of the Needleman-Wunsch (NW) sequence alignment algorithm for the 64-core Tile64 microprocessor. The unprecedented performance it offers for a standalone personal computer (PC) is discussed, optimally aligning sequences up to 20 times faster than the non-parallelized version, thus saving valuable time. AVAILABILITY: This algorithm is available as a free web service for the scientific community at http://www.sicuma.uma.es/multicore. The open source code is also available on such site.&lt;br/&gt;&lt;br/&gt;post to: &lt;a href = &quot;http://www.citeulike.org/posturl?url=http%3A%2F%2Fwww.ncbi.nlm.nih.gov%2Fentrez%2Fquery.fcgi%3Fcmd%3DRetrieve%26db%3DPubMed%26dopt%3DAbstract%26list_uids%3D20081221&amp;title=Entrez+Pubmed&quot;&gt;CiteULike&lt;/a&gt;</description>
    </item>
    <item>
      <title>Random distance dependent attachment as a model for neural network generation in the Caenorhabditis elegans.</title>
      <link>http://www.ncbi.nlm.nih.gov/entrez/query.fcgi?cmd=Retrieve&amp;db=PubMed&amp;dopt=Abstract&amp;list_uids=20081220</link>
      <description>Publication Date: 2010 Mar 1 PMID: 20081220&lt;br/&gt;Authors: Itzhack, R. - Louzoun, Y.&lt;br/&gt;Journal: Bioinformatics&lt;br/&gt;&lt;br/&gt;MOTIVATION: The topology of the network induced by the neurons connectivity's in the Caenorhabditis elegans differs from most common random networks. The neurons positions of the C.elegans have been previously explained as being optimal to induce the required network wiring. We here propose a complementary explanation that the network wiring is the direct result of a local stochastic synapse formation process. RESULTS: We show that a model based on the physical distance between neurons can explain the C.elegans neural network structure, specifically, we demonstrate that a simple model based on a geometrical synapse formation probability and the inhibition of short coherent cycles can explain the properties of the C.elegans' neural network. We suggest this model as an initial framework to discuss neural network generation and as a first step toward the development of models for more advanced creatures. In order to measure the circle frequency in the network, a novel graph-theory circle length measurement algorithm is proposed.&lt;br/&gt;&lt;br/&gt;post to: &lt;a href = &quot;http://www.citeulike.org/posturl?url=http%3A%2F%2Fwww.ncbi.nlm.nih.gov%2Fentrez%2Fquery.fcgi%3Fcmd%3DRetrieve%26db%3DPubMed%26dopt%3DAbstract%26list_uids%3D20081220&amp;title=Entrez+Pubmed&quot;&gt;CiteULike&lt;/a&gt;</description>
    </item>
    <item>
      <title>PSiFR: an integrated resource for prediction of protein structure and function.</title>
      <link>http://www.ncbi.nlm.nih.gov/entrez/query.fcgi?cmd=Retrieve&amp;db=PubMed&amp;dopt=Abstract&amp;list_uids=20080513</link>
      <description>Publication Date: 2010 Mar 1 PMID: 20080513&lt;br/&gt;Authors: Pandit, S. B. - Brylinski, M. - Zhou, H. - Gao, M. - Arakaki, A. K. - Skolnick, J.&lt;br/&gt;Journal: Bioinformatics&lt;br/&gt;&lt;br/&gt;In the post-genomic era, the annotation of protein function facilitates the understanding of various biological processes. To extend the range of function annotation methods to the twilight zone of sequence identity, we have developed approaches that exploit both protein tertiary structure and/or protein sequence evolutionary relationships. To serve the scientific community, we have integrated the structure prediction tools, TASSER, TASSER-Lite and METATASSER, and the functional inference tools, FINDSITE, a structure-based algorithm for binding site prediction, Gene Ontology molecular function inference and ligand screening, EFICAz(2), a sequence-based approach to enzyme function inference and DBD-hunter, an algorithm for predicting DNA-binding proteins and associated DNA-binding residues, into a unified web resource, Protein Structure and Function prediction Resource (PSiFR). Availability and implementation: PSiFR is freely available for use on the web at http://psifr.cssb.biology.gatech.edu/&lt;br/&gt;&lt;br/&gt;post to: &lt;a href = &quot;http://www.citeulike.org/posturl?url=http%3A%2F%2Fwww.ncbi.nlm.nih.gov%2Fentrez%2Fquery.fcgi%3Fcmd%3DRetrieve%26db%3DPubMed%26dopt%3DAbstract%26list_uids%3D20080513&amp;title=Entrez+Pubmed&quot;&gt;CiteULike&lt;/a&gt;</description>
    </item>
    <item>
      <title>Bayesian rule learning for biomedical data mining.</title>
      <link>http://www.ncbi.nlm.nih.gov/entrez/query.fcgi?cmd=Retrieve&amp;db=PubMed&amp;dopt=Abstract&amp;list_uids=20080512</link>
      <description>Publication Date: 2010 Mar 1 PMID: 20080512&lt;br/&gt;Authors: Gopalakrishnan, V. - Lustgarten, J. L. - Visweswaran, S. - Cooper, G. F.&lt;br/&gt;Journal: Bioinformatics&lt;br/&gt;&lt;br/&gt;MOTIVATION: Disease state prediction from biomarker profiling studies is an important problem because more accurate classification models will potentially lead to the discovery of better, more discriminative markers. Data mining methods are routinely applied to such analyses of biomedical datasets generated from high-throughput 'omic' technologies applied to clinical samples from tissues or bodily fluids. Past work has demonstrated that rule models can be successfully applied to this problem, since they can produce understandable models that facilitate review of discriminative biomarkers by biomedical scientists. While many rule-based methods produce rules that make predictions under uncertainty, they typically do not quantify the uncertainty in the validity of the rule itself. This article describes an approach that uses a Bayesian score to evaluate rule models. RESULTS: We have combined the expressiveness of rules with the mathematical rigor of Bayesian networks (BNs) to develop and evaluate a Bayesian rule learning (BRL) system. This system utilizes a novel variant of the K2 algorithm for building BNs from the training data to provide probabilistic scores for IF-antecedent-THEN-consequent rules using heuristic best-first search. We then apply rule-based inference to evaluate the learned models during 10-fold cross-validation performed two times. The BRL system is evaluated on 24 published 'omic' datasets, and on average it performs on par or better than other readily available rule learning methods. Moreover, BRL produces models that contain on average 70% fewer variables, which means that the biomarker panels for disease prediction contain fewer markers for further verification and validation by bench scientists.&lt;br/&gt;&lt;br/&gt;post to: &lt;a href = &quot;http://www.citeulike.org/posturl?url=http%3A%2F%2Fwww.ncbi.nlm.nih.gov%2Fentrez%2Fquery.fcgi%3Fcmd%3DRetrieve%26db%3DPubMed%26dopt%3DAbstract%26list_uids%3D20080512&amp;title=Entrez+Pubmed&quot;&gt;CiteULike&lt;/a&gt;</description>
    </item>
    <item>
      <title>Faster computation of exact RNA shape probabilities.</title>
      <link>http://www.ncbi.nlm.nih.gov/entrez/query.fcgi?cmd=Retrieve&amp;db=PubMed&amp;dopt=Abstract&amp;list_uids=20080511</link>
      <description>Publication Date: 2010 Mar 1 PMID: 20080511&lt;br/&gt;Authors: Janssen, S. - Giegerich, R.&lt;br/&gt;Journal: Bioinformatics&lt;br/&gt;&lt;br/&gt;MOTIVATION: Abstract shape analysis allows efficient computation of a representative sample of low-energy foldings of an RNA molecule. More comprehensive information is obtained by computing shape probabilities, accumulating the Boltzmann probabilities of all structures within each abstract shape. Such information is superior to free energies because it is independent of sequence length and base composition. However, up to this point, computation of shape probabilities evaluates all shapes simultaneously and comes with a computation cost which is exponential in the length of the sequence. RESULTS: We device an approach called RapidShapes that computes the shapes above a specified probability threshold T by generating a list of promising shapes and constructing specialized folding programs for each shape to compute its share of Boltzmann probability. This aims at a heuristic improvement of runtime, while still computing exact probability values. Conclusion: Evaluating this approach and several substrategies, we find that only a small proportion of shapes have to be actually computed. For an RNA sequence of length 400, this leads, depending on the threshold, to a 10-138 fold speed-up compared with the previous complete method. Thus, probabilistic shape analysis has become feasible in medium-scale applications, such as the screening of RNA transcripts in a bacterial genome. AVAILABILITY: RapidShapes is available via http://bibiserv.cebitec.uni-bielefeld.de/rnashapes&lt;br/&gt;&lt;br/&gt;post to: &lt;a href = &quot;http://www.citeulike.org/posturl?url=http%3A%2F%2Fwww.ncbi.nlm.nih.gov%2Fentrez%2Fquery.fcgi%3Fcmd%3DRetrieve%26db%3DPubMed%26dopt%3DAbstract%26list_uids%3D20080511&amp;title=Entrez+Pubmed&quot;&gt;CiteULike&lt;/a&gt;</description>
    </item>
    <item>
      <title>NEMO: a tool for analyzing gene and chromosome territory distributions from 3D-FISH experiments.</title>
      <link>http://www.ncbi.nlm.nih.gov/entrez/query.fcgi?cmd=Retrieve&amp;db=PubMed&amp;dopt=Abstract&amp;list_uids=20080510</link>
      <description>Publication Date: 2010 Mar 1 PMID: 20080510&lt;br/&gt;Authors: Iannuccelli, E. - Mompart, F. - Gellin, J. - Lahbib-Mansais, Y. - Yerle, M. - Boudier, T.&lt;br/&gt;Journal: Bioinformatics&lt;br/&gt;&lt;br/&gt;Three-dimensional fluorescence in situ hybridization (3D-FISH) is used to study the organization and the positioning of chromosomes or specific sequences such as genes or RNA in cell nuclei. Many different programs (commercial or free) allow image analysis for 3D-FISH experiments. One of the more efficient open-source programs for automatically processing 3D-FISH microscopy images is Smart 3D-FISH, an ImageJ plug-in designed to automatically analyze distances between genes. One of the drawbacks of Smart 3D-FISH is that it has a rather basic user interface and produces its results in various text and image files thus making the data post-processing step time consuming. We developed a new Smart 3D-FISH graphical user interface, NEMO, which provides all information in the same place so that results can be checked and validated efficiently. NEMO gives users the ability to drive their experiments analysis in either automatic, semi-automatic or manual detection mode. We also tuned Smart 3D-FISH to better analyze chromosome territories. AVAILABILITY: NEMO is a stand-alone Java application available for Windows and Linux platforms. The program is distributed under the creative commons licence and can be freely downloaded from https://www-lgc.toulouse.inra.fr/nemo&lt;br/&gt;&lt;br/&gt;post to: &lt;a href = &quot;http://www.citeulike.org/posturl?url=http%3A%2F%2Fwww.ncbi.nlm.nih.gov%2Fentrez%2Fquery.fcgi%3Fcmd%3DRetrieve%26db%3DPubMed%26dopt%3DAbstract%26list_uids%3D20080510&amp;title=Entrez+Pubmed&quot;&gt;CiteULike&lt;/a&gt;</description>
    </item>
    <item>
      <title>MSMSpdbb: providing protein databases of closely related organisms to improve proteomic characterization of prokaryotic microbes.</title>
      <link>http://www.ncbi.nlm.nih.gov/entrez/query.fcgi?cmd=Retrieve&amp;db=PubMed&amp;dopt=Abstract&amp;list_uids=20080508</link>
      <description>Publication Date: 2010 Mar 1 PMID: 20080508&lt;br/&gt;Authors: de Souza, G. A. - Arntzen, M. O. - Wiker, H. G.&lt;br/&gt;Journal: Bioinformatics&lt;br/&gt;&lt;br/&gt;The Microbial Proteomic Resource (MPR) is a repository service that contains non-redundant protein databases of related bacterial strains, which were generated through an in-house developed software called Multi-Strain Mass Spectrometry Prokaryotic DataBase Builder (MSMSpdbb). MSMSpdbb merges and clusters protein sequences inferred from genomic sequences, and provide a protein list in FASTA format that covers for divergence in gene annotation, translational start site choice and presence of single nucleotide polymorphisms and other mutations. AVAILABILITY: MSMSpdbb was developed in C++ using the Qt libraries (Nokia) and licensed under the GNU General Public License version 2. MSMSpdbb is freely available, and its installation files, instructions for use and additional documentation can be found at the MPR web site http://org.uib.no/prokaryotedb/ can also be found at Proteomecommons.org (see Supplementary Methods for Hash number).&lt;br/&gt;&lt;br/&gt;post to: &lt;a href = &quot;http://www.citeulike.org/posturl?url=http%3A%2F%2Fwww.ncbi.nlm.nih.gov%2Fentrez%2Fquery.fcgi%3Fcmd%3DRetrieve%26db%3DPubMed%26dopt%3DAbstract%26list_uids%3D20080508&amp;title=Entrez+Pubmed&quot;&gt;CiteULike&lt;/a&gt;</description>
    </item>
    <item>
      <title>Active site prediction using evolutionary and structural information.</title>
      <link>http://www.ncbi.nlm.nih.gov/entrez/query.fcgi?cmd=Retrieve&amp;db=PubMed&amp;dopt=Abstract&amp;list_uids=20080507</link>
      <description>Publication Date: 2010 Mar 1 PMID: 20080507&lt;br/&gt;Authors: Sankararaman, S. - Sha, F. - Kirsch, J. F. - Jordan, M. I. - Sjolander, K.&lt;br/&gt;Journal: Bioinformatics&lt;br/&gt;&lt;br/&gt;MOTIVATION: The identification of catalytic residues is a key step in understanding the function of enzymes. While a variety of computational methods have been developed for this task, accuracies have remained fairly low. The best existing method exploits information from sequence and structure to achieve a precision (the fraction of predicted catalytic residues that are catalytic) of 18.5% at a corresponding recall (the fraction of catalytic residues identified) of 57% on a standard benchmark. Here we present a new method, Discern, which provides a significant improvement over the state-of-the-art through the use of statistical techniques to derive a model with a small set of features that are jointly predictive of enzyme active sites. RESULTS: In cross-validation experiments on two benchmark datasets from the Catalytic Site Atlas and CATRES resources containing a total of 437 manually curated enzymes spanning 487 SCOP families, Discern increases catalytic site recall between 12% and 20% over methods that combine information from both sequence and structure, and by &gt;or=50% over methods that make use of sequence conservation signal only. Controlled experiments show that Discern's improvement in catalytic residue prediction is derived from the combination of three ingredients: the use of the INTREPID phylogenomic method to extract conservation information; the use of 3D structure data, including features computed for residues that are proximal in the structure; and a statistical regularization procedure to prevent overfitting.&lt;br/&gt;&lt;br/&gt;post to: &lt;a href = &quot;http://www.citeulike.org/posturl?url=http%3A%2F%2Fwww.ncbi.nlm.nih.gov%2Fentrez%2Fquery.fcgi%3Fcmd%3DRetrieve%26db%3DPubMed%26dopt%3DAbstract%26list_uids%3D20080507&amp;title=Entrez+Pubmed&quot;&gt;CiteULike&lt;/a&gt;</description>
    </item>
    <item>
      <title>A censored beta mixture model for the estimation of the proportion of non-differentially expressed genes.</title>
      <link>http://www.ncbi.nlm.nih.gov/entrez/query.fcgi?cmd=Retrieve&amp;db=PubMed&amp;dopt=Abstract&amp;list_uids=20080506</link>
      <description>Publication Date: 2010 Mar 1 PMID: 20080506&lt;br/&gt;Authors: Markitsis, A. - Lai, Y.&lt;br/&gt;Journal: Bioinformatics&lt;br/&gt;&lt;br/&gt;MOTIVATION: The proportion of non-differentially expressed genes (pi(0)) is an important quantity in microarray data analysis. Although many statistical methods have been proposed for its estimation, it is still necessary to develop more efficient methods. METHODS: Our approach for improving pi(0) estimation is to modify an existing simple method by introducing artificial censoring to P-values. In a comprehensive simulation study and the applications to experimental datasets, we compare our method with eight existing estimation methods. RESULTS: The simulation study confirms that our method can clearly improve the estimation performance. Compared with the existing methods, our method can generally provide a relatively accurate estimate with relatively small variance. Using experimental microarray datasets, we also demonstrate that our method can generally provide satisfactory estimates in practice. AVAILABILITY: The R code is freely available at http://home.gwu.edu/~ylai/research/CBpi0/.&lt;br/&gt;&lt;br/&gt;post to: &lt;a href = &quot;http://www.citeulike.org/posturl?url=http%3A%2F%2Fwww.ncbi.nlm.nih.gov%2Fentrez%2Fquery.fcgi%3Fcmd%3DRetrieve%26db%3DPubMed%26dopt%3DAbstract%26list_uids%3D20080506&amp;title=Entrez+Pubmed&quot;&gt;CiteULike&lt;/a&gt;</description>
    </item>
    <item>
      <title>Fast and accurate long-read alignment with Burrows-Wheeler transform.</title>
      <link>http://www.ncbi.nlm.nih.gov/entrez/query.fcgi?cmd=Retrieve&amp;db=PubMed&amp;dopt=Abstract&amp;list_uids=20080505</link>
      <description>Publication Date: 2010 Mar 1 PMID: 20080505&lt;br/&gt;Authors: Li, H. - Durbin, R.&lt;br/&gt;Journal: Bioinformatics&lt;br/&gt;&lt;br/&gt;MOTIVATION: Many programs for aligning short sequencing reads to a reference genome have been developed in the last 2 years. Most of them are very efficient for short reads but inefficient or not applicable for reads &gt;200 bp because the algorithms are heavily and specifically tuned for short queries with low sequencing error rate. However, some sequencing platforms already produce longer reads and others are expected to become available soon. For longer reads, hashing-based software such as BLAT and SSAHA2 remain the only choices. Nonetheless, these methods are substantially slower than short-read aligners in terms of aligned bases per unit time. RESULTS: We designed and implemented a new algorithm, Burrows-Wheeler Aligner's Smith-Waterman Alignment (BWA-SW), to align long sequences up to 1 Mb against a large sequence database (e.g. the human genome) with a few gigabytes of memory. The algorithm is as accurate as SSAHA2, more accurate than BLAT, and is several to tens of times faster than both. AVAILABILITY: http://bio-bwa.sourceforge.net&lt;br/&gt;&lt;br/&gt;post to: &lt;a href = &quot;http://www.citeulike.org/posturl?url=http%3A%2F%2Fwww.ncbi.nlm.nih.gov%2Fentrez%2Fquery.fcgi%3Fcmd%3DRetrieve%26db%3DPubMed%26dopt%3DAbstract%26list_uids%3D20080505&amp;title=Entrez+Pubmed&quot;&gt;CiteULike&lt;/a&gt;</description>
    </item>
    <item>
      <title>COPS benchmark: interactive analysis of database search methods.</title>
      <link>http://www.ncbi.nlm.nih.gov/entrez/query.fcgi?cmd=Retrieve&amp;db=PubMed&amp;dopt=Abstract&amp;list_uids=20080504</link>
      <description>Publication Date: 2010 Feb 15 PMID: 20080504&lt;br/&gt;Authors: Frank, K. - Gruber, M. - Sippl, M. J.&lt;br/&gt;Journal: Bioinformatics&lt;br/&gt;&lt;br/&gt;SUMMARY: The performance of sequence database search methods is usually judged by receiver operating characteristic (ROC) analysis. The proper interpretation of the results obtained and a fair comparison across different methods critically depends on the properties of the data set used for such an analysis; in particular, each query must have the same number of true positives and true negatives. Here, we present a novel web service based on a dataset specifically designed for ROC analysis and the investigation of alignment quality. The data set is derived from a quantitative classification of protein structures (COPS), while analysis and results are presented through an intuitive web interface. The analysis provides details such as false positives per query, and visualization of the structural similarity between query and targets. Most importantly, results obtained for a specific alignment method are immediately related to those obtained for several popular standard sequence alignment methods.&lt;br/&gt;&lt;br/&gt;post to: &lt;a href = &quot;http://www.citeulike.org/posturl?url=http%3A%2F%2Fwww.ncbi.nlm.nih.gov%2Fentrez%2Fquery.fcgi%3Fcmd%3DRetrieve%26db%3DPubMed%26dopt%3DAbstract%26list_uids%3D20080504&amp;title=Entrez+Pubmed&quot;&gt;CiteULike&lt;/a&gt;</description>
    </item>
    <item>
      <title>BamView: viewing mapped read alignment data in the context of the reference sequence.</title>
      <link>http://www.ncbi.nlm.nih.gov/entrez/query.fcgi?cmd=Retrieve&amp;db=PubMed&amp;dopt=Abstract&amp;list_uids=20071372</link>
      <description>Publication Date: 2010 Mar 1 PMID: 20071372&lt;br/&gt;Authors: Carver, T. - Bohme, U. - Otto, T. D. - Parkhill, J. - Berriman, M.&lt;br/&gt;Journal: Bioinformatics&lt;br/&gt;&lt;br/&gt;SUMMARY: BamView is an interactive Java application for visualizing the large amounts of data stored for sequence reads which are aligned against a reference genome sequence. It supports the BAM (Binary Alignment/Map) format. It can be used in a number of contexts including SNP calling and structural annotation. BamView has also been integrated into Artemis so that the reads can be viewed in the context of the nucleotide sequence and genomic features. AVAILABILITY: BamView and Artemis are freely available (under a GPL licence) for download (for MacOSX, UNIX and Windows) at: http://bamview.sourceforge.net/&lt;br/&gt;&lt;br/&gt;post to: &lt;a href = &quot;http://www.citeulike.org/posturl?url=http%3A%2F%2Fwww.ncbi.nlm.nih.gov%2Fentrez%2Fquery.fcgi%3Fcmd%3DRetrieve%26db%3DPubMed%26dopt%3DAbstract%26list_uids%3D20071372&amp;title=Entrez+Pubmed&quot;&gt;CiteULike&lt;/a&gt;</description>
    </item>
    <item>
      <title>PyRosetta: a script-based interface for implementing molecular modeling algorithms using Rosetta.</title>
      <link>http://www.ncbi.nlm.nih.gov/entrez/query.fcgi?cmd=Retrieve&amp;db=PubMed&amp;dopt=Abstract&amp;list_uids=20061306</link>
      <description>Publication Date: 2010 Mar 1 PMID: 20061306&lt;br/&gt;Authors: Chaudhury, S. - Lyskov, S. - Gray, J. J.&lt;br/&gt;Journal: Bioinformatics&lt;br/&gt;&lt;br/&gt;SUMMARY: PyRosetta is a stand-alone Python-based implementation of the Rosetta molecular modeling package that allows users to write custom structure prediction and design algorithms using the major Rosetta sampling and scoring functions. PyRosetta contains Python bindings to libraries that define Rosetta functions including those for accessing and manipulating protein structure, calculating energies and running Monte Carlo-based simulations. PyRosetta can be used in two ways: (i) interactively, using iPython and (ii) script-based, using Python scripting. Interactive mode contains a number of help features and is ideal for beginners while script-mode is best suited for algorithm development. PyRosetta has similar computational performance to Rosetta, can be easily scaled up for cluster applications and has been implemented for algorithms demonstrating protein docking, protein folding, loop modeling and design. AVAILABILITY: PyRosetta is a stand-alone package available at http://www.pyrosetta.org under the Rosetta license which is free for academic and non-profit users. A tutorial, user's manual and sample scripts demonstrating usage are also available on the web site.&lt;br/&gt;&lt;br/&gt;post to: &lt;a href = &quot;http://www.citeulike.org/posturl?url=http%3A%2F%2Fwww.ncbi.nlm.nih.gov%2Fentrez%2Fquery.fcgi%3Fcmd%3DRetrieve%26db%3DPubMed%26dopt%3DAbstract%26list_uids%3D20061306&amp;title=Entrez+Pubmed&quot;&gt;CiteULike&lt;/a&gt;</description>
    </item>
    <item>
      <title>CD-HIT Suite: a web server for clustering and comparing biological sequences.</title>
      <link>http://www.ncbi.nlm.nih.gov/entrez/query.fcgi?cmd=Retrieve&amp;db=PubMed&amp;dopt=Abstract&amp;list_uids=20053844</link>
      <description>Publication Date: 2010 Mar 1 PMID: 20053844&lt;br/&gt;Authors: Huang, Y. - Niu, B. - Gao, Y. - Fu, L. - Li, W.&lt;br/&gt;Journal: Bioinformatics&lt;br/&gt;&lt;br/&gt;CD-HIT is a widely used program for clustering and comparing large biological sequence datasets. In order to further assist the CD-HIT users, we significantly improved this program with more functions and better accuracy, scalability and flexibility. Most importantly, we developed a new web server, CD-HIT Suite, for clustering a user-uploaded sequence dataset or comparing it to another dataset at different identity levels. Users can now interactively explore the clusters within web browsers. We also provide downloadable clusters for several public databases (NCBI NR, Swissprot and PDB) at different identity levels. AVAILABILITY: Free access at http://cd-hit.org&lt;br/&gt;&lt;br/&gt;post to: &lt;a href = &quot;http://www.citeulike.org/posturl?url=http%3A%2F%2Fwww.ncbi.nlm.nih.gov%2Fentrez%2Fquery.fcgi%3Fcmd%3DRetrieve%26db%3DPubMed%26dopt%3DAbstract%26list_uids%3D20053844&amp;title=Entrez+Pubmed&quot;&gt;CiteULike&lt;/a&gt;</description>
    </item>
    <item>
      <title>dbTEU: a protein database of trace element utilization.</title>
      <link>http://www.ncbi.nlm.nih.gov/entrez/query.fcgi?cmd=Retrieve&amp;db=PubMed&amp;dopt=Abstract&amp;list_uids=20053843</link>
      <description>Publication Date: 2010 Mar 1 PMID: 20053843&lt;br/&gt;Authors: Zhang, Y. - Gladyshev, V. N.&lt;br/&gt;Journal: Bioinformatics&lt;br/&gt;&lt;br/&gt;Biological trace elements are required for numerous biological processes and by all organisms. We describe a database, dbTEU (DataBase of Trace Element Utilization), that features known transporters and user proteins for five trace elements (copper, molybdenum, nickel, cobalt and selenium) and represents sequenced organisms from the three domains of life. The manually curated dbTEU currently includes approximately 16,500 proteins from &gt;700 organisms, and offers interactive trace element, protein, organism and sequence search and browse tools. Availability and Implementation: dbTEU is freely available at http://gladyshevlab.bwh.harvard.edu/trace_element/&lt;br/&gt;&lt;br/&gt;post to: &lt;a href = &quot;http://www.citeulike.org/posturl?url=http%3A%2F%2Fwww.ncbi.nlm.nih.gov%2Fentrez%2Fquery.fcgi%3Fcmd%3DRetrieve%26db%3DPubMed%26dopt%3DAbstract%26list_uids%3D20053843&amp;title=Entrez+Pubmed&quot;&gt;CiteULike&lt;/a&gt;</description>
    </item>
    <item>
      <title>Xper2: introducing e-taxonomy.</title>
      <link>http://www.ncbi.nlm.nih.gov/entrez/query.fcgi?cmd=Retrieve&amp;db=PubMed&amp;dopt=Abstract&amp;list_uids=20053842</link>
      <description>Publication Date: 2010 Mar 1 PMID: 20053842&lt;br/&gt;Authors: Ung, V. - Dubus, G. - Zaragueta-Bagils, R. - Vignes-Lebbe, R.&lt;br/&gt;Journal: Bioinformatics&lt;br/&gt;&lt;br/&gt;MOTIVATION: Computer Aided Identification systems provide users with the resources to relate morpho-anatomic observations with taxa names and to subsequently access other knowledge about the organisms. They have the ability to manage descriptive data and make identifications through interactive keys. They are essential for both authors and users of biodiversity information. Xper(2) version 2.0 is one of the most user-friendly tools in its category and provides a complete environment dedicated to taxonomic management. AVAILABILITY: Xper(2) software can be freely downloaded at http://lis-upmc.snv.jussieu.fr/lis/?q=en/resources/softwares/xper2&lt;br/&gt;&lt;br/&gt;post to: &lt;a href = &quot;http://www.citeulike.org/posturl?url=http%3A%2F%2Fwww.ncbi.nlm.nih.gov%2Fentrez%2Fquery.fcgi%3Fcmd%3DRetrieve%26db%3DPubMed%26dopt%3DAbstract%26list_uids%3D20053842&amp;title=Entrez+Pubmed&quot;&gt;CiteULike&lt;/a&gt;</description>
    </item>
    <item>
      <title>Bioinformatics challenges for genome-wide association studies.</title>
      <link>http://www.ncbi.nlm.nih.gov/entrez/query.fcgi?cmd=Retrieve&amp;db=PubMed&amp;dopt=Abstract&amp;list_uids=20053841</link>
      <description>Publication Date: 2010 Feb 15 PMID: 20053841&lt;br/&gt;Authors: Moore, J. H. - Asselbergs, F. W. - Williams, S. M.&lt;br/&gt;Journal: Bioinformatics&lt;br/&gt;&lt;br/&gt;MOTIVATION: The sequencing of the human genome has made it possible to identify an informative set of &gt;1 million single nucleotide polymorphisms (SNPs) across the genome that can be used to carry out genome-wide association studies (GWASs). The availability of massive amounts of GWAS data has necessitated the development of new biostatistical methods for quality control, imputation and analysis issues including multiple testing. This work has been successful and has enabled the discovery of new associations that have been replicated in multiple studies. However, it is now recognized that most SNPs discovered via GWAS have small effects on disease susceptibility and thus may not be suitable for improving health care through genetic testing. One likely explanation for the mixed results of GWAS is that the current biostatistical analysis paradigm is by design agnostic or unbiased in that it ignores all prior knowledge about disease pathobiology. Further, the linear modeling framework that is employed in GWAS often considers only one SNP at a time thus ignoring their genomic and environmental context. There is now a shift away from the biostatistical approach toward a more holistic approach that recognizes the complexity of the genotype-phenotype relationship that is characterized by significant heterogeneity and gene-gene and gene-environment interaction. We argue here that bioinformatics has an important role to play in addressing the complexity of the underlying genetic basis of common human diseases. The goal of this review is to identify and discuss those GWAS challenges that will require computational methods.&lt;br/&gt;&lt;br/&gt;post to: &lt;a href = &quot;http://www.citeulike.org/posturl?url=http%3A%2F%2Fwww.ncbi.nlm.nih.gov%2Fentrez%2Fquery.fcgi%3Fcmd%3DRetrieve%26db%3DPubMed%26dopt%3DAbstract%26list_uids%3D20053841&amp;title=Entrez+Pubmed&quot;&gt;CiteULike&lt;/a&gt;</description>
    </item>
    <item>
      <title>Disambiguating the species of biomedical named entities using natural language parsers.</title>
      <link>http://www.ncbi.nlm.nih.gov/entrez/query.fcgi?cmd=Retrieve&amp;db=PubMed&amp;dopt=Abstract&amp;list_uids=20053840</link>
      <description>Publication Date: 2010 Mar 1 PMID: 20053840&lt;br/&gt;Authors: Wang, X. - Tsujii, J. - Ananiadou, S.&lt;br/&gt;Journal: Bioinformatics&lt;br/&gt;&lt;br/&gt;MOTIVATION: Text mining technologies have been shown to reduce the laborious work involved in organizing the vast amount of information hidden in the literature. One challenge in text mining is linking ambiguous word forms to unambiguous biological concepts. This article reports on a comprehensive study on resolving the ambiguity in mentions of biomedical named entities with respect to model organisms and presents an array of approaches, with focus on methods utilizing natural language parsers. RESULTS: We build a corpus for organism disambiguation where every occurrence of protein/gene entity is manually tagged with a species ID, and evaluate a number of methods on it. Promising results are obtained by training a machine learning model on syntactic parse trees, which is then used to decide whether an entity belongs to the model organism denoted by a neighbouring species-indicating word (e.g. yeast). The parser-based approaches are also compared with a supervised classification method and results indicate that the former are a more favorable choice when domain portability is of concern. The best overall performance is obtained by combining the strengths of syntactic features and supervised classification. AVAILABILITY: The corpus and demo are available at http://www.nactem.ac.uk/deca_details/start.cgi, and the software is freely available as U-Compare components (Kano et al., 2009): NaCTeM Species Word Detector and NaCTeM Species Disambiguator. U-Compare is available at http://-compare.org/&lt;br/&gt;&lt;br/&gt;post to: &lt;a href = &quot;http://www.citeulike.org/posturl?url=http%3A%2F%2Fwww.ncbi.nlm.nih.gov%2Fentrez%2Fquery.fcgi%3Fcmd%3DRetrieve%26db%3DPubMed%26dopt%3DAbstract%26list_uids%3D20053840&amp;title=Entrez+Pubmed&quot;&gt;CiteULike&lt;/a&gt;</description>
    </item>
    <item>
      <title>GWAS analyzer: integrating genotype, phenotype and public annotation data for genome-wide association study analysis.</title>
      <link>http://www.ncbi.nlm.nih.gov/entrez/query.fcgi?cmd=Retrieve&amp;db=PubMed&amp;dopt=Abstract&amp;list_uids=20053839</link>
      <description>Publication Date: 2010 Feb 15 PMID: 20053839&lt;br/&gt;Authors: Fong, C. - Ko, D. C. - Wasnick, M. - Radey, M. - Miller, S. I. - Brittnacher, M.&lt;br/&gt;Journal: Bioinformatics&lt;br/&gt;&lt;br/&gt;MOTIVATION: Genome-wide association studies are beginning to elucidate how our genetic differences contribute to susceptibility and severity of disease. While computational tools have previously been developed to support various aspects of genome-wide association studies, there is currently a need for informatics solutions that facilitate the integration of data from multiple sources. RESULTS: Here we present GWAS Analyzer, a database driven web-based tool that integrates genotype and phenotype data, association analysis results and genomic annotations from multiple public resources. GWAS Analyzer contains features for browsing these interrelated data, exploring phenotypic values by family or genotype, and filtering association results based on multiple criteria. The utility of the tool has been demonstrated by a genome-wide association study of human in vitro susceptibility to bacterial infection. GWAS Analyzer facilitated management of large sets of phenotype and genotype data, analysis of phenotypic variation and heritability, and most importantly, generation of a refined set of candidate single nucleotide polymorphisms (SNPs). The tool revealed a SNP that was experimentally validated to be associated with increased cell death among Salmonella infected HapMap cell lines.&lt;br/&gt;&lt;br/&gt;post to: &lt;a href = &quot;http://www.citeulike.org/posturl?url=http%3A%2F%2Fwww.ncbi.nlm.nih.gov%2Fentrez%2Fquery.fcgi%3Fcmd%3DRetrieve%26db%3DPubMed%26dopt%3DAbstract%26list_uids%3D20053839&amp;title=Entrez+Pubmed&quot;&gt;CiteULike&lt;/a&gt;</description>
    </item>
    <item>
      <title>Functional embedding for the classification of gene expression profiles.</title>
      <link>http://www.ncbi.nlm.nih.gov/entrez/query.fcgi?cmd=Retrieve&amp;db=PubMed&amp;dopt=Abstract&amp;list_uids=20053838</link>
      <description>Publication Date: 2010 Feb 15 PMID: 20053838&lt;br/&gt;Authors: Wu, P. S. - Muller, H. G.&lt;br/&gt;Journal: Bioinformatics&lt;br/&gt;&lt;br/&gt;MOTIVATION: Low sample size n high-dimensional large p data with n&lt;&lt;p are commonly encountered in genomics and statistical genetics. Ill-conditioning of the variance-covariance matrix for such data renders the traditional multivariate data analytical approaches unattractive. On the other side, functional data analysis (FDA) approaches are designed for infinite-dimensional data and therefore may have potential for the analysis of large p data. We herein propose a functional embedding (FEM) technique, which exploits the interface between multivariate and functional data, aiming at borrowing strength across the sample through FDA techniques in order to resolve the difficulties caused by the high dimension p. RESULTS: Using pairwise dissimilarities among predictor variables, one obtains a univariate configuration of these covariates. This is interpreted as variable ordination that defines the domain of a suitable function space, thus leading to the FEM of the high-dimensional data. The embedding may then be followed by functional logistic regression for the classification of high-dimensional multivariate data as an example for downstream analysis. The resulting functional classification is evaluated on several published gene expression array datasets and a mass spectrometric data, and is shown to compare favorably with various methods that have been employed previously for the classification of these high-dimensional gene expression profiles.&lt;br/&gt;&lt;br/&gt;post to: &lt;a href = &quot;http://www.citeulike.org/posturl?url=http%3A%2F%2Fwww.ncbi.nlm.nih.gov%2Fentrez%2Fquery.fcgi%3Fcmd%3DRetrieve%26db%3DPubMed%26dopt%3DAbstract%26list_uids%3D20053838&amp;title=Entrez+Pubmed&quot;&gt;CiteULike&lt;/a&gt;</description>
    </item>
    <item>
      <title>jORCA: easily integrating bioinformatics Web Services.</title>
      <link>http://www.ncbi.nlm.nih.gov/entrez/query.fcgi?cmd=Retrieve&amp;db=PubMed&amp;dopt=Abstract&amp;list_uids=20047879</link>
      <description>Publication Date: 2010 Feb 15 PMID: 20047879&lt;br/&gt;Authors: Martin-Requena, V. - Rios, J. - Garcia, M. - Ramirez, S. - Trelles, O.&lt;br/&gt;Journal: Bioinformatics&lt;br/&gt;&lt;br/&gt;MOTIVATION: Web services technology is becoming the option of choice to deploy bioinformatics tools that are universally available. One of the major strengths of this approach is that it supports machine-to-machine interoperability over a network. However, a weakness of this approach is that various Web Services differ in their definition and invocation protocols, as well as their communication and data formats-and this presents a barrier to service interoperability. RESULTS: jORCA is a desktop client aimed at facilitating seamless integration of Web Services. It does so by making a uniform representation of the different web resources, supporting scalable service discovery, and automatic composition of workflows. Usability is at the top of the jORCA agenda; thus it is a highly customizable and extensible application that accommodates a broad range of user skills featuring double-click invocation of services in conjunction with advanced execution-control, on the fly data standardization, extensibility of viewer plug-ins, drag-and-drop editing capabilities, plus a file-based browsing style and organization of favourite tools. The integration of bioinformatics Web Services is made easier to support a wider range of users. .&lt;br/&gt;&lt;br/&gt;post to: &lt;a href = &quot;http://www.citeulike.org/posturl?url=http%3A%2F%2Fwww.ncbi.nlm.nih.gov%2Fentrez%2Fquery.fcgi%3Fcmd%3DRetrieve%26db%3DPubMed%26dopt%3DAbstract%26list_uids%3D20047879&amp;title=Entrez+Pubmed&quot;&gt;CiteULike&lt;/a&gt;</description>
    </item>
    <item>
      <title>GWAF: an R package for genome-wide association analyses with family data.</title>
      <link>http://www.ncbi.nlm.nih.gov/entrez/query.fcgi?cmd=Retrieve&amp;db=PubMed&amp;dopt=Abstract&amp;list_uids=20040588</link>
      <description>Publication Date: 2010 Feb 15 PMID: 20040588&lt;br/&gt;Authors: Chen, M. H. - Yang, Q.&lt;br/&gt;Journal: Bioinformatics&lt;br/&gt;&lt;br/&gt;SUMMARY: GWAF, Genome-Wide Association analyses with Family, is an R package designed for GWAF. It implements association tests between a batch of genotyped or imputed single nucleotide polymorphisms (SNPs) and a binary or continuous trait with user specified genetic model, and generates informative results from the analyses. In addition, GWAF provides functions to visualize results. We evaluated GWAF using a simulated continuous trait and a binary trait dichotomized from the simulated continuous trait with real genotype data from the Framingham Heart Study's SNP Health Association Resource project.&lt;br/&gt;&lt;br/&gt;post to: &lt;a href = &quot;http://www.citeulike.org/posturl?url=http%3A%2F%2Fwww.ncbi.nlm.nih.gov%2Fentrez%2Fquery.fcgi%3Fcmd%3DRetrieve%26db%3DPubMed%26dopt%3DAbstract%26list_uids%3D20040588&amp;title=Entrez+Pubmed&quot;&gt;CiteULike&lt;/a&gt;</description>
    </item>
    <item>
      <title>emPAI Calc--for the estimation of protein abundance from large-scale identification data by liquid chromatography-tandem mass spectrometry.</title>
      <link>http://www.ncbi.nlm.nih.gov/entrez/query.fcgi?cmd=Retrieve&amp;db=PubMed&amp;dopt=Abstract&amp;list_uids=20031975</link>
      <description>Publication Date: 2010 Feb 15 PMID: 20031975&lt;br/&gt;Authors: Shinoda, K. - Tomita, M. - Ishihama, Y.&lt;br/&gt;Journal: Bioinformatics&lt;br/&gt;&lt;br/&gt;SUMMARY: emPAI Calc is an open-source web application for the estimation of protein abundance. It uses the correlation between the number of identified peptides and protein abundance in mass spectrometry-based proteomic experiments. The program is the first implementation of our previously reported emPAI algorithm; it calculates the emPAI from the protein identification results obtained by database search engines such as Mascot.&lt;br/&gt;&lt;br/&gt;post to: &lt;a href = &quot;http://www.citeulike.org/posturl?url=http%3A%2F%2Fwww.ncbi.nlm.nih.gov%2Fentrez%2Fquery.fcgi%3Fcmd%3DRetrieve%26db%3DPubMed%26dopt%3DAbstract%26list_uids%3D20031975&amp;title=Entrez+Pubmed&quot;&gt;CiteULike&lt;/a&gt;</description>
    </item>
    <item>
      <title>BRAT: bisulfite-treated reads analysis tool.</title>
      <link>http://www.ncbi.nlm.nih.gov/entrez/query.fcgi?cmd=Retrieve&amp;db=PubMed&amp;dopt=Abstract&amp;list_uids=20031974</link>
      <description>Publication Date: 2010 Feb 15 PMID: 20031974&lt;br/&gt;Authors: Harris, E. Y. - Ponts, N. - Levchuk, A. - Roch, K. L. - Lonardi, S.&lt;br/&gt;Journal: Bioinformatics&lt;br/&gt;&lt;br/&gt;SUMMARY: We present a new, accurate and efficient tool for mapping short reads obtained from the Illumina Genome Analyzer following sodium bisulfite conversion. Our tool, BRAT, supports single and paired-end reads and handles input files containing reads and mates of different lengths. BRAT is faster, maps more unique paired-end reads and has higher accuracy than existing programs. The software package includes tools to end-trim low-quality bases of the reads and to report nucleotide counts for mapped reads on the reference genome.&lt;br/&gt;&lt;br/&gt;post to: &lt;a href = &quot;http://www.citeulike.org/posturl?url=http%3A%2F%2Fwww.ncbi.nlm.nih.gov%2Fentrez%2Fquery.fcgi%3Fcmd%3DRetrieve%26db%3DPubMed%26dopt%3DAbstract%26list_uids%3D20031974&amp;title=Entrez+Pubmed&quot;&gt;CiteULike&lt;/a&gt;</description>
    </item>
    <item>
      <title>Limited contribution of stem-loop potential to symmetry of single-stranded genomic DNA.</title>
      <link>http://www.ncbi.nlm.nih.gov/entrez/query.fcgi?cmd=Retrieve&amp;db=PubMed&amp;dopt=Abstract&amp;list_uids=20031973</link>
      <description>Publication Date: 2010 Feb 15 PMID: 20031973&lt;br/&gt;Authors: Zhang, S. H. - Huang, Y. Z.&lt;br/&gt;Journal: Bioinformatics&lt;br/&gt;&lt;br/&gt;MOTIVATION: The phenomenon of strand symmetry, which may provide clues to genome evolution, exists in all prokaryotic and eukaryotic genomes studied. Several possible mechanisms for its origins have been proposed, including: no strand biases for mutation and selection, strand inversion and selection of stem-loop structures. However, the relative contributions of these mechanisms to strand symmetry are not clear. In this article, we studied specifically the role of stem-loop potential of single-stranded DNA in strand symmetry. RESULTS: We analyzed the complete genomes of 90 prokaryotes. We found that most oligonucleotides (pentanucleotides and higher) do not have a reverse complement in close proximity in the genomic sequences. Combined with further analysis, we conclude that the contribution of the widespread stem-loop potential of single-stranded genomic DNA to the formation and maintenance of strand symmetry would be very limited, at least for higher-order oligonucleotides. Therefore, other possible causes for strand symmetry must be taken into account to a deeper degree.&lt;br/&gt;&lt;br/&gt;post to: &lt;a href = &quot;http://www.citeulike.org/posturl?url=http%3A%2F%2Fwww.ncbi.nlm.nih.gov%2Fentrez%2Fquery.fcgi%3Fcmd%3DRetrieve%26db%3DPubMed%26dopt%3DAbstract%26list_uids%3D20031973&amp;title=Entrez+Pubmed&quot;&gt;CiteULike&lt;/a&gt;</description>
    </item>
    <item>
      <title>Bisque: a platform for bioimage analysis and management.</title>
      <link>http://www.ncbi.nlm.nih.gov/entrez/query.fcgi?cmd=Retrieve&amp;db=PubMed&amp;dopt=Abstract&amp;list_uids=20031971</link>
      <description>Publication Date: 2010 Feb 15 PMID: 20031971&lt;br/&gt;Authors: Kvilekval, K. - Fedorov, D. - Obara, B. - Singh, A. - Manjunath, B. S.&lt;br/&gt;Journal: Bioinformatics&lt;br/&gt;&lt;br/&gt;MOTIVATION: Advances in the field of microscopy have brought about the need for better image management and analysis solutions. Novel imaging techniques have created vast stores of images and metadata that are difficult to organize, search, process and analyze. These tasks are further complicated by conflicting and proprietary image and metadata formats, that impede analyzing and sharing of images and any associated data. These obstacles have resulted in research resources being locked away in digital media and file cabinets. Current image management systems do not address the pressing needs of researchers who must quantify image data on a regular basis. RESULTS: We present Bisque, a web-based platform specifically designed to provide researchers with organizational and quantitative analysis tools for 5D image data. Users can extend Bisque with both data model and analysis extensions in order to adapt the system to local needs. Bisque's extensibility stems from two core concepts: flexible metadata facility and an open web-based architecture. Together these empower researchers to create, develop and share novel bioimage analyses. Several case studies using Bisque with specific applications are presented as an indication of how users can expect to extend Bisque for their own purposes.&lt;br/&gt;&lt;br/&gt;post to: &lt;a href = &quot;http://www.citeulike.org/posturl?url=http%3A%2F%2Fwww.ncbi.nlm.nih.gov%2Fentrez%2Fquery.fcgi%3Fcmd%3DRetrieve%26db%3DPubMed%26dopt%3DAbstract%26list_uids%3D20031971&amp;title=Entrez+Pubmed&quot;&gt;CiteULike&lt;/a&gt;</description>
    </item>
    <item>
      <title>Pandora, a pathway and network discovery approach based on common biological evidence.</title>
      <link>http://www.ncbi.nlm.nih.gov/entrez/query.fcgi?cmd=Retrieve&amp;db=PubMed&amp;dopt=Abstract&amp;list_uids=20031970</link>
      <description>Publication Date: 2010 Feb 15 PMID: 20031970&lt;br/&gt;Authors: Zhang, K. X. - Ouellette, B. F.&lt;br/&gt;Journal: Bioinformatics&lt;br/&gt;&lt;br/&gt;MOTIVATION: Many biological phenomena involve extensive interactions between many of the biological pathways present in cells. However, extraction of all the inherent biological pathways remains a major challenge in systems biology. With the advent of high-throughput functional genomic techniques, it is now possible to infer biological pathways and pathway organization in a systematic way by integrating disparate biological information. RESULTS: Here, we propose a novel integrated approach that uses network topology to predict biological pathways. We integrated four types of biological evidence (protein-protein interaction, genetic interaction, domain-domain interaction and semantic similarity of Gene Ontology terms) to generate a functionally associated network. This network was then used to develop a new pathway finding algorithm to predict biological pathways in yeast. Our approach discovered 195 biological pathways and 31 functionally redundant pathway pairs in yeast. By comparing our identified pathways to three public pathway databases (KEGG, BioCyc and Reactome), we observed that our approach achieves a maximum positive predictive value of 12.8% and improves on other predictive approaches. This study allows us to reconstruct biological pathways and delineates cellular machinery in a systematic view.&lt;br/&gt;&lt;br/&gt;post to: &lt;a href = &quot;http://www.citeulike.org/posturl?url=http%3A%2F%2Fwww.ncbi.nlm.nih.gov%2Fentrez%2Fquery.fcgi%3Fcmd%3DRetrieve%26db%3DPubMed%26dopt%3DAbstract%26list_uids%3D20031970&amp;title=Entrez+Pubmed&quot;&gt;CiteULike&lt;/a&gt;</description>
    </item>
    <item>
      <title>Predicting metabolic engineering knockout strategies for chemical production: accounting for competing pathways.</title>
      <link>http://www.ncbi.nlm.nih.gov/entrez/query.fcgi?cmd=Retrieve&amp;db=PubMed&amp;dopt=Abstract&amp;list_uids=20031969</link>
      <description>Publication Date: 2010 Feb 15 PMID: 20031969&lt;br/&gt;Authors: Tepper, N. - Shlomi, T.&lt;br/&gt;Journal: Bioinformatics&lt;br/&gt;&lt;br/&gt;MOTIVATION: Computational modeling in metabolic engineering involves the prediction of genetic manipulations that would lead to optimized microbial strains, maximizing the production rate of chemicals of interest. Various computational methods are based on constraint-based modeling, which enables to anticipate the effect of genetic manipulations on cellular metabolism considering a genome-scale metabolic network. However, current methods do not account for the presence of competing pathways in a metabolic network that may diverge metabolic flux away from producing a required chemical, resulting in lower (or even zero) chemical production rates in reality-making these methods somewhat over optimistic. RESULTS: In this article, we describe a novel constraint-based method called RobustKnock that predicts gene deletion strategies that lead to the over-production of chemicals of interest, by accounting for the presence of competing pathways in the network. We describe results of applying RobustKnock to Escherichia coli's metabolic network towards the production of various chemicals, demonstrating its ability to provide more robust predictions than those obtained via current state-of-the-art methods.&lt;br/&gt;&lt;br/&gt;post to: &lt;a href = &quot;http://www.citeulike.org/posturl?url=http%3A%2F%2Fwww.ncbi.nlm.nih.gov%2Fentrez%2Fquery.fcgi%3Fcmd%3DRetrieve%26db%3DPubMed%26dopt%3DAbstract%26list_uids%3D20031969&amp;title=Entrez+Pubmed&quot;&gt;CiteULike&lt;/a&gt;</description>
    </item>
    <item>
      <title>CMDS: a population-based method for identifying recurrent DNA copy number aberrations in cancer from high-resolution data.</title>
      <link>http://www.ncbi.nlm.nih.gov/entrez/query.fcgi?cmd=Retrieve&amp;db=PubMed&amp;dopt=Abstract&amp;list_uids=20031968</link>
      <description>Publication Date: 2010 Feb 15 PMID: 20031968&lt;br/&gt;Authors: Zhang, Q. - Ding, L. - Larson, D. E. - Koboldt, D. C. - McLellan, M. D. - Chen, K. - Shi, X. - Kraja, A. - Mardis, E. R. - Wilson, R. K. - Borecki, I. B. - Province, M. A.&lt;br/&gt;Journal: Bioinformatics&lt;br/&gt;&lt;br/&gt;MOTIVATION: DNA copy number aberration (CNA) is a hallmark of genomic abnormality in tumor cells. Recurrent CNA (RCNA) occurs in multiple cancer samples across the same chromosomal region and has greater implication in tumorigenesis. Current commonly used methods for RCNA identification require CNA calling for individual samples before cross-sample analysis. This two-step strategy may result in a heavy computational burden, as well as a loss of the overall statistical power due to segmentation and discretization of individual sample's data. We propose a population-based approach for RCNA detection with no need of single-sample analysis, which is statistically powerful, computationally efficient and particularly suitable for high-resolution and large-population studies. RESULTS: Our approach, correlation matrix diagonal segmentation (CMDS), identifies RCNAs based on a between-chromosomal-site correlation analysis. Directly using the raw intensity ratio data from all samples and adopting a diagonal transformation strategy, CMDS substantially reduces computational burden and can obtain results very quickly from large datasets. Our simulation indicates that the statistical power of CMDS is higher than that of single-sample CNA calling based two-step approaches. We applied CMDS to two real datasets of lung cancer and brain cancer from Affymetrix and Illumina array platforms, respectively, and successfully identified known regions of CNA associated with EGFR, KRAS and other important oncogenes. CMDS provides a fast, powerful and easily implemented tool for the RCNA analysis of large-scale data from cancer genomes.&lt;br/&gt;&lt;br/&gt;post to: &lt;a href = &quot;http://www.citeulike.org/posturl?url=http%3A%2F%2Fwww.ncbi.nlm.nih.gov%2Fentrez%2Fquery.fcgi%3Fcmd%3DRetrieve%26db%3DPubMed%26dopt%3DAbstract%26list_uids%3D20031968&amp;title=Entrez+Pubmed&quot;&gt;CiteULike&lt;/a&gt;</description>
    </item>
    <item>
      <title>Penalized mixtures of factor analyzers with application to clustering high-dimensional microarray data.</title>
      <link>http://www.ncbi.nlm.nih.gov/entrez/query.fcgi?cmd=Retrieve&amp;db=PubMed&amp;dopt=Abstract&amp;list_uids=20031967</link>
      <description>Publication Date: 2010 Feb 15 PMID: 20031967&lt;br/&gt;Authors: Xie, B. - Pan, W. - Shen, X.&lt;br/&gt;Journal: Bioinformatics&lt;br/&gt;&lt;br/&gt;MOTIVATION: Model-based clustering has been widely used, e.g. in microarray data analysis. Since for high-dimensional data variable selection is necessary, several penalized model-based clustering methods have been proposed torealize simultaneous variable selection and clustering. However, the existing methods all assume that the variables are independent with the use of diagonal covariance matrices. RESULTS: To model non-independence of variables (e.g. correlated gene expressions) while alleviating the problem with the large number of unknown parameters associated with a general non-diagonal covariance matrix, we generalize the mixture of factor analyzers to that with penalization, which, among others, can effectively realize variable selection. We use simulated data and real microarray data to illustrate the utility and advantages of the proposed method over several existing ones.&lt;br/&gt;&lt;br/&gt;post to: &lt;a href = &quot;http://www.citeulike.org/posturl?url=http%3A%2F%2Fwww.ncbi.nlm.nih.gov%2Fentrez%2Fquery.fcgi%3Fcmd%3DRetrieve%26db%3DPubMed%26dopt%3DAbstract%26list_uids%3D20031967&amp;title=Entrez+Pubmed&quot;&gt;CiteULike&lt;/a&gt;</description>
    </item>
    <item>
      <title>DCDB: drug combination database.</title>
      <link>http://www.ncbi.nlm.nih.gov/entrez/query.fcgi?cmd=Retrieve&amp;db=PubMed&amp;dopt=Abstract&amp;list_uids=20031966</link>
      <description>Publication Date: 2010 Feb 15 PMID: 20031966&lt;br/&gt;Authors: Liu, Y. - Hu, B. - Fu, C. - Chen, X.&lt;br/&gt;Journal: Bioinformatics&lt;br/&gt;&lt;br/&gt;SUMMARY: Rapid advances in pharmaceutical sciences have brought ever-increasing interests in combined therapies for better clinical efficacy and safety, especially in cases of complicated and refractory diseases. Innovative experimental technologies and theoretical frameworks are being actively developed for multicomponent drug research. In this work, we present the Drug Combination Database, with aims to facilitate analyses of known drug combinations, to summarize patterns of beneficial drug interactions, and to provide a basis for theoretical modeling and simulation of such drug interactions. Its current version (1.0) collected 499 approved or investigational drug combinations, including 40 unsuccessful drug combinations, involving 485 individual drugs, from &gt;6000 references.&lt;br/&gt;&lt;br/&gt;post to: &lt;a href = &quot;http://www.citeulike.org/posturl?url=http%3A%2F%2Fwww.ncbi.nlm.nih.gov%2Fentrez%2Fquery.fcgi%3Fcmd%3DRetrieve%26db%3DPubMed%26dopt%3DAbstract%26list_uids%3D20031966&amp;title=Entrez+Pubmed&quot;&gt;CiteULike&lt;/a&gt;</description>
    </item>
    <item>
      <title>Power to detect selective allelic amplification in genome-wide scans of tumor data.</title>
      <link>http://www.ncbi.nlm.nih.gov/entrez/query.fcgi?cmd=Retrieve&amp;db=PubMed&amp;dopt=Abstract&amp;list_uids=20031965</link>
      <description>Publication Date: 2010 Feb 15 PMID: 20031965&lt;br/&gt;Authors: Dewal, N. - Freedman, M. L. - LaFramboise, T. - Pe'er, I.&lt;br/&gt;Journal: Bioinformatics&lt;br/&gt;&lt;br/&gt;MOTIVATION: Somatic amplification of particular genomic regions and selection of cellular lineages with such amplifications drives tumor development. However, pinpointing genes under such selection has been difficult due to the large span of these regions. Our recently-developed method, the amplification distortion test (ADT), identifies specific nucleotide alleles and haplotypes that confer better survival for tumor cells when somatically amplified. In this work, we focus on evaluating ADT's power to detect such causal variants across a variety of tumor dataset scenarios. RESULTS: Towards this end, we generated multiple parameter-based, synthetic datasets-derived from real data-that contain somatic copy number aberrations (CNAs) of various lengths and frequencies over germline single nucleotide polymorphisms (SNPs) genome-wide. Gold-standard causal sub-regions were assigned within these CNAs, followed by an assessment of ADT's ability to detect these sub-regions. Results indicate that ADT possesses high sensitivity and specificity in large sample sizes across most parameter cases, including those that more closely reflect existing SNP and CNA cancer data.&lt;br/&gt;&lt;br/&gt;post to: &lt;a href = &quot;http://www.citeulike.org/posturl?url=http%3A%2F%2Fwww.ncbi.nlm.nih.gov%2Fentrez%2Fquery.fcgi%3Fcmd%3DRetrieve%26db%3DPubMed%26dopt%3DAbstract%26list_uids%3D20031965&amp;title=Entrez+Pubmed&quot;&gt;CiteULike&lt;/a&gt;</description>
    </item>
    <item>
      <title>GonadSAGE: a comprehensive SAGE database for transcript discovery on male embryonic gonad development.</title>
      <link>http://www.ncbi.nlm.nih.gov/entrez/query.fcgi?cmd=Retrieve&amp;db=PubMed&amp;dopt=Abstract&amp;list_uids=20028690</link>
      <description>Publication Date: 2010 Feb 15 PMID: 20028690&lt;br/&gt;Authors: Lee, T. L. - Li, Y. - Cheung, H. H. - Claus, J. - Singh, S. - Sastry, C. - Rennert, O. M. - Lau, Y. F. - Chan, W. Y.&lt;br/&gt;Journal: Bioinformatics&lt;br/&gt;&lt;br/&gt;SUMMARY: Serial analysis of gene expression (SAGE) provides an alternative, with additional advantages, to microarray gene expression studies. GonadSAGE is the first publicly available web-based SAGE database on male gonad development that covers six male mouse embryonic gonad stages, including E10.5, E11.5, E12.5, E13.5, E15.5 and E17.5. The sequence coverage of each SAGE library is beyond 150K, 'which is the most extensive sequence-based male gonadal transcriptome to date'. An interactive web interface with customizable parameters is provided for analyzing male gonad transcriptome information. Furthermore, the data can be visualized and analyzed with the other genomic features in the UCSC genome browser. It represents an integrated platform that leads to a better understanding of male gonad development, and allows discovery of related novel targets and regulatory pathways.&lt;br/&gt;&lt;br/&gt;post to: &lt;a href = &quot;http://www.citeulike.org/posturl?url=http%3A%2F%2Fwww.ncbi.nlm.nih.gov%2Fentrez%2Fquery.fcgi%3Fcmd%3DRetrieve%26db%3DPubMed%26dopt%3DAbstract%26list_uids%3D20028690&amp;title=Entrez+Pubmed&quot;&gt;CiteULike&lt;/a&gt;</description>
    </item>
    <item>
      <title>RNA-Seq gene expression estimation with read mapping uncertainty.</title>
      <link>http://www.ncbi.nlm.nih.gov/entrez/query.fcgi?cmd=Retrieve&amp;db=PubMed&amp;dopt=Abstract&amp;list_uids=20022975</link>
      <description>Publication Date: 2010 Feb 15 PMID: 20022975&lt;br/&gt;Authors: Li, B. - Ruotti, V. - Stewart, R. M. - Thomson, J. A. - Dewey, C. N.&lt;br/&gt;Journal: Bioinformatics&lt;br/&gt;&lt;br/&gt;MOTIVATION: RNA-Seq is a promising new technology for accurately measuring gene expression levels. Expression estimation with RNA-Seq requires the mapping of relatively short sequencing reads to a reference genome or transcript set. Because reads are generally shorter than transcripts from which they are derived, a single read may map to multiple genes and isoforms, complicating expression analyses. Previous computational methods either discard reads that map to multiple locations or allocate them to genes heuristically. RESULTS: We present a generative statistical model and associated inference methods that handle read mapping uncertainty in a principled manner. Through simulations parameterized by real RNA-Seq data, we show that our method is more accurate than previous methods. Our improved accuracy is the result of handling read mapping uncertainty with a statistical model and the estimation of gene expression levels as the sum of isoform expression levels. Unlike previous methods, our method is capable of modeling non-uniform read distributions. Simulations with our method indicate that a read length of 20-25 bases is optimal for gene-level expression estimation from mouse and maize RNA-Seq data when sequencing throughput is fixed.&lt;br/&gt;&lt;br/&gt;post to: &lt;a href = &quot;http://www.citeulike.org/posturl?url=http%3A%2F%2Fwww.ncbi.nlm.nih.gov%2Fentrez%2Fquery.fcgi%3Fcmd%3DRetrieve%26db%3DPubMed%26dopt%3DAbstract%26list_uids%3D20022975&amp;title=Entrez+Pubmed&quot;&gt;CiteULike&lt;/a&gt;</description>
    </item>
    <item>
      <title>Copy number variant detection in inbred strains from short read sequence data.</title>
      <link>http://www.ncbi.nlm.nih.gov/entrez/query.fcgi?cmd=Retrieve&amp;db=PubMed&amp;dopt=Abstract&amp;list_uids=20022973</link>
      <description>Publication Date: 2010 Feb 15 PMID: 20022973&lt;br/&gt;Authors: Simpson, J. T. - McIntyre, R. E. - Adams, D. J. - Durbin, R.&lt;br/&gt;Journal: Bioinformatics&lt;br/&gt;&lt;br/&gt;SUMMARY: We have developed an algorithm to detect copy number variants (CNVs) in homozygous organisms, such as inbred laboratory strains of mice, from short read sequence data. Our novel approach exploits the fact that inbred mice are homozygous at virtually every position in the genome to detect CNVs using a hidden Markov model (HMM). This HMM uses both the density of sequence reads mapped to the genome, and the rate of apparent heterozygous single nucleotide polymorphisms, to determine genomic copy number. We tested our algorithm on short read sequence data generated from re-sequencing chromosome 17 of the mouse strains A/J and CAST/EiJ with the Illumina platform. In total, we identified 118 copy number variants (43 for A/J and 75 for CAST/EiJ). We investigated the performance of our algorithm through comparison to CNVs previously identified by array-comparative genomic hybridization (array CGH). We performed quantitative-PCR validation on a subset of the calls that differed from the array CGH data sets.&lt;br/&gt;&lt;br/&gt;post to: &lt;a href = &quot;http://www.citeulike.org/posturl?url=http%3A%2F%2Fwww.ncbi.nlm.nih.gov%2Fentrez%2Fquery.fcgi%3Fcmd%3DRetrieve%26db%3DPubMed%26dopt%3DAbstract%26list_uids%3D20022973&amp;title=Entrez+Pubmed&quot;&gt;CiteULike&lt;/a&gt;</description>
    </item>
    <item>
      <title>RNAsnoop: efficient target prediction for H/ACA snoRNAs.</title>
      <link>http://www.ncbi.nlm.nih.gov/entrez/query.fcgi?cmd=Retrieve&amp;db=PubMed&amp;dopt=Abstract&amp;list_uids=20015949</link>
      <description>Publication Date: 2010 Mar 1 PMID: 20015949&lt;br/&gt;Authors: Tafer, H. - Kehr, S. - Hertel, J. - Hofacker, I. L. - Stadler, P. F.&lt;br/&gt;Journal: Bioinformatics&lt;br/&gt;&lt;br/&gt;MOTIVATION: Small nucleolar RNAs are an abundant class of non-coding RNAs that guide chemical modifications of rRNAs, snRNAs and some mRNAs. In the case of many 'orphan' snoRNAs, the targeted nucleotides remain unknown, however. The box H/ACA subclass determines uridine residues that are to be converted into pseudouridines via specific complementary binding in a well-defined secondary structure configuration that is outside the scope of common RNA (co-)folding algorithms. RESULTS: RNAsnoop implements a dynamic programming algorithm that computes thermodynamically optimal H/ACA-RNA interactions in an efficient scanning variant. Complemented by an support vector machine (SVM)-based machine learning approach to distinguish true binding sites from spurious solutions and a system to evaluate comparative information, it presents an efficient and reliable tool for the prediction of H/ACA snoRNA target sites. We apply RNAsnoop to identify the snoRNAs that are responsible for several of the remaining 'orphan' pseudouridine modifications in human rRNAs, and we assign a target to one of the five orphan H/ACA snoRNAs in Drosophila. AVAILABILITY: The C source code of RNAsnoop is freely available at http://www.tbi.univie.ac.at/ -htafer/RNAsnoop&lt;br/&gt;&lt;br/&gt;post to: &lt;a href = &quot;http://www.citeulike.org/posturl?url=http%3A%2F%2Fwww.ncbi.nlm.nih.gov%2Fentrez%2Fquery.fcgi%3Fcmd%3DRetrieve%26db%3DPubMed%26dopt%3DAbstract%26list_uids%3D20015949&amp;title=Entrez+Pubmed&quot;&gt;CiteULike&lt;/a&gt;</description>
    </item>
    <item>
      <title>r2cat: synteny plots and comparative assembly.</title>
      <link>http://www.ncbi.nlm.nih.gov/entrez/query.fcgi?cmd=Retrieve&amp;db=PubMed&amp;dopt=Abstract&amp;list_uids=20015948</link>
      <description>Publication Date: 2010 Feb 15 PMID: 20015948&lt;br/&gt;Authors: Husemann, P. - Stoye, J.&lt;br/&gt;Journal: Bioinformatics&lt;br/&gt;&lt;br/&gt;SUMMARY: Recent parallel pyrosequencing methods and the increasing number of finished genomes encourage the sequencing and investigation of closely related strains. Although the sequencing itself becomes easier and cheaper with each machine generation, the finishing of the genomes remains difficult. Instead of the desired whole genomic sequence, a set of contigs is the result of the assembly. In this applications note, we present the tool r2cat (related reference contig arrangement tool) that helps in the task of comparative assembly and also provides an interactive visualization for synteny inspection.&lt;br/&gt;&lt;br/&gt;post to: &lt;a href = &quot;http://www.citeulike.org/posturl?url=http%3A%2F%2Fwww.ncbi.nlm.nih.gov%2Fentrez%2Fquery.fcgi%3Fcmd%3DRetrieve%26db%3DPubMed%26dopt%3DAbstract%26list_uids%3D20015948&amp;title=Entrez+Pubmed&quot;&gt;CiteULike&lt;/a&gt;</description>
    </item>
    <item>
      <title>Mixture-model based estimation of gene expression variance from public database improves identification of differentially expressed genes in small sized microarray data.</title>
      <link>http://www.ncbi.nlm.nih.gov/entrez/query.fcgi?cmd=Retrieve&amp;db=PubMed&amp;dopt=Abstract&amp;list_uids=20015947</link>
      <description>Publication Date: 2010 Feb 15 PMID: 20015947&lt;br/&gt;Authors: Kim, M. - Cho, S. B. - Kim, J. H.&lt;br/&gt;Journal: Bioinformatics&lt;br/&gt;&lt;br/&gt;MOTIVATION: The small number of samples in many microarray experiments is a challenge for the correct identification of differentially expressed gens (DEGs) by conventional statistical means. Information from public microarray databases can help more efficient identification of DEGs. To model various experimental conditions of a public microarray database, we applied Gaussian mixture model and extracted bi- or tri-modal distributions of gene expression. Prior variance of Baldi's Bayesian framework was estimate for the analysis of the small sample-sized datasets. RESULTS: First, we estimated the prior variance of a gene expression by pooling variances obtained from mixture modeling of large samples in the public microarray database. Then, using the prior variance, we identified DEGs in small sample-sized test datasets using the Baldi's framework. For benchmark study, we generated test datasets having several samples from relatively large datasets. Our proposed method outperformed other benchmark methods in terms of detecting gold-standard DEGs from the test datasets. The results may be a challenging evidence for usage of public microarray databases in microarray data analysis.&lt;br/&gt;&lt;br/&gt;post to: &lt;a href = &quot;http://www.citeulike.org/posturl?url=http%3A%2F%2Fwww.ncbi.nlm.nih.gov%2Fentrez%2Fquery.fcgi%3Fcmd%3DRetrieve%26db%3DPubMed%26dopt%3DAbstract%26list_uids%3D20015947&amp;title=Entrez+Pubmed&quot;&gt;CiteULike&lt;/a&gt;</description>
    </item>
    <item>
      <title>WebPARE: web-computing for inferring genetic or transcriptional interactions.</title>
      <link>http://www.ncbi.nlm.nih.gov/entrez/query.fcgi?cmd=Retrieve&amp;db=PubMed&amp;dopt=Abstract&amp;list_uids=20007742</link>
      <description>Publication Date: 2010 Feb 15 PMID: 20007742&lt;br/&gt;Authors: Chuang, C. L. - Wu, J. H. - Cheng, C. S. - Shieh, G. S.&lt;br/&gt;Journal: Bioinformatics&lt;br/&gt;&lt;br/&gt;SUMMARY: Inferring genetic or transcriptional interactions, when done successfully, may provide insights into biological processes or biochemical pathways of interest. Unfortunately, most computational algorithms require a certain level of programming expertise. To provide a simple web interface for users to infer interactions from time course gene expression data, we present WebPARE, which is based on the pattern recognition algorithm (PARE). For expression data, in which each type of interaction (e.g. activator target) and the corresponding paired gene expression pattern are significantly associated, PARE uses a non-linear score to classify gene pairs of interest into a few subclasses of various time lags. In each subclass, PARE learns the parameters in the decision score using known interactions from biological experiments or published literature. Subsequently, the trained algorithm predicts interactions of a similar nature. Previously, PARE was shown to infer two sets of interactions in yeast successfully. Moreover, several predicted genetic interactions coincided with existing pathways; this indicates the potential of PARE in predicting partial pathway components. Given a list of gene pairs or genes of interest and expression data, WebPARE invokes PARE and outputs predicted interactions and their networks in directed graphs.&lt;br/&gt;&lt;br/&gt;post to: &lt;a href = &quot;http://www.citeulike.org/posturl?url=http%3A%2F%2Fwww.ncbi.nlm.nih.gov%2Fentrez%2Fquery.fcgi%3Fcmd%3DRetrieve%26db%3DPubMed%26dopt%3DAbstract%26list_uids%3D20007742&amp;title=Entrez+Pubmed&quot;&gt;CiteULike&lt;/a&gt;</description>
    </item>
    <item>
      <title>MARTA: a suite of Java-based tools for assigning taxonomic status to DNA sequences.</title>
      <link>http://www.ncbi.nlm.nih.gov/entrez/query.fcgi?cmd=Retrieve&amp;db=PubMed&amp;dopt=Abstract&amp;list_uids=20007739</link>
      <description>Publication Date: 2010 Feb 15 PMID: 20007739&lt;br/&gt;Authors: Horton, M. - Bodenhausen, N. - Bergelson, J.&lt;br/&gt;Journal: Bioinformatics&lt;br/&gt;&lt;br/&gt;MOTIVATION: We have created a suite of Java-based software to better provide taxonomic assignments to DNA sequences. We anticipate that the program will be useful for protistologists, virologists, mycologists and other microbial ecologists. The program relies on NCBI utilities including the BLAST software and Taxonomy database and is easily manipulated at the command-line to specify a BLAST candidate's query-coverage or percent identity requirements; other options include the ability to set minimal consensus requirements (%) for each of the eight major taxonomic ranks (Domain, Kingdom, Phylum, ...) and whether to consider lower scoring candidates when the top-hit lacks taxonomic classification.&lt;br/&gt;&lt;br/&gt;post to: &lt;a href = &quot;http://www.citeulike.org/posturl?url=http%3A%2F%2Fwww.ncbi.nlm.nih.gov%2Fentrez%2Fquery.fcgi%3Fcmd%3DRetrieve%26db%3DPubMed%26dopt%3DAbstract%26list_uids%3D20007739&amp;title=Entrez+Pubmed&quot;&gt;CiteULike&lt;/a&gt;</description>
    </item>
    <item>
      <title>ConceptGen: a gene set enrichment and gene set relation mapping tool.</title>
      <link>http://www.ncbi.nlm.nih.gov/entrez/query.fcgi?cmd=Retrieve&amp;db=PubMed&amp;dopt=Abstract&amp;list_uids=20007254</link>
      <description>Publication Date: 2010 Feb 15 PMID: 20007254&lt;br/&gt;Authors: Sartor, M. A. - Mahavisno, V. - Keshamouni, V. G. - Cavalcoli, J. - Wright, Z. - Karnovsky, A. - Kuick, R. - Jagadish, H. V. - Mirel, B. - Weymouth, T. - Athey, B. - Omenn, G. S.&lt;br/&gt;Journal: Bioinformatics&lt;br/&gt;&lt;br/&gt;MOTIVATION: The elucidation of biological concepts enriched with differentially expressed genes has become an integral part of the analysis and interpretation of genomic data. Of additional importance is the ability to explore networks of relationships among previously defined biological concepts from diverse information sources, and to explore results visually from multiple perspectives. Accomplishing these tasks requires a unified framework for agglomeration of data from various genomic resources, novel visualizations, and user functionality. RESULTS: We have developed ConceptGen, a web-based gene set enrichment and gene set relation mapping tool that is streamlined and simple to use. ConceptGen offers over 20,000 concepts comprising 14 different types of biological knowledge, including data not currently available in any other gene set enrichment or gene set relation mapping tool. We demonstrate the functionalities of ConceptGen using gene expression data modeling TGF-beta-induced epithelial-mesenchymal transition and metabolomics data comparing metastatic versus localized prostate cancers.&lt;br/&gt;&lt;br/&gt;post to: &lt;a href = &quot;http://www.citeulike.org/posturl?url=http%3A%2F%2Fwww.ncbi.nlm.nih.gov%2Fentrez%2Fquery.fcgi%3Fcmd%3DRetrieve%26db%3DPubMed%26dopt%3DAbstract%26list_uids%3D20007254&amp;title=Entrez+Pubmed&quot;&gt;CiteULike&lt;/a&gt;</description>
    </item>
    <item>
      <title>A novel method for accurate one-dimensional protein structure prediction based on fragment matching.</title>
      <link>http://www.ncbi.nlm.nih.gov/entrez/query.fcgi?cmd=Retrieve&amp;db=PubMed&amp;dopt=Abstract&amp;list_uids=20007252</link>
      <description>Publication Date: 2010 Feb 15 PMID: 20007252&lt;br/&gt;Authors: Zhou, T. - Shu, N. - Hovmoller, S.&lt;br/&gt;Journal: Bioinformatics&lt;br/&gt;&lt;br/&gt;MOTIVATION: The precise prediction of one-dimensional (1D) protein structure as represented by the protein secondary structure and 1D string of discrete state of dihedral angles (i.e. Shape Strings) is a prerequisite for the successful prediction of three-dimensional (3D) structure as well as protein-protein interaction. We have developed a novel 1D structure prediction method, called Frag1D, based on a straightforward fragment matching algorithm and demonstrated its success in the prediction of three sets of 1D structural alphabets, i.e. the classical three-state secondary structure, three- and eight-state Shape Strings. RESULTS: By exploiting the vast protein sequence and protein structure data available, we have brought secondary-structure prediction closer to the expected theoretical limit. When tested by a leave-one-out cross validation on a non-redundant set of PDB cutting at 30% sequence identity containing 5860 protein chains, the overall per-residue accuracy for secondary-structure prediction, i.e. Q3 is 82.9%. The overall per-residue accuracy for three- and eight-state Shape Strings are 85.1 and 71.5%, respectively. We have also benchmarked our program with the latest version of PSIPRED for secondary structure prediction and our program predicted 0.3% better in Q3 when tested on 2241 chains with the same training set. For Shape Strings, we compared our method with a recently published method with the same dataset and definition as used by that method. Our program predicted at 2.2% better in accuracy for three-state Shape Strings. By quantitatively investigating the effect of data base size on 1D structure prediction we show that the accuracy increases by approximately 1% with every doubling of the database size.&lt;br/&gt;&lt;br/&gt;post to: &lt;a href = &quot;http://www.citeulike.org/posturl?url=http%3A%2F%2Fwww.ncbi.nlm.nih.gov%2Fentrez%2Fquery.fcgi%3Fcmd%3DRetrieve%26db%3DPubMed%26dopt%3DAbstract%26list_uids%3D20007252&amp;title=Entrez+Pubmed&quot;&gt;CiteULike&lt;/a&gt;</description>
    </item>
  </channel>
</rss>
