<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0"
  xmlns:trackback="http://madskills.com/public/xml/rss/module/trackback/">
  <channel>
    <title>Bioinformatics</title>
    <link>http://barf.jcowboy.org</link>
    <description>Bioinformatics recent publications</description>
    <language>en-us</language>
    <image>
      <url>http://barf.jcowboy.org/pubmed.gif</url>
      <title>the data for this feed is provided by PubMed</title>
      <link>http://barf.jcowboy.org</link>
    </image>
    <item>
      <title>A Distance Metric for a Class of Tree-Sibling Phylogenetic Networks.</title>
      <link>http://www.ncbi.nlm.nih.gov/entrez/query.fcgi?cmd=Retrieve&amp;db=PubMed&amp;dopt=Abstract&amp;list_uids=18477576</link>
      <description>Publication Date: 2008 May 12 PMID: 18477576&lt;br/&gt;Authors: Cardona, G. - Llabres, M. - Rossello, F. - Valiente, G.&lt;br/&gt;Journal: Bioinformatics&lt;br/&gt;&lt;br/&gt;MOTIVATION: The presence of reticulate evolutionary events in phylogenies turn phylogenetic trees into phylogenetic networks. These events imply in particular that there may exist multiple evolutionary paths from a non-extant species to an extant one, and this multiplicity makes the comparison of phylogenetic networks much more difficult than the comparison of phylogenetic trees. In fact, all attempts to define a sound distance measure on the class of all phylogenetic networks have failed so far. Thus, the only practical solutions have been either the use of rough estimates of similarity (based on comparison of the trees embedded in the networks), or narrowing the class of phylogenetic networks to a certain class where such a distance is known and can be efficiently computed. The first approach has the problem that one may identify two networks as equivalent, when they are not; the second one has the drawback that there may not exist algorithms to reconstruct such networks from biological sequences. RESULTS: We present in this paper a distance measure on the class of semi-binary tree-sibling time consistent phylogenetic networks, which generalize tree-child time consistent phylogenetic networks, and thus also galled-trees. The practical interest of this distance measure is twofold: it can be computed in polynomial time by means of simple algorithms, and there also exist polynomial-time algorithms for reconstructing networks of this class from DNA sequence data. AVAILABILITY: The Perl package Bio::PhyloNetwork, included in the BioPerl bundle, implements many algorithms on phylogenetic networks, including the computation of the distance presented in this paper. CONTACT: gabriel.cardona@uib.es SUPPLEMENTARY INFORMATION: Some counterexamples, proofs of the results not included in this paper, and some computational experiments are available at Bioinformatics online.&lt;br/&gt;&lt;br/&gt;post to: &lt;a href = &quot;http://www.citeulike.org/posturl?url=http%3A%2F%2Fwww.ncbi.nlm.nih.gov%2Fentrez%2Fquery.fcgi%3Fcmd%3DRetrieve%26db%3DPubMed%26dopt%3DAbstract%26list_uids%3D18477576&amp;title=Entrez+Pubmed&quot;&gt;CiteULike&lt;/a&gt;</description>
    </item>
    <item>
      <title>DupTree: A program for large-scale phylogenetic analyses using gene tree parsimony.</title>
      <link>http://www.ncbi.nlm.nih.gov/entrez/query.fcgi?cmd=Retrieve&amp;db=PubMed&amp;dopt=Abstract&amp;list_uids=18474508</link>
      <description>Publication Date: 2008 May 12 PMID: 18474508&lt;br/&gt;Authors: Wehe, A. - Bansal, M. S. - Burleigh, J. G. - Eulenstein, O.&lt;br/&gt;Journal: Bioinformatics&lt;br/&gt;&lt;br/&gt;SUMMARY: DupTree is a new software program for inferring rooted species trees from collections of gene trees using the gene tree parsimony approach. The program implements a novel algorithm that significantly improves upon the run time of standard search heuristics for gene tree parsimony, and enables the first truly genome-scale phylogenetic analyses. In addition, DupTree allows users to examine alternate rootings and to weight the reconciliation costs for gene trees. DupTree is an open source project written in C++. AVAILABILITY: DupTree for Mac OS X, Windows, and Linux along with a sample dataset and an on-line manual are available at http://genome.cs.iastate.edu/CBL/DupTree CONTACT: oeulenst@cs.iastate.edu.&lt;br/&gt;&lt;br/&gt;post to: &lt;a href = &quot;http://www.citeulike.org/posturl?url=http%3A%2F%2Fwww.ncbi.nlm.nih.gov%2Fentrez%2Fquery.fcgi%3Fcmd%3DRetrieve%26db%3DPubMed%26dopt%3DAbstract%26list_uids%3D18474508&amp;title=Entrez+Pubmed&quot;&gt;CiteULike&lt;/a&gt;</description>
    </item>
    <item>
      <title>OCTOPUS:Improvingtopologypredictionbytwo-track ANN-basedpreferencescoresandanextended topologicalgrammar.</title>
      <link>http://www.ncbi.nlm.nih.gov/entrez/query.fcgi?cmd=Retrieve&amp;db=PubMed&amp;dopt=Abstract&amp;list_uids=18474507</link>
      <description>Publication Date: 2008 May 12 PMID: 18474507&lt;br/&gt;Authors: Viklund, H. - Elofsson, A.&lt;br/&gt;Journal: Bioinformatics&lt;br/&gt;&lt;br/&gt;MOTIVATION: As alpha-helical transmembrane proteins constitute roughly 25% of a typical genome and are vital parts of many essential biological processes, structural knowledge of these proteins is necessary for increasing our understanding of such processes. Because structural knowledge of transmembrane proteins is difficult to attain experimentally, improved methods for prediction of structural features of these proteins are important. RESULTS: OCTOPUS, a new method for predicting transmembrane protein topology is presented and benchmarked using a data set of 124 sequences with known structures. Using a novel combination of hidden Markov models and artificial neural networks, OCTOPUS predicts the correct topology for 94% of the sequences. In particular, OCTOPUS is the first topology predictor to fully integrate modeling of reentrant/membrane-dipping regions and transmembrane hairpins in the topological grammar. AVAILABILITY: OCTOPUS is available as a web-server at http://octopus.cbr.su.se. CONTACT: hakanv@sbc.su.se.&lt;br/&gt;&lt;br/&gt;post to: &lt;a href = &quot;http://www.citeulike.org/posturl?url=http%3A%2F%2Fwww.ncbi.nlm.nih.gov%2Fentrez%2Fquery.fcgi%3Fcmd%3DRetrieve%26db%3DPubMed%26dopt%3DAbstract%26list_uids%3D18474507&amp;title=Entrez+Pubmed&quot;&gt;CiteULike&lt;/a&gt;</description>
    </item>
    <item>
      <title>Malin: maximum likelihood analysis of intron evolution in eukaryotes.</title>
      <link>http://www.ncbi.nlm.nih.gov/entrez/query.fcgi?cmd=Retrieve&amp;db=PubMed&amp;dopt=Abstract&amp;list_uids=18474506</link>
      <description>Publication Date: 2008 May 12 PMID: 18474506&lt;br/&gt;Authors: Csur Os, M.&lt;br/&gt;Journal: Bioinformatics&lt;br/&gt;&lt;br/&gt;SUMMARY: Malin is a software package for the analysis of eukaryotic gene structure evolution. It provides a graphical user interface for various tasks commonly used to infer the evolution of exon-intron structure in protein-coding orthologs. Implemented tasks include the identification of conserved homologous intron sites in protein alignments, as well as the estimation of ancestral intron content, lineage-specific intron losses and gains. Estimates are computed either with parsimony, or with a probabilistic model that incorporates rate variation across lineages and intron sites. AVAILABILITY: Availability: Malin is available as a stand-alone Java application, as well as an application bundle for MacOS X, at the website http://www.iro.umontreal.ca/~csuros/introns/malin/. The software is distributed under a BSD-style license. CONTACT: csuros@iro.umontreal.ca.&lt;br/&gt;&lt;br/&gt;post to: &lt;a href = &quot;http://www.citeulike.org/posturl?url=http%3A%2F%2Fwww.ncbi.nlm.nih.gov%2Fentrez%2Fquery.fcgi%3Fcmd%3DRetrieve%26db%3DPubMed%26dopt%3DAbstract%26list_uids%3D18474506&amp;title=Entrez+Pubmed&quot;&gt;CiteULike&lt;/a&gt;</description>
    </item>
    <item>
      <title>FunSiP: A Modular and Extensible Classifier for the Prediction of Functional Sites in DNA.</title>
      <link>http://www.ncbi.nlm.nih.gov/entrez/query.fcgi?cmd=Retrieve&amp;db=PubMed&amp;dopt=Abstract&amp;list_uids=18474505</link>
      <description>Publication Date: 2008 May 12 PMID: 18474505&lt;br/&gt;Authors: Bel, M. V. - Saeys, Y. - de Peer, Y. V.&lt;br/&gt;Journal: Bioinformatics&lt;br/&gt;&lt;br/&gt;MOTIVATION: Many problems in genome annotation are tackled by using a classification model to predict functional sites such as splice sites, translation start sites or stop codons. Locating the correct position of these sites remains one of the most important but also one of the most difficult issues in the structural annotation of genomes. Most of the software currently in use is written for a very specific problem, thereby limiting the possibilities for reuse. SUMMARY: We developed a software platform that uses a very general approach towards the classification of functional sites in DNA sequences. The program uses an &quot;ab initio&quot; approach towards the identification of these sites, and extends SpliceMachine, a previously developed splice site predictor that shows state-of-the-art performance for both donor and acceptor splice site recognition in the human and Arabidopsis thaliana genome. AVAILABILITY: The program is developed as a stand-alone Java application, and is available as GPLv3 open-source software. The program, source and documentation can be obtained from the &quot;Soft-ware&quot; section at http://bioinformatics.psb.ugent.be/ CONTACT: Yves.VandePeer@psb.ugent.be.&lt;br/&gt;&lt;br/&gt;post to: &lt;a href = &quot;http://www.citeulike.org/posturl?url=http%3A%2F%2Fwww.ncbi.nlm.nih.gov%2Fentrez%2Fquery.fcgi%3Fcmd%3DRetrieve%26db%3DPubMed%26dopt%3DAbstract%26list_uids%3D18474505&amp;title=Entrez+Pubmed&quot;&gt;CiteULike&lt;/a&gt;</description>
    </item>
    <item>
      <title>HSEpred: predict Half-Sphere Exposure from protein sequences.</title>
      <link>http://www.ncbi.nlm.nih.gov/entrez/query.fcgi?cmd=Retrieve&amp;db=PubMed&amp;dopt=Abstract&amp;list_uids=18467349</link>
      <description>Publication Date: 2008 May 8 PMID: 18467349&lt;br/&gt;Authors: Song, J. - Tan, H. - Takemoto, K. - Akutsu, T.&lt;br/&gt;Journal: Bioinformatics&lt;br/&gt;&lt;br/&gt;MOTIVATION: Half-Sphere Exposure (HSE) is a newly developed two-dimensional solvent exposure measure. By conceptually separating an amino acid's sphere in a protein structure into two half spheres which represent its distinct spatial neighborhoods in the upward and downward directions, the HSE-up and HSE-down measures show superior performance compared with other measures such as accessible surface area, residue depth and contact number. However, currently there is no existing method for the prediction of HSE measures from sequence data. RESULTS: In this article, we propose a novel approach to predict the HSE measures and infer residue contact numbers using the predicted HSE values, based on a well-prepared non-homologous protein structure dataset. In particular, we employ support vector regression to quantify the relationship between HSE measures and protein sequences and evaluate its prediction performance. We extensively explore five sequence encoding schemes to examine their effects on the prediction performance. Our method could achieve the correlation coefficients of 0.72 and 0.68 between the predicted and observed HSE-up and HSE-down measures, respectively. Moreover, contact number can be accurately predicted by the summation of the predicted HSE-up and HSE-down values, which has further enlarged the application of this method. The successful application of support vector regression approach in this study suggests that it should be more useful in quantifying the protein sequence-structure relationship and predicting the structural property profiles from protein sequences. AVAILABILITY: The prediction webserver and supplementary materials are accessible at http://sunflower.kuicr.kyoto-u.ac.jp/~sjn/hse/. CONTACT: sjn@kuicr.kyoto-u.ac.jp; takutsu@kuicr.kyoto-u.ac.jp.&lt;br/&gt;&lt;br/&gt;post to: &lt;a href = &quot;http://www.citeulike.org/posturl?url=http%3A%2F%2Fwww.ncbi.nlm.nih.gov%2Fentrez%2Fquery.fcgi%3Fcmd%3DRetrieve%26db%3DPubMed%26dopt%3DAbstract%26list_uids%3D18467349&amp;title=Entrez+Pubmed&quot;&gt;CiteULike&lt;/a&gt;</description>
    </item>
    <item>
      <title>lumi: a pipeline for processing Illumina microarray.</title>
      <link>http://www.ncbi.nlm.nih.gov/entrez/query.fcgi?cmd=Retrieve&amp;db=PubMed&amp;dopt=Abstract&amp;list_uids=18467348</link>
      <description>Publication Date: 2008 May 8 PMID: 18467348&lt;br/&gt;Authors: Du, P. - Kibbe, W. A. - Lin, S. M.&lt;br/&gt;Journal: Bioinformatics&lt;br/&gt;&lt;br/&gt;SUMMARY: Illumina microarray is becoming a popular microarray platform. The BeadArray technology from Illumina makes its preprocessing and quality control different from other microarray technologies. Unfortunately, most other analyses have not taken advantage of the unique properties of the BeadArray system, and have just incorporated preprocessing methods originally designed for Affymetrix microarrays. lumi is a Bioconductor package especially designed to process the Illumina microarray data. It includes data input, quality control, variance stabilization, normalization and gene annotation portions. In specific, the lumi package includes a variance-stabilizing transformation (VST) algorithm that takes advantage of the technical replicates available on every Illumina microarray. Different normalization method options and multiple quality control plots are provided in the package. To better annotate the Illumina data, a vendor independent nucleotide universal identifier (nuID) was devised to identify the probes of Illumina microarray. The nuID annotation packages and output of lumi processed results can be easily integrated with other Bioconductor packages to construct a statistical data analysis pipeline for Illumina data. AVAILABILITY: The lumi Bioconductor package, www.bioconductor.org CONTACT: Pan Du (dupan@northwestern.edu), Warren Kibbe (wakibbe@northwestern.edu), Simon Lin (s-lin2@northwestern.edu.&lt;br/&gt;&lt;br/&gt;post to: &lt;a href = &quot;http://www.citeulike.org/posturl?url=http%3A%2F%2Fwww.ncbi.nlm.nih.gov%2Fentrez%2Fquery.fcgi%3Fcmd%3DRetrieve%26db%3DPubMed%26dopt%3DAbstract%26list_uids%3D18467348&amp;title=Entrez+Pubmed&quot;&gt;CiteULike&lt;/a&gt;</description>
    </item>
    <item>
      <title>Microbial Genotype-Phenotype Mapping by Class Association Rule Mining.</title>
      <link>http://www.ncbi.nlm.nih.gov/entrez/query.fcgi?cmd=Retrieve&amp;db=PubMed&amp;dopt=Abstract&amp;list_uids=18467347</link>
      <description>Publication Date: 2008 May 8 PMID: 18467347&lt;br/&gt;Authors: Tamura, M. - D'haeseleer, P.&lt;br/&gt;Journal: Bioinformatics&lt;br/&gt;&lt;br/&gt;MOTIVATION: Microbial phenotypes are typically due to the concerted action of multiple gene functions, yet the presence of each gene may have only a weak correlation with the observed phenotype. Hence, it may be more appropriate to examine co-occurrence between sets of genes and a phenotype (multiple-to-one) instead of pairwise relations between a single gene and the phenotype. Here, we propose an efficient Class Association Rule mining algorithm, NETCAR, in order to extract sets of COGs (Clusters of Orthologous Groups of proteins) associated with a phenotype from COG phylogenetic profiles and a phenotype profile. NETCAR takes into account the phylogenetic cooccurrence graph between COGs to restrict hypothesis space, and uses mutual information to evaluate the biconditional relation. RESULTS: We examined the mining capability of pairwise and multiple-toone association by using NETCAR to extract COGs relevant to six microbial phenotypes (aerobic, anaerobic, facultative, endospore, motility, and Gram negative) from 11,969 unique COG profiles across 155 prokaryotic organisms. With the same level of False Discovery Rate (FDR), multiple-to-one association can extract about 10 times more relevant COGs than one-to-one association. We also reveal various topologies of association networks among COGs (modules) from extracted multiple-to-one correlation rules relevant with the six phenotypes; including a well-connected network for motility, a startshaped network for aerobic, and intermediate topologies for the other phenotypes. NETCAR outperforms a standard Class Association Rule mining algorithm, CARAPRIORI, while requiring several orders of magnitude less computational time for extracting 3-COG sets. AVAILABILITY: Source code of the Java implementation is available as Supplementary material at the Bioinformatics online website, or upon request to the author. CONTACT: makio323@gmail.com.&lt;br/&gt;&lt;br/&gt;post to: &lt;a href = &quot;http://www.citeulike.org/posturl?url=http%3A%2F%2Fwww.ncbi.nlm.nih.gov%2Fentrez%2Fquery.fcgi%3Fcmd%3DRetrieve%26db%3DPubMed%26dopt%3DAbstract%26list_uids%3D18467347&amp;title=Entrez+Pubmed&quot;&gt;CiteULike&lt;/a&gt;</description>
    </item>
    <item>
      <title>Discerning static and causal interactions in genome-wide reverse engineering problems.</title>
      <link>http://www.ncbi.nlm.nih.gov/entrez/query.fcgi?cmd=Retrieve&amp;db=PubMed&amp;dopt=Abstract&amp;list_uids=18467346</link>
      <description>Publication Date: 2008 May 8 PMID: 18467346&lt;br/&gt;Authors: Zampieri, M. - Soranzo, N. - Altafini, C.&lt;br/&gt;Journal: Bioinformatics&lt;br/&gt;&lt;br/&gt;MOTIVATION: In the past years devising methods for discovering gene regulatory mechanisms at a genome-wide level has become a fundamental topic in the field of systems biology. The aim is to infer gene-gene interactions in an increasingly sophisticated and reliable way through the continuous improvement of reverse engineering algorithms exploiting microarray data. RESULTS: This work is inspired by the several studies suggesting that co-expression is mostly related to &quot;static&quot; stable binding relationships, like belonging to the same protein complex, rather than other types of interactions more of a &quot;causal&quot; and transient nature (e.g. transcription factor-binding site interactions). The aim of this work is to verify if direct or conditional network inference algorithms (e.g. Pearson correlation for the former, partial Pearson correlation for the latter) are indeed useful in discerning static from causal dependencies in artificial and real gene networks (derived from E.coli and S.cerevisiae). CONTACT: altafini@sissa.it.&lt;br/&gt;&lt;br/&gt;post to: &lt;a href = &quot;http://www.citeulike.org/posturl?url=http%3A%2F%2Fwww.ncbi.nlm.nih.gov%2Fentrez%2Fquery.fcgi%3Fcmd%3DRetrieve%26db%3DPubMed%26dopt%3DAbstract%26list_uids%3D18467346&amp;title=Entrez+Pubmed&quot;&gt;CiteULike&lt;/a&gt;</description>
    </item>
    <item>
      <title>Evolutionary design principles of modules that control cellular differentiation: Consequences for hysteresis and multistationarity.</title>
      <link>http://www.ncbi.nlm.nih.gov/entrez/query.fcgi?cmd=Retrieve&amp;db=PubMed&amp;dopt=Abstract&amp;list_uids=18467345</link>
      <description>Publication Date: 2008 May 8 PMID: 18467345&lt;br/&gt;Authors: Kim, J. - Kim, T. G. - Jung, S. H. - Kim, J. R. - Park, T. - Heslop-Harrison, P. - Cho, K. H.&lt;br/&gt;Journal: Bioinformatics&lt;br/&gt;&lt;br/&gt;MOTIVATION: Gene regulatory networks govern cellular differentiation processes and enable construction of multi-cellular organisms from single cells. Although such networks are complex, there must be evolutionary design principles that shape the network to its present form, gaining complexity from simple modules. RESULTS: To isolate particular design principles, we have computationally evolved random regulatory networks with a preference to result either in hysteresis (switching threshold depending on current state), or in multistationarity (having multiple steady states), two commonly observed dynamical features of gene regulatory networks related to differentiation processes. We have analyzed the resulting evolved networks and compared their structures and characteristics with real gene regulatory networks reported from experiments. Conclusion: We found that the artificially evolved networks have particular topologies and it was notable that these topologies share important features and similarities with the real gene regulatory networks, particularly in contrasting properties of positive and negative feedback loops. We conclude that the structures of real gene regulatory networks are consistent with selection to favor one or other of the dynamical features of multistationarity or hysteresis. CONTACT: ckh@kaist.ac.kr Supplementary Material: Supplementary Material is available at Bioinformatics online.&lt;br/&gt;&lt;br/&gt;post to: &lt;a href = &quot;http://www.citeulike.org/posturl?url=http%3A%2F%2Fwww.ncbi.nlm.nih.gov%2Fentrez%2Fquery.fcgi%3Fcmd%3DRetrieve%26db%3DPubMed%26dopt%3DAbstract%26list_uids%3D18467345&amp;title=Entrez+Pubmed&quot;&gt;CiteULike&lt;/a&gt;</description>
    </item>
    <item>
      <title>PatMaN: rapid alignment of short sequences to large databases.</title>
      <link>http://www.ncbi.nlm.nih.gov/entrez/query.fcgi?cmd=Retrieve&amp;db=PubMed&amp;dopt=Abstract&amp;list_uids=18467344</link>
      <description>Publication Date: 2008 May 8 PMID: 18467344&lt;br/&gt;Authors: Prufer, K. - Stenzel, U. - Dannemann, M. - Green, R. E. - Lachmann, M. - Kelso, J.&lt;br/&gt;Journal: Bioinformatics&lt;br/&gt;&lt;br/&gt;SUMMARY: We present a tool suited for searching for many short nucleotide sequences in large databases, allowing for a pre-defined number of gaps and mismatches. The commandline-driven program implements a nondeterministic automata matching-algorithm on a keyword tree of the search strings. Both queries with and without ambiguity codes can be searched. Search time is short for perfect matches, and retrieval time rises exponentially with the number of edits allowed. AVAILABILITY: The C++ source code for PatMaN is distributed under the GNU General Public License and has been tested on the GNU/Linux operating system. It is available from http://bioinf.eva.mpg.de/patman. CONTACT: pruefer@eva.mpg.de.&lt;br/&gt;&lt;br/&gt;post to: &lt;a href = &quot;http://www.citeulike.org/posturl?url=http%3A%2F%2Fwww.ncbi.nlm.nih.gov%2Fentrez%2Fquery.fcgi%3Fcmd%3DRetrieve%26db%3DPubMed%26dopt%3DAbstract%26list_uids%3D18467344&amp;title=Entrez+Pubmed&quot;&gt;CiteULike&lt;/a&gt;</description>
    </item>
    <item>
      <title>Identification of OBO Nonalignments and Its Implications for OBO Enrichment.</title>
      <link>http://www.ncbi.nlm.nih.gov/entrez/query.fcgi?cmd=Retrieve&amp;db=PubMed&amp;dopt=Abstract&amp;list_uids=18463117</link>
      <description>Publication Date: 2008 May 7 PMID: 18463117&lt;br/&gt;Authors: Bada, M. - Hunter, L.&lt;br/&gt;Journal: Bioinformatics&lt;br/&gt;&lt;br/&gt;MOTIVATION: Existing projects that focus on the semiautomatic addition of links between existing terms in the Open Biomedical Ontologies can take advantage of reasoners that can make new inferences between terms that are based on the added formal definitions and that reflect nonalignments between the linked terms. However, these projects require that these definitions be necessary and sufficient, a strong requirement that often does not hold. If such definitions cannot be added, the reasoners cannot point to the nonalignments through the suggestion of new inferences. RESULTS: We describe a methodology by which we have identified over 1,9800 instances of nonredundant nonalignments between terms from the GO biological-process (BP), cellular-component (CC), and molecular-function (MF) ontologies, ChEBI, and the Cell Type Ontology (CL). Many of the 39.838.1% of these nonalignments whose object terms are more atomic than the subject terms are not currently examined in other ontology-enrichment projects due to the fact that the necessary and sufficient conditions required for the inferences are not currently examined. Analysis of the ratios of nonalignments to assertions from which the nonalignments were identified suggests that BP-MF, BP-BP, BP-CL, and CC-CC, BP-BP, and BP-CL terms are relatively well-aligned, while BP-ChEBI-MF, BP-ChEBI, and CC-MF and ChEBI-MF terms are relatively not aligned well. We propose four ways to resolve an identified nonalignment and recommend an analogous implementation of our methodology in ontology-enrichment tools to identify types of nonalignments that are currently not detected. AVAILABILITY: The nonalignments discussed in this article may be viewed at http://compbio.uchsc.edu/Hunter_lab/Bada/ nonalignments_20087_03_0614.html. Code for the generation of these nonalignments is available upon request. CONTACT: mike.bada@uchsc.edu.&lt;br/&gt;&lt;br/&gt;post to: &lt;a href = &quot;http://www.citeulike.org/posturl?url=http%3A%2F%2Fwww.ncbi.nlm.nih.gov%2Fentrez%2Fquery.fcgi%3Fcmd%3DRetrieve%26db%3DPubMed%26dopt%3DAbstract%26list_uids%3D18463117&amp;title=Entrez+Pubmed&quot;&gt;CiteULike&lt;/a&gt;</description>
    </item>
    <item>
      <title>SYCAMORE - A SYstems biology Computational Analysis and MOdeling Research Environment.</title>
      <link>http://www.ncbi.nlm.nih.gov/entrez/query.fcgi?cmd=Retrieve&amp;db=PubMed&amp;dopt=Abstract&amp;list_uids=18463116</link>
      <description>Publication Date: 2008 May 7 PMID: 18463116&lt;br/&gt;Authors: Weidemann, A. - Richter, S. - Stein, M. - Sahle, S. - Gauges, R. - Gabdoulline, R. - Surovtsova, I. - Semmelrock, N. - Besson, B. - Rojas, I. - Wade, R. - Kummer, U.&lt;br/&gt;Journal: Bioinformatics&lt;br/&gt;&lt;br/&gt;SUMMARY: SYCAMORE is a browser-based application that facilitates construction, simulation and analysis of kinetic models in systems biology. Thus, it allows e.g. database supported modelling, basic model checking and the estimation of unknown kinetic parameters based on protein structures. In addition, it offers some guidance in order to allow non-expert users to perform basic computational modelling tasks. AVAILABILITY: SYCAMORE is freely available for academic use at http://sycamore.eml.org. Commercial users may acquire a license. CONTACT: ursula.kummer@bioquant.uni-heidelberg.de.&lt;br/&gt;&lt;br/&gt;post to: &lt;a href = &quot;http://www.citeulike.org/posturl?url=http%3A%2F%2Fwww.ncbi.nlm.nih.gov%2Fentrez%2Fquery.fcgi%3Fcmd%3DRetrieve%26db%3DPubMed%26dopt%3DAbstract%26list_uids%3D18463116&amp;title=Entrez+Pubmed&quot;&gt;CiteULike&lt;/a&gt;</description>
    </item>
    <item>
      <title>Mireval: a web tool for simple microRNA prediction in genome sequences.</title>
      <link>http://www.ncbi.nlm.nih.gov/entrez/query.fcgi?cmd=Retrieve&amp;db=PubMed&amp;dopt=Abstract&amp;list_uids=18453555</link>
      <description>Publication Date: 2008 May 3 PMID: 18453555&lt;br/&gt;Authors: Ritchie, W. - Theodule, F. X. - Gautheret, D.&lt;br/&gt;Journal: Bioinformatics&lt;br/&gt;&lt;br/&gt;SUMMARY: We have developed an online tool called mirEval which can search sequences of up to 10,000nt for novel microRNAs in multiple organisms. It is a comprehensive tool, easy to use and very informative. It will allow users with no prior knowledge of in-silico detection of microRNAs to take advantage of the most successful approaches to investigate sequences of interest AVAILABILITY: The mirEval web server is available at http://tagc.univ-mrs.fr/mireval CONTACT: W.Ritchie@centenary.org.au.&lt;br/&gt;&lt;br/&gt;post to: &lt;a href = &quot;http://www.citeulike.org/posturl?url=http%3A%2F%2Fwww.ncbi.nlm.nih.gov%2Fentrez%2Fquery.fcgi%3Fcmd%3DRetrieve%26db%3DPubMed%26dopt%3DAbstract%26list_uids%3D18453555&amp;title=Entrez+Pubmed&quot;&gt;CiteULike&lt;/a&gt;</description>
    </item>
    <item>
      <title>An efficient method to identify differentially expressed genes in microarray experiments.</title>
      <link>http://www.ncbi.nlm.nih.gov/entrez/query.fcgi?cmd=Retrieve&amp;db=PubMed&amp;dopt=Abstract&amp;list_uids=18453554</link>
      <description>Publication Date: 2008 May 3 PMID: 18453554&lt;br/&gt;Authors: Qin, H. - Feng, T. - Harding, S. A. - Tsai, C. J. - Zhang, S.&lt;br/&gt;Journal: Bioinformatics&lt;br/&gt;&lt;br/&gt;MOTIVATION: Microarray experiments typically analyze thousands to tens of thousands of genes from small numbers of biological replicates. The fact that genes are normally expressed in functionally relevant patterns suggests that gene expression data can be stratified and clustered into relatively homogenous groups. Cluster-wise dimensionality reduction should make it feasible to improve screening power while minimizing information loss. RESULTS: We propose a powerful and computationally simple method for finding differentially expressed genes in small microarray experiments. The method incorporates a novel stratification-based tight clustering algorithm, principal component analysis and information pooling. Comprehensive simulations show that our method is substantially more powerful than the popular SAM and eBayes approaches. We applied the method to three real microarray datasets: one from a Populus nitrogen stress experiment with 3 biological replicates; and two from public microarray datasets of human cancers with 10 to 40 biological replicates. In all three analyses, our method proved more robust than the popular alternatives for identification of differentially expressed genes. AVAILABILITY: The C++ code to implement the proposed method is available upon request for academic use. CONTACT: shuzhang@mtu.edu.&lt;br/&gt;&lt;br/&gt;post to: &lt;a href = &quot;http://www.citeulike.org/posturl?url=http%3A%2F%2Fwww.ncbi.nlm.nih.gov%2Fentrez%2Fquery.fcgi%3Fcmd%3DRetrieve%26db%3DPubMed%26dopt%3DAbstract%26list_uids%3D18453554&amp;title=Entrez+Pubmed&quot;&gt;CiteULike&lt;/a&gt;</description>
    </item>
    <item>
      <title>High-performance hardware implementation of a parallel database search engine for real-time peptide mass fingerprinting.</title>
      <link>http://www.ncbi.nlm.nih.gov/entrez/query.fcgi?cmd=Retrieve&amp;db=PubMed&amp;dopt=Abstract&amp;list_uids=18453553</link>
      <description>Publication Date: 2008 May 3 PMID: 18453553&lt;br/&gt;Authors: Bogdan, I. - Rivers, J. - Beynon, R. J. - Coca, D.&lt;br/&gt;Journal: Bioinformatics&lt;br/&gt;&lt;br/&gt;MOTIVATION: Peptide Mass Fingerprinting (PMF) is a method for protein identification in which a protein is fragmented by a defined cleavage protocol (usually proteolysis with trypsin), and the masses of these products constitute a 'fingerprint' that can be searched against theoretical fingerprints of all known proteins. In the first stage of PMF, the raw mass spectrometric data are processed to generate a peptide mass list. In the second stage this protein fingerprint is used to search a database of known proteins for the best protein match. Although current software solutions can typically deliver a match in a relatively short time, a system that can find a match in real-time could change the way in which PMF is deployed and presented. In a paper published earlier (Bogdan et al., 2007) we presented a hardware design of a raw mass spectra processor that, when implemented in FPGA hardware, achieves almost 170-fold speed gain relative to a conventional software implementation running on a dual processor server. In this paper we present a complementary hardware realisation of a parallel database search engine that, when running on a Xilinx Virtex 2 FPGA at 100MHz, delivers 1800-fold speed-up compared with an equivalent C software routine, running on a 3.06GHz Xeon workstation. The inherent scalability of the design means that processing speed can be multiplied by deploying the design on multiple FPGAs. The database search processor and the mass spectra processor, running on a reconfigurable computing platform, provide a complete real-time PMF protein identification solution. CONTACT: d.coca@sheffield.ac.uk.&lt;br/&gt;&lt;br/&gt;post to: &lt;a href = &quot;http://www.citeulike.org/posturl?url=http%3A%2F%2Fwww.ncbi.nlm.nih.gov%2Fentrez%2Fquery.fcgi%3Fcmd%3DRetrieve%26db%3DPubMed%26dopt%3DAbstract%26list_uids%3D18453553&amp;title=Entrez+Pubmed&quot;&gt;CiteULike&lt;/a&gt;</description>
    </item>
    <item>
      <title>DAnTE: a statistical tool for quantitative analysis of -omics data.</title>
      <link>http://www.ncbi.nlm.nih.gov/entrez/query.fcgi?cmd=Retrieve&amp;db=PubMed&amp;dopt=Abstract&amp;list_uids=18453552</link>
      <description>Publication Date: 2008 May 3 PMID: 18453552&lt;br/&gt;Authors: Polpitiya, A. D. - Qian, W. J. - Jaitly, N. - Petyuk, V. A. - Adkins, J. N. - Camp, D. G. 2nd - Anderson, G. A. - Smith, R. D.&lt;br/&gt;Journal: Bioinformatics&lt;br/&gt;&lt;br/&gt;SUMMARY: DAnTE (Data Analysis Tool Extension) is a statistical tool designed to address challenges associated with quantitative bottom-up, shotgun proteomics data. This tool has also been demonstrated for microarray data and can easily be extended to other high-throughput data types. DAnTE features selected normalization methods, missing value imputation algorithms, peptide to protein rollup methods, an extensive array of plotting functions, and a comprehensive hypothesis testing scheme that can handle unbalanced data and random effects. The Graphical User Interface (GUI) is designed to be very intuitive and user friendly. AVAILABILITY: DAnTE may be downloaded free of charge at http://ncrr.pnl.gov/software/ CONTACT: rds@pnl.gov or proteomics@pnl.gov SUPPLEMENTARY INFORMATION: An example dataset with instructions on how to perform a series of analysis steps is available at http://ncrr.pnl.gov/software/&lt;br/&gt;&lt;br/&gt;post to: &lt;a href = &quot;http://www.citeulike.org/posturl?url=http%3A%2F%2Fwww.ncbi.nlm.nih.gov%2Fentrez%2Fquery.fcgi%3Fcmd%3DRetrieve%26db%3DPubMed%26dopt%3DAbstract%26list_uids%3D18453552&amp;title=Entrez+Pubmed&quot;&gt;CiteULike&lt;/a&gt;</description>
    </item>
    <item>
      <title>A Support Vector Machine model for the prediction of proteotypic peptides for accurate mass and time proteomics.</title>
      <link>http://www.ncbi.nlm.nih.gov/entrez/query.fcgi?cmd=Retrieve&amp;db=PubMed&amp;dopt=Abstract&amp;list_uids=18453551</link>
      <description>Publication Date: 2008 May 3 PMID: 18453551&lt;br/&gt;Authors: Webb-Robertson, B. J. - Cannon, W. R. - Oehmen, C. S. - Shah, A. R. - Gurumoorthi, V. - Lipton, M. S. - Waters, K. M.&lt;br/&gt;Journal: Bioinformatics&lt;br/&gt;&lt;br/&gt;MOTIVATION: The standard approach to identifying peptides based on accurate mass and elution time (AMT) compares these profiles obtained from a high resolution mass spectrometer to a database of peptides previously identified from tandem mass spectrometry (MS/MS) studies. It would be advantageous, with respect to both accuracy and cost, to only search for those peptides that are detectable by MS (proteotypic). RESULTS: We present a Support Vector Machine (SVM) model that uses a simple descriptor space based on 35 properties of amino acid content, charge, hydrophilicity, and polarity for the quantitative prediction of proteotypic peptides. Using three independently derived AMT databases (Shewanella oneidensis, Salmonella typhimurium, Yersinia pestis) for training and validation within and across species, the SVM resulted in an average accuracy measure of approximately 0.8 with a standard deviation of less than 0.025. Furthermore, we demonstrate that these results are achievable with a small set of 12 variables and can achieve high proteome coverage. AVAILABILITY: http://omics.pnl.gov/software/STEPP.php.&lt;br/&gt;&lt;br/&gt;post to: &lt;a href = &quot;http://www.citeulike.org/posturl?url=http%3A%2F%2Fwww.ncbi.nlm.nih.gov%2Fentrez%2Fquery.fcgi%3Fcmd%3DRetrieve%26db%3DPubMed%26dopt%3DAbstract%26list_uids%3D18453551&amp;title=Entrez+Pubmed&quot;&gt;CiteULike&lt;/a&gt;</description>
    </item>
    <item>
      <title>A correction for estimating error when using the Local Pooled Error Statistical Test.</title>
      <link>http://www.ncbi.nlm.nih.gov/entrez/query.fcgi?cmd=Retrieve&amp;db=PubMed&amp;dopt=Abstract&amp;list_uids=18450812</link>
      <description>Publication Date: 2008 May 1 PMID: 18450812&lt;br/&gt;Authors: Murie, C. - Nadon, R.&lt;br/&gt;Journal: Bioinformatics&lt;br/&gt;&lt;br/&gt;Jain et al. (2003) introduced the Local Pooled Error statistical test designed for use with small sample size microarray gene expression data. Based on an asymptotic proof, the test multiplicatively adjusts the standard error for a test of differences between two classes of observations by pi/2 due to the use of medians rather than means as measures of central tendency. The adjustment is upwardly biased at small sample sizes, however, producing fewer than expected small p-values with a consequent loss of statistical power. We present an empirical correction to the adjustment factor which removes the bias and produces theoretically expected p-values when distributional assumptions are met. Our adjusted LPE measure should prove useful to ongoing methodological studies designed to improve the LPE's performance for microarray and proteomics applications and for future work for other high-throughput biotechnologies. AVAILABILITY: The software is implemented in the R language and can be downloaded from the Bioconductor project website (http://www.bioconductor.org). CONTACT: robert.nadon@mcgill.ca.&lt;br/&gt;&lt;br/&gt;post to: &lt;a href = &quot;http://www.citeulike.org/posturl?url=http%3A%2F%2Fwww.ncbi.nlm.nih.gov%2Fentrez%2Fquery.fcgi%3Fcmd%3DRetrieve%26db%3DPubMed%26dopt%3DAbstract%26list_uids%3D18450812&amp;title=Entrez+Pubmed&quot;&gt;CiteULike&lt;/a&gt;</description>
    </item>
    <item>
      <title>Characterization and Prediction of Residues Determining Protein Functional Specificity.</title>
      <link>http://www.ncbi.nlm.nih.gov/entrez/query.fcgi?cmd=Retrieve&amp;db=PubMed&amp;dopt=Abstract&amp;list_uids=18450811</link>
      <description>Publication Date: 2008 May 1 PMID: 18450811&lt;br/&gt;Authors: Capra, J. A. - Singh, M.&lt;br/&gt;Journal: Bioinformatics&lt;br/&gt;&lt;br/&gt;MOTIVATION: Within a homologous protein family, proteins may be grouped into subtypes that share specific functions that are not common to the entire family. Often, the amino acids present in a small number of sequence positions determine each protein's particular functional specificity. Knowledge of these specificity determining positions (SDPs) aids in protein function prediction, drug design, and experimental analysis. A number of sequence-based computational methods have been introduced for identifying SDPs; however, their further development and evaluation have been hindered by the limited number of known experimentally-determined SDPs. RESULTS: We combine several bioinformatics resources to automate a process, typically undertaken manually, to build a data set of SDPs. The resulting large data set, which consists of SDPs in enzymes, enables us to characterize SDPs in terms of their physicochemical and evolutionary properties. It also facilitates the large-scale evaluation of sequence-based SDP prediction methods. We present a simple sequence-based SDP prediction method, GroupSim, and show that, surprisingly, it is competitive with a representative set of current methods. We also describe ConsWin, a heuristic that considers sequence conservation of neighboring amino acids, and demonstrate that it improves the performance of all methods tested on our large data set of enzyme SDPs. AVAILABILITY: Data sets and GroupSim code are available online at http://compbio.cs.princeton.edu/specificity/. CONTACT: msingh@cs.princeton.edu.&lt;br/&gt;&lt;br/&gt;post to: &lt;a href = &quot;http://www.citeulike.org/posturl?url=http%3A%2F%2Fwww.ncbi.nlm.nih.gov%2Fentrez%2Fquery.fcgi%3Fcmd%3DRetrieve%26db%3DPubMed%26dopt%3DAbstract%26list_uids%3D18450811&amp;title=Entrez+Pubmed&quot;&gt;CiteULike&lt;/a&gt;</description>
    </item>
    <item>
      <title>quantiNEMO: an individual-based program to simulate quantitative traits with explicit genetic architecture in a dynamic metapopulation.</title>
      <link>http://www.ncbi.nlm.nih.gov/entrez/query.fcgi?cmd=Retrieve&amp;db=PubMed&amp;dopt=Abstract&amp;list_uids=18450810</link>
      <description>Publication Date: 2008 May 1 PMID: 18450810&lt;br/&gt;Authors: Neuenschwander, S. - Hospital, F. - Guillaume, F. - Goudet, J.&lt;br/&gt;Journal: Bioinformatics&lt;br/&gt;&lt;br/&gt;SUMMARY: quantiNEMO is an individual-based, genetically explicit stochastic simulation program. It was developed to investigate the effects of selection, mutation, recombination, and drift on quantitative traits with varying architectures in structured populations connected by migration and located in a heterogeneous habitat. QuantiNEMO is highly flexible at various levels: population, selection, trait(s) architecture, genetic map for QTL and/or markers, environment, demography, mating system, etc. QuantiNEMO is coded in C++ using an object oriented approach and runs on any computer platform. AVAILABILITY: Executables for several platforms, user's manual, and source code are freely available under the GNU General Public License at http://www2.unil.ch/popgen/softwares/quantinemo CONTACT: samuel.neuenschwander@unil.ch.&lt;br/&gt;&lt;br/&gt;post to: &lt;a href = &quot;http://www.citeulike.org/posturl?url=http%3A%2F%2Fwww.ncbi.nlm.nih.gov%2Fentrez%2Fquery.fcgi%3Fcmd%3DRetrieve%26db%3DPubMed%26dopt%3DAbstract%26list_uids%3D18450810&amp;title=Entrez+Pubmed&quot;&gt;CiteULike&lt;/a&gt;</description>
    </item>
    <item>
      <title>nuScore: a web-interface for nucleosome positioning predictions.</title>
      <link>http://www.ncbi.nlm.nih.gov/entrez/query.fcgi?cmd=Retrieve&amp;db=PubMed&amp;dopt=Abstract&amp;list_uids=18445607</link>
      <description>Publication Date: 2008 Apr 29 PMID: 18445607&lt;br/&gt;Authors: Tolstorukov, M. Y. - Choudhary, V. - Olson, W. K. - Zhurkin, V. B. - Park, P. J.&lt;br/&gt;Journal: Bioinformatics&lt;br/&gt;&lt;br/&gt;SUMMARY: Sequence-directed mapping of nucleosome positions is of major biological interest. Here, we present a web-interface for estimation of the affinity of the histone core to DNA and prediction of nucleosome arrangement on a given sequence. Our approach is based on assessment of the energy cost of imposing the deformations required to wrap DNA around the histone surface. The interface allows the user to specify a number of options such as selecting from several structural templates for threading calculations and adding random sequences to the analysis. AVAILABILITY: The nuScore interface is freely available for use at http://compbio.med.harvard.edu/nuScore. SUPPLEMENTARY INFORMATION: The site contains user manual, description of the methodology, and examples. CONTACT: peter_park@harvard.edu; tolstorukov@gmail.com.&lt;br/&gt;&lt;br/&gt;post to: &lt;a href = &quot;http://www.citeulike.org/posturl?url=http%3A%2F%2Fwww.ncbi.nlm.nih.gov%2Fentrez%2Fquery.fcgi%3Fcmd%3DRetrieve%26db%3DPubMed%26dopt%3DAbstract%26list_uids%3D18445607&amp;title=Entrez+Pubmed&quot;&gt;CiteULike&lt;/a&gt;</description>
    </item>
    <item>
      <title>jSquid: a Java applet for graphical on-line network exploration.</title>
      <link>http://www.ncbi.nlm.nih.gov/entrez/query.fcgi?cmd=Retrieve&amp;db=PubMed&amp;dopt=Abstract&amp;list_uids=18445606</link>
      <description>Publication Date: 2008 Apr 29 PMID: 18445606&lt;br/&gt;Authors: Klammer, M. - Roopra, S. - Sonnhammer, E. L.&lt;br/&gt;Journal: Bioinformatics&lt;br/&gt;&lt;br/&gt;SUMMARY: jSquid is a graph visualization tool for exploring graphs from protein-protein interaction or functional coupling networks. The tool was designed for the FunCoup web site, but can be used for any similar network exploring purpose. The program offers various visualization and graph manipulation techniques to increase the utility for the user. AVAILABILITY: jSquid is available for direct usage and download at http://jSquid.sbc.su.se including source code under the GPLv3 license, and input examples. It requires Java version 5 or higher to run properly. CONTACT: erik.sonnhammer@sbc.su.se SUPPLEMENTARY INFORMATION: available at Bioinformatics online.&lt;br/&gt;&lt;br/&gt;post to: &lt;a href = &quot;http://www.citeulike.org/posturl?url=http%3A%2F%2Fwww.ncbi.nlm.nih.gov%2Fentrez%2Fquery.fcgi%3Fcmd%3DRetrieve%26db%3DPubMed%26dopt%3DAbstract%26list_uids%3D18445606&amp;title=Entrez+Pubmed&quot;&gt;CiteULike&lt;/a&gt;</description>
    </item>
    <item>
      <title>Cytoscape ESP: simple search of complex biological networks.</title>
      <link>http://www.ncbi.nlm.nih.gov/entrez/query.fcgi?cmd=Retrieve&amp;db=PubMed&amp;dopt=Abstract&amp;list_uids=18445605</link>
      <description>Publication Date: 2008 Apr 28 PMID: 18445605&lt;br/&gt;Authors: Ashkenazi, M. - Bader, G. D. - Kuchinsky, A. - Moshelion, M. - States, D. J.&lt;br/&gt;Journal: Bioinformatics&lt;br/&gt;&lt;br/&gt;SUMMARY: Cytoscape ESP enables searching complex biological networks on multiple attribute fields using logical operators and wildcards. Queries use an intuitive syntax and simple search line interface. ESP is implemented as a Cytoscape plugin and complements existing search functions in the Cytoscape network visualization and analysis software, allowing users to easily identify nodes, edges and subgraphs of interest, even for very large networks. AVAILABILITY: http://conklinwolf.ucsf.edu/genmappwiki/Google_Summer_of_Code_2007/Maital CONTACT: ashkenaz@agri.huji.ac.il.&lt;br/&gt;&lt;br/&gt;post to: &lt;a href = &quot;http://www.citeulike.org/posturl?url=http%3A%2F%2Fwww.ncbi.nlm.nih.gov%2Fentrez%2Fquery.fcgi%3Fcmd%3DRetrieve%26db%3DPubMed%26dopt%3DAbstract%26list_uids%3D18445605&amp;title=Entrez+Pubmed&quot;&gt;CiteULike&lt;/a&gt;</description>
    </item>
    <item>
      <title>PEPITO: Improved Discontinuous B-Cell Epitope Prediction Using Multiple Distance Thresholds and Half Sphere Exposure.</title>
      <link>http://www.ncbi.nlm.nih.gov/entrez/query.fcgi?cmd=Retrieve&amp;db=PubMed&amp;dopt=Abstract&amp;list_uids=18443018</link>
      <description>Publication Date: 2008 Apr 28 PMID: 18443018&lt;br/&gt;Authors: Sweredoski, M. J. - Baldi, P.&lt;br/&gt;Journal: Bioinformatics&lt;br/&gt;&lt;br/&gt;MOTIVATION: Accurate prediction of B-cell epitopes is an important goal of computational immunology. Up to 90% of B-cell epitopes are discontinuous in nature, yet most predictors focus on linear epitopes. Even whenith the tertiary structure of the antigen is available, the accurate prediction of B-cell epitopes remains challenging. RESULTS: Our predictor, PEPITO, uses a combination of amino acid propensity scores and half sphere exposure values at multiple distances to achieve state-of-the-art performance. PEPITO achieves an Area Under the Curve (AUC) of 75.4 on the Discotope dataset. Additionally, we benchmark PEPITO as well as the Discotope predictor on the more recent Epitome datasaset, achieving AUCs of 68.3 and 66.0 respectively. AVAILABILITY: PEPITO is available as part of the SCRATCH suite of protein structure predictors via www.igb.uci.edu.&lt;br/&gt;&lt;br/&gt;post to: &lt;a href = &quot;http://www.citeulike.org/posturl?url=http%3A%2F%2Fwww.ncbi.nlm.nih.gov%2Fentrez%2Fquery.fcgi%3Fcmd%3DRetrieve%26db%3DPubMed%26dopt%3DAbstract%26list_uids%3D18443018&amp;title=Entrez+Pubmed&quot;&gt;CiteULike&lt;/a&gt;</description>
    </item>
    <item>
      <title>Positive selection drives a correlation between nonsynonymous/ synonymous divergence and functional divergence.</title>
      <link>http://www.ncbi.nlm.nih.gov/entrez/query.fcgi?cmd=Retrieve&amp;db=PubMed&amp;dopt=Abstract&amp;list_uids=18443017</link>
      <description>Publication Date: 2008 Apr 28 PMID: 18443017&lt;br/&gt;Authors: Tennessen, J. A.&lt;br/&gt;Journal: Bioinformatics&lt;br/&gt;&lt;br/&gt;MOTIVATION: Functional divergence among proteins is often assumed to be strongly influenced by natural selection, as inferred from the ratio of nonsynonymous nucleotide divergence (d(N)) to synonymous nucleotide divergence (d(S)). That is, the more a mutation changes protein function, the more likely it is to be either selected against or selectively favored, and because the d(N)/d(S) ratio is a measure of natural selection, this ratio can be used to predict the degree of functional divergence (d(F)). However, these hypotheses have rarely been experimentally tested. RESULTS: I present a novel method to address this issue, and demonstrate that divergence in bacteria-killing activity among animal antimicrobial peptides is positively correlated with the log of the d(N)/d(S) ratio. The primary cause of this pattern appears to be that positively selected substitutions change protein function more than neutral substitutions do. Thus, the d(N)/d(S) ratio is an accurate estimator of adaptive functional divergence. CONTACT: tennessj@science.oregonstate.edu SUPPLEMENTARY INFORMATION: Supplementary data, including GenBank Accession numbers, are available at Bioinformatics Online.&lt;br/&gt;&lt;br/&gt;post to: &lt;a href = &quot;http://www.citeulike.org/posturl?url=http%3A%2F%2Fwww.ncbi.nlm.nih.gov%2Fentrez%2Fquery.fcgi%3Fcmd%3DRetrieve%26db%3DPubMed%26dopt%3DAbstract%26list_uids%3D18443017&amp;title=Entrez+Pubmed&quot;&gt;CiteULike&lt;/a&gt;</description>
    </item>
    <item>
      <title>gpDB: A database of GPCRs, G-proteins, Effectors and their interactions.</title>
      <link>http://www.ncbi.nlm.nih.gov/entrez/query.fcgi?cmd=Retrieve&amp;db=PubMed&amp;dopt=Abstract&amp;list_uids=18441001</link>
      <description>Publication Date: 2008 Apr 25 PMID: 18441001&lt;br/&gt;Authors: Theodoropoulou, M. C. - Bagos, P. G. - Spyropoulos, I. C. - Hamodrakas, S. J.&lt;br/&gt;Journal: Bioinformatics&lt;br/&gt;&lt;br/&gt;SUMMARY: gpDB is a publicly accessible, relational database, containing information about G-proteins, GPCRs and effectors, as well as information concerning known interactions between these molecules. The sequences are classified according to a hierarchy of different classes, families and subfamilies based on literature search. The main innovation besides the classification of G-proteins, GPCRs and effectors is the relational model of the database, describing the known coupling specificity of GPCRs to their respective alpha subunits of G-proteins, and also the specific interaction between G-proteins and their effectors, a unique feature not available in any other database. AVAILABILITY: http://bioinformatics.biol.uoa.gr/gpDB CONTACT: shamodr@biol.uoa.gr SUPPLEMENTARY INFORMATION: Supplementary data are available on Bioinformatics online.&lt;br/&gt;&lt;br/&gt;post to: &lt;a href = &quot;http://www.citeulike.org/posturl?url=http%3A%2F%2Fwww.ncbi.nlm.nih.gov%2Fentrez%2Fquery.fcgi%3Fcmd%3DRetrieve%26db%3DPubMed%26dopt%3DAbstract%26list_uids%3D18441001&amp;title=Entrez+Pubmed&quot;&gt;CiteULike&lt;/a&gt;</description>
    </item>
    <item>
      <title>fdrtool: a versatile R package for estimating local and tail area-based false discovery rates.</title>
      <link>http://www.ncbi.nlm.nih.gov/entrez/query.fcgi?cmd=Retrieve&amp;db=PubMed&amp;dopt=Abstract&amp;list_uids=18441000</link>
      <description>Publication Date: 2008 Apr 25 PMID: 18441000&lt;br/&gt;Authors: Strimmer, K.&lt;br/&gt;Journal: Bioinformatics&lt;br/&gt;&lt;br/&gt;SUMMARY: False discovery rate (FDR) methodologies are essential in the study of high-dimensional genomic and proteomic data. The R package &quot;fdrtool&quot; facilitates such analyzes by offering a comprehensive set of procedures for FDR estimation. Its distinctive features include: i) many different types of test statistics are allowed as input data, such as p-values, z-scores, correlations, and t-scores; ii) simultaneously, both local FDR and tail area-based FDR values are estimated for all test statistics; iii) empirical null models are fit where possible, thereby taking account of potential over- or underdispersion of the theoretical null. In addition, &quot;fdrtool&quot; provides readily interpretable graphical output, and can be applied to very large scale (in the order of millions of hypotheses) multiple testing problems. Consequently, &quot;fdrtool&quot; implements a flexible FDR estimation scheme that is unified across different test statistics and variants of FDR. AVAILABILITY: The program is freely available from the Comprehensive R Archive Network (http://cran.r-project.org/) under the terms of the GNU General Public License (version 3 or later). CONTACT: strimmer@uni-leipzig.de.&lt;br/&gt;&lt;br/&gt;post to: &lt;a href = &quot;http://www.citeulike.org/posturl?url=http%3A%2F%2Fwww.ncbi.nlm.nih.gov%2Fentrez%2Fquery.fcgi%3Fcmd%3DRetrieve%26db%3DPubMed%26dopt%3DAbstract%26list_uids%3D18441000&amp;title=Entrez+Pubmed&quot;&gt;CiteULike&lt;/a&gt;</description>
    </item>
    <item>
      <title>MAMOT: Hidden MArkov MOdeling Tool.</title>
      <link>http://www.ncbi.nlm.nih.gov/entrez/query.fcgi?cmd=Retrieve&amp;db=PubMed&amp;dopt=Abstract&amp;list_uids=18440999</link>
      <description>Publication Date: 2008 Apr 25 PMID: 18440999&lt;br/&gt;Authors: Schutz, F. - Delorenzi, M.&lt;br/&gt;Journal: Bioinformatics&lt;br/&gt;&lt;br/&gt;SUMMARY: Hidden Markov Models are probabilistic models that are well adapted to many tasks in bioinformatics, for example for predicting the occurrence of specific motifs in biological sequences. MAMOT is a command-line program for Unix-like operating systems, including MacOS X, that we developed to allow scientists to apply Hidden Markov Models more easily in their research. One can define the architecture and initial parameters of the model in a text file and then use MAMOT for parameter optimization on example data, decoding (like predicting motif occurrence in sequences) and the production of stochastic sequences generated according to the probabilistic model. Two examples for which models are provided are coiled-coil domains in protein sequences and protein binding sites in DNA. A wealth of useful features include the use of pseudocounts, state tying and fixing of selected parameters in learning, and the inclusion of prior probabilities in decoding. AVAILABILITY: MAMOT is implemented in C++, and is distributed under the GNU General Public Licence (GPL). The software, documentation, and example model files can be found at http://bcf.isb-sib.ch/mamot. CONTACT: Mauro.Delorenzi@isb-sib.ch.&lt;br/&gt;&lt;br/&gt;post to: &lt;a href = &quot;http://www.citeulike.org/posturl?url=http%3A%2F%2Fwww.ncbi.nlm.nih.gov%2Fentrez%2Fquery.fcgi%3Fcmd%3DRetrieve%26db%3DPubMed%26dopt%3DAbstract%26list_uids%3D18440999&amp;title=Entrez+Pubmed&quot;&gt;CiteULike&lt;/a&gt;</description>
    </item>
    <item>
      <title>swissPIT: A novel approach for pipelined analysis of mass spectrometry data.</title>
      <link>http://www.ncbi.nlm.nih.gov/entrez/query.fcgi?cmd=Retrieve&amp;db=PubMed&amp;dopt=Abstract&amp;list_uids=18436540</link>
      <description>Publication Date: 2008 Apr 23 PMID: 18436540&lt;br/&gt;Authors: Quandt, A. - Hernandez, P. - Masselot, A. - Hernandez, C. - Maffioletti, S. - Pautasso, C. - Appel, R. D. - Lisacek, F.&lt;br/&gt;Journal: Bioinformatics&lt;br/&gt;&lt;br/&gt;The identification and characterisation of peptides from tandem mass spectrometry (MS/MS) data represents a critical aspect of proteomics. Today, tandem mass spectrometry analysis is often performed by only using a single identification program achieving identification rates between 10 - 50 % (Elias and Gygi, 2007). Be-side the development of new analysis tools, recent publications describe also the pipelining of different search programs to increase the identification rate (Keller et al., 2005, Hartler et al., 2007). The swissPIT (Swiss Protein Identification Toolbox) follows this approach but goes a step further by providing the user an expand-able multi-tool platform capable of executing workflows to analyze Tandem MS-based data. One of the major problems in proteomics is the absent of standardized workflows to analyze the produced data. This includes the pre-processing part as well as the final identifica-tion of peptides and proteins. The main idea of swissPIT is not only the usage of different identification tool in parallel but also the mean-ingful concatenation of different identification strategies at the same time. The swissPIT is open source software but we also provide a user-friendly web platform, which demonstrates the capabilities of our software and which is available at http://swisspit.cscs.ch upon request for account.&lt;br/&gt;&lt;br/&gt;post to: &lt;a href = &quot;http://www.citeulike.org/posturl?url=http%3A%2F%2Fwww.ncbi.nlm.nih.gov%2Fentrez%2Fquery.fcgi%3Fcmd%3DRetrieve%26db%3DPubMed%26dopt%3DAbstract%26list_uids%3D18436540&amp;title=Entrez+Pubmed&quot;&gt;CiteULike&lt;/a&gt;</description>
    </item>
    <item>
      <title>Annotation-Modules: A tool for finding significant combinations of multisource annotations for gene lists.</title>
      <link>http://www.ncbi.nlm.nih.gov/entrez/query.fcgi?cmd=Retrieve&amp;db=PubMed&amp;dopt=Abstract&amp;list_uids=18434345</link>
      <description>Publication Date: 2008 May 8 PMID: 18434345&lt;br/&gt;Authors: Hackenberg, M. - Matthiesen, R.&lt;br/&gt;Journal: Bioinformatics&lt;br/&gt;&lt;br/&gt;MOTIVATION: The ontological analysis of the gene lists obtained from DNA microarray experiments constitutes an important step in understanding the underlying biology of the analyzed system. Over the last years, many other high-throughput techniques emerged, covering now basically all &quot;omics&quot; fields. However, for some of these techniques the generally used functional ontologies might not be sufficient to describe the biological system represented by the derived gene lists. For a more complete and correct interpretation of these experiments, it is important to extend substantially the number of annotations, adapting the ontological analysis to the new emerging techniques. RESULTS: We developed Annotation-Modules, which offers an improvement over the current tools which improves the current tools in two critical aspects. Firstly, the underlying annotation database implements features from many different fields like gene regulation and expression, sequence properties, evolution and conservation, genomic localization and functional categories - resulting in about 60 different annotation features. Secondly, it examines not only single annotations but also all the combinations, which is important to gain insight into the interplay of different mechanisms in the analyzed biological system. AVAILABILITY: http://web.bioinformatics.cicbiogune.es/AM/AnnotationModules.php.&lt;br/&gt;&lt;br/&gt;post to: &lt;a href = &quot;http://www.citeulike.org/posturl?url=http%3A%2F%2Fwww.ncbi.nlm.nih.gov%2Fentrez%2Fquery.fcgi%3Fcmd%3DRetrieve%26db%3DPubMed%26dopt%3DAbstract%26list_uids%3D18434345&amp;title=Entrez+Pubmed&quot;&gt;CiteULike&lt;/a&gt;</description>
    </item>
    <item>
      <title>RNAplex: a fast tool for RNA-RNA interaction search.</title>
      <link>http://www.ncbi.nlm.nih.gov/entrez/query.fcgi?cmd=Retrieve&amp;db=PubMed&amp;dopt=Abstract&amp;list_uids=18434344</link>
      <description>Publication Date: 2008 Apr 23 PMID: 18434344&lt;br/&gt;Authors: Tafer, H. - Hofacker, I. L.&lt;br/&gt;Journal: Bioinformatics&lt;br/&gt;&lt;br/&gt;MOTIVATION: Regulatory RNAs often unfold their action via RNA-RNA interaction. Transcriptional gene silencing by means of siRNAs and miRNA as well as snoRNA directed RNA editing rely on this mechanism. Additionally ncRNA regulation in bacteria is mainly based upon RNA duplex formation. Finding putative target sites for newly discovered ncRNAs is a lengthy task as tools for cofolding RNA molecules like RNAcofold and RNAup are too slow for genome-wide search. Tools like RNAhybrid that neglects intramolecular interactions have runtimes proportional to $$\mathcal{O}$$ (m n), albeit with a large prefactor. Still in many cases the need for even faster methods exists. RESULTS: We present a new program, RNAplex, especially designed to quickly find possible hybridization sites for a query RNA in large RNA databases. RNAplex uses a slightly different energy model which reduces the computational time by a factor 10-27 compared to RNAhybrid. In addition a length penalty allows to focus the target search on short highly stable interactions. AVAILABILITY: RNAplex can be downloaded at http://www.tbi.univie.ac.at/~htafer/ CONTACT: ivo@tbi.univie.ac.at.&lt;br/&gt;&lt;br/&gt;post to: &lt;a href = &quot;http://www.citeulike.org/posturl?url=http%3A%2F%2Fwww.ncbi.nlm.nih.gov%2Fentrez%2Fquery.fcgi%3Fcmd%3DRetrieve%26db%3DPubMed%26dopt%3DAbstract%26list_uids%3D18434344&amp;title=Entrez+Pubmed&quot;&gt;CiteULike&lt;/a&gt;</description>
    </item>
    <item>
      <title>A Global Pathway Crosstalk Network.</title>
      <link>http://www.ncbi.nlm.nih.gov/entrez/query.fcgi?cmd=Retrieve&amp;db=PubMed&amp;dopt=Abstract&amp;list_uids=18434343</link>
      <description>Publication Date: 2008 Apr 23 PMID: 18434343&lt;br/&gt;Authors: Li, Y. - Agarwal, P. - Rajagopalan, D.&lt;br/&gt;Journal: Bioinformatics&lt;br/&gt;&lt;br/&gt;MOTIVATION: Given the complex nature of biological systems, pathways often need to function in a coordinated fashion in order to produce appropriate physiological responses to both internal and external stimuli (Hartwell et al., 1999). Therefore, understanding the interaction and crosstalk between pathways is important for understanding the function of both cells and more complex systems. RESULTS: We have developed a computational approach to detect crosstalk among pathways based on protein interactions between the pathway components. We built a global mammalian pathway crosstalk network that includes 580 pathways (covering 4,753 genes) with 1,815 edges between pathways. This crosstalk network follows a power-law distribution: P(k) approximately k(-gamma), gamma= 1.45, where P(k) is the number of pathways with k neighbors, thus pathway interactions may exhibit the same scale-free phenomenon that has been documented for protein interaction networks. We further used this network to understand colorectal cancer progression to metastasis based on transcriptomic data. CONTACT: yong.2.li@gsk.com.&lt;br/&gt;&lt;br/&gt;post to: &lt;a href = &quot;http://www.citeulike.org/posturl?url=http%3A%2F%2Fwww.ncbi.nlm.nih.gov%2Fentrez%2Fquery.fcgi%3Fcmd%3DRetrieve%26db%3DPubMed%26dopt%3DAbstract%26list_uids%3D18434343&amp;title=Entrez+Pubmed&quot;&gt;CiteULike&lt;/a&gt;</description>
    </item>
    <item>
      <title>TOPDOM: database of domains and motifs with conservative location in transmembrane proteins.</title>
      <link>http://www.ncbi.nlm.nih.gov/entrez/query.fcgi?cmd=Retrieve&amp;db=PubMed&amp;dopt=Abstract&amp;list_uids=18434342</link>
      <description>Publication Date: 2008 Apr 23 PMID: 18434342&lt;br/&gt;Authors: Tusnady, G. E. - Kalmar, L. - Hegyi, H. - Tompa, P. - Simon, I.&lt;br/&gt;Journal: Bioinformatics&lt;br/&gt;&lt;br/&gt;SUMMARY: The TOPDOM database is a collection of domains and sequence motifs located consistently on the same side of the membrane in alpha-helical transmembrane proteins. The database was created by scanning well annotated transmembrane protein sequences in the UniProt database by specific domain or motif detecting algorithms. The identified domains or motifs were added to the database if they were uniformly annotated on the same side of the membrane of the various proteins in the UniProt database. The information about the location of the collected domains and motifs can be incorporated into constrained topology prediction algorithms, like HMMTOP, increasing the prediction accuracy. AVAILABILITY: The TOPDOM database and the constrained HMMTOP prediction server are available on the page http://topdom.enzim.hu. CONTACT: tusi@enzim.hu, lkamar@enzim.hu.&lt;br/&gt;&lt;br/&gt;post to: &lt;a href = &quot;http://www.citeulike.org/posturl?url=http%3A%2F%2Fwww.ncbi.nlm.nih.gov%2Fentrez%2Fquery.fcgi%3Fcmd%3DRetrieve%26db%3DPubMed%26dopt%3DAbstract%26list_uids%3D18434342&amp;title=Entrez+Pubmed&quot;&gt;CiteULike&lt;/a&gt;</description>
    </item>
    <item>
      <title>OnD-CRF: predicting order and disorder in proteins using conditional random fields.</title>
      <link>http://www.ncbi.nlm.nih.gov/entrez/query.fcgi?cmd=Retrieve&amp;db=PubMed&amp;dopt=Abstract&amp;list_uids=18430742</link>
      <description>Publication Date: 2008 Apr 21 PMID: 18430742&lt;br/&gt;Authors: Wang, L. - Sauer, U. H.&lt;br/&gt;Journal: Bioinformatics&lt;br/&gt;&lt;br/&gt;MOTIVATION: Order and Disorder prediction using Conditional Random Fields, OnD-CRF, is a new method for accurately predicting the transition between structured and mobile or disordered regions in proteins. OnD-CRF applies CRFs relying on features which are generated from the amino acids sequence and from secondary structure prediction. Benchmarking results based on CASP7 targets, and evaluation with respect to several CASP criteria, rank the OnD-CRF model highest among the fully automatic server group. AVAILABILITY: http://babel.ucmp.umu.se/ond-crf/ CONTACT: Uwe.Sauer@ucmp.umu.se.&lt;br/&gt;&lt;br/&gt;post to: &lt;a href = &quot;http://www.citeulike.org/posturl?url=http%3A%2F%2Fwww.ncbi.nlm.nih.gov%2Fentrez%2Fquery.fcgi%3Fcmd%3DRetrieve%26db%3DPubMed%26dopt%3DAbstract%26list_uids%3D18430742&amp;title=Entrez+Pubmed&quot;&gt;CiteULike&lt;/a&gt;</description>
    </item>
    <item>
      <title>Eukaryotic transcription factor binding sites - modeling and integrative search methods.</title>
      <link>http://www.ncbi.nlm.nih.gov/entrez/query.fcgi?cmd=Retrieve&amp;db=PubMed&amp;dopt=Abstract&amp;list_uids=18426806</link>
      <description>Publication Date: 2008 Apr 21 PMID: 18426806&lt;br/&gt;Authors: Hannenhalli, S.&lt;br/&gt;Journal: Bioinformatics&lt;br/&gt;&lt;br/&gt;A comprehensive knowledge of transcription factor binding sites is important for a mechanistic understanding of transcriptional regulation as well as for inferring gene regulatory networks. Because the DNA motif recognized by a transcription factor is typically short and degenerate, computational approaches for identifying binding sites based only on the sequence motif inevitably suffer from high error rates. Current state-of-the-art techniques for improving computational identification of binding sites can be broadly categorized into two classes: (1) Approaches that aim to improve binding motif models by extracting maximal sequence information from experimentally determined binding sites and (2) Approaches that supplement binding motif models with additional genomic or other attributes (such as evolutionary conservation). In this review we will discuss recent attempts to improve computational identification of transcription factor binding sites through these two types of approaches and conclude with thoughts on future development.&lt;br/&gt;&lt;br/&gt;post to: &lt;a href = &quot;http://www.citeulike.org/posturl?url=http%3A%2F%2Fwww.ncbi.nlm.nih.gov%2Fentrez%2Fquery.fcgi%3Fcmd%3DRetrieve%26db%3DPubMed%26dopt%3DAbstract%26list_uids%3D18426806&amp;title=Entrez+Pubmed&quot;&gt;CiteULike&lt;/a&gt;</description>
    </item>
    <item>
      <title>Prediction of disordered regions in proteins based on the meta approach.</title>
      <link>http://www.ncbi.nlm.nih.gov/entrez/query.fcgi?cmd=Retrieve&amp;db=PubMed&amp;dopt=Abstract&amp;list_uids=18426805</link>
      <description>Publication Date: 2008 Apr 20 PMID: 18426805&lt;br/&gt;Authors: Ishida, T. - Kinoshita, K.&lt;br/&gt;Journal: Bioinformatics&lt;br/&gt;&lt;br/&gt;MOTIVATION: Intrinsically disordered regions in proteins have no unique stable structures without their partner molecules, thus these regions sometimes prevent high quality structure determination. Furthermore, proteins with disordered regions are often involved in important biological processes, and the disordered regions are considered to play important roles in molecular interactions. Therefore, identifying disordered regions is important to obtain high-resolution structural information and to understand the functional aspects of these proteins. RESULTS: We developed a new prediction method for disordered regions in proteins based on the meta approach and implemented a web-server for this prediction method named &quot;metaPrDOS&quot;. The method predicts the disorder tendency of each residue using support vector machines from the prediction results of the seven independent predictors. Evaluation of the meta approach was performed using the CASP7 prediction targets to avoid an overestimation due to the inclusion of proteins used in the training set of some component predictors. As a result, the meta approach achieved higher prediction accuracy than all methods participating in CASP7. AVAILABILITY: http://prdos.hgc.jp/meta/ CONTACT: t-ishida@hgc.jp.&lt;br/&gt;&lt;br/&gt;post to: &lt;a href = &quot;http://www.citeulike.org/posturl?url=http%3A%2F%2Fwww.ncbi.nlm.nih.gov%2Fentrez%2Fquery.fcgi%3Fcmd%3DRetrieve%26db%3DPubMed%26dopt%3DAbstract%26list_uids%3D18426805&amp;title=Entrez+Pubmed&quot;&gt;CiteULike&lt;/a&gt;</description>
    </item>
    <item>
      <title>Comparative conservation analysis of the human mitotic phosphoproteome.</title>
      <link>http://www.ncbi.nlm.nih.gov/entrez/query.fcgi?cmd=Retrieve&amp;db=PubMed&amp;dopt=Abstract&amp;list_uids=18426804</link>
      <description>Publication Date: 2008 Apr 20 PMID: 18426804&lt;br/&gt;Authors: Malik, R. - Nigg, E. A. - Korner, R.&lt;br/&gt;Journal: Bioinformatics&lt;br/&gt;&lt;br/&gt;MOTIVATION: A key challenge in phosphoproteomic studies is to distinguish functionally relevant phosphorylation sites from potentially &quot;silent&quot; phosphorylation. Considering that relevant phosphorylation sites are expected to be better conserved during evolution than overall Serine, Threonine, and Tyrosine (S/T/Y) residues, we asked whether this can be directly demonstrated through statistic analysis, using a large experimental dataset. RESULTS: Analyzing phosphoproteomic data derived from the human mitotic spindle apparatus, we found that 95.2 % of 1744 phosphorylation sites are conserved in at least one of six other vertebrate species. Using a new score, termed CZ-Score, we demonstrate that phosphorylation sites are significantly better conserved than other S/T/Y sites, a conclusion validated from several kinase consensus motifs. Most importantly, phosphorylation sites with experimentally verified biological functions were significantly better conserved than other phosphorylation sites, indicating that analysis utilizing evolutionary conservation may constitute a powerful basis for the development of improved phosphorylation site predictors. CONTACT: malik@biochem.mpg.de.&lt;br/&gt;&lt;br/&gt;post to: &lt;a href = &quot;http://www.citeulike.org/posturl?url=http%3A%2F%2Fwww.ncbi.nlm.nih.gov%2Fentrez%2Fquery.fcgi%3Fcmd%3DRetrieve%26db%3DPubMed%26dopt%3DAbstract%26list_uids%3D18426804&amp;title=Entrez+Pubmed&quot;&gt;CiteULike&lt;/a&gt;</description>
    </item>
    <item>
      <title>A note on the false discovery rate and inconsistent comparisons between experiments.</title>
      <link>http://www.ncbi.nlm.nih.gov/entrez/query.fcgi?cmd=Retrieve&amp;db=PubMed&amp;dopt=Abstract&amp;list_uids=18424815</link>
      <description>Publication Date: 2008 May 15 PMID: 18424815&lt;br/&gt;Authors: Higdon, R. - van Belle, G. - Kolker, E.&lt;br/&gt;Journal: Bioinformatics&lt;br/&gt;&lt;br/&gt;MOTIVATION: The false discovery rate (FDR) has been widely adopted to address the multiple comparisons issue in high-throughput experiments such as microarray gene-expression studies. However, while the FDR is quite useful as an approach to limit false discoveries within a single experiment, like other multiple comparison corrections it may be an inappropriate way to compare results across experiments. This article uses several examples based on gene-expression data to demonstrate the potential misinterpretations that can arise from using FDR to compare across experiments. Researchers should be aware of these pitfalls and wary of using FDR to compare experimental results. FDR should be augmented with other measures such as p-values and expression ratios. It is worth including standard error and variance information for meta-analyses and, if possible, the raw data for re-analyses. This is especially important for high-throughput studies because data are often re-used for different objectives, including comparing common elements across many experiments. No single error rate or data summary may be appropriate for all of the different objectives.&lt;br/&gt;&lt;br/&gt;post to: &lt;a href = &quot;http://www.citeulike.org/posturl?url=http%3A%2F%2Fwww.ncbi.nlm.nih.gov%2Fentrez%2Fquery.fcgi%3Fcmd%3DRetrieve%26db%3DPubMed%26dopt%3DAbstract%26list_uids%3D18424815&amp;title=Entrez+Pubmed&quot;&gt;CiteULike&lt;/a&gt;</description>
    </item>
    <item>
      <title>Fast grid layout algorithm for biological networks with sweep calculation.</title>
      <link>http://www.ncbi.nlm.nih.gov/entrez/query.fcgi?cmd=Retrieve&amp;db=PubMed&amp;dopt=Abstract&amp;list_uids=18424458</link>
      <description>Publication Date: 2008 Apr 18 PMID: 18424458&lt;br/&gt;Authors: Kojima, K. - Nagasaki, M. - Miyano, S.&lt;br/&gt;Journal: Bioinformatics&lt;br/&gt;&lt;br/&gt;MOTIVATION: Properly drawn biological networks are of great help in the comprehension of their characteristics. The quality of the layouts for retrieved biological networks is critical for pathway databases. However, since it is unrealistic to manually draw biological networks for every retrieval, automatic drawing algorithms are essential. Grid layout algorithms handle various biological properties such as aligning vertices having the same attributes and complicated positional constraints according to their subcellular localizations; thus, they succeed in providing biologically comprehensible layouts. However, existing grid layout algorithms are not suitable for real-time drawing, which is one of requisites for applications to pathway databases, due to their high computational cost. In addition, they do not consider edge directions and their resulting layouts lack traceability for biochemical reactions and gene regulations, which are the most important features in biological networks. RESULTS: We devise a new calculation method termed sweep calculation and reduce the time complexity of the current grid layout algorithms through its encoding and decoding processes. We conduct practical experiments by using 95 pathway models of various sizes from TRANSPATH and show that our new grid layout algorithm is much faster than existing grid layout algorithms. For the cost function, we introduce a new component that penalizes undesirable edge directions to avoid the lack of traceability in pathways due to the differences in direction between in-edges and out-edges of each vertex. AVAILABILITY: Java implementations of our layout algorithms are available in Cell Illustrator. CONTACT: masao@ims.u-tokyo.ac.jp.&lt;br/&gt;&lt;br/&gt;post to: &lt;a href = &quot;http://www.citeulike.org/posturl?url=http%3A%2F%2Fwww.ncbi.nlm.nih.gov%2Fentrez%2Fquery.fcgi%3Fcmd%3DRetrieve%26db%3DPubMed%26dopt%3DAbstract%26list_uids%3D18424458&amp;title=Entrez+Pubmed&quot;&gt;CiteULike&lt;/a&gt;</description>
    </item>
    <item>
      <title>A system for generating transcription regulatory networks with combinatorial control of transcription.</title>
      <link>http://www.ncbi.nlm.nih.gov/entrez/query.fcgi?cmd=Retrieve&amp;db=PubMed&amp;dopt=Abstract&amp;list_uids=18400774</link>
      <description>Publication Date: 2008 May 15 PMID: 18400774&lt;br/&gt;Authors: Roy, S. - Werner-Washburne, M. - Lane, T.&lt;br/&gt;Journal: Bioinformatics&lt;br/&gt;&lt;br/&gt;We have developed a new software system, REgulatory Network generator with COmbinatorial control (RENCO), for automatic generation of differential equations describing pre-transcriptional combinatorics in artificial regulatory networks. RENCO has the following benefits: (a) it explicitly models protein-protein interactions among transcription factors, (b) it captures combinatorial control of transcription factors on target genes and (c) it produces output in Systems Biology Markup Language (SBML) format, which allows these equations to be directly imported into existing simulators. Explicit modeling of the protein interactions allows RENCO to incorporate greater mechanistic detail of the transcription machinery compared to existing models and can provide a better assessment of algorithms for regulatory network inference. AVAILABILITY: RENCO is a C++ command line program, available at http://sourceforge.net/projects/renco/&lt;br/&gt;&lt;br/&gt;post to: &lt;a href = &quot;http://www.citeulike.org/posturl?url=http%3A%2F%2Fwww.ncbi.nlm.nih.gov%2Fentrez%2Fquery.fcgi%3Fcmd%3DRetrieve%26db%3DPubMed%26dopt%3DAbstract%26list_uids%3D18400774&amp;title=Entrez+Pubmed&quot;&gt;CiteULike&lt;/a&gt;</description>
    </item>
    <item>
      <title>Domain annotation of trimeric autotransporter adhesins--daTAA.</title>
      <link>http://www.ncbi.nlm.nih.gov/entrez/query.fcgi?cmd=Retrieve&amp;db=PubMed&amp;dopt=Abstract&amp;list_uids=18397894</link>
      <description>Publication Date: 2008 May 15 PMID: 18397894&lt;br/&gt;Authors: Szczesny, P. - Lupas, A.&lt;br/&gt;Journal: Bioinformatics&lt;br/&gt;&lt;br/&gt;MOTIVATION: Trimeric autotransporter adhesins (TAAs), such as Yersinia YadA, Neisseria NadA, Moraxella UspAs, Haemophilus Hia and Bartonella BadA, are important pathogenicity factors of proteobacteria. Their high sequence diversity and distinct mosaic-like structure lead to difficulties in the annotation of their sequences. These stem from the large number of short repeats, the presence of compositionally unusual coiled-coils, fuzzy domain boundaries and regions of seemingly low sequence complexity. RESULTS: We have developed a workflow, named daTAA, for the accurate domain annotation of TAAs. Its core consists of manually curated alignments and of knowledge-based rules that enhance assignments made by sequence similarity. Compared to general domain annotation servers such as PFAM, daTAA captures more domains and provides more sensitive domain detection, as well as integrated and detailed coiled-coil assignments. AVAILABILITY: The daTAA server is freely accessible at http://toolkit.tuebingen.mpg.de/dataa&lt;br/&gt;&lt;br/&gt;post to: &lt;a href = &quot;http://www.citeulike.org/posturl?url=http%3A%2F%2Fwww.ncbi.nlm.nih.gov%2Fentrez%2Fquery.fcgi%3Fcmd%3DRetrieve%26db%3DPubMed%26dopt%3DAbstract%26list_uids%3D18397894&amp;title=Entrez+Pubmed&quot;&gt;CiteULike&lt;/a&gt;</description>
    </item>
    <item>
      <title>siRNA specificity searching incorporating mismatch tolerance data.</title>
      <link>http://www.ncbi.nlm.nih.gov/entrez/query.fcgi?cmd=Retrieve&amp;db=PubMed&amp;dopt=Abstract&amp;list_uids=18397893</link>
      <description>Publication Date: 2008 May 15 PMID: 18397893&lt;br/&gt;Authors: Chalk, A. M. - Sonnhammer, E. L.&lt;br/&gt;Journal: Bioinformatics&lt;br/&gt;&lt;br/&gt;Artificially synthesized short interfering RNAs (siRNAs) are widely used in functional genomics to knock down specific target genes. One ongoing challenge is to guarantee that the siRNA does not elicit off-target effects. Initial reports suggested that siRNAs were highly sequence-specific; however, subsequent data indicates that this is not necessarily the case. It is still uncertain what level of similarity and other rules are required for an off-target effect to be observed, and scoring schemes have not been developed to look beyond simple measures such as the number of mismatches or the number of consecutive matching bases present. We created design rules for predicting the likelihood of a non-specific effect and present a web server that allows the user to check the specificity of a given siRNA in a flexible manner using a combination of methods. The server finds potential off-target matches in the corresponding RefSeq database and ranks them according to a scoring system based on experimental studies of specificity. AVAILABILITY: The server is available at http://informatics-eskitis.griffith.edu.au/SpecificityServer.&lt;br/&gt;&lt;br/&gt;post to: &lt;a href = &quot;http://www.citeulike.org/posturl?url=http%3A%2F%2Fwww.ncbi.nlm.nih.gov%2Fentrez%2Fquery.fcgi%3Fcmd%3DRetrieve%26db%3DPubMed%26dopt%3DAbstract%26list_uids%3D18397893&amp;title=Entrez+Pubmed&quot;&gt;CiteULike&lt;/a&gt;</description>
    </item>
    <item>
      <title>Peak bagging for peptide mass fingerprinting.</title>
      <link>http://www.ncbi.nlm.nih.gov/entrez/query.fcgi?cmd=Retrieve&amp;db=PubMed&amp;dopt=Abstract&amp;list_uids=18397892</link>
      <description>Publication Date: 2008 May 15 PMID: 18397892&lt;br/&gt;Authors: He, Z. - Yang, C. - Yu, W.&lt;br/&gt;Journal: Bioinformatics&lt;br/&gt;&lt;br/&gt;MOTIVATION: Mass Spectrometry (MS)-based protein identification via peptide mass fingerprinting (PMF) is a key component in high-throughput proteome research. While PMF was the first commonly used protein identification method, provided higher throughput than the tandem MS-based method, its accuracy is lower than that of the tandem MS method. Thus, it is desirable to develop PMF-based algorithm with higher protein identification accuracy to facilitate proteome research. RESULTS: We propose a peak bagging method for single MS-based protein identification. It combines results from multiple PMF algorithms, where each PMF algorithm takes a random peak subset as input. Evaluation with a set of real MALDI-TOF MS spectra shows that the new peak bagging method provides consistent improvements over the single PMF algorithm.&lt;br/&gt;&lt;br/&gt;post to: &lt;a href = &quot;http://www.citeulike.org/posturl?url=http%3A%2F%2Fwww.ncbi.nlm.nih.gov%2Fentrez%2Fquery.fcgi%3Fcmd%3DRetrieve%26db%3DPubMed%26dopt%3DAbstract%26list_uids%3D18397892&amp;title=Entrez+Pubmed&quot;&gt;CiteULike&lt;/a&gt;</description>
    </item>
    <item>
      <title>UniProtJAPI: a remote API for accessing UniProt data.</title>
      <link>http://www.ncbi.nlm.nih.gov/entrez/query.fcgi?cmd=Retrieve&amp;db=PubMed&amp;dopt=Abstract&amp;list_uids=18390879</link>
      <description>Publication Date: 2008 May 15 PMID: 18390879&lt;br/&gt;Authors: Patient, S. - Wieser, D. - Kleen, M. - Kretschmann, E. - Jesus Martin, M. - Apweiler, R.&lt;br/&gt;Journal: Bioinformatics&lt;br/&gt;&lt;br/&gt;Programmatic access to the UniProt Knowledgebase (UniProtKB) is essential for many bioinformatics applications dealing with protein data. We have created a Java library named UniProtJAPI, which facilitates the integration of UniProt data into Java-based software applications. The library supports queries and similarity searches that return UniProtKB entries in the form of Java objects. These objects contain functional annotations or sequence information associated with a UniProt entry. Here, we briefly describe the UniProtJAPI and demonstrate its usage.&lt;br/&gt;&lt;br/&gt;post to: &lt;a href = &quot;http://www.citeulike.org/posturl?url=http%3A%2F%2Fwww.ncbi.nlm.nih.gov%2Fentrez%2Fquery.fcgi%3Fcmd%3DRetrieve%26db%3DPubMed%26dopt%3DAbstract%26list_uids%3D18390879&amp;title=Entrez+Pubmed&quot;&gt;CiteULike&lt;/a&gt;</description>
    </item>
    <item>
      <title>ASPicDB: a database resource for alternative splicing analysis.</title>
      <link>http://www.ncbi.nlm.nih.gov/entrez/query.fcgi?cmd=Retrieve&amp;db=PubMed&amp;dopt=Abstract&amp;list_uids=18388144</link>
      <description>Publication Date: 2008 May 15 PMID: 18388144&lt;br/&gt;Authors: Castrignano, T. - D'Antonio, M. - Anselmo, A. - Carrabino, D. - D'Onorio De Meo, A. - D'Erchia, A. M. - Licciulli, F. - Mangiulli, M. - Mignone, F. - Pavesi, G. - Picardi, E. - Riva, A. - Rizzi, R. - Bonizzoni, P. - Pesole, G.&lt;br/&gt;Journal: Bioinformatics&lt;br/&gt;&lt;br/&gt;MOTIVATION: Alternative splicing has recently emerged as a key mechanism responsible for the expansion of transcriptome and proteome complexity in human and other organisms. Although several online resources devoted to alternative splicing analysis are available they may suffer from limitations related both to the computational methodologies adopted and to the extent of the annotations they provide that prevent the full exploitation of the available data. Furthermore, current resources provide limited query and download facilities. RESULTS: ASPicDB is a database designed to provide access to reliable annotations of the alternative splicing pattern of human genes and to the functional annotation of predicted splicing isoforms. Splice-site detection and full-length transcript modeling have been carried out by a genome-wide application of the ASPic algorithm, based on the multiple alignments of gene-related transcripts (typically a Unigene cluster) to the genomic sequence, a strategy that greatly improves prediction accuracy compared to methods based on independent and progressive alignments. Enhanced query and download facilities for annotations and sequences allow users to select and extract specific sets of data related to genes, transcripts and introns fulfilling a combination of user-defined criteria. Several tabular and graphical views of the results are presented, providing a comprehensive assessment of the functional implication of alternative splicing in the gene set under investigation. ASPicDB, which is regularly updated on a monthly basis, also includes information on tissue-specific splicing patterns of normal and cancer cells, based on available EST sequences and their library source annotation. AVAILABILITY: www.caspur.it/ASPicDB&lt;br/&gt;&lt;br/&gt;post to: &lt;a href = &quot;http://www.citeulike.org/posturl?url=http%3A%2F%2Fwww.ncbi.nlm.nih.gov%2Fentrez%2Fquery.fcgi%3Fcmd%3DRetrieve%26db%3DPubMed%26dopt%3DAbstract%26list_uids%3D18388144&amp;title=Entrez+Pubmed&quot;&gt;CiteULike&lt;/a&gt;</description>
    </item>
    <item>
      <title>Prediction of the translocon-mediated membrane insertion free energies of protein sequences.</title>
      <link>http://www.ncbi.nlm.nih.gov/entrez/query.fcgi?cmd=Retrieve&amp;db=PubMed&amp;dopt=Abstract&amp;list_uids=18388143</link>
      <description>Publication Date: 2008 May 15 PMID: 18388143&lt;br/&gt;Authors: Park, Y. - Helms, V.&lt;br/&gt;Journal: Bioinformatics&lt;br/&gt;&lt;br/&gt;MOTIVATION: Helical membrane proteins (HMPs) play crucial roles in a variety of cellular processes. Unlike water-soluble proteins, HMPs need not only to fold but also get inserted into the membrane to be fully functional. This process of membrane insertion is mediated by the translocon complex. Thus, it is of great interest to develop computational methods for predicting the translocon-mediated membrane insertion free energies of protein sequences. RESULT: We have developed Membrane Insertion (MINS), a novel sequence-based computational method for predicting the membrane insertion free energies of protein sequences. A benchmark test gives a correlation coefficient of 0.74 between predicted and observed free energies for 357 known cases, which corresponds to a mean unsigned error of 0.41 kcal/mol. These results are significantly better than those obtained by traditional hydropathy analysis. Moreover, the ability of MINS to reasonably predict membrane insertion free energies of protein sequences allows for effective identification of transmembrane (TM) segments. Subsequently, MINS was applied to predict the membrane insertion free energies of 316 TM segments found in known structures. An in-depth analysis of the predicted free energies reveals a number of interesting findings about the biogenesis and structural stability of HMPs. AVAILABILITY: A web server for MINS is available at http://service.bioinformatik.uni-saarland.de/mins&lt;br/&gt;&lt;br/&gt;post to: &lt;a href = &quot;http://www.citeulike.org/posturl?url=http%3A%2F%2Fwww.ncbi.nlm.nih.gov%2Fentrez%2Fquery.fcgi%3Fcmd%3DRetrieve%26db%3DPubMed%26dopt%3DAbstract%26list_uids%3D18388143&amp;title=Entrez+Pubmed&quot;&gt;CiteULike&lt;/a&gt;</description>
    </item>
    <item>
      <title>MTMDAT: Automated analysis and visualization of mass spectrometry data for tertiary and quaternary structure probing of proteins.</title>
      <link>http://www.ncbi.nlm.nih.gov/entrez/query.fcgi?cmd=Retrieve&amp;db=PubMed&amp;dopt=Abstract&amp;list_uids=18388142</link>
      <description>Publication Date: 2008 May 15 PMID: 18388142&lt;br/&gt;Authors: Hennig, J. - Hennig, K. D. - Sunnerhagen, M.&lt;br/&gt;Journal: Bioinformatics&lt;br/&gt;&lt;br/&gt;In structural biology and -genomics, nuclear magnetic resonance (NMR) spectroscopy and crystallography are the methods of choice, but sample requirements can be hard to fulfil. Valuable structural information can also be obtained by using a combination of limited proteolysis and mass spectrometry, providing not only knowledge of how to improve sample conditions for crystallization trials or NMR spectrosopy by gaining insight into subdomain identities but also probing tertiary and quaternary structure, folding and stability, ligand binding, protein interactions and the location of post-translational modifications. For high-throughput studies and larger proteins, however, this experimentally fast and easy approach produces considerable amounts of data, which until now has made the evaluation exceedingly laborious if at all manually possible. MTMDAT, equipped with a browser-like graphical user interface, accelerates this evaluation manifold by automated peak picking, assignment, data processing and visualization. AVAILABILITY: MTMDAT can be downloaded from the following page: http://www.cms.liu.se/chemistry/molbiotech/maria_sunnerhagens_group/mtmdat by clicking on the corresponding links (windows- or unix-based) together with the manual and example files. The program is free for academic/non-commercial purposes only.&lt;br/&gt;&lt;br/&gt;post to: &lt;a href = &quot;http://www.citeulike.org/posturl?url=http%3A%2F%2Fwww.ncbi.nlm.nih.gov%2Fentrez%2Fquery.fcgi%3Fcmd%3DRetrieve%26db%3DPubMed%26dopt%3DAbstract%26list_uids%3D18388142&amp;title=Entrez+Pubmed&quot;&gt;CiteULike&lt;/a&gt;</description>
    </item>
    <item>
      <title>GeneTrack--a genomic data processing and visualization framework.</title>
      <link>http://www.ncbi.nlm.nih.gov/entrez/query.fcgi?cmd=Retrieve&amp;db=PubMed&amp;dopt=Abstract&amp;list_uids=18388141</link>
      <description>Publication Date: 2008 May 15 PMID: 18388141&lt;br/&gt;Authors: Albert, I. - Wachi, S. - Jiang, C. - Pugh, B. F.&lt;br/&gt;Journal: Bioinformatics&lt;br/&gt;&lt;br/&gt;MOTIVATION: High-throughput 'ChIP-chip' and 'ChIP-seq' methodologies generate sufficiently large data sets that analysis poses significant informatics challenges, particularly for research groups with modest computational support. To address this challenge, we devised a software platform for storing, analyzing and visualizing high resolution genome-wide binding data. GeneTrack automates several steps of a typical data processing pipeline, including smoothing and peak detection, and facilitates dissemination of the results via the web. Our software is freely available via the Google Project Hosting environment at http://genetrack.googlecode.com&lt;br/&gt;&lt;br/&gt;post to: &lt;a href = &quot;http://www.citeulike.org/posturl?url=http%3A%2F%2Fwww.ncbi.nlm.nih.gov%2Fentrez%2Fquery.fcgi%3Fcmd%3DRetrieve%26db%3DPubMed%26dopt%3DAbstract%26list_uids%3D18388141&amp;title=Entrez+Pubmed&quot;&gt;CiteULike&lt;/a&gt;</description>
    </item>
    <item>
      <title>Assigning functional linkages to proteins using phylogenetic profiles and continuous phenotypes.</title>
      <link>http://www.ncbi.nlm.nih.gov/entrez/query.fcgi?cmd=Retrieve&amp;db=PubMed&amp;dopt=Abstract&amp;list_uids=18381403</link>
      <description>Publication Date: 2008 May 15 PMID: 18381403&lt;br/&gt;Authors: Gonzalez, O. - Zimmer, R.&lt;br/&gt;Journal: Bioinformatics&lt;br/&gt;&lt;br/&gt;MOTIVATION: A class of non-homology-based methods for protein function prediction relies on the assumption that genes linked to a phenotypic trait are preferentially conserved among organisms that share the trait. These methods typically compare pairs of binary strings, where one string encodes the phylogenetic distribution of a trait and the other of a protein. In this work, we extended the approach to automatically deal with continuous phenotypes. RESULTS: Rather than use a priori rules, which can be very subjective, to construct binary profiles from continuous phenotypes, we propose to systematically explore thresholds which can meaningfully separate the phenotype values. We illustrate our method by analyzing optimal growth temperatures, and demonstrate its usefulness by automatically retrieving genes which have been associated with thermophilic growth. We also apply the general approach, for the first time, to optimal growth pH, and make novel predictions. Finally, we show that our method can also be applied to other properties which may not be classically considered as phenotypes. Specifically, we studied correlations between genome size and the distribution of genes.&lt;br/&gt;&lt;br/&gt;post to: &lt;a href = &quot;http://www.citeulike.org/posturl?url=http%3A%2F%2Fwww.ncbi.nlm.nih.gov%2Fentrez%2Fquery.fcgi%3Fcmd%3DRetrieve%26db%3DPubMed%26dopt%3DAbstract%26list_uids%3D18381403&amp;title=Entrez+Pubmed&quot;&gt;CiteULike&lt;/a&gt;</description>
    </item>
    <item>
      <title>FT-COMAR: fault tolerant three-dimensional structure reconstruction from protein contact maps.</title>
      <link>http://www.ncbi.nlm.nih.gov/entrez/query.fcgi?cmd=Retrieve&amp;db=PubMed&amp;dopt=Abstract&amp;list_uids=18381401</link>
      <description>Publication Date: 2008 May 15 PMID: 18381401&lt;br/&gt;Authors: Vassura, M. - Margara, L. - Di Lena, P. - Medri, F. - Fariselli, P. - Casadio, R.&lt;br/&gt;Journal: Bioinformatics&lt;br/&gt;&lt;br/&gt;Fault Tolerant Contact Map Reconstruction (FT-COMAR) is a heuristic algorithm for the reconstruction of the protein three-dimensional structure from (possibly) incomplete (i.e. containing unknown entries) and noisy contact maps. FT-COMAR runs within minutes, allowing its application to a large-scale number of predictions. AVAILABILITY: http://bioinformatics.cs.unibo.it/FT-COMAR&lt;br/&gt;&lt;br/&gt;post to: &lt;a href = &quot;http://www.citeulike.org/posturl?url=http%3A%2F%2Fwww.ncbi.nlm.nih.gov%2Fentrez%2Fquery.fcgi%3Fcmd%3DRetrieve%26db%3DPubMed%26dopt%3DAbstract%26list_uids%3D18381401&amp;title=Entrez+Pubmed&quot;&gt;CiteULike&lt;/a&gt;</description>
    </item>
    <item>
      <title>An improved physico-chemical model of hybridization on high-density oligonucleotide microarrays.</title>
      <link>http://www.ncbi.nlm.nih.gov/entrez/query.fcgi?cmd=Retrieve&amp;db=PubMed&amp;dopt=Abstract&amp;list_uids=18378525</link>
      <description>Publication Date: 2008 May 15 PMID: 18378525&lt;br/&gt;Authors: Ono, N. - Suzuki, S. - Furusawa, C. - Agata, T. - Kashiwagi, A. - Shimizu, H. - Yomo, T.&lt;br/&gt;Journal: Bioinformatics&lt;br/&gt;&lt;br/&gt;MOTIVATION: High-density DNA microarrays provide useful tools to analyze gene expression comprehensively. However, it is still difficult to obtain accurate expression levels from the observed microarray data because the signal intensity is affected by complicated factors involving probe-target hybridization, such as non-linear behavior of hybridization, non-specific hybridization, and folding of probe and target oligonucleotides. Various methods for microarray data analysis have been proposed to address this problem. In our previous report, we presented a benchmark analysis of probe-target hybridization using artificially synthesized oligonucleotides as targets, in which the effect of non-specific hybridization was negligible. The results showed that the preceding models explained the behavior of probe-target hybridization only within a narrow range of target concentrations. More accurate models are required for quantitative expression analysis. RESULTS: The experiments showed that finiteness of both probe and target molecules should be considered to explain the hybridization behavior. In this article, we present an extension of the Langmuir model that reproduces the experimental results consistently. In this model, we introduced the effects of secondary structure formation, and dissociation of the probe-target duplex during washing after hybridization. The results will provide useful methods for the understanding and analysis of microarray experiments. AVAILABILITY: The method was implemented for the R software and can be downloaded from our website (http://www-shimizu.ist.osaka-u.ac.jp/shimizu_lab/FHarray/).&lt;br/&gt;&lt;br/&gt;post to: &lt;a href = &quot;http://www.citeulike.org/posturl?url=http%3A%2F%2Fwww.ncbi.nlm.nih.gov%2Fentrez%2Fquery.fcgi%3Fcmd%3DRetrieve%26db%3DPubMed%26dopt%3DAbstract%26list_uids%3D18378525&amp;title=Entrez+Pubmed&quot;&gt;CiteULike&lt;/a&gt;</description>
    </item>
    <item>
      <title>Probabilistic multi-class multi-kernel learning: on protein fold recognition and remote homology detection.</title>
      <link>http://www.ncbi.nlm.nih.gov/entrez/query.fcgi?cmd=Retrieve&amp;db=PubMed&amp;dopt=Abstract&amp;list_uids=18378524</link>
      <description>Publication Date: 2008 May 15 PMID: 18378524&lt;br/&gt;Authors: Damoulas, T. - Girolami, M. A.&lt;br/&gt;Journal: Bioinformatics&lt;br/&gt;&lt;br/&gt;MOTIVATION: The problems of protein fold recognition and remote homology detection have recently attracted a great deal of interest as they represent challenging multi-feature multi-class problems for which modern pattern recognition methods achieve only modest levels of performance. As with many pattern recognition problems, there are multiple feature spaces or groups of attributes available, such as global characteristics like the amino-acid composition (C), predicted secondary structure (S), hydrophobicity (H), van der Waals volume (V), polarity (P), polarizability (Z), as well as attributes derived from local sequence alignment such as the Smith-Waterman scores. This raises the need for a classification method that is able to assess the contribution of these potentially heterogeneous object descriptors while utilizing such information to improve predictive performance. To that end, we offer a single multi-class kernel machine that informatively combines the available feature groups and, as is demonstrated in this article, is able to provide the state-of-the-art in performance accuracy on the fold recognition problem. Furthermore, the proposed approach provides some insight by assessing the significance of recently introduced protein features and string kernels. The proposed method is well-founded within a Bayesian hierarchical framework and a variational Bayes approximation is derived which allows for efficient CPU processing times. RESULTS: The best performance which we report on the SCOP PDB-40D benchmark data-set is a 70% accuracy by combining all the available feature groups from global protein characteristics but also including sequence-alignment features. We offer an 8% improvement on the best reported performance that combines multi-class k-nn classifiers while at the same time reducing computational costs and assessing the predictive power of the various available features. Furthermore, we examine the performance of our methodology on the SCOP 1.53 benchmark data-set that simulates remote homology detection and examine the combination of various state-of-the-art string kernels that have recently been proposed.&lt;br/&gt;&lt;br/&gt;post to: &lt;a href = &quot;http://www.citeulike.org/posturl?url=http%3A%2F%2Fwww.ncbi.nlm.nih.gov%2Fentrez%2Fquery.fcgi%3Fcmd%3DRetrieve%26db%3DPubMed%26dopt%3DAbstract%26list_uids%3D18378524&amp;title=Entrez+Pubmed&quot;&gt;CiteULike&lt;/a&gt;</description>
    </item>
    <item>
      <title>CompariMotif: quick and easy comparisons of sequence motifs.</title>
      <link>http://www.ncbi.nlm.nih.gov/entrez/query.fcgi?cmd=Retrieve&amp;db=PubMed&amp;dopt=Abstract&amp;list_uids=18375965</link>
      <description>Publication Date: 2008 May 15 PMID: 18375965&lt;br/&gt;Authors: Edwards, R. J. - Davey, N. E. - Shields, D. C.&lt;br/&gt;Journal: Bioinformatics&lt;br/&gt;&lt;br/&gt;CompariMotif is a novel tool for making motif-motif comparisons, identifying and describing similarities between regular expression motifs. CompariMotif can identify a number of different relationships between motifs, including exact matches, variants of degenerate motifs and complex overlapping motifs. Motif relationships are scored using shared information content, allowing the best matches to be easily identified in large comparisons. Many input and search options are available, enabling a list of motifs to be compared to itself (to identify recurring motifs) or to datasets of known motifs. AVAILABILITY: CompariMotif can be run online at http://bioware.ucd.ie/ and is freely available for academic use as a set of open source Python modules under a GNU General Public License from http://bioinformatics.ucd.ie/shields/software/comparimotif/&lt;br/&gt;&lt;br/&gt;post to: &lt;a href = &quot;http://www.citeulike.org/posturl?url=http%3A%2F%2Fwww.ncbi.nlm.nih.gov%2Fentrez%2Fquery.fcgi%3Fcmd%3DRetrieve%26db%3DPubMed%26dopt%3DAbstract%26list_uids%3D18375965&amp;title=Entrez+Pubmed&quot;&gt;CiteULike&lt;/a&gt;</description>
    </item>
    <item>
      <title>Analysis of correlated mutations in HIV-1 protease using spectral clustering.</title>
      <link>http://www.ncbi.nlm.nih.gov/entrez/query.fcgi?cmd=Retrieve&amp;db=PubMed&amp;dopt=Abstract&amp;list_uids=18375964</link>
      <description>Publication Date: 2008 May 15 PMID: 18375964&lt;br/&gt;Authors: Liu, Y. - Eyal, E. - Bahar, I.&lt;br/&gt;Journal: Bioinformatics&lt;br/&gt;&lt;br/&gt;MOTIVATION: The ability of human immunodeficiency virus-1 (HIV-1) protease to develop mutations that confer multi-drug resistance (MDR) has been a major obstacle in designing rational therapies against HIV. Resistance is usually imparted by a cooperative mechanism that can be elucidated by a covariance analysis of sequence data. Identification of such correlated substitutions of amino acids may be obscured by evolutionary noise. RESULTS: HIV-1 protease sequences from patients subjected to different specific treatments (set 1), and from untreated patients (set 2) were subjected to sequence covariance analysis by evaluating the mutual information (MI) between all residue pairs. Spectral clustering of the resulting covariance matrices disclosed two distinctive clusters of correlated residues: the first, observed in set 1 but absent in set 2, contained residues involved in MDR acquisition; and the second, included those residues differentiated in the various HIV-1 protease subtypes, shortly referred to as the phylogenetic cluster. The MDR cluster occupies sites close to the central symmetry axis of the enzyme, which overlap with the global hinge region identified from coarse-grained normal-mode analysis of the enzyme structure. The phylogenetic cluster, on the other hand, occupies solvent-exposed and highly mobile regions. This study demonstrates (i) the possibility of distinguishing between the correlated substitutions resulting from neutral mutations and those induced by MDR upon appropriate clustering analysis of sequence covariance data and (ii) a connection between global dynamics and functional substitution of amino acids.&lt;br/&gt;&lt;br/&gt;post to: &lt;a href = &quot;http://www.citeulike.org/posturl?url=http%3A%2F%2Fwww.ncbi.nlm.nih.gov%2Fentrez%2Fquery.fcgi%3Fcmd%3DRetrieve%26db%3DPubMed%26dopt%3DAbstract%26list_uids%3D18375964&amp;title=Entrez+Pubmed&quot;&gt;CiteULike&lt;/a&gt;</description>
    </item>
    <item>
      <title>Visualization of unfavorable interactions in protein folds.</title>
      <link>http://www.ncbi.nlm.nih.gov/entrez/query.fcgi?cmd=Retrieve&amp;db=PubMed&amp;dopt=Abstract&amp;list_uids=18375963</link>
      <description>Publication Date: 2008 May 1 PMID: 18375963&lt;br/&gt;Authors: Weichenberger, C. X. - Byzia, P. - Sippl, M. J.&lt;br/&gt;Journal: Bioinformatics&lt;br/&gt;&lt;br/&gt;Three dimensional structures of proteins contain errors which often originate from limitations of the experimental techniques employed. Such errors frequently result in unfavorable atomic interactions. Here we present a new web service, called Interaction Viewer, for the visualization and correction of such errors. We show how the Interaction Viewer is used in combination with the NQ-Flipper service to spot strained asparagine and glutamine rotamers and we emphasize the convenience of this service in correcting such errors. AVAILABILITY: The web service is integrated with the NQ-Flipper service and accessible at http://flipper.services.came.sbg.ac.at&lt;br/&gt;&lt;br/&gt;post to: &lt;a href = &quot;http://www.citeulike.org/posturl?url=http%3A%2F%2Fwww.ncbi.nlm.nih.gov%2Fentrez%2Fquery.fcgi%3Fcmd%3DRetrieve%26db%3DPubMed%26dopt%3DAbstract%26list_uids%3D18375963&amp;title=Entrez+Pubmed&quot;&gt;CiteULike&lt;/a&gt;</description>
    </item>
    <item>
      <title>Linear time-varying models can reveal non-linear interactions of biomolecular regulatory networks using multiple time-series data.</title>
      <link>http://www.ncbi.nlm.nih.gov/entrez/query.fcgi?cmd=Retrieve&amp;db=PubMed&amp;dopt=Abstract&amp;list_uids=18367478</link>
      <description>Publication Date: 2008 May 15 PMID: 18367478&lt;br/&gt;Authors: Kim, J. - Bates, D. G. - Postlethwaite, I. - Heslop-Harrison, P. - Cho, K. H.&lt;br/&gt;Journal: Bioinformatics&lt;br/&gt;&lt;br/&gt;MOTIVATION: Inherent non-linearities in biomolecular interactions make the identification of network interactions difficult. One of the principal problems is that all methods based on the use of linear time-invariant models will have fundamental limitations in their capability to infer certain non-linear network interactions. Another difficulty is the multiplicity of possible solutions, since, for a given dataset, there may be many different possible networks which generate the same time-series expression profiles. RESULTS: A novel algorithm for the inference of biomolecular interaction networks from temporal expression data is presented. Linear time-varying models, which can represent a much wider class of time-series data than linear time-invariant models, are employed in the algorithm. From time-series expression profiles, the model parameters are identified by solving a non-linear optimization problem. In order to systematically reduce the set of possible solutions for the optimization problem, a filtering process is performed using a phase-portrait analysis with random numerical perturbations. The proposed approach has the advantages of not requiring the system to be in a stable steady state, of using time-series profiles which have been generated by a single experiment, and of allowing non-linear network interactions to be identified. The ability of the proposed algorithm to correctly infer network interactions is illustrated by its application to three examples: a non-linear model for cAMP oscillations in Dictyostelium discoideum, the cell-cycle data for Saccharomyces cerevisiae and a large-scale non-linear model of a group of synchronized Dictyostelium cells. AVAILABILITY: The software used in this article is available from http://sbie.kaist.ac.kr/software&lt;br/&gt;&lt;br/&gt;post to: &lt;a href = &quot;http://www.citeulike.org/posturl?url=http%3A%2F%2Fwww.ncbi.nlm.nih.gov%2Fentrez%2Fquery.fcgi%3Fcmd%3DRetrieve%26db%3DPubMed%26dopt%3DAbstract%26list_uids%3D18367478&amp;title=Entrez+Pubmed&quot;&gt;CiteULike&lt;/a&gt;</description>
    </item>
    <item>
      <title>Bayesian inference of the sites of perturbations in metabolic pathways via Markov chain Monte Carlo.</title>
      <link>http://www.ncbi.nlm.nih.gov/entrez/query.fcgi?cmd=Retrieve&amp;db=PubMed&amp;dopt=Abstract&amp;list_uids=18356193</link>
      <description>Publication Date: 2008 May 1 PMID: 18356193&lt;br/&gt;Authors: Jayawardhana, B. - Kell, D. B. - Rattray, M.&lt;br/&gt;Journal: Bioinformatics&lt;br/&gt;&lt;br/&gt;MOTIVATION: Genetic modifications or pharmaceutical interventions can influence multiple sites in metabolic pathways, and often these are 'distant' from the primary effect. In this regard, the ability to identify target and off-target effects of a specific compound or gene therapy is both a major challenge and critical in drug discovery. RESULTS: We applied Markov Chain Monte Carlo (MCMC) for parameter estimation and perturbation identification in the kinetic modeling of metabolic pathways. Variability in the steady-state measurements in cells taken from a population can be caused by differences in initial conditions within the population, by variation of parameters among individuals and by possible measurement noise. MCMC-based parameter estimation is proposed as a method to help in inferring parameter distributions, taking into account uncertainties in the initial conditions and in the measurement data. The inferred parameter distributions are then used to predict changes in the network via a simple classification method. The proposed technique is applied to analyze changes in the pathways of pyruvate metabolism of mutants of Lactococcus lactis, based on previously published experimental data. AVAILABILITY: MATLAB code used in the simulations is available from ftp://anonymous@dbkweb.mib.man.ac.uk/pub/Bioinformatics_BJ.zip&lt;br/&gt;&lt;br/&gt;post to: &lt;a href = &quot;http://www.citeulike.org/posturl?url=http%3A%2F%2Fwww.ncbi.nlm.nih.gov%2Fentrez%2Fquery.fcgi%3Fcmd%3DRetrieve%26db%3DPubMed%26dopt%3DAbstract%26list_uids%3D18356193&amp;title=Entrez+Pubmed&quot;&gt;CiteULike&lt;/a&gt;</description>
    </item>
    <item>
      <title>Scaffolding and validation of bacterial genome assemblies using optical restriction maps.</title>
      <link>http://www.ncbi.nlm.nih.gov/entrez/query.fcgi?cmd=Retrieve&amp;db=PubMed&amp;dopt=Abstract&amp;list_uids=18356192</link>
      <description>Publication Date: 2008 May 15 PMID: 18356192&lt;br/&gt;Authors: Nagarajan, N. - Read, T. D. - Pop, M.&lt;br/&gt;Journal: Bioinformatics&lt;br/&gt;&lt;br/&gt;MOTIVATION: New, high-throughput sequencing technologies have made it feasible to cheaply generate vast amounts of sequence information from a genome of interest. The computational reconstruction of the complete sequence of a genome is complicated by specific features of these new sequencing technologies, such as the short length of the sequencing reads and absence of mate-pair information. In this article we propose methods to overcome such limitations by incorporating information from optical restriction maps. RESULTS: We demonstrate the robustness of our methods to sequencing and assembly errors using extensive experiments on simulated datasets. We then present the results obtained by applying our algorithms to data generated from two bacterial genomes Yersinia aldovae and Yersinia kristensenii. The resulting assemblies contain a single scaffold covering a large fraction of the respective genomes, suggesting that the careful use of optical maps can provide a cost-effective framework for the assembly of genomes. AVAILABILITY: The tools described here are available as an open-source package at ftp://ftp.cbcb.umd.edu/pub/software/soma&lt;br/&gt;&lt;br/&gt;post to: &lt;a href = &quot;http://www.citeulike.org/posturl?url=http%3A%2F%2Fwww.ncbi.nlm.nih.gov%2Fentrez%2Fquery.fcgi%3Fcmd%3DRetrieve%26db%3DPubMed%26dopt%3DAbstract%26list_uids%3D18356192&amp;title=Entrez+Pubmed&quot;&gt;CiteULike&lt;/a&gt;</description>
    </item>
    <item>
      <title>An analytical pipeline for genomic representations used for cytosine methylation studies.</title>
      <link>http://www.ncbi.nlm.nih.gov/entrez/query.fcgi?cmd=Retrieve&amp;db=PubMed&amp;dopt=Abstract&amp;list_uids=18353789</link>
      <description>Publication Date: 2008 May 1 PMID: 18353789&lt;br/&gt;Authors: Thompson, R. F. - Reimers, M. - Khulan, B. - Gissot, M. - Richmond, T. A. - Chen, Q. - Zheng, X. - Kim, K. - Greally, J. M.&lt;br/&gt;Journal: Bioinformatics&lt;br/&gt;&lt;br/&gt;MOTIVATION: Representations of the genome can be generated by the selection of a subpopulation of restriction fragments using ligation-mediated PCR. Such representations form the basis for a number of high-throughput assays, including the HELP assay to study cytosine methylation. We find that HELP data analysis is complicated not only by PCR amplification heterogeneity but also by a complex and variable distribution of cytosine methylation. To address this, we created an analytical pipeline and novel normalization approach that improves concordance between microarray-derived data and single locus validation results, demonstrating the value of the analytical approach. A major influence on the PCR amplification is the size of the restriction fragment, requiring a quantile normalization approach that reduces the influence of fragment length on signal intensity. Here we describe all of the components of the pipeline, which can also be applied to data derived from other assays based on genomic representations.&lt;br/&gt;&lt;br/&gt;post to: &lt;a href = &quot;http://www.citeulike.org/posturl?url=http%3A%2F%2Fwww.ncbi.nlm.nih.gov%2Fentrez%2Fquery.fcgi%3Fcmd%3DRetrieve%26db%3DPubMed%26dopt%3DAbstract%26list_uids%3D18353789&amp;title=Entrez+Pubmed&quot;&gt;CiteULike&lt;/a&gt;</description>
    </item>
    <item>
      <title>Combining statistical alignment and phylogenetic footprinting to detect regulatory elements.</title>
      <link>http://www.ncbi.nlm.nih.gov/entrez/query.fcgi?cmd=Retrieve&amp;db=PubMed&amp;dopt=Abstract&amp;list_uids=18353788</link>
      <description>Publication Date: 2008 May 15 PMID: 18353788&lt;br/&gt;Authors: Satija, R. - Pachter, L. - Hein, J.&lt;br/&gt;Journal: Bioinformatics&lt;br/&gt;&lt;br/&gt;MOTIVATION: Traditional alignment-based phylogenetic footprinting approaches make predictions on the basis of a single assumed alignment. The predictions are therefore highly sensitive to alignment errors or regions of alignment uncertainty. Alternatively, statistical alignment methods provide a framework for performing phylogenetic analyses by examining a distribution of alignments. RESULTS: We developed a novel algorithm for predicting functional elements by combining statistical alignment and phylogenetic footprinting (SAPF). SAPF simultaneously performs both alignment and annotation by combining phylogenetic footprinting techniques with an hidden Markov model (HMM) transducer-based multiple alignment model, and can analyze sequence data from multiple sequences. We assessed SAPF's predictive performance on two simulated datasets and three well-annotated cis-regulatory modules from newly sequenced Drosophila genomes. The results demonstrate that removing the traditional dependence on a single alignment can significantly augment the predictive performance, especially when there is uncertainty in the alignment of functional regions. AVAILABILITY: SAPF is freely available to download online at http://www.stats.ox.ac.uk/~satija/SAPF/&lt;br/&gt;&lt;br/&gt;post to: &lt;a href = &quot;http://www.citeulike.org/posturl?url=http%3A%2F%2Fwww.ncbi.nlm.nih.gov%2Fentrez%2Fquery.fcgi%3Fcmd%3DRetrieve%26db%3DPubMed%26dopt%3DAbstract%26list_uids%3D18353788&amp;title=Entrez+Pubmed&quot;&gt;CiteULike&lt;/a&gt;</description>
    </item>
    <item>
      <title>Integrating ARC grid middleware with Taverna workflows.</title>
      <link>http://www.ncbi.nlm.nih.gov/entrez/query.fcgi?cmd=Retrieve&amp;db=PubMed&amp;dopt=Abstract&amp;list_uids=18353787</link>
      <description>Publication Date: 2008 May 1 PMID: 18353787&lt;br/&gt;Authors: Krabbenhoft, H. N. - Moller, S. - Bayer, D.&lt;br/&gt;Journal: Bioinformatics&lt;br/&gt;&lt;br/&gt;SUMMARY: This work presents two independent approaches for a seamless integration of computational grids with the bioinformatics workflow suite Taverna. These are supported by a unique relational database to link applications with grid resources and presents those as workflow elements. A web portal facilitates its collaborative maintenance. The first approach implements a gateway service to handle authentication certificates and all communication with the grid. It reads the database to spawn web services for workflow elements which are in turn used by Taverna. The second approach lets Taverna communicate with the grid on its own, by means of a newly developed plug-in. It reads the database and executes the needed tasks directly on the grid. While the gateway service is non-intrusive, the plug-in has technical advantages, e.g. by allowing data to remain on the grid while being passed between workflow elements. AVAILABILITY: http://grid.inb.uni-luebeck.de/&lt;br/&gt;&lt;br/&gt;post to: &lt;a href = &quot;http://www.citeulike.org/posturl?url=http%3A%2F%2Fwww.ncbi.nlm.nih.gov%2Fentrez%2Fquery.fcgi%3Fcmd%3DRetrieve%26db%3DPubMed%26dopt%3DAbstract%26list_uids%3D18353787&amp;title=Entrez+Pubmed&quot;&gt;CiteULike&lt;/a&gt;</description>
    </item>
    <item>
      <title>Celestial3D: a novel method for 3D visualization of familial data.</title>
      <link>http://www.ncbi.nlm.nih.gov/entrez/query.fcgi?cmd=Retrieve&amp;db=PubMed&amp;dopt=Abstract&amp;list_uids=18346980</link>
      <description>Publication Date: 2008 May 1 PMID: 18346980&lt;br/&gt;Authors: Loh, A. M. - Wiltshire, S. - Emery, J. - Carter, K. W. - Palmer, L. J.&lt;br/&gt;Journal: Bioinformatics&lt;br/&gt;&lt;br/&gt;SUMMARY: Traditional two-dimensional (2D) software programs for drawing pedigrees are limited when dealing with extended pedigrees. In successive generations, the number of individuals grows exponentially, leading to an unworkable amount of space required in the horizontal direction for 2D displays. In addition, it is not always possible to place closely related individuals near each other due to the lack of space in 2Ds. To address these issues we have developed three-dimensional (3D) pedigree drawing techniques to enable clearer visualization of extended pedigrees. Currently no other methods are available for displaying extended pedigrees in 3Ds. We have made freely available a software tool--'Celestial3D'--that implements these novel techniques. AVAILABILITY: Freely available to non-commercial users.&lt;br/&gt;&lt;br/&gt;post to: &lt;a href = &quot;http://www.citeulike.org/posturl?url=http%3A%2F%2Fwww.ncbi.nlm.nih.gov%2Fentrez%2Fquery.fcgi%3Fcmd%3DRetrieve%26db%3DPubMed%26dopt%3DAbstract%26list_uids%3D18346980&amp;title=Entrez+Pubmed&quot;&gt;CiteULike&lt;/a&gt;</description>
    </item>
    <item>
      <title>Unequal group variances in microarray data analyses.</title>
      <link>http://www.ncbi.nlm.nih.gov/entrez/query.fcgi?cmd=Retrieve&amp;db=PubMed&amp;dopt=Abstract&amp;list_uids=18344518</link>
      <description>Publication Date: 2008 May 1 PMID: 18344518&lt;br/&gt;Authors: Demissie, M. - Mascialino, B. - Calza, S. - Pawitan, Y.&lt;br/&gt;Journal: Bioinformatics&lt;br/&gt;&lt;br/&gt;MOTIVATION: In searching for differentially expressed (DE) genes in microarray data, we often observe a fraction of the genes to have unequal variability between groups. This is not an issue in large samples, where a valid test exists that uses individual variances separately. The problem arises in the small-sample setting, where the approximately valid Welch test lacks sensitivity, while the more sensitive moderated t-test assumes equal variance. METHODS: We introduce a moderated Welch test (MWT) that allows unequal variance between groups. It is based on (i) weighting of pooled and unpooled standard errors and (ii) improved estimation of the gene-level variance that exploits the information from across the genes. RESULTS: When a non-trivial proportion of genes has unequal variability, false discovery rate (FDR) estimates based on the standard t and moderated t-tests are often too optimistic, while the standard Welch test has low sensitivity. The MWT is shown to (i) perform better than the standard t, the standard Welch and the moderated t-tests when the variances are unequal between groups and (ii) perform similarly to the moderated t, and better than the standard t and Welch tests when the group variances are equal. These results mean that MWT is more reliable than other existing tests over wider range of data conditions. AVAILABILITY: R package to perform MWT is available at http://www.meb.ki.se/~yudpaw&lt;br/&gt;&lt;br/&gt;post to: &lt;a href = &quot;http://www.citeulike.org/posturl?url=http%3A%2F%2Fwww.ncbi.nlm.nih.gov%2Fentrez%2Fquery.fcgi%3Fcmd%3DRetrieve%26db%3DPubMed%26dopt%3DAbstract%26list_uids%3D18344518&amp;title=Entrez+Pubmed&quot;&gt;CiteULike&lt;/a&gt;</description>
    </item>
    <item>
      <title>GlycoBase and autoGU: tools for HPLC-based glycan analysis.</title>
      <link>http://www.ncbi.nlm.nih.gov/entrez/query.fcgi?cmd=Retrieve&amp;db=PubMed&amp;dopt=Abstract&amp;list_uids=18344517</link>
      <description>Publication Date: 2008 May 1 PMID: 18344517&lt;br/&gt;Authors: Campbell, M. P. - Royle, L. - Radcliffe, C. M. - Dwek, R. A. - Rudd, P. M.&lt;br/&gt;Journal: Bioinformatics&lt;br/&gt;&lt;br/&gt;SUMMARY: The development of robust high-performance liquid chromatography (HPLC) technologies continues to improve the detailed analysis and sequencing of glycan structures released from glycoproteins. Here, we present a database (GlycoBase) and analytical tool (autoGU) to assist the interpretation and assignment of HPLC-glycan profiles. GlycoBase is a relational database which contains the HPLC elution positions for over 350 2-AB labelled N-glycan structures together with predicted products of exoglycosidase digestions. AutoGU assigns provisional structures to each integrated HPLC peak and, when used in combination with exoglycosidase digestions, progressively assigns each structure automatically based on the footprint data. These tools are potentially very promising and facilitate basic research as well as the quantitative high-throughput analysis of low concentrations of glycans released from glycoproteins. AVAILABILITY: http://glycobase.ucd.ie&lt;br/&gt;&lt;br/&gt;post to: &lt;a href = &quot;http://www.citeulike.org/posturl?url=http%3A%2F%2Fwww.ncbi.nlm.nih.gov%2Fentrez%2Fquery.fcgi%3Fcmd%3DRetrieve%26db%3DPubMed%26dopt%3DAbstract%26list_uids%3D18344517&amp;title=Entrez+Pubmed&quot;&gt;CiteULike&lt;/a&gt;</description>
    </item>
    <item>
      <title>Biological sequence classification utilizing positive and unlabeled data.</title>
      <link>http://www.ncbi.nlm.nih.gov/entrez/query.fcgi?cmd=Retrieve&amp;db=PubMed&amp;dopt=Abstract&amp;list_uids=18344247</link>
      <description>Publication Date: 2008 May 1 PMID: 18344247&lt;br/&gt;Authors: Xiao, Y. - Segal, M. R.&lt;br/&gt;Journal: Bioinformatics&lt;br/&gt;&lt;br/&gt;MOTIVATION: In the genomics setting, an increasingly common data configuration consists of a small set of sequences possessing a targeted property (positive instances) amongst a large set of sequences for which class membership is unknown (unlabeled instances). Traditional two-class classification methods do not effectively handle such data. RESULTS: Here, we develop a novel method, likely positive-iterative classification (LP-IC) for this problem, and contrast its performance with the few existing methods, most of which were devised and utilized in the text classification context. LP-IC employs an iterative classification scheme and introduces a class dispersion measure, adopted from unsupervised clustering approaches, to monitor the model selection process. Using two case studies--prediction of HLA binding, and alternative splicing conservation between human and mouse--we show that LP-IC provides superior performance to existing methodologies in terms of: (i) combined accuracy and precision in positive identification from the unlabeled set; and (ii) predictive performance of the resultant classifiers on independent test data.&lt;br/&gt;&lt;br/&gt;post to: &lt;a href = &quot;http://www.citeulike.org/posturl?url=http%3A%2F%2Fwww.ncbi.nlm.nih.gov%2Fentrez%2Fquery.fcgi%3Fcmd%3DRetrieve%26db%3DPubMed%26dopt%3DAbstract%26list_uids%3D18344247&amp;title=Entrez+Pubmed&quot;&gt;CiteULike&lt;/a&gt;</description>
    </item>
    <item>
      <title>ChIPCodis: mining complex regulatory systems in yeast by concurrent enrichment analysis of chip-on-chip data.</title>
      <link>http://www.ncbi.nlm.nih.gov/entrez/query.fcgi?cmd=Retrieve&amp;db=PubMed&amp;dopt=Abstract&amp;list_uids=18339638</link>
      <description>Publication Date: 2008 May 1 PMID: 18339638&lt;br/&gt;Authors: Abascal, F. - Carmona-Saez, P. - Carazo, J. M. - Pascual-Montano, A.&lt;br/&gt;Journal: Bioinformatics&lt;br/&gt;&lt;br/&gt;MOTIVATION: Eukaryotic genes are often regulated by multiple transcription factors (TFs). Depending on the interactions among different TFs the expression of a gene can be tuned to respond to diverse environmental conditions. Chip-on-chip experiments provide a snapshot of which TF are in vivo bound to which genes in a particular condition, and have been applied to characterize the regulatory code of yeast under several experimental settings. ChIPCodis mines this data to provide new insights about how the expression of a particular group of genes is regulated. For a given list of yeast genes ChIPCodis determines which combinations of TFs are significantly over-represented in a series of environmental conditions. Availability: http://chipcodis.dacya.ucm.es&lt;br/&gt;&lt;br/&gt;post to: &lt;a href = &quot;http://www.citeulike.org/posturl?url=http%3A%2F%2Fwww.ncbi.nlm.nih.gov%2Fentrez%2Fquery.fcgi%3Fcmd%3DRetrieve%26db%3DPubMed%26dopt%3DAbstract%26list_uids%3D18339638&amp;title=Entrez+Pubmed&quot;&gt;CiteULike&lt;/a&gt;</description>
    </item>
    <item>
      <title>PlasmoGF: an integrated system for comparative genomics and phylogenetic analysis of Plasmodium gene families.</title>
      <link>http://www.ncbi.nlm.nih.gov/entrez/query.fcgi?cmd=Retrieve&amp;db=PubMed&amp;dopt=Abstract&amp;list_uids=18337260</link>
      <description>Publication Date: 2008 May 1 PMID: 18337260&lt;br/&gt;Authors: Xu, X. - Wu, J. - Xiao, J. - Tan, Y. - Bao, Q. - Zhao, F. - Li, X.&lt;br/&gt;Journal: Bioinformatics&lt;br/&gt;&lt;br/&gt;Malaria, one of the world's most common diseases, is caused by the intracellular protozoan parasite known as Plasmodium. Recently, with the arrival of several malaria parasite genomes, we established an integrated system named PlasmoGF for comparative genomics and phylogenetic analysis of Plasmodium gene families. Gene families were clustered using the Markov Cluster algorithm implemented in TribeMCL program and could be searched using keywords, gene-family information, domain composition, Gene Ontology and BLAST. Moreover, a number of useful bioinformatics tools were implemented to facilitate the analysis of these putative Plasmodium gene families, including gene retrieval, annotation, sequence alignment, phylogeny construction and visualization. In the current version, PlasmoGF contained 8980 sets of gene families derived from six malaria parasite genomes: Plasmodium. falciparum, P. berghei, P. knowlesi, P. chabaudi, P. vivax and P. yoelii. The availability of such a highly integrated system would be of great interest for the community of researchers working on malaria parasite phylogenomics. AVAILABILITY: PlasmoGF is freely available at http://bioinformatics.zj.cn/pgf/&lt;br/&gt;&lt;br/&gt;post to: &lt;a href = &quot;http://www.citeulike.org/posturl?url=http%3A%2F%2Fwww.ncbi.nlm.nih.gov%2Fentrez%2Fquery.fcgi%3Fcmd%3DRetrieve%26db%3DPubMed%26dopt%3DAbstract%26list_uids%3D18337260&amp;title=Entrez+Pubmed&quot;&gt;CiteULike&lt;/a&gt;</description>
    </item>
    <item>
      <title>Optimizing the size of the sequence profiles to increase the accuracy of protein sequence alignments generated by profile-profile algorithms.</title>
      <link>http://www.ncbi.nlm.nih.gov/entrez/query.fcgi?cmd=Retrieve&amp;db=PubMed&amp;dopt=Abstract&amp;list_uids=18337259</link>
      <description>Publication Date: 2008 May 1 PMID: 18337259&lt;br/&gt;Authors: Poleksic, A. - Fienup, M.&lt;br/&gt;Journal: Bioinformatics&lt;br/&gt;&lt;br/&gt;MOTIVATION: Profile-based protein homology detection algorithms are valuable tools in genome annotation and protein classification. By utilizing information present in the sequences of homologous proteins, profile-based methods are often able to detect extremely weak relationships between protein sequences, as evidenced by the large-scale benchmarking experiments such as CASP and LiveBench. RESULTS: We study the relationship between the sensitivity of a profile-profile method and the size of the sequence profile, which is defined as the average number of different residue types observed at the profile's positions. We also demonstrate that improvements in the sensitivity of a profile-profile method can be made by incorporating a profile-dependent scoring scheme, such as position-specific background frequencies. The techniques presented in this article are implemented in an alignment algorithm UNI-FOLD. When tested against other well-established methods for fold recognition, UNI-FOLD shows increased sensitivity and specificity in detecting remote relationships between protein sequences. AVAILABILITY: UNI-FOLD web server can be accessed at http://blackhawk.cs.uni.edu&lt;br/&gt;&lt;br/&gt;post to: &lt;a href = &quot;http://www.citeulike.org/posturl?url=http%3A%2F%2Fwww.ncbi.nlm.nih.gov%2Fentrez%2Fquery.fcgi%3Fcmd%3DRetrieve%26db%3DPubMed%26dopt%3DAbstract%26list_uids%3D18337259&amp;title=Entrez+Pubmed&quot;&gt;CiteULike&lt;/a&gt;</description>
    </item>
    <item>
      <title>A pattern recognition approach to infer time-lagged genetic interactions.</title>
      <link>http://www.ncbi.nlm.nih.gov/entrez/query.fcgi?cmd=Retrieve&amp;db=PubMed&amp;dopt=Abstract&amp;list_uids=18337258</link>
      <description>Publication Date: 2008 May 1 PMID: 18337258&lt;br/&gt;Authors: Chuang, C. L. - Jen, C. H. - Chen, C. M. - Shieh, G. S.&lt;br/&gt;Journal: Bioinformatics&lt;br/&gt;&lt;br/&gt;MOTIVATION: For any time-course microarray data in which the gene interactions and the associated paired patterns are depend