<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0"
  xmlns:trackback="http://madskills.com/public/xml/rss/module/trackback/">
  <channel>
    <title>Journal of Computational Biology</title>
    <link>http://barf.jcowboy.org</link>
    <description>Journal of Computational Biology recent publications</description>
    <language>en-us</language>
    <image>
      <url>http://barf.jcowboy.org/pubmed.gif</url>
      <title>the data for this feed is provided by PubMed</title>
      <link>http://barf.jcowboy.org</link>
    </image>
    <item>
      <title>A simple model of the modular structure of transcriptional regulation in yeast.</title>
      <link>http://www.ncbi.nlm.nih.gov/entrez/query.fcgi?cmd=Retrieve&amp;db=PubMed&amp;dopt=Abstract&amp;list_uids=18466069</link>
      <description>Publication Date: 2008 May PMID: 18466069&lt;br/&gt;Authors: Filkov, V. - Shah, N.&lt;br/&gt;Journal: J Comput Biol&lt;br/&gt;&lt;br/&gt;Resolving the general organizational principles that govern the interactions during transcriptional gene regulation has great relevance for understanding disease progression, biofabrication, and biological systems in general. The available genome-level monitoring technologies and the best understood biological work on gene regulation are together providing us with unprecedented amounts of data and universal modeling frameworks in which to reason about regulatory systems on a computational level. Gene regulatory systems exhibit modularity in their regulatory sequences as well as in the corresponding gene expression. This modularity has a nontrivial, general combinatorial structure that can be studied and generalized to model classes of regulatory systems. Here, we study computationally the combinatorial nature of transcriptional regulation by assuming a one-to-one relationship between shared patterns in genome-wide gene-expression and cis-region modules. In our combinatorial framework, the DNA binding events are complementary to their expression counterparts, and together let us approximate the underlying regulation structure. Our model maps regulatory systems onto hierarchical structures which can be approximated by conflating existing large scale gene expression and ChIP-chip data. We have developed methods for building regulatory hierarchies and identifying the basic functional units, or modules, of transcriptional regulation. We validate our model using yeast data by showing agreement of our predictions with experimental data, and using the hierarchies to resolve a finer structure of co-regulation.&lt;br/&gt;&lt;br/&gt;post to: &lt;a href = &quot;http://www.citeulike.org/posturl?url=http%3A%2F%2Fwww.ncbi.nlm.nih.gov%2Fentrez%2Fquery.fcgi%3Fcmd%3DRetrieve%26db%3DPubMed%26dopt%3DAbstract%26list_uids%3D18466069&amp;title=Entrez+Pubmed&quot;&gt;CiteULike&lt;/a&gt;</description>
    </item>
    <item>
      <title>Reconstruction of genuine pair-wise sequence alignment.</title>
      <link>http://www.ncbi.nlm.nih.gov/entrez/query.fcgi?cmd=Retrieve&amp;db=PubMed&amp;dopt=Abstract&amp;list_uids=18435572</link>
      <description>Publication Date: 2008 May PMID: 18435572&lt;br/&gt;Authors: Polyanovsky, V. - Roytberg, M. A. - Tumanyan, V. G.&lt;br/&gt;Journal: J Comput Biol&lt;br/&gt;&lt;br/&gt;In many applications, the algorithmically obtained alignment ideally should restore the &quot;golden standard&quot; (GS) alignment, which superimposes positions originating from the same position of the common ancestor of the compared sequences. The average similarity between the algorithmically obtained and GS alignments (&quot;the quality&quot;) is an important characteristic of an alignment algorithm. We proposed to determine the quality of an algorithm, using sequences that were artificially generated in accordance with an appropriate evolution model; the approach was applied to the global version of the Smith-Waterman algorithm (SWA). The quality of SWA is between 97% (for a PAM distance of 60) and 70% (for a PAM distance of 300). The percentage of identical aligned residues is the same for algorithmic and GS alignments. The total length of indels in algorithmic alignments is less than in the GS-mainly due to a substantial decrease in the number of indels in algorithmic alignments.&lt;br/&gt;&lt;br/&gt;post to: &lt;a href = &quot;http://www.citeulike.org/posturl?url=http%3A%2F%2Fwww.ncbi.nlm.nih.gov%2Fentrez%2Fquery.fcgi%3Fcmd%3DRetrieve%26db%3DPubMed%26dopt%3DAbstract%26list_uids%3D18435572&amp;title=Entrez+Pubmed&quot;&gt;CiteULike&lt;/a&gt;</description>
    </item>
    <item>
      <title>UPSEC: An Algorithm for Classifying Unaligned Protein Sequences into Functional Families.</title>
      <link>http://www.ncbi.nlm.nih.gov/entrez/query.fcgi?cmd=Retrieve&amp;db=PubMed&amp;dopt=Abstract&amp;list_uids=18435571</link>
      <description>Publication Date: 2008 May PMID: 18435571&lt;br/&gt;Authors: Ma, P. C. - Chan, K. C.&lt;br/&gt;Journal: J Comput Biol&lt;br/&gt;&lt;br/&gt;To classify proteins into functional families based on their primary sequences, popular algorithms such as the k-NN-, HMM-, and SVM-based algorithms are often used. For many of these algorithms to perform their tasks, protein sequences need to be properly aligned first. Since the alignment process can be error-prone, protein classification may not be performed very accurately. To improve classification accuracy, we propose an algorithm, called the Unaligned Protein SEquence Classifier (UPSEC), which can perform its tasks without sequence alignment. UPSEC makes use of a probabilistic measure to identify residues that are useful for classification in both positive and negative training samples, and can handle multi-class classification with a single classifier and a single pass through the training data. UPSEC has been tested with real protein data sets. Experimental results show that UPSEC can effectively classify unaligned protein sequences into their corresponding functional families, and the patterns it discovers during the training process can be biologically meaningful.&lt;br/&gt;&lt;br/&gt;post to: &lt;a href = &quot;http://www.citeulike.org/posturl?url=http%3A%2F%2Fwww.ncbi.nlm.nih.gov%2Fentrez%2Fquery.fcgi%3Fcmd%3DRetrieve%26db%3DPubMed%26dopt%3DAbstract%26list_uids%3D18435571&amp;title=Entrez+Pubmed&quot;&gt;CiteULike&lt;/a&gt;</description>
    </item>
    <item>
      <title>Delineating slowly and rapidly evolving fractions of the Drosophila genome.</title>
      <link>http://www.ncbi.nlm.nih.gov/entrez/query.fcgi?cmd=Retrieve&amp;db=PubMed&amp;dopt=Abstract&amp;list_uids=18435570</link>
      <description>Publication Date: 2008 May PMID: 18435570&lt;br/&gt;Authors: Keith, J. M. - Adams, P. - Stephen, S. - Mattick, J. S.&lt;br/&gt;Journal: J Comput Biol&lt;br/&gt;&lt;br/&gt;Evolutionary conservation is an important indicator of function and a major component of bioinformatic methods to identify non-protein-coding genes. We present a new Bayesian method for segmenting pairwise alignments of eukaryotic genomes while simultaneously classifying segments into slowly and rapidly evolving fractions. We also describe an information criterion similar to the Akaike Information Criterion (AIC) for determining the number of classes. Working with pairwise alignments enables detection of differences in conservation patterns among closely related species. We analyzed three whole-genome and three partial-genome pairwise alignments among eight Drosophila species. Three distinct classes of conservation level were detected. Sequences comprising the most slowly evolving component were consistent across a range of species pairs, and constituted approximately 62-66% of the D. melanogaster genome. Almost all (&gt;90%) of the aligned protein-coding sequence is in this fraction, suggesting much of it (comprising the majority of the Drosophila genome, including approximately 56% of non-protein-coding sequences) is functional. The size and content of the most rapidly evolving component was species dependent, and varied from 1.6% to 4.8%. This fraction is also enriched for protein-coding sequence (while containing significant amounts of non-protein-coding sequence), suggesting it is under positive selection. We also classified segments according to conservation and GC content simultaneously. This analysis identified numerous sub-classes of those identified on the basis of conservation alone, but was nevertheless consistent with that classification. Software, data, and results available at www.maths.qut.edu.au/ approximately keithj/. Genomic segments comprising the conservation classes available in BED format.&lt;br/&gt;&lt;br/&gt;post to: &lt;a href = &quot;http://www.citeulike.org/posturl?url=http%3A%2F%2Fwww.ncbi.nlm.nih.gov%2Fentrez%2Fquery.fcgi%3Fcmd%3DRetrieve%26db%3DPubMed%26dopt%3DAbstract%26list_uids%3D18435570&amp;title=Entrez+Pubmed&quot;&gt;CiteULike&lt;/a&gt;</description>
    </item>
    <item>
      <title>Combined dynamic arrays for storing and searching semi-ordered tandem mass spectrometry data.</title>
      <link>http://www.ncbi.nlm.nih.gov/entrez/query.fcgi?cmd=Retrieve&amp;db=PubMed&amp;dopt=Abstract&amp;list_uids=18435569</link>
      <description>Publication Date: 2008 May PMID: 18435569&lt;br/&gt;Authors: Feng, J. - Naiman, D. Q. - Cooper, B.&lt;br/&gt;Journal: J Comput Biol&lt;br/&gt;&lt;br/&gt;When performing bioinformatics analysis on tandem mass spectrometry data, there is a computational need to efficiently store and sort these semi-ordered datasets. To solve this problem, a new data structure based on dynamic arrays was designed and implemented in an algorithm that parses semi-ordered data made by Mascot, a separate software program that matches peptide tandem mass spectra to protein sequences in a database. By accommodating the special features of these large datasets, the combined dynamic array (CDA) provides efficient searching and insertion operations. The operations on real datasets using this new data structure are hundreds times faster than operations using binary tree and red-black tree structures. The difference becomes more significant when the dataset size grows. This data structure may be useful for improving the speed of other related types of protein assembling software or other types of software that operate on datasets with similar semi-ordered features.&lt;br/&gt;&lt;br/&gt;post to: &lt;a href = &quot;http://www.citeulike.org/posturl?url=http%3A%2F%2Fwww.ncbi.nlm.nih.gov%2Fentrez%2Fquery.fcgi%3Fcmd%3DRetrieve%26db%3DPubMed%26dopt%3DAbstract%26list_uids%3D18435569&amp;title=Entrez+Pubmed&quot;&gt;CiteULike&lt;/a&gt;</description>
    </item>
    <item>
      <title>PROFALIGN Algorithm Identifies the Regions Containing Folding Determinants by Scoring Pairs of Hydrophobic Profiles of Remotely Related Proteins.</title>
      <link>http://www.ncbi.nlm.nih.gov/entrez/query.fcgi?cmd=Retrieve&amp;db=PubMed&amp;dopt=Abstract&amp;list_uids=18386966</link>
      <description>Publication Date: 2008 May PMID: 18386966&lt;br/&gt;Authors: Tcheremenskaia, O. - Giuliani, A. - Tomasi, M.&lt;br/&gt;Journal: J Comput Biol&lt;br/&gt;&lt;br/&gt;Profile comparison methods have been shown to be very powerful in creating accurate alignments of protein sequences, especially in the case of remotely related proteins (RRP). These methods take advantage of the observation that hydrophobic profiles are more conserved than the corresponding amino acid sequences. Here, we present the PROFALIGN algorithm, which allows one to perform a detailed comparative analysis, at both local and global levels of two protein sequence profiles. The user can either choose among four different hydrophobic scales (Miyazawa-Jernigan, Eisenberg, Engelman-Steiz, and Kyte-Doolittle) or can add a personal scale. The interface is designed for a wide range of users, including those who are not involved in protein research. It allows one to vary the alignment parameters (such as gap penalties, embedding, and profile smoothness). Secondary structure propensity is added as an optional alignment filter. Similar segments of two proteins are singled out on the basis of score. We have tested the algorithm with different Src homology 3 (SH3) domain fragments sharing low sequence homology but very similar three-dimensional (3D) structures. By using the Miyazawa-Jernigan hydrophobic scale, PROFALIGN was able to detect the strong correlation between the regions that are known to be crucial for SH3 transition state topology. PROFALIGN seems able to identify most of the mutual alignment of structures on the basis of their hydrophobic profiles, delimiting the regions containing the key determinants of folding. Therefore, the present methodology may be useful for the detection of the most structurally relevant positions inside remote related proteins.&lt;br/&gt;&lt;br/&gt;post to: &lt;a href = &quot;http://www.citeulike.org/posturl?url=http%3A%2F%2Fwww.ncbi.nlm.nih.gov%2Fentrez%2Fquery.fcgi%3Fcmd%3DRetrieve%26db%3DPubMed%26dopt%3DAbstract%26list_uids%3D18386966&amp;title=Entrez+Pubmed&quot;&gt;CiteULike&lt;/a&gt;</description>
    </item>
    <item>
      <title>Space Efficient Computation of Rare Maximal Exact Matches between Multiple Sequences.</title>
      <link>http://www.ncbi.nlm.nih.gov/entrez/query.fcgi?cmd=Retrieve&amp;db=PubMed&amp;dopt=Abstract&amp;list_uids=18361760</link>
      <description>Publication Date: 2008 May PMID: 18361760&lt;br/&gt;Authors: Ohlebusch, E. - Kurtz, S.&lt;br/&gt;Journal: J Comput Biol&lt;br/&gt;&lt;br/&gt;In this article, we propose a new method for computing rare maximal exact matches between multiple sequences. A rare match between k sequences S(1), ... , S(k) is a string that occurs at most t(i)-times in the sequence S(i), where the t(i) &gt; 0 are user-defined thresholds. First, the suffix tree of one of the sequences (the reference sequence) is built, and then the other sequences are matched separately against this suffix tree. Second, the resulting pairwise exact matches are combined to multiple exact matches. A clever implementation of this method yields a very fast and space efficient program. This program can be applied in several comparative genomics tasks, such as the identification of synteny blocks between whole genomes.&lt;br/&gt;&lt;br/&gt;post to: &lt;a href = &quot;http://www.citeulike.org/posturl?url=http%3A%2F%2Fwww.ncbi.nlm.nih.gov%2Fentrez%2Fquery.fcgi%3Fcmd%3DRetrieve%26db%3DPubMed%26dopt%3DAbstract%26list_uids%3D18361760&amp;title=Entrez+Pubmed&quot;&gt;CiteULike&lt;/a&gt;</description>
    </item>
  </channel>
</rss>
