查看原文
其他

a simple gene finder

2017-08-06 Y叔 biobabble



最近又在开IBW2017的会,想起来我当年唯一参加过的一次是IBW2011,那年我在暨大拿了个小基金,资助了自己去西安玩一趟。当年只记录了这个课程的作业,分享一下。






Course Projects:

Project 1: Implementation of a simple gene finder


GOAL

Build a simple codon-usage based gene finder for finding genes in E.coli.


Procedure

Collect 100 gene sequences from the bacterium E. coli in the genbank (http://www.ncbi.nlm.nihh.gov). Compute the codon usage table based on these genes (and the translated protein sequences from them); Build a probabilistic model based on the codon usages; Implement a random sequence model in which the nucleotide frequency is computed from the 100 E. coli genes. For a given DNA sequence (and one selected reading frame), compare your model with a random sequence model; Results that you should submit:


Two FASTA files for the collected 100 genes and 100 translated protein sequences; The printed codon usage table; A program named ECgnfinder, running with the syntax as ECgnfinder –i inputfile


Inputfile stands for the name of input file, which should contain one DNA sequence in FASTA file format; the program should be able to report an error message if the input file is in the wrong format.


The output should be printed to the standard output as (xxx stands for the likelihood)


ORF1: xxx ORF2: xxx



代码点击阅读原文看吧,代码长一点,就懒得调了,怎么调都调不好,微信太渣渣。

赞赏

您可能也对以下帖子感兴趣

文章有问题?点此查看未经处理的缓存