|DPDB Home Page||Search||Analysis||Help||Statistics||Links||Contact us|
|(1) Sequence comparison||(2) Nucleotide Diversity|
REGION TO ANALYZE
A region of the
alignment can be selected for the analysis. You can enter the range in nucleotides in
the "Form site" and "To site" boxes provided under "Region
to analyze". For example, to analyze the region from nucleotide
24 to nucleotide 200 of a query alignment, you would enter "From site= 24" and
"To site= 200".
This is the first position in the alignment that will be analyzed (1 is set by default).
This is the last position in the alignment that will be analyzed (all the alignment will be analyzed by default).
SLIDING WINDOWS PARAMETERS
This parameter refers to the length in bases of the windows that will be analyzed along the alignment (see figure below).
This is the separation between each window, in bases (see figure below).
If selected, basic sequence statistics will be displayed, such as the number of sequences in the alignment, the length of the alignment, the number of analyzed bases, the number of excluded sites, etc.
If selected, the alignment
will be available in Jalview to be viewed or
Here we can select the variables we want to include in the graphics (S, number of segregating sites; Pi, nucleotide diversity; Theta per site, estimated from the number of segregating sites), and whether we want the different variables to be combined in a same graph, or displayed separately.
ENTER OR PASTE AN ALIGNMENT:
You can either paste an alignment in the box provided, or upload a file from your computer containing the aligned sequences. SNPs-Graphic accepts alignments in FASTA FORMAT ONLY! Please, read carefully the description of this format if you are not used on it:
FASTA (PEARSON and LIPMAN, 1988): A sequence in FASTA format begins with a single-line description, followed by lines of sequence data. The description line is distinguished from the sequence data by a greater-than (">") symbol in the first column. It is recommended that all lines of text be shorter than 80 characters in length. An example sequence in FASTA format is:
GENE X PROTEIN (OVALBUMIN-RELATED)
Blank lines are not allowed in the middle of FASTA input.
Sequences are expected to be represented in the standard IUB/IUPAC amino acid and nucleic acid codes, with these exceptions: lower-case letters are accepted and are mapped into upper-case; a single hyphen or dash can be used to represent a gap of indeterminate length; and in amino acid sequences, U and * are acceptable letters (see below). Before submitting a request, any numerical digits in the query sequence should either be removed or replaced by appropriate letter codes (e.g., N for unknown nucleic acid residue or X for unknown amino acid residue). The nucleic acid codes supported are:A --> adenosine M --> A C (amino) C --> cytidine S --> G C (strong) G --> guanine W --> A T (weak) T --> thymidine B --> G T C U --> uridine D --> G A T R --> G A (purine) H --> A C T Y --> T C (pyrimidine) V --> G C A K --> G T (keto) N --> A G C T (any) - gap of indeterminate length