

Hi: I have the following table and I am interested in calculating mismatch ratio. I am not completely clear how to do this and any help is deeply appreciated.
Length Matches 77 24A0T9T36
71 25^T9^T37
60 25^T9^T26 62 42A19
In length column I have length of the character string. In the second column I have the matches my reference string.
In fist case, where 77 is length, in matches from left to right, first 24 matched my reference string following by a extra character A, a null (does not account to proble) and extra T, 9 matches, extra T and 36 matches. Totally there are 3 mismatches
In case 2, I lost 2 characters (^ = loss of character compared to reference sentence) 
TOMISAGOODBOY T^MISAGOOD^OY (here I lost 2 characters)
= I have 2 mismatches TOMISAGOOODBOOY (here I have 2 extra characters O and O) = I have two mismatches
In case 4: I have 42 matches, extra A and 19 matches = so I have 1 mismatch
How can that mismatch number from matches string. 1. I have to count how many A or T or G or C (believe me only these 4
letters will appear in this, i will not see Z or B or K etc) 2. ^T or ^A or ^G or ^C will also be a mismatch
desired output:
Length Matches mismatches 77 24A0T9T36 3
71
25^T9^T37 2
60 25^T9^T26 2 62 42A19 1 10 6^TTT1 3
thanks Hs.
_______________________________________________
Tutor maillist  [hidden email]
To unsubscribe or change subscription options:
http://mail.python.org/mailman/listinfo/tutor


Hi,
I do not completely follow you, but perhaps you could check out this page: http://code.activestate.com/recipes/576869longestcommonsubsequenceproblemsolver/ Another source of inspiration could be the levenshtein distance. Regards, AlbertJan
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ All right, but apart from the sanitation, the medicine, education, wine, public order, irrigation, roads, a fresh water system, and public health, what have the Romans ever done for us? ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
Hi: I have the following table and I am interested in calculating mismatch ratio. I am not completely clear how to do this and any help is deeply appreciated.
Length Matches 77 24A0T9T36
71 25^T9^T37
60 25^T9^T26 62 42A19
In length column I have length of the character string. In the second column I have the matches my reference string.
In fist case, where 77 is length, in matches from left to right, first 24 matched my reference string following by a extra character A, a null (does not account to proble) and extra T, 9 matches, extra T and 36 matches. Totally there are 3 mismatches
In case 2, I lost 2 characters (^ = loss of character compared to reference sentence) 
TOMISAGOODBOY T^MISAGOOD^OY (here I lost 2 characters)
= I have 2 mismatches TOMISAGOOODBOOY (here I have 2 extra characters O and O) = I have two mismatches
In case 4: I have 42 matches, extra A and 19 matches = so I have 1 mismatch
How can that mismatch number from matches string. 1. I have to count how many A or T or G or C (believe me only these 4
letters will appear in this, i will not see Z or B or K etc) 2. ^T or ^A or ^G or ^C will also be a mismatch
desired output:
Length Matches mismatches 77 24A0T9T36 3
71
25^T9^T37 2
60 25^T9^T26 2 62 42A19 1 10 6^TTT1 3
thanks Hs.
_______________________________________________ Tutor maillist  [hidden email]To unsubscribe or change subscription options: http://mail.python.org/mailman/listinfo/tutor _______________________________________________
Tutor maillist  [hidden email]
To unsubscribe or change subscription options:
http://mail.python.org/mailman/listinfo/tutor


On 3/2/2012 2:11 PM, Hs Hs wrote:
Hi:
I
have the following table and I am interested in calculating
mismatch ratio. I am not completely clear how to do this and
any help is deeply appreciated.
Length Matches
77 24A0T9T36
71 25^T9^T37
60 25^T9^T26
62 42A19
In length column I have length of the character
string.
In the second column I have the matches my
reference string.
In fist case, where 77 is length, in matches from
left to right, first 24 matched my reference string
following by a extra character A, a null (does not account
to proble) and extra T, 9 matches, extra T and 36 matches.
Totally there are 3 mismatches
In case 2, I lost 2 characters (^ = loss of
character compared to reference sentence) 
TOMISAGOODBOY
T^MISAGOOD^OY (here I lost 2 characters) = I
have 2 mismatches
TOMISAGOOODBOOY (here I have 2 extra characters O
and O) = I have two mismatches
In
case 4: I have 42 matches, extra A and 19 matches = so I
have 1 mismatch
How
can that mismatch number from matches string.
1.
I have to count how many A or T or G or C (believe me only
these 4 letters will appear in this, i will not see Z or B
or K etc)
2.
^T or ^A or ^G or ^C will also be a mismatch
desired
output:
Length Matches
mismatches
77
24A0T9T36 3
71
25^T9^T37 2
60
25^T9^T26 2
62
42A19 1
10
6^TTT1 3
I am sorry but I do not understand, and do not have the patience
to wade through all the above in the hopes of gaining insight.
Perhaps you could restate the problem in a way that makes it
crystal clear.

Bob Gailer
9196364239
Chapel Hill NC
_______________________________________________
Tutor maillist  [hidden email]
To unsubscribe or change subscription options:
http://mail.python.org/mailman/listinfo/tutor


On Fri, Mar 2, 2012 at 2:11 PM, Hs Hs < [hidden email]> wrote:
> Hi:
> I have the following table and I am interested in calculating mismatch
> ratio. I am not completely clear how to do this and any help is deeply
> appreciated.
>
> Length Matches
> 77 24A0T9T36
> 71 25^T9^T37
> 60 25^T9^T26
> 62 42A19
>
>
> In length column I have length of the character string.
> In the second column I have the matches my reference string.
>
>
> In fist case, where 77 is length, in matches from left to right, first 24
> matched my reference string following by a extra character A, a null (does
> not account to proble) and extra T, 9 matches, extra T and 36 matches.
> Totally there are 3 mismatches
>
> In case 2, I lost 2 characters (^ = loss of character compared to reference
> sentence) 
>
> TOMISAGOODBOY
> T^MISAGOOD^OY (here I lost 2 characters) = I have 2 mismatches
> TOMISAGOOODBOOY (here I have 2 extra characters O and O) = I have two
> mismatches
>
>
> In case 4: I have 42 matches, extra A and 19 matches = so I have 1 mismatch
>
>
> How can that mismatch number from matches string.
> 1. I have to count how many A or T or G or C (believe me only these 4
> letters will appear in this, i will not see Z or B or K etc)
> 2. ^T or ^A or ^G or ^C will also be a mismatch
>
>
> desired output:
>
> Length Matches mismatches
> 77 24A0T9T36 3
> 71 25^T9^T37 2
> 60 25^T9^T26 2
> 62 42A19 1
> 10 6^TTT1 3
>
It looks like all you need to do is count the number of A, T, C, and G
characters in your Matches column. Maybe something like this:
differences = [
[77, '24A0T9T36'],
[71, '25^T9^T37'],
[60, '25^T9^T26'],
[62, '42A19']
]
for length, matches in differences:
mismatches = 0
for char in matches:
if char in ('A', 'T', 'G', 'C'):
mismatches += 1
print length, matches, mismatches
which produces the following output:
77 24A0T9T36 3
71 25^T9^T37 2
60 25^T9^T26 2
62 42A19 1

Jerry
_______________________________________________
Tutor maillist  [hidden email]
To unsubscribe or change subscription options:
http://mail.python.org/mailman/listinfo/tutor


On 02/03/12 19:11, Hs Hs wrote:
> 1. I have to count how many A or T or G or C (believe me only these 4
> letters will appear in this, i will not see Z or B or K etc)
This suggests to me that its related to chromosome analysis or somesuch?
There are some python libraries for biochemistry work.
Maybe you should Google for that and see if there is something
already out there that can do what you want?
Your explanation doesn't really make sense to me outside that context
and, since I'm not a biologist, it doesn't mean that much in that
context either!

Alan G
Author of the Learn to Program web site
http://www.alang.me.uk/_______________________________________________
Tutor maillist  [hidden email]
To unsubscribe or change subscription options:
http://mail.python.org/mailman/listinfo/tutor


On 3/2/2012 11:11 AM Hs Hs said...
> Hi:
> I have the following table and I am interested in calculating mismatch
> ratio. I am not completely clear how to do this and any help is deeply
> appreciated.
>
...and then there's always the standard library:
Help on class SequenceMatcher in module difflib:
class SequenceMatcher
 SequenceMatcher is a flexible class for comparing pairs of sequences of
 any type, so long as the sequence elements are hashable.
Emile
_______________________________________________
Tutor maillist  [hidden email]
To unsubscribe or change subscription options:
http://mail.python.org/mailman/listinfo/tutor

