[Tutor] number of mismatches in a string

classic Classic list List threaded Threaded
6 messages Options
Reply | Threaded
Open this post in threaded view
|

[Tutor] number of mismatches in a string

Hs Hs
Hi:
I have the following table and I am interested in calculating mismatch ratio. I am not completely clear how to do this and any help is deeply appreciated. 

Length     Matches
77      24A0T9T36
71      25^T9^T37
60      25^T9^T26
62      42A19


In length column I have length of the character string. 
In the second column I have the matches my reference string. 


In fist case, where 77 is length, in matches from left to right, first 24 matched my reference string following by a extra character A, a null (does not account to proble) and extra T, 9 matches, extra T and 36 matches.  Totally there are 3 mismatches

In case 2, I lost 2 characters (^ = loss of character compared to reference sentence)   - 

TOMISAGOODBOY
T^MISAGOOD^OY   (here I lost 2 characters)  = I have 2 mismatches
TOMISAGOOODBOOY (here I have 2 extra characters O and O) = I have two mismatches


In case 4: I have 42 matches, extra A and 19 matches = so I have 1 mismatch


How can that mismatch number from matches string.
1. I have to count how many A or T or G or C (believe me only these 4 letters will appear in this, i will not see Z or B or K etc)
2. ^T or ^A or ^G or ^C will also be a mismatch


desired output:

Length     Matches   mismatches
77      24A0T9T36    3 
71      25^T9^T37     2
60      25^T9^T26     2
62      42A19             1
10      6^TTT1           3


thanks 
Hs.


_______________________________________________
Tutor maillist  -  [hidden email]
To unsubscribe or change subscription options:
http://mail.python.org/mailman/listinfo/tutor
Reply | Threaded
Open this post in threaded view
|

Re: [Tutor] number of mismatches in a string

Albert-jan Roskam
Hi,

I do not completely follow you, but perhaps you could check out this page: http://code.activestate.com/recipes/576869-longest-common-subsequence-problem-solver/
Another source of inspiration could be the levenshtein distance.
 
Regards,
Albert-Jan

~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
All right, but apart from the sanitation, the medicine, education, wine, public order, irrigation, roads, a
fresh water system, and public health, what have the Romans ever done for us?
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ 

From: Hs Hs <[hidden email]>
To: "[hidden email]" <[hidden email]>
Sent: Friday, March 2, 2012 8:11 PM
Subject: [Tutor] number of mismatches in a string

Hi:
I have the following table and I am interested in calculating mismatch ratio. I am not completely clear how to do this and any help is deeply appreciated. 

Length     Matches
77      24A0T9T36
71      25^T9^T37
60      25^T9^T26
62      42A19


In length column I have length of the character string. 
In the second column I have the matches my reference string. 


In fist case, where 77 is length, in matches from left to right, first 24 matched my reference string following by a extra character A, a null (does not account to proble) and extra T, 9 matches, extra T and 36 matches.  Totally there are 3 mismatches

In case 2, I lost 2 characters (^ = loss of character compared to reference sentence)   - 

TOMISAGOODBOY
T^MISAGOOD^OY   (here I lost 2 characters)  = I have 2 mismatches
TOMISAGOOODBOOY (here I have 2 extra characters O and O) = I have two mismatches


In case 4: I have 42 matches, extra A and 19 matches = so I have 1 mismatch


How can that mismatch number from matches string.
1. I have to count how many A or T or G or C (believe me only these 4 letters will appear in this, i will not see Z or B or K etc)
2. ^T or ^A or ^G or ^C will also be a mismatch


desired output:

Length     Matches   mismatches
77      24A0T9T36    3 
71      25^T9^T37     2
60      25^T9^T26     2
62      42A19             1
10      6^TTT1           3


thanks 
Hs.


_______________________________________________
Tutor maillist  -  [hidden email]
To unsubscribe or change subscription options:
http://mail.python.org/mailman/listinfo/tutor



_______________________________________________
Tutor maillist  -  [hidden email]
To unsubscribe or change subscription options:
http://mail.python.org/mailman/listinfo/tutor
Reply | Threaded
Open this post in threaded view
|

Re: [Tutor] number of mismatches in a string

Bob Gailer
In reply to this post by Hs Hs
On 3/2/2012 2:11 PM, Hs Hs wrote:
Hi:
I have the following table and I am interested in calculating mismatch ratio. I am not completely clear how to do this and any help is deeply appreciated. 

Length     Matches
77      24A0T9T36
71      25^T9^T37
60      25^T9^T26
62      42A19


In length column I have length of the character string. 
In the second column I have the matches my reference string. 


In fist case, where 77 is length, in matches from left to right, first 24 matched my reference string following by a extra character A, a null (does not account to proble) and extra T, 9 matches, extra T and 36 matches.  Totally there are 3 mismatches

In case 2, I lost 2 characters (^ = loss of character compared to reference sentence)   - 

TOMISAGOODBOY
T^MISAGOOD^OY   (here I lost 2 characters)  = I have 2 mismatches
TOMISAGOOODBOOY (here I have 2 extra characters O and O) = I have two mismatches


In case 4: I have 42 matches, extra A and 19 matches = so I have 1 mismatch


How can that mismatch number from matches string.
1. I have to count how many A or T or G or C (believe me only these 4 letters will appear in this, i will not see Z or B or K etc)
2. ^T or ^A or ^G or ^C will also be a mismatch


desired output:

Length     Matches   mismatches
77      24A0T9T36    3 
71      25^T9^T37     2
60      25^T9^T26     2
62      42A19             1
10      6^TTT1           3


I am sorry but I do not understand, and do not have the patience to wade through all the above in the hopes of gaining insight.

Perhaps you could restate the problem in a way that makes it crystal clear.
-- 
Bob Gailer
919-636-4239
Chapel Hill NC

_______________________________________________
Tutor maillist  -  [hidden email]
To unsubscribe or change subscription options:
http://mail.python.org/mailman/listinfo/tutor
Reply | Threaded
Open this post in threaded view
|

Re: [Tutor] number of mismatches in a string

Jerry Hill
In reply to this post by Hs Hs
On Fri, Mar 2, 2012 at 2:11 PM, Hs Hs <[hidden email]> wrote:

> Hi:
> I have the following table and I am interested in calculating mismatch
> ratio. I am not completely clear how to do this and any help is deeply
> appreciated.
>
> Length     Matches
> 77      24A0T9T36
> 71      25^T9^T37
> 60      25^T9^T26
> 62      42A19
>
>
> In length column I have length of the character string.
> In the second column I have the matches my reference string.
>
>
> In fist case, where 77 is length, in matches from left to right, first 24
> matched my reference string following by a extra character A, a null (does
> not account to proble) and extra T, 9 matches, extra T and 36 matches.
>  Totally there are 3 mismatches
>
> In case 2, I lost 2 characters (^ = loss of character compared to reference
> sentence)   -
>
> TOMISAGOODBOY
> T^MISAGOOD^OY   (here I lost 2 characters)  = I have 2 mismatches
> TOMISAGOOODBOOY (here I have 2 extra characters O and O) = I have two
> mismatches
>
>
> In case 4: I have 42 matches, extra A and 19 matches = so I have 1 mismatch
>
>
> How can that mismatch number from matches string.
> 1. I have to count how many A or T or G or C (believe me only these 4
> letters will appear in this, i will not see Z or B or K etc)
> 2. ^T or ^A or ^G or ^C will also be a mismatch
>
>
> desired output:
>
> Length     Matches   mismatches
> 77      24A0T9T36    3
> 71      25^T9^T37     2
> 60      25^T9^T26     2
> 62      42A19             1
> 10      6^TTT1           3
>

It looks like all you need to do is count the number of A, T, C, and G
characters in your Matches column.  Maybe something like this:

differences = [
    [77, '24A0T9T36'],
    [71, '25^T9^T37'],
    [60, '25^T9^T26'],
    [62, '42A19']
]


for length, matches in differences:
    mismatches = 0
    for char in matches:
        if char in ('A', 'T', 'G', 'C'):
            mismatches += 1
    print length, matches, mismatches


which produces the following output:
77 24A0T9T36 3
71 25^T9^T37 2
60 25^T9^T26 2
62 42A19 1

--
Jerry
_______________________________________________
Tutor maillist  -  [hidden email]
To unsubscribe or change subscription options:
http://mail.python.org/mailman/listinfo/tutor
Reply | Threaded
Open this post in threaded view
|

Re: [Tutor] number of mismatches in a string

Alan Gauld
In reply to this post by Hs Hs
On 02/03/12 19:11, Hs Hs wrote:

> 1. I have to count how many A or T or G or C (believe me only these 4
> letters will appear in this, i will not see Z or B or K etc)

This suggests to me that its related to chromosome analysis or somesuch?

There are some python libraries for biochemistry work.
Maybe you should Google for that and see if there is something
already out there that can do what you want?

Your explanation doesn't really make sense to me outside that context
and, since I'm not a biologist, it doesn't mean that much in that
context either!


--
Alan G
Author of the Learn to Program web site
http://www.alan-g.me.uk/

_______________________________________________
Tutor maillist  -  [hidden email]
To unsubscribe or change subscription options:
http://mail.python.org/mailman/listinfo/tutor
Reply | Threaded
Open this post in threaded view
|

Re: [Tutor] number of mismatches in a string

Emile van Sebille
In reply to this post by Hs Hs
On 3/2/2012 11:11 AM Hs Hs said...
> Hi:
> I have the following table and I am interested in calculating mismatch
> ratio. I am not completely clear how to do this and any help is deeply
> appreciated.
>
...and then there's always the standard library:

Help on class SequenceMatcher in module difflib:

class SequenceMatcher
  |  SequenceMatcher is a flexible class for comparing pairs of sequences of
  |  any type, so long as the sequence elements are hashable.


Emile

_______________________________________________
Tutor maillist  -  [hidden email]
To unsubscribe or change subscription options:
http://mail.python.org/mailman/listinfo/tutor