[Tutor] Biopython

classic Classic list List threaded Threaded
4 messages Options
Reply | Threaded
Open this post in threaded view
|

[Tutor] Biopython

Laura Scearce
Hello,
I am using Linux version 2.6 and python version 2.6.6 (gcc version 4.4.5).
I have a list of protein names and am trying to get their sequences. I thought I would download Biopython, and am having troubles.
I first downloaded and installed easyinstall from http://pypi.python.org/pypi/setuptools
I followed the instructions and it seemed to work (see below error messages).

This is the error I get when I try to install biopython:
root@debbie-VirtualBox:/usr/local/bin# easy_install biopython
Searching for biopython
Reading http://pypi.python.org/simple/biopython/
Reading http://www.biopython.org/
Reading http://biopython.org/DIST/
Best match: biopython 1.59
Downloading http://biopython.org/DIST/biopython-1.59.zip
Processing biopython-1.59.zip
Running biopython-1.59/setup.py -q bdist_egg --dist-dir /tmp/easy_install-qMfJsF/biopython-1.59/egg-dist-tmp-W4L1YX
warning: no previously-included files found matching 'Tests/Graphics/*'
warning: no previously-included files matching '.cvsignore' found under directory '*'
warning: no previously-included files matching '*.pyc' found under directory '*'
Bio/cpairwise2module.c:12: fatal error: Python.h: No such file or directory
compilation terminated.
error: Setup script exited with error: command 'gcc' failed with exit status 1

*I also tried it in the directory Downloads, because this is where the easyinstall was downloaded to. Same message.

Then I tried downloading Biopython from http://pypi.python.org/pypi/biopython
I downloaded biopython-1.59.tar.gz. and extracted it to the folder BLAST-SW in my Documents folder.

Here is what I tried:

debbie@debbie-VirtualBox:~/Documents/BLAST-SW$ sh biopython-1.59.tar.gz
sh: Can't open biopython-1.59.tar.gz
debbie@debbie-VirtualBox:~$ sh biopython-1.59.tar.gz
sh: Can't open biopython-1.59.tar.gz
debbie@debbie-VirtualBox:~/Documents/BLAST-SW$ sh biopython-1.59
*This had no error message, but I think that Biopython is not installed because:
>>> import Bio
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
ImportError: No module named Bio

The end goal is to use ncbi Eutils to get protein sequences, so if anyone has experience with this please let me know.
Thanks!
Laura Scearce


_______________________________________________
Tutor maillist  -  [hidden email]
To unsubscribe or change subscription options:
http://mail.python.org/mailman/listinfo/tutor
Reply | Threaded
Open this post in threaded view
|

Re: [Tutor] Biopython

Emile van Sebille
On 4/10/2012 1:25 PM Laura Scearce said...
> Hello,
> I am using Linux version 2.6 and python version 2.6.6 (gcc version 4.4.5).
> I have a list of protein names and am trying to get their sequences. I
> thought I would download Biopython, and am having troubles.

Your best source for answers pertaining to third party packages if the
third party itself.

Try http://biopython.org/wiki/Mailing_lists -- they'll get you going
faster than we will...  unless of course someone here has sufficient
experience with that package.

Emile

_______________________________________________
Tutor maillist  -  [hidden email]
To unsubscribe or change subscription options:
http://mail.python.org/mailman/listinfo/tutor
Reply | Threaded
Open this post in threaded view
|

Re: [Tutor] Biopython

wprins
In reply to this post by Laura Scearce
Hi Laura,

On 10 April 2012 22:25, Laura Scearce <[hidden email]> wrote:
I am using Linux version 2.6 and python version 2.6.6 (gcc version 4.4.5).

What distribution of Linux?  (Ubuntu, Debian, Centos or something else?)


I followed the instructions and it seemed to work (see below error messages).

[snip...]
 
compilation terminated.
error: Setup script exited with error: command 'gcc' failed with exit status 1

That means gcc is likely not installed.  GCC is the GNU C compiler.  On Linux, Python modules written in C needs GCC to be able to install.  I imagine BioPython is written in C, which would explain why it wants GCC during installation. 

On Ubuntu/Debian, you can fix the lack of GCC by using the following command:

sudo apt-get install build-essential

This will install the "build-essential" package which includes GCC and a bunch of other development tools.  (If you're not using Debian or Ubuntu or a distribution based on Debian the above command won't work.)

 
*I also tried it in the directory Downloads, because this is where the easyinstall was downloaded to. Same message.

After installing "easy_install" into your Python environment, the originally downloaded file is redundant/irrelevant to the functioning of easy_install, so the folder you're in doesn't actually matter.  Which is why you get the same message. :)

 

Then I tried downloading Biopython from http://pypi.python.org/pypi/biopython
I downloaded biopython-1.59.tar.gz. and extracted it to the folder BLAST-SW in my Documents folder.

Here is what I tried:

debbie@debbie-VirtualBox:~/Documents/BLAST-SW$ sh biopython-1.59.tar.gz
sh: Can't open biopython-1.59.tar.gz
debbie@debbie-VirtualBox:~$ sh biopython-1.59.tar.gz
sh: Can't open biopython-1.59.tar.gz
debbie@debbie-VirtualBox:~/Documents/BLAST-SW$ sh biopython-1.59
*This had no error message, but I think that Biopython is not installed because:
>>> import Bio
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
ImportError: No module named Bio

A .tar.gz file is not a shellscript file or bundle.  It is a type of archive file, and to extract it you need to use the "tar" command or a GUI tool that can read tar.gz files.  (I'm a tad puzzled as you seem to understand this file needs to be extracted, yet the commands you issue implies you seem to think you can run the file as-is?)  Normally you'd use a command like this:

tar -zxvf  biopython-1.59.tar.gz

... which will extract the the
biopython-1.59.tar.gz file into the current folder.  In case the files in the archive are not in a subfolder it's therefore  a bit
safer to create your own targer folder first, then extract from within this new folder, e.g. something like:

mkdir biopython
cd biopython
tar -zxvf  ../biopython-1.59.tar.gz

Finally having briefly scanned the biopython download page, I want to point out that there are Biopython packages available for Ubuntu/Debian systems, so if you're using one of these Linux distributions you should preferentially use Synaptic (or another Debian package tool like apt-get or aptitude) to install BioPython as this will be a lot less grief.

Walter

_______________________________________________
Tutor maillist  -  [hidden email]
To unsubscribe or change subscription options:
http://mail.python.org/mailman/listinfo/tutor
Reply | Threaded
Open this post in threaded view
|

Re: [Tutor] Biopython

Alan Gauld
In reply to this post by Laura Scearce
On 10/04/12 21:25, Laura Scearce wrote:
> Hello,
> I am using Linux version 2.6 and python version 2.6.6 (gcc version 4.4.5).
> I have a list of protein names and am trying to get their sequences.

OK, That's not unusual and there is a forum somewhere for Python users
doing bio type work. However, it helps if you tell us what distribution
of Linux since their installers etc all work slightly differently (or
more specifically there are three or four different installer packages
that work differently across the dozens of Linux distros.)

> thought I would download Biopython, and am having troubles.

This is a list for people learning python. There may be people on the
list that also know biopython but I wouldn't count on it. If you can
find a biopython forum you are more likely to get useful help.

> Bio/cpairwise2module.c:12: fatal error: Python.h: No such file or directory
> compilation terminated.
> error: Setup script exited with error: command 'gcc' failed with exit
> status 1

This means its trying to build from source. Does your PC have a
working development environment set up? Specifically can you
build C programs using make? If you don;t, or don;t know, you are
probably out of your depth and should try to find a pre-built
package for your system.

> Here is what I tried:
> debbie@debbie-VirtualBox:~/Documents/BLAST-SW$ sh biopython-1.59.tar.gz
> sh: Can't open biopython-1.59.tar.gz

OK, the fact that you are trying to  run sh on a tar.gz file tells me
you are also a newbie to Linux as well as biopython.
You need to find the tool that your distro uses to open/install
"gunzip compressed tape archives" - which is what a tar.gz file is...

You can do it manually but thats likely to lead to even more problems if
you don't really understand what you are doing.

You might be lucky and the distro has already set up the associations,
what happens if you just run biopython-1.59.tar.gz? That might start the
right tool automatically. Although what the tool will do depends on what
is inside the archive.

> The end goal is to use ncbi Eutils to get protein sequences, so if
> anyone has experience with this please let me know.

You really are asking on the wrong forum.
First have you read this? I found it at the top ofd my first google
search on biopython...

http://biopython.org/wiki/Biopython

There is a download and installation guide.

Also a page full of bio help sources:

http://biopython.org/wiki/Mailing_lists

In programming it always helps to read the manual first, it is
invariably faster than guessing.

HTH,
--
Alan G
Author of the Learn to Program web site
http://www.alan-g.me.uk/

_______________________________________________
Tutor maillist  -  [hidden email]
To unsubscribe or change subscription options:
http://mail.python.org/mailman/listinfo/tutor