Quantcast

[IPython-User] Question about schedulers

classic Classic list List threaded Threaded
19 messages Options
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate
star

[IPython-User] Question about schedulers

Darren Govoni
Hi,
  Let's say I have 10 engines and a list of 20 objects I want to map
using a load balanced view. How will the scheduler distribute the 20
messages? Assuming all engines are equal, will the first 10 objects be
distributed to 1 engine each and the second 10 objects will wait for an
engine to be free then go there? Or will all 20 messages be spread to
the engines at the same time?

thanks,
Darren

_______________________________________________
IPython-User mailing list
[hidden email]
http://mail.scipy.org/mailman/listinfo/ipython-user
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate
star

Re: Question about schedulers

Fernando Perez
On Wed, Jun 6, 2012 at 2:01 AM, Darren Govoni <[hidden email]> wrote:
>  Let's say I have 10 engines and a list of 20 objects I want to map
> using a load balanced view. How will the scheduler distribute the 20
> messages? Assuming all engines are equal, will the first 10 objects be
> distributed to 1 engine each and the second 10 objects will wait for an
> engine to be free then go there? Or will all 20 messages be spread to
> the engines at the same time?

The load balancing is dynamic, so tasks will go to the engine as they
become available.  If you have an estimate of your task durations it's
still a good idea to schedule the longest first so they start
earliest, rather than having one long task starting last.  But if you
have no prior knowledge of this kind, at least the dynamic scheduling
gives you the best possible results.

Cheers,

f
_______________________________________________
IPython-User mailing list
[hidden email]
http://mail.scipy.org/mailman/listinfo/ipython-user
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate
star

Re: Question about schedulers

Jon Olav Vik-3
In reply to this post by Darren Govoni
Darren Govoni <darren <at> ontrenet.com> writes:

> Assuming all engines are equal, will the first 10 objects be
> distributed to 1 engine each and the second 10 objects will wait for an
> engine to be free then go there? Or will all 20 messages be spread to
> the engines at the same time?

I think two relevant options are:


The `chunksize` argument to IPython.parallel.ParallelFunction determines how
many list items are passed in each "task".

from IPython.parallel import Client
c = Client()
lv = c.load_balanced_view()

@lv.parallel(block=True)
def chunk1(x):
    return str(x)

@lv.parallel(chunksize=2, block=True)
def chunk2(x):
    return str(x)

L = range(5)
print chunk1(L)
print chunk2(L)
## -- End pasted text --
['[0]', '[1]', '[2]', '[3]', '[4]']
['[0, 1]', '[2, 3]', '[4]']


The `hwm` (high water mark) configurable determines the maximum number of tasks
that can be outstanding on an engine. On my system, it is set in the file
ipcontroller_config.py, inside the directory profile_default inside the
directory returned by IPython.utils.path.get_ipython_dir().

Quoting
http://ipython.org/ipython-doc/dev/parallel/parallel_task.html#greedy-assignment

"""
Tasks are assigned greedily as they are submitted. If their dependencies are
met, they will be assigned to an engine right away, and multiple tasks can be
assigned to an engine at a given time. This limit is set with the
TaskScheduler.hwm (high water mark) configurable:
# the most common choices are:
c.TaskSheduler.hwm = 0 # (minimal latency, default in IPython ≤ 0.12)
# or
c.TaskScheduler.hwm = 1 # (most-informed balancing, default in > 0.12)

In IPython ≤ 0.12,the default is 0, or no-limit. That is, there is no limit to
the number of tasks that can be outstanding on a given engine. This greatly
benefits the latency of execution, because network traffic can be hidden behind
computation. However, this means that workload is assigned without knowledge of
how long each task might take, and can result in poor load-balancing,
particularly for submitting a collection of heterogeneous tasks all at once.
You can limit this effect by setting hwm to a positive integer, 1 being maximum
load-balancing (a task will never be waiting if there is an idle engine), and
any larger number being a compromise between load-balance and latency-hiding.

In practice, some users have been confused by having this optimization on by
default, and the default value has been changed to 1. This can be slower, but
has more obvious behavior and won’t result in assigning too many tasks to some
engines in heterogeneous cases.
"""

_______________________________________________
IPython-User mailing list
[hidden email]
http://mail.scipy.org/mailman/listinfo/ipython-user
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate
star

Re: Question about schedulers

Darren Govoni
Jon,
   Thanks for those details. Very informative.

So it says multiple tasks can be assigned to an engine at a time, but
how many execute at the same time? Just one right? Or is there a setting
for that too?

thanks!
Darren

On Wed, 2012-06-06 at 21:38 +0000, Jon Olav Vik wrote:

> Darren Govoni <darren <at> ontrenet.com> writes:
>
> > Assuming all engines are equal, will the first 10 objects be
> > distributed to 1 engine each and the second 10 objects will wait for an
> > engine to be free then go there? Or will all 20 messages be spread to
> > the engines at the same time?
>
> I think two relevant options are:
>
>
> The `chunksize` argument to IPython.parallel.ParallelFunction determines how
> many list items are passed in each "task".
>
> from IPython.parallel import Client
> c = Client()
> lv = c.load_balanced_view()
>
> @lv.parallel(block=True)
> def chunk1(x):
>     return str(x)
>
> @lv.parallel(chunksize=2, block=True)
> def chunk2(x):
>     return str(x)
>
> L = range(5)
> print chunk1(L)
> print chunk2(L)
> ## -- End pasted text --
> ['[0]', '[1]', '[2]', '[3]', '[4]']
> ['[0, 1]', '[2, 3]', '[4]']
>
>
> The `hwm` (high water mark) configurable determines the maximum number of tasks
> that can be outstanding on an engine. On my system, it is set in the file
> ipcontroller_config.py, inside the directory profile_default inside the
> directory returned by IPython.utils.path.get_ipython_dir().
>
> Quoting
> http://ipython.org/ipython-doc/dev/parallel/parallel_task.html#greedy-assignment
>
> """
> Tasks are assigned greedily as they are submitted. If their dependencies are
> met, they will be assigned to an engine right away, and multiple tasks can be
> assigned to an engine at a given time. This limit is set with the
> TaskScheduler.hwm (high water mark) configurable:
> # the most common choices are:
> c.TaskSheduler.hwm = 0 # (minimal latency, default in IPython ≤ 0.12)
> # or
> c.TaskScheduler.hwm = 1 # (most-informed balancing, default in > 0.12)
>
> In IPython ≤ 0.12,the default is 0, or no-limit. That is, there is no limit to
> the number of tasks that can be outstanding on a given engine. This greatly
> benefits the latency of execution, because network traffic can be hidden behind
> computation. However, this means that workload is assigned without knowledge of
> how long each task might take, and can result in poor load-balancing,
> particularly for submitting a collection of heterogeneous tasks all at once.
> You can limit this effect by setting hwm to a positive integer, 1 being maximum
> load-balancing (a task will never be waiting if there is an idle engine), and
> any larger number being a compromise between load-balance and latency-hiding.
>
> In practice, some users have been confused by having this optimization on by
> default, and the default value has been changed to 1. This can be slower, but
> has more obvious behavior and won’t result in assigning too many tasks to some
> engines in heterogeneous cases.
> """
>
> _______________________________________________
> IPython-User mailing list
> [hidden email]
> http://mail.scipy.org/mailman/listinfo/ipython-user


_______________________________________________
IPython-User mailing list
[hidden email]
http://mail.scipy.org/mailman/listinfo/ipython-user
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate
star

Re: Question about schedulers

Min RK


On Wed, Jun 6, 2012 at 4:52 PM, Darren Govoni <[hidden email]> wrote:
Jon,
  Thanks for those details. Very informative.

So it says multiple tasks can be assigned to an engine at a time, but
how many execute at the same time? Just one right? Or is there a setting
for that too?

Correct, the engines themselves are not multithreaded, so it only runs one at a time.  This is not configurable.  The normal mode is starting one engine per core on each machine.

Assigning multiple tasks to the engines helps hide the network latency behind computation, because the next task will be waiting in-memory on the Engine when it finishes the previous one, rather than having to fetch it from the scheduler.

-MinRK
 

thanks!
Darren

On Wed, 2012-06-06 at 21:38 +0000, Jon Olav Vik wrote:
> Darren Govoni <darren <at> ontrenet.com> writes:
>
> > Assuming all engines are equal, will the first 10 objects be
> > distributed to 1 engine each and the second 10 objects will wait for an
> > engine to be free then go there? Or will all 20 messages be spread to
> > the engines at the same time?
>
> I think two relevant options are:
>
>
> The `chunksize` argument to IPython.parallel.ParallelFunction determines how
> many list items are passed in each "task".
>
> from IPython.parallel import Client
> c = Client()
> lv = c.load_balanced_view()
>
> @lv.parallel(block=True)
> def chunk1(x):
>     return str(x)
>
> @lv.parallel(chunksize=2, block=True)
> def chunk2(x):
>     return str(x)
>
> L = range(5)
> print chunk1(L)
> print chunk2(L)
> ## -- End pasted text --
> ['[0]', '[1]', '[2]', '[3]', '[4]']
> ['[0, 1]', '[2, 3]', '[4]']
>
>
> The `hwm` (high water mark) configurable determines the maximum number of tasks
> that can be outstanding on an engine. On my system, it is set in the file
> ipcontroller_config.py, inside the directory profile_default inside the
> directory returned by IPython.utils.path.get_ipython_dir().
>
> Quoting
> http://ipython.org/ipython-doc/dev/parallel/parallel_task.html#greedy-assignment
>
> """
> Tasks are assigned greedily as they are submitted. If their dependencies are
> met, they will be assigned to an engine right away, and multiple tasks can be
> assigned to an engine at a given time. This limit is set with the
> TaskScheduler.hwm (high water mark) configurable:
> # the most common choices are:
> c.TaskSheduler.hwm = 0 # (minimal latency, default in IPython ≤ 0.12)
> # or
> c.TaskScheduler.hwm = 1 # (most-informed balancing, default in > 0.12)
>
> In IPython ≤ 0.12,the default is 0, or no-limit. That is, there is no limit to
> the number of tasks that can be outstanding on a given engine. This greatly
> benefits the latency of execution, because network traffic can be hidden behind
> computation. However, this means that workload is assigned without knowledge of
> how long each task might take, and can result in poor load-balancing,
> particularly for submitting a collection of heterogeneous tasks all at once.
> You can limit this effect by setting hwm to a positive integer, 1 being maximum
> load-balancing (a task will never be waiting if there is an idle engine), and
> any larger number being a compromise between load-balance and latency-hiding.
>
> In practice, some users have been confused by having this optimization on by
> default, and the default value has been changed to 1. This can be slower, but
> has more obvious behavior and won’t result in assigning too many tasks to some
> engines in heterogeneous cases.
> """
>
> _______________________________________________
> IPython-User mailing list
> [hidden email]
> http://mail.scipy.org/mailman/listinfo/ipython-user


_______________________________________________
IPython-User mailing list
[hidden email]
http://mail.scipy.org/mailman/listinfo/ipython-user


_______________________________________________
IPython-User mailing list
[hidden email]
http://mail.scipy.org/mailman/listinfo/ipython-user
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate
star

Re: Question about schedulers

Darren Govoni
Gotcha. Makes sense.

Incidentally, I discovered that I can execute the ipengine code directly
in my python IDE and set break points in my user code modules and when I
execute functions from remote clients/views, it will hit the break
points and let me debug my code visually (in the running engine). Pretty
sweet. Though I'd share.

On Wed, 2012-06-06 at 17:06 -0700, MinRK wrote:

>
>
> On Wed, Jun 6, 2012 at 4:52 PM, Darren Govoni <[hidden email]>
> wrote:
>         Jon,
>           Thanks for those details. Very informative.
>        
>         So it says multiple tasks can be assigned to an engine at a
>         time, but
>         how many execute at the same time? Just one right? Or is there
>         a setting
>         for that too?
>
>
> Correct, the engines themselves are not multithreaded, so it only runs
> one at a time.  This is not configurable.  The normal mode is starting
> one engine per core on each machine.
>
>
> Assigning multiple tasks to the engines helps hide the network latency
> behind computation, because the next task will be waiting in-memory on
> the Engine when it finishes the previous one, rather than having to
> fetch it from the scheduler.
>
>
> -MinRK
>  
>        
>         thanks!
>         Darren
>        
>         On Wed, 2012-06-06 at 21:38 +0000, Jon Olav Vik wrote:
>         > Darren Govoni <darren <at> ontrenet.com> writes:
>         >
>         > > Assuming all engines are equal, will the first 10 objects
>         be
>         > > distributed to 1 engine each and the second 10 objects
>         will wait for an
>         > > engine to be free then go there? Or will all 20 messages
>         be spread to
>         > > the engines at the same time?
>         >
>         > I think two relevant options are:
>         >
>         >
>         > The `chunksize` argument to
>         IPython.parallel.ParallelFunction determines how
>         > many list items are passed in each "task".
>         >
>         > from IPython.parallel import Client
>         > c = Client()
>         > lv = c.load_balanced_view()
>         >
>         > @lv.parallel(block=True)
>         > def chunk1(x):
>         >     return str(x)
>         >
>         > @lv.parallel(chunksize=2, block=True)
>         > def chunk2(x):
>         >     return str(x)
>         >
>         > L = range(5)
>         > print chunk1(L)
>         > print chunk2(L)
>         > ## -- End pasted text --
>         > ['[0]', '[1]', '[2]', '[3]', '[4]']
>         > ['[0, 1]', '[2, 3]', '[4]']
>         >
>         >
>         > The `hwm` (high water mark) configurable determines the
>         maximum number of tasks
>         > that can be outstanding on an engine. On my system, it is
>         set in the file
>         > ipcontroller_config.py, inside the directory profile_default
>         inside the
>         > directory returned by IPython.utils.path.get_ipython_dir().
>         >
>         > Quoting
>         >
>         http://ipython.org/ipython-doc/dev/parallel/parallel_task.html#greedy-assignment
>         >
>         > """
>         > Tasks are assigned greedily as they are submitted. If their
>         dependencies are
>         > met, they will be assigned to an engine right away, and
>         multiple tasks can be
>         > assigned to an engine at a given time. This limit is set
>         with the
>         > TaskScheduler.hwm (high water mark) configurable:
>         > # the most common choices are:
>         > c.TaskSheduler.hwm = 0 # (minimal latency, default in
>         IPython ≤ 0.12)
>         > # or
>         > c.TaskScheduler.hwm = 1 # (most-informed balancing, default
>         in > 0.12)
>         >
>         > In IPython ≤ 0.12,the default is 0, or no-limit. That is,
>         there is no limit to
>         > the number of tasks that can be outstanding on a given
>         engine. This greatly
>         > benefits the latency of execution, because network traffic
>         can be hidden behind
>         > computation. However, this means that workload is assigned
>         without knowledge of
>         > how long each task might take, and can result in poor
>         load-balancing,
>         > particularly for submitting a collection of heterogeneous
>         tasks all at once.
>         > You can limit this effect by setting hwm to a positive
>         integer, 1 being maximum
>         > load-balancing (a task will never be waiting if there is an
>         idle engine), and
>         > any larger number being a compromise between load-balance
>         and latency-hiding.
>         >
>         > In practice, some users have been confused by having this
>         optimization on by
>         > default, and the default value has been changed to 1. This
>         can be slower, but
>         > has more obvious behavior and won’t result in assigning too
>         many tasks to some
>         > engines in heterogeneous cases.
>         > """
>         >
>         > _______________________________________________
>         > IPython-User mailing list
>         > [hidden email]
>         > http://mail.scipy.org/mailman/listinfo/ipython-user
>        
>        
>         _______________________________________________
>         IPython-User mailing list
>         [hidden email]
>         http://mail.scipy.org/mailman/listinfo/ipython-user
>        
>
> _______________________________________________
> IPython-User mailing list
> [hidden email]
> http://mail.scipy.org/mailman/listinfo/ipython-user


_______________________________________________
IPython-User mailing list
[hidden email]
http://mail.scipy.org/mailman/listinfo/ipython-user
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate
star

Re: Question about schedulers

Fernando Perez
On Wed, Jun 6, 2012 at 6:38 PM, Darren Govoni <[hidden email]> wrote:
> Incidentally, I discovered that I can execute the ipengine code directly
> in my python IDE and set break points in my user code modules and when I
> execute functions from remote clients/views, it will hit the break
> points and let me debug my code visually (in the running engine). Pretty
> sweet. Though I'd share.

Cool!  Are you starting that one engine manually so it runs in the IDE
instead of letting ipcluster start it?  And if so, are you running a
'cluster of one' just for debugging, or a larger cluster with one
engine added that's tied to the debugger?

f
_______________________________________________
IPython-User mailing list
[hidden email]
http://mail.scipy.org/mailman/listinfo/ipython-user
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate
star

Re: Question about schedulers

Darren Govoni
I run an ipcontroller separate and then open the ipengine source file in
my Wing IDE and run/debug it. I wasn't sure if the debugger would trap
my code, but it does.

To test that, I ran an ipython shell session in another terminal.
created Client() and load balanced view and then ran a map over one of
my module functions and it would reach breakpoints in the debugger. I
was quite happy since now I can debug inside the engine context and step
through my code easily without putting traps, hooks or prints directly
in my code.

On Wed, 2012-06-06 at 18:40 -0700, Fernando Perez wrote:

> On Wed, Jun 6, 2012 at 6:38 PM, Darren Govoni <[hidden email]> wrote:
> > Incidentally, I discovered that I can execute the ipengine code directly
> > in my python IDE and set break points in my user code modules and when I
> > execute functions from remote clients/views, it will hit the break
> > points and let me debug my code visually (in the running engine). Pretty
> > sweet. Though I'd share.
>
> Cool!  Are you starting that one engine manually so it runs in the IDE
> instead of letting ipcluster start it?  And if so, are you running a
> 'cluster of one' just for debugging, or a larger cluster with one
> engine added that's tied to the debugger?
>
> f
> _______________________________________________
> IPython-User mailing list
> [hidden email]
> http://mail.scipy.org/mailman/listinfo/ipython-user


_______________________________________________
IPython-User mailing list
[hidden email]
http://mail.scipy.org/mailman/listinfo/ipython-user
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate
star

Re: Question about schedulers

Fernando Perez
On Wed, Jun 6, 2012 at 7:36 PM, Darren Govoni <[hidden email]> wrote:

> I run an ipcontroller separate and then open the ipengine source file in
> my Wing IDE and run/debug it. I wasn't sure if the debugger would trap
> my code, but it does.
>
> To test that, I ran an ipython shell session in another terminal.
> created Client() and load balanced view and then ran a map over one of
> my module functions and it would reach breakpoints in the debugger. I
> was quite happy since now I can debug inside the engine context and step
> through my code easily without putting traps, hooks or prints directly
> in my code.

OK, just wanted to clarify.

Note that as of yesterday, you can load the parallelmagics extension
and simply type

%px %qtconsole

and automatically one Qtconsole will open per engine, pointed to the
engine namespace.  These consoles treat the engine just like a
'regular ipython' (which they are, since we merged Min's massive
mergekernel PR), and you can thus use everything you'd normally do
interactively (including plotting) in any of your engines.

That's another useful arrow to have in your quiver.

Cheers,

f
_______________________________________________
IPython-User mailing list
[hidden email]
http://mail.scipy.org/mailman/listinfo/ipython-user
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate
star

[IPython-User] Debug code on ipengine in IDE (Re: Question about schedulers)

Jon Olav Vik-3
In reply to this post by Darren Govoni
Darren Govoni <darren <at> ontrenet.com> writes:

> Incidentally, I discovered that I can execute the ipengine code directly
> in my python IDE and set break points in my user code modules and when I
> execute functions from remote clients/views, it will hit the break
> points and let me debug my code visually (in the running engine). Pretty
> sweet. Though I'd share.

Brilliant! This works in Eclipse PyDev on Windows too. What I did:

1. Create an Eclipse project for the c:\python27\scripts folder.
2. Run "ipcluster start --n=0", e.g. in a command window.
3. In Eclipse, open the code you want to debug and set a breakpoint there.
4. In your "scripts" Eclipse project, open ipengine-script.py and "debug as
python run".
5. Run some code that uses IPython.parallel.Client().

In the example below,
cell.py is my main script.
It creates a Client()
which parallelizes cell.py:do(),
which calls task.py:pace().

The "debug" tab in Eclipse shows an impressive call stack:

scripts ipengine-script.py [Python Run]
        ipengine-script.py
                MainThread - pid6940_seq2
                        pace [task.py:26]
                        do [cell.py:54]
                        <module> [<string>:1]
                        apply_request [streamkernel.py:337]
                        dispatch_queue [streamkernel.py:399]
                        dispatcher [streamkernel.py:408]
                        _run_callback [zmqstream.py:365]
                        _handle_recv [zmqstream.py:424]
                        _handle_events [zmqstream.py:391]
                        start [ioloop.py:330]
                        start [ipengineapp.py:316]
                        launch_new_instance [ipengineapp.py:325]
                        <module> [ipengine-script.py:9]
                        run [pydevd.py:1060]
                        <module> [pydevd.py:1346]
                Thread-4 - pid6940_seq4
        ipengine-script.py
metamod cell.py (1) [Python Run]
        C:\git\metamod\cell.py


_______________________________________________
IPython-User mailing list
[hidden email]
http://mail.scipy.org/mailman/listinfo/ipython-user
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate
star

Re: Debug code on ipengine in IDE

Darren Govoni
Nice! And I'm sure the debugger will do its own context switching if or
when any user threads come into play as well - complete with call
stacks. In fact, I can jump back in the call stack in my debugger and
inspect a prior context, then resume normal execution. Definitely cool
within the scope of ipython parallel tasks execution. Very easy to debug
client side AND engine side at once.

On Fri, 2012-06-08 at 21:54 +0000, Jon Olav Vik wrote:

> Darren Govoni <darren <at> ontrenet.com> writes:
>
> > Incidentally, I discovered that I can execute the ipengine code directly
> > in my python IDE and set break points in my user code modules and when I
> > execute functions from remote clients/views, it will hit the break
> > points and let me debug my code visually (in the running engine). Pretty
> > sweet. Though I'd share.
>
> Brilliant! This works in Eclipse PyDev on Windows too. What I did:
>
> 1. Create an Eclipse project for the c:\python27\scripts folder.
> 2. Run "ipcluster start --n=0", e.g. in a command window.
> 3. In Eclipse, open the code you want to debug and set a breakpoint there.
> 4. In your "scripts" Eclipse project, open ipengine-script.py and "debug as
> python run".
> 5. Run some code that uses IPython.parallel.Client().
>
> In the example below,
> cell.py is my main script.
> It creates a Client()
> which parallelizes cell.py:do(),
> which calls task.py:pace().
>
> The "debug" tab in Eclipse shows an impressive call stack:
>
> scripts ipengine-script.py [Python Run]
> ipengine-script.py
> MainThread - pid6940_seq2
> pace [task.py:26]
> do [cell.py:54]
> <module> [<string>:1]
> apply_request [streamkernel.py:337]
> dispatch_queue [streamkernel.py:399]
> dispatcher [streamkernel.py:408]
> _run_callback [zmqstream.py:365]
> _handle_recv [zmqstream.py:424]
> _handle_events [zmqstream.py:391]
> start [ioloop.py:330]
> start [ipengineapp.py:316]
> launch_new_instance [ipengineapp.py:325]
> <module> [ipengine-script.py:9]
> run [pydevd.py:1060]
> <module> [pydevd.py:1346]
> Thread-4 - pid6940_seq4
> ipengine-script.py
> metamod cell.py (1) [Python Run]
> C:\git\metamod\cell.py
>
>
> _______________________________________________
> IPython-User mailing list
> [hidden email]
> http://mail.scipy.org/mailman/listinfo/ipython-user


_______________________________________________
IPython-User mailing list
[hidden email]
http://mail.scipy.org/mailman/listinfo/ipython-user
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate
star

Re: Debug code on ipengine in IDE (Re: Question about schedulers)

Fernando Perez
In reply to this post by Jon Olav Vik-3
On Fri, Jun 8, 2012 at 2:54 PM, Jon Olav Vik <[hidden email]> wrote:
>> Incidentally, I discovered that I can execute the ipengine code directly
>> in my python IDE and set break points in my user code modules and when I
>> execute functions from remote clients/views, it will hit the break
>> points and let me debug my code visually (in the running engine). Pretty
>> sweet. Though I'd share.
>
> Brilliant! This works in Eclipse PyDev on Windows too. What I did:

We really need a tips and tricks section in the wiki...

Min, do you want to kick on a thread off the conversation we had about
the wiki?  I'm happy to revisit that, I'm just not sure we have the
bandwidth for one more conversation quite now ;)

f
_______________________________________________
IPython-User mailing list
[hidden email]
http://mail.scipy.org/mailman/listinfo/ipython-user
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate
star

Re: Debug code on ipengine in IDE

Fernando Perez
In reply to this post by Darren Govoni
On Fri, Jun 8, 2012 at 3:07 PM, Darren Govoni <[hidden email]> wrote:
> Nice! And I'm sure the debugger will do its own context switching if or
> when any user threads come into play as well - complete with call
> stacks. In fact, I can jump back in the call stack in my debugger and
> inspect a prior context, then resume normal execution. Definitely cool
> within the scope of ipython parallel tasks execution. Very easy to debug
> client side AND engine side at once.

Would one of you be willing to make a little youtube screencast about this?

I have the [hidden email] address unused as well as an also unused
ipython google+ organization.  I just locked those things in so we
wouldn't get squatted on.

If someone knows the logistics of managing such a thing in a shared
way, it would be a great way to start putting up screencasts and other
project-based contributions made by anyone who wants to pitch in.

Anyone on the list(s) who's good at this kind of thing and would like to help?

f
_______________________________________________
IPython-User mailing list
[hidden email]
http://mail.scipy.org/mailman/listinfo/ipython-user
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate
star

Re: Debug code on ipengine in IDE (Re: Question about schedulers)

Min RK
In reply to this post by Fernando Perez


On Fri, Jun 8, 2012 at 5:53 PM, Fernando Perez <[hidden email]> wrote:
On Fri, Jun 8, 2012 at 2:54 PM, Jon Olav Vik <[hidden email]> wrote:
>> Incidentally, I discovered that I can execute the ipengine code directly
>> in my python IDE and set break points in my user code modules and when I
>> execute functions from remote clients/views, it will hit the break
>> points and let me debug my code visually (in the running engine). Pretty
>> sweet. Though I'd share.
>
> Brilliant! This works in Eclipse PyDev on Windows too. What I did:

We really need a tips and tricks section in the wiki...

Min, do you want to kick on a thread off the conversation we had about
the wiki?  I'm happy to revisit that, I'm just not sure we have the
bandwidth for one more conversation quite now ;)

I think we can discuss that at another time, when we aren't gearing up for SciPy tutorials and trying to draw the lines around a 0.13 release.

But the gist: Every day I find my dislike of mediawiki and rst growing, and I think for high visibility / low barrier for edits, a GitHub wiki might be an improvement, at least for FAQ / tips&tricks / links type stuff.

-MinRK
 

f
_______________________________________________
IPython-User mailing list
[hidden email]
http://mail.scipy.org/mailman/listinfo/ipython-user


_______________________________________________
IPython-User mailing list
[hidden email]
http://mail.scipy.org/mailman/listinfo/ipython-user
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate
star

Re: Debug code on ipengine in IDE (Re: Question about schedulers)

Aaron Meurer
On Jun 8, 2012, at 6:37 PM, MinRK <[hidden email]> wrote:



On Fri, Jun 8, 2012 at 5:53 PM, Fernando Perez <[hidden email]> wrote:
On Fri, Jun 8, 2012 at 2:54 PM, Jon Olav Vik <[hidden email]> wrote:
>> Incidentally, I discovered that I can execute the ipengine code directly
>> in my python IDE and set break points in my user code modules and when I
>> execute functions from remote clients/views, it will hit the break
>> points and let me debug my code visually (in the running engine). Pretty
>> sweet. Though I'd share.
>
> Brilliant! This works in Eclipse PyDev on Windows too. What I did:

We really need a tips and tricks section in the wiki...

Min, do you want to kick on a thread off the conversation we had about
the wiki?  I'm happy to revisit that, I'm just not sure we have the
bandwidth for one more conversation quite now ;)

I think we can discuss that at another time, when we aren't gearing up for SciPy tutorials and trying to draw the lines around a 0.13 release.

But the gist: Every day I find my dislike of mediawiki and rst growing, and I think for high visibility / low barrier for edits, a GitHub wiki might be an improvement, at least for FAQ / tips&tricks / links type stuff.

-MinRK

We moved our SymPy wiki from MediaWiki/Google Code to GitHub a while ago, and I would recommend it. Markdown is the best markup language IMHO (and it also supports rst for more complex documents). Also, the ability to edit the files with your text editor and manage them with git is a huge plus. 

Some heads up: merging the wikis is a task. Gollum's MediaWiki parser isn't really good. You'll likely want to convert them to Markdown or rst. We've only gotten our pages converted through a lot of volunteer work, and we still have some very old pages that still have formatting problems. If you are regular expression savvy this is where the ability to edit the files locally will come in handy, but it can be tricky, especially with images. There's also the question of getting the files out of MediaWiki, but I'm sure there are scripts to do that. Getting old pages to redirect is probably possible but also probably difficult (unless someone has also already made a nice solution for that too). We never made the effort to do it. 

Second, Gollum is a tad buggy, especially when it comes to non-Markdown formats as I mentioned. But it is open source and their development team seems friendly enough. 

Finally, you'll note that it's a step down from MediaWiki in terms of features. It doesn't even attempt to do merges, for example, and doesn't have a live preview functionality (yet). It's best to edit files locally and let git do the merging in my experience. 

But the pluses are that it's all local and with git, meaning you have the whole wiki including history backed up to every clone, as well as the other powerful features of git like merging and diffs  And it's very easy for anyone with a GitHub account to edit it with the web interface(caveat: if they want to add an image, they have to do it through git, so they either have to have repo push access or fork the wiki and get someone with repo access to merge it, or just get them to upload the file for them.  You cannot grant push access to the wiki without also granting it to the repo that it sits on). 

If you are considering it, I might recommend turning the wiki on (if it isn't already; I don't have Internet now as I'm writing this, so I can't check) and test it, and you'll soon discover if if you think it's worth the switch.  The downside of this is the confusion that comes from having two wikis (but hey, at one point SymPy had three wikis). 

But yeah, turning on the GitHub wiki is something you can do now, but migrating the old one is definitely something that you'll want to wait until you have the bandwidth and/or volunteers for. 

Aaron Meurer


 

f
_______________________________________________
IPython-User mailing list
[hidden email]
http://mail.scipy.org/mailman/listinfo/ipython-user

_______________________________________________
IPython-User mailing list
[hidden email]
http://mail.scipy.org/mailman/listinfo/ipython-user

_______________________________________________
IPython-User mailing list
[hidden email]
http://mail.scipy.org/mailman/listinfo/ipython-user
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate
star

Re: Debug code on ipengine in IDE (Re: Question about schedulers)

Fernando Perez
On Sun, Jun 10, 2012 at 10:20 PM, Aaron Meurer <[hidden email]> wrote:
> We moved our SymPy wiki from MediaWiki/Google Code to GitHub a while ago,
> and I would recommend it. Markdown is the best markup language IMHO (and it
> also supports rst for more complex documents). Also, the ability to edit the
> files with your text editor and manage them with git is a huge plus.

[...]

Aaron, many thanks for your detailed feedback, it sounds like a vote
for moving to GH, despite the reservations we had that led to deciding
on mediawiki (I'm always happy to revisit these decisions as we see
how well they work in real life).

And in our case, I suspect the pain would be pretty minimal for two reasons:

- our wiki is pretty small to begin with
- we already use rst in many pages (all?)

The two above mean that it should be very, very straightforward to run
pandoc manually over the pages we have, by simply copying the rst
source from an edit box, without having to worry about writing code to
pull data out of mediawiki.

Right now our resources are 100% focused on the 0.13 release and then
Min and I have to worry about talks/tutorials for Scipy 2012, but this
is good to know.

If we really want to go that route, it could even be a sprint topic
for one or two new volunteers to help with at Scipy, that would not
require them to dig straight into the main code.

As always, your detailed feedback from your sympy experiece is both
very useful and greatly appreciated.

Best,

f
_______________________________________________
IPython-User mailing list
[hidden email]
http://mail.scipy.org/mailman/listinfo/ipython-user
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate
star

Re: Debug code on ipengine in IDE (Re: Question about schedulers)

Aaron Meurer
On Mon, Jun 11, 2012 at 3:02 PM, Fernando Perez <[hidden email]> wrote:

> On Sun, Jun 10, 2012 at 10:20 PM, Aaron Meurer <[hidden email]> wrote:
>> We moved our SymPy wiki from MediaWiki/Google Code to GitHub a while ago,
>> and I would recommend it. Markdown is the best markup language IMHO (and it
>> also supports rst for more complex documents). Also, the ability to edit the
>> files with your text editor and manage them with git is a huge plus.
>
> [...]
>
> Aaron, many thanks for your detailed feedback, it sounds like a vote
> for moving to GH, despite the reservations we had that led to deciding
> on mediawiki (I'm always happy to revisit these decisions as we see
> how well they work in real life).

By the way, another reason that we moved was that Ondrej was tired of
hosting and maintaining the wiki on his own server. I don't know how
you guys feel about that.

>
> And in our case, I suspect the pain would be pretty minimal for two reasons:
>
> - our wiki is pretty small to begin with

Ah, it is.  You have ~60 pages, compared to ~250 pages in the SymPy
wiki (partly because we tend to use a wiki-based development format
and partly because we ask all our GSoC applicants to put the
applications on the wiki).  Your wiki also seems to be more well
maintained, in the sense that there are no really old pages (we still
have pages on our wiki mentioning hg, even though SymPy stopped using
it in 2008 or 2009, before I even joined the project).

> - we already use rst in many pages (all?)
>
> The two above mean that it should be very, very straightforward to run
> pandoc manually over the pages we have, by simply copying the rst
> source from an edit box, without having to worry about writing code to
> pull data out of mediawiki.

Oh, so it shouldn't be too difficult.  But do note again that Gollum
tends to have subtle rendering bugs for non-markdown, so it's probably
more along the lines of "copy code into the edit box, save, find minor
formatting errors, and fix".  And like I said, it's even easier if you
can exporot them as files because then you can just add them all at
once using git (but in your case I wouldn't go through the hassle
unless someone else has already written a tool to do it).  Images are
a little more tricky because the links have to be redone, but I just
checked and I guess you guys only have one uploaded image on your
wiki.  I'm not sure how MediaWiki vs. Gollum do internal links in rst.
 They may or may not have to be redone.

>
> Right now our resources are 100% focused on the 0.13 release and then
> Min and I have to worry about talks/tutorials for Scipy 2012, but this
> is good to know.

Good luck with all that. I wish we could roll out a SymPy release as
fast as you guys seem to be going with 0.13.

>
> If we really want to go that route, it could even be a sprint topic
> for one or two new volunteers to help with at Scipy, that would not
> require them to dig straight into the main code.

Once again, I recommend just enabling the wiki for now (if you don't
mind having two wikis), and it will start to give you an idea if you
want to move or not.

>
> As always, your detailed feedback from your sympy experiece is both
> very useful and greatly appreciated.
>
> Best,
>
> f
> _______________________________________________
> IPython-User mailing list
> [hidden email]
> http://mail.scipy.org/mailman/listinfo/ipython-user

Aaron Meurer
_______________________________________________
IPython-User mailing list
[hidden email]
http://mail.scipy.org/mailman/listinfo/ipython-user
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate
star

Re: Debug code on ipengine in IDE (Re: Question about schedulers)

Thomas Kluyver-2
On 14 June 2012 05:44, Aaron Meurer <[hidden email]> wrote:
>  Your wiki also seems to be more well
> maintained, in the sense that there are no really old pages

That's partly because the current wiki is relatively new - until some
time last year, we had a Moinmoin wiki on scipy.org. When we moved
pages across, there was some weeding and reorganisation of material.

We did consider at the time moving to a github wiki. I think we felt
that, because the wiki was more aimed at users than developers, it
made more sense to have it outside our Github project. Can someone
remind me of the other reasons?

Personally I'd be a little disappointed if the time I spent tweaking
mediawiki was wasted, but of course if there is a better solution, we
shouldn't worry about sunk costs.

Thanks,
Thomas
_______________________________________________
IPython-User mailing list
[hidden email]
http://mail.scipy.org/mailman/listinfo/ipython-user
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate
star

Re: Debug code on ipengine in IDE (Re: Question about schedulers)

Fernando Perez
On Thu, Jun 14, 2012 at 3:06 AM, Thomas Kluyver <[hidden email]> wrote:
>
> Personally I'd be a little disappointed if the time I spent tweaking
> mediawiki was wasted, but of course if there is a better solution, we
> shouldn't worry about sunk costs.

I certainly don't want to push for something that wastes anyone's
work, much less yours.  So we'll come to this decision *only* if we
all agree that the benefits outweigh the costs, and there's no rush
whatsoever to do it right away (much less with all we have on our
plate at the moment).

Cheers,

f
_______________________________________________
IPython-User mailing list
[hidden email]
http://mail.scipy.org/mailman/listinfo/ipython-user
Loading...