Re: [sage-devel] Re: Jupyter notebook by default?

 (cross-posting to ipython-dev)

Jon,

At the recent San Francisco meetings, we talked about this.  What do you think about:

1. keeping track of the size of the io messages sent from any specific kernel execution
2. When the total size of io reaches some specific size (user-configurable), transmitting a special "throwing away output, but here's how to save the output to a file if you want in the future, or how to increase the limit" message
3. keep a running buffer of the last bit of output attempted to be sent, and send it when the execution finishes (so basically a ring buffer that overwrites the oldest message)

This:
* allows small output through
* provides an explanatory message
* provides the last bit of output as well

One thing to figure out: a limit on size of output that is text may not be appropriate for output that is images, etc.

Thanks,

Jason

On Tue, Jan 5, 2016 at 12:11 PM, Jason Grout wrote:

---------- Forwarded message ----------
From: Jonathan Frederic
Date: Tue, Jan 5, 2016 at 11:42 AM
Subject: Re: [sage-devel] Re: Jupyter notebook by default?
To: Jason Grout
Cc: sage-devel

Jason,

Thanks for pulling me in on this.  

William,

I agree, getting a bunch of people to agree on stuff can seem impossible.  However, you mention Sage offers a couple options to mitigate output overflows, can you point me to those options?  The Jupyter Notebook should provide multiple options too - this will also make it easier for everyone to agree.

Also, in you experience, which of these options work the best?  I was thinking initially of doing something simple, like hard limiting data/time, then printing an error if that's exceeded.  In the Jupyter Notebook, we have to worry about
- Too many messages sent on the websocket
- The notebook json file growing too large and consequently becoming unopenable
- Too much data being appended to the DOM, crashing the browser

Thanks!
-Jon

On Tue, Jan 5, 2016 at 10:19 AM, Jason Grout wrote:

On Tuesday, January 5, 2016 at 8:17:45 AM UTC-7, William wrote:

One example of a subtle feature in Sage (notebook and worksheets) not in Jupyter, which I was just reminded of, is output limiting.  In Sage there are numerous rules/options to deal with people doing stuff like:

while True:
    print "hi!"

... which is exactly what students will tend to do by accident...

Jupyter doesn't deal with this, but it might not be too hard to implement in theory.  One of the main problems is figuring out what the arbitrary rate limiting defaults "should" be; it's arbitrary, and depends a lot on whether everything is local, over the web, etc. so getting a bunch of people to agree is hard, which might mean they will never implement anything.

William,

Jon Frederic in the Jupyter dev meeting happening right now said that he will be working on output limiting as one of his next things.

Jason
Re: [sage-devel] Re: Jupyter notebook by default?

## Re: [sage-devel] Re: Jupyter notebook by default?

Re: [sage-devel] Re: Jupyter notebook by default?

Re: [sage-devel] Re: Jupyter notebook by default?

 On Wed, Jan 6, 2016 at 1:02 PM, Volker Braun wrote:

On Wednesday, January 6, 2016 at 11:55:36 AM UTC+1, Min RK wrote:

If we truncate instead of virtual-scroll, then we have a choice for whether truncated output is included in the document or not, which alleviates the problem of opening notebooks that have a problematic amount of output

There is no fundamental problem with large amounts of output (really, any content), and there is essentially only a single way to do it right:

The view (dom) needs only a fixed number of dom nodes for a virtual scroll.

The in-browser view model can lazily load the current scroll position, with a suitable cache. Fixed amount of browser JS memory.

The server can just mmap the output file, or alternatively seek around in the file. With a suitable index. Fixed amount of server-side memory.

Files aren't used for output. The filesystem should only be involved, if at all, in the exceptional case of output overflow. The kernel has to block if the notebook server can't append output fast enough, thats normal flow control just like in a pipe. Fixed memory usage in the kernel.
 Thanks Jason for cross-posting. Since the issue of funding was brought up, I think supporting projects like this is exactly the sort of thing we should be doing with the funding we have, whether the work sits on the Jupyter or Sage side (I assume there will be both).

It's a bit tricky to keep track of all the points in an email thread, but if we could aggregate the things that are blockers and the things that would be nice, especially changes you need from Jupyter, we should be able to start ticking boxes. A summary of what I've seen so far:

sage interacts
language cells
document conversion from sagenb to ipynb
low-level output capturing
gracefully handling large output

Some comments:

Re: language cells, I assume it's referring to things like %%bash, %%R, and %%cython. While these look similar, there is a significant difference in how they are implemented. For instance, the R magic (provided by rpy2) runs an R interpreter in-memory, and talks to it, capturing output, etc.. Where many of these magics, such as bash, ruby, perl, come from is some "script magic" machinery in IPython, which populates the default magics with shortcuts to running a script in a given interpreter. They are essentially shortcuts to cat | . It's not a fundamental limitation, or anything dire like that. If sage has an implementation of running code in a persistent alternate interpreter, then it should not be much work to represent that in magics, since cell magics are any Python functions called with two string arguments (the rest of the line and the cell), and can be defined at any time, for instance:

def mymagic(line, cell):
    do_stuff_with(cell)

get_ipython().register_magic_function(mymagic, 'cell') Re: output capturing, Thomas Kluyver and I were at CERN last month working on the Cling kernel, and one of the things we did was C-level capturing of output. Now that we have that working, integrating it into the IPython kernel should not be much work, and if it's really important, libraries can use the same technique themselves without waiting for IPython to catch up.

Interacts are perhaps the hardest piece. I think it should be doable to get sage's own interacts working in the notebook, rather than forcing people to adopt the much more basic interact provided by the IPython widgets.

I can't speak to the UI transition part of the problem whenever you change defaults, which is a big challenge, but I think we can at least mitigate most of the things on the Jupyter side that are getting in your way.

-MinRK On Tue, Jan 5, 2016 at 8:19 PM, Jason Grout wrote:

(cross-posting to ipython-dev)

Jon,

At the recent San Francisco meetings, we talked about this.  What do you think about:

1. keeping track of the size of the io messages sent from any specific kernel execution
2. When the total size of io reaches some specific size (user-configurable), transmitting a special "throwing away output, but here's how to save the output to a file if you want in the future, or how to increase the limit" message
3. keep a running buffer of the last bit of output attempted to be sent, and send it when the execution finishes (so basically a ring buffer that overwrites the oldest message)

This:
* allows small output through
* provides an explanatory message
* provides the last bit of output as well

One thing to figure out: a limit on size of output that is text may not be appropriate for output that is images, etc.

Thanks,

Jason