nbexplode - experiment in version controlling notebooks

classic Classic list List threaded Threaded
7 messages Options
Reply | Threaded
Open this post in threaded view
|

nbexplode - experiment in version controlling notebooks

Thomas Kluyver-2
I've just completed a rough prototype of a concept we've been calling 'nbexplode'. If you've wrestled with merge conflicts in notebooks kept in text-based VCSs, you might like to investigate it.

https://github.com/takluyver/nbexplode/

Since current VCSs don't understand any structure within a file (beyond lines), the idea is to let the VCS know about the structure of a notebook by breaking it up into many files for separate cells and outputs. The VCS should then be smarter about merging separate changes. When you check out an exploded notebook, you can recombine it into a single .ipynb file to work with.

This is still sub-optimal, because filesystems (and therefore VCSs) have no notion of an ordered sequence. And it's probably more unwieldy for viewing diffs, because cells appear out of order.

Thomas

_______________________________________________
IPython-dev mailing list
[hidden email]
http://mail.scipy.org/mailman/listinfo/ipython-dev
Reply | Threaded
Open this post in threaded view
|

Re: nbexplode - experiment in version controlling notebooks

Brian Granger-3
Ooohhh, very cool! Thanks for working on this!

On Thu, Mar 5, 2015 at 11:36 AM, Thomas Kluyver <[hidden email]> wrote:
I've just completed a rough prototype of a concept we've been calling 'nbexplode'. If you've wrestled with merge conflicts in notebooks kept in text-based VCSs, you might like to investigate it.

https://github.com/takluyver/nbexplode/

Since current VCSs don't understand any structure within a file (beyond lines), the idea is to let the VCS know about the structure of a notebook by breaking it up into many files for separate cells and outputs. The VCS should then be smarter about merging separate changes. When you check out an exploded notebook, you can recombine it into a single .ipynb file to work with.

This is still sub-optimal, because filesystems (and therefore VCSs) have no notion of an ordered sequence. And it's probably more unwieldy for viewing diffs, because cells appear out of order.

Thomas

_______________________________________________
IPython-dev mailing list
[hidden email]
http://mail.scipy.org/mailman/listinfo/ipython-dev




--
Brian E. Granger
Cal Poly State University, San Luis Obispo
@ellisonbg on Twitter and GitHub
[hidden email] and [hidden email]

_______________________________________________
IPython-dev mailing list
[hidden email]
http://mail.scipy.org/mailman/listinfo/ipython-dev
Reply | Threaded
Open this post in threaded view
|

Re: nbexplode - experiment in version controlling notebooks

Maximilian Albert
In reply to this post by Thomas Kluyver-2
Very interesting idea! Thanks for working on this, Thomas! (And on everything else. ;) )

Would you favour this approach over something like nbdiff [1,2]? Disclaimer: I haven't tried nbdiff yet, just stumbled upon it and thought it looked promising. But I'm wondering what sort of future direction you core devs have in mind for dealing with "semantic" diffing/merging of notebooks.

Cheers,
Max

[1] http://nbdiff.org/
[2] https://github.com/tarmstrong/nbdiff


2015-03-05 19:36 GMT+00:00 Thomas Kluyver <[hidden email]>:
I've just completed a rough prototype of a concept we've been calling 'nbexplode'. If you've wrestled with merge conflicts in notebooks kept in text-based VCSs, you might like to investigate it.

https://github.com/takluyver/nbexplode/

Since current VCSs don't understand any structure within a file (beyond lines), the idea is to let the VCS know about the structure of a notebook by breaking it up into many files for separate cells and outputs. The VCS should then be smarter about merging separate changes. When you check out an exploded notebook, you can recombine it into a single .ipynb file to work with.

This is still sub-optimal, because filesystems (and therefore VCSs) have no notion of an ordered sequence. And it's probably more unwieldy for viewing diffs, because cells appear out of order.

Thomas

_______________________________________________
IPython-dev mailing list
[hidden email]
http://mail.scipy.org/mailman/listinfo/ipython-dev



_______________________________________________
IPython-dev mailing list
[hidden email]
http://mail.scipy.org/mailman/listinfo/ipython-dev
Reply | Threaded
Open this post in threaded view
|

Re: nbexplode - experiment in version controlling notebooks

rossant
For a different (orthogonal) approach to the problem of VCS with
ipynb, you can have a look at https://github.com/rossant/ipymd/. It
lets you replace the JSON ipynb format with regular Markdown or Python
scripts. It's convenient if you don't need to save
metadata/output/images/etc. These formats are more VCS-friendly than
JSON.

Cyrille

2015-03-06 10:47 GMT+01:00 Maximilian Albert <[hidden email]>:

> Very interesting idea! Thanks for working on this, Thomas! (And on
> everything else. ;) )
>
> Would you favour this approach over something like nbdiff [1,2]? Disclaimer:
> I haven't tried nbdiff yet, just stumbled upon it and thought it looked
> promising. But I'm wondering what sort of future direction you core devs
> have in mind for dealing with "semantic" diffing/merging of notebooks.
>
> Cheers,
> Max
>
> [1] http://nbdiff.org/
> [2] https://github.com/tarmstrong/nbdiff
>
>
> 2015-03-05 19:36 GMT+00:00 Thomas Kluyver <[hidden email]>:
>>
>> I've just completed a rough prototype of a concept we've been calling
>> 'nbexplode'. If you've wrestled with merge conflicts in notebooks kept in
>> text-based VCSs, you might like to investigate it.
>>
>> https://github.com/takluyver/nbexplode/
>>
>> Since current VCSs don't understand any structure within a file (beyond
>> lines), the idea is to let the VCS know about the structure of a notebook by
>> breaking it up into many files for separate cells and outputs. The VCS
>> should then be smarter about merging separate changes. When you check out an
>> exploded notebook, you can recombine it into a single .ipynb file to work
>> with.
>>
>> This is still sub-optimal, because filesystems (and therefore VCSs) have
>> no notion of an ordered sequence. And it's probably more unwieldy for
>> viewing diffs, because cells appear out of order.
>>
>> Thomas
>>
>> _______________________________________________
>> IPython-dev mailing list
>> [hidden email]
>> http://mail.scipy.org/mailman/listinfo/ipython-dev
>>
>
>
> _______________________________________________
> IPython-dev mailing list
> [hidden email]
> http://mail.scipy.org/mailman/listinfo/ipython-dev
>
_______________________________________________
IPython-dev mailing list
[hidden email]
http://mail.scipy.org/mailman/listinfo/ipython-dev
Reply | Threaded
Open this post in threaded view
|

Re: nbexplode - experiment in version controlling notebooks

Thomas Kluyver-2
In reply to this post by Maximilian Albert

On 6 Mar 2015 01:47, "Maximilian Albert" <[hidden email]> wrote:
> Would you favour this approach over something like nbdiff [1,2]? Disclaimer: I haven't tried nbdiff yet, just stumbled upon it and thought it looked promising.

I don't think this is better than nbdiff: it's much worse for a human looking at the diff, but it may allow some more merges to be handled automatically, without manually fixing conflicts. It may even be possible to combine the two, so nbexplode helps the vcs, and nbdiff helps the user when the VCS can't merge automatically.

> But I'm wondering what sort of future direction you core devs have in mind for dealing with "semantic" diffing/merging of notebooks.

Write a new VCS? I'm mostly joking, but so long as VCSs are designed for nothing but flat text files, I think storing structured data in them will be kind of hackish.

Thomas


_______________________________________________
IPython-dev mailing list
[hidden email]
http://mail.scipy.org/mailman/listinfo/ipython-dev
Reply | Threaded
Open this post in threaded view
|

Re: nbexplode - experiment in version controlling notebooks

Michael Borysow
Is there anything like Semantic Merge (https://www.semanticmerge.com/) that understands something simpler like JSON?

Still eagerly awaiting them adding support for Python.

Oh, and this is my first email to the list, hello everyone!

Regards,
Michael


On 03/06/2015 10:32 AM, Thomas Kluyver wrote:

On 6 Mar 2015 01:47, "Maximilian Albert" <[hidden email]> wrote:
> Would you favour this approach over something like nbdiff [1,2]? Disclaimer: I haven't tried nbdiff yet, just stumbled upon it and thought it looked promising.

I don't think this is better than nbdiff: it's much worse for a human looking at the diff, but it may allow some more merges to be handled automatically, without manually fixing conflicts. It may even be possible to combine the two, so nbexplode helps the vcs, and nbdiff helps the user when the VCS can't merge automatically.

> But I'm wondering what sort of future direction you core devs have in mind for dealing with "semantic" diffing/merging of notebooks.

Write a new VCS? I'm mostly joking, but so long as VCSs are designed for nothing but flat text files, I think storing structured data in them will be kind of hackish.

Thomas



_______________________________________________
IPython-dev mailing list
[hidden email]
http://mail.scipy.org/mailman/listinfo/ipython-dev

-- 
Michael Borysow, Ph.D.
Engineering Scientist
The University of Texas at Austin
Applied Research Laboratories
[hidden email]
(512) 835-3396

_______________________________________________
IPython-dev mailing list
[hidden email]
http://mail.scipy.org/mailman/listinfo/ipython-dev
Reply | Threaded
Open this post in threaded view
|

Re: nbexplode - experiment in version controlling notebooks

Thomas Kluyver-2


On 6 Mar 2015 08:37, "Michael Borysow" <[hidden email]> wrote:
> Semantic Merge (https://www.semanticmerge.com/)

Interesting, I hadn't seen that before. Pity it's not open source.

Thomas


_______________________________________________
IPython-dev mailing list
[hidden email]
http://mail.scipy.org/mailman/listinfo/ipython-dev