Brilliant or insane code?

classic Classic list List threaded Threaded
5 messages Options
Reply | Threaded
Open this post in threaded view
|

Brilliant or insane code?

Mark Lawrence
I've just come across this
http://www.stavros.io/posts/brilliant-or-insane-code/ as a result of
this http://bugs.python.org/issue23695

Any and all opinions welcomed, I'm chickening out and sitting firmly on
the fence.

--
My fellow Pythonistas, ask not what our language can do for you, ask
what you can do for our language.

Mark Lawrence



Reply | Threaded
Open this post in threaded view
|

Brilliant or insane code?

Chris Angelico
On Wed, Mar 18, 2015 at 11:35 AM, Mark Lawrence <breamoreboy at yahoo.co.uk> wrote:
> I've just come across this
> http://www.stavros.io/posts/brilliant-or-insane-code/ as a result of this
> http://bugs.python.org/issue23695
>
> Any and all opinions welcomed, I'm chickening out and sitting firmly on the
> fence.

"""I don?t really like the fact that it relies in an implementation
detail (the order the zip function iterates over the arrays) to
work"""

I don't think that's an implementation detail. Since zip is designed
to work with arbitrary iterables, it needs to be able to cope with
those which interact with each other. The most obvious order for it to
grab values is the order it was given them, and that's exactly what it
does. If that's not specifically documented, it should be, because
this won't be the only code that depends on it.

ChrisA


Reply | Threaded
Open this post in threaded view
|

Brilliant or insane code?

Oscar Benjamin-2
In reply to this post by Mark Lawrence
On 18 March 2015 at 00:35, Mark Lawrence <breamoreboy at yahoo.co.uk> wrote:
> I've just come across this
> http://www.stavros.io/posts/brilliant-or-insane-code/ as a result of this
> http://bugs.python.org/issue23695
>
> Any and all opinions welcomed, I'm chickening out and sitting firmly on the
> fence.

It seems fine to me. The code in question was rightly put into a
sensibly named function to avoid confusion (although the doc-string is
a little confusing).

The timing comparison with numpy.reshape is unfair though:
numpy.reshape is for working with numpy arrays and is vastly more
efficient than the zip method when used as intended.


Oscar


Reply | Threaded
Open this post in threaded view
|

Brilliant or insane code?

Jason Swails
In reply to this post by Mark Lawrence
On Wed, 2015-03-18 at 00:35 +0000, Mark Lawrence wrote:
> I've just come across this
> http://www.stavros.io/posts/brilliant-or-insane-code/ as a result of
> this http://bugs.python.org/issue23695
>
> Any and all opinions welcomed, I'm chickening out and sitting firmly on
> the fence.

I'll go with clever, as most others have said here; I don't have much to
add to what Steven, Dan, Oscar, and Chris have said.  FTR, I do prefer
the:

def func(arr):
    i = iter(arr)
    return list(zip(i, i, i)) # py3's zip is py2's izip

I will poke at his benchmarking some, though; since they're really
unfairly biased toward the "magical" zip(*([iter(arr)]*3)) syntax:

He compares 3 different options:

In [5]: %timeit [(arr[3*x], arr[3*x+1], arr[3*x+2]) for x in range(len(arr)/3)]
10 loops, best of 3: 41.3 ms per loop

In [6]: %timeit numpy.reshape(arr, (-1, 3))
10 loops, best of 3: 25.3 ms per loop

In [7]: timeit zip(*([iter(arr)]*3))
100 loops, best of 3: 13.4 ms per loop

The first one spends a substantial amount of time doing the same
calculation -- 3*x.  You can trivially shave about 25% of the time off
that with no algorithmic changes just by avoiding tripling x so much:

In [8]: %timeit [(arr[x], arr[x+1], arr[x+2]) for x in xrange(0, len(arr), 3)]
10 loops, best of 3: 26.6 ms per loop

Any compiler would optimize that out, but the Python interpreter
doesn't.  As for numpy -- the vast majority of the time spent there is
in data copy.  As Oscar pointed out, numpy is almost always faster and
better at what numpy does well. If you run this test on a numpy array
instead of a list:

In [10]: %timeit numpy.reshape(arr2, (-1, 3))
100000 loops, best of 3: 1.91 ?s per loop

So here, option 2 is really ~4 orders of magnitude faster; but that's a
little cheating since no data is ever copied (reshape always returns a
view).  Doing an actual data copy, but always living in numpy, is a
little closer to pure Python (but still ~1 order of magnitude faster):

In [14]: %timeit np.array(arr2.reshape((-1, 3)))
1000 loops, best of 3: 307 ?s per loop

So the iter-magic is really about 10-100x slower than an equivalent
numpy (and 10000x slower than an optimized numpy) variant, and only ~2x
faster than the more explicit option.

But it's still a nice trick (and one I may make use of in the
future :)).

Thanks for the post,
Jason




Reply | Threaded
Open this post in threaded view
|

Brilliant or insane code?

Robin Becker
In reply to this post by Mark Lawrence
On 18/03/2015 00:35, Mark Lawrence wrote:
> I've just come across this http://www.stavros.io/posts/brilliant-or-insane-code/
> as a result of this http://bugs.python.org/issue23695
>
> Any and all opinions welcomed, I'm chickening out and sitting firmly on the fence.
>

There was a long thread on an inverse problem (2d only interleave separate lists
of x & y coordinates) in clp some while ago

https://groups.google.com/forum/#!topic/comp.lang.python/ODqrLDRn61k

the winner at that time was was Peter Otten with

> def flatten7(x,y):
> '''Peter Otten special case equal lengths'''
> n = len(x)
> assert len(y) == n
> result = [None] * (2*n)
> result[::2] = x
> result[1::2] = y
> return result

interestingly whilst many of the other solutions can be improved/modernized in
later pythons this one has stayed the same.
--
Robin Becker