On Wed, 2015-03-18 at 00:35 +0000, Mark Lawrence wrote:

> I've just come across this

>

http://www.stavros.io/posts/brilliant-or-insane-code/ as a result of

> this

http://bugs.python.org/issue23695>

> Any and all opinions welcomed, I'm chickening out and sitting firmly on

> the fence.

I'll go with clever, as most others have said here; I don't have much to

add to what Steven, Dan, Oscar, and Chris have said. FTR, I do prefer

the:

def func(arr):

i = iter(arr)

return list(zip(i, i, i)) # py3's zip is py2's izip

I will poke at his benchmarking some, though; since they're really

unfairly biased toward the "magical" zip(*([iter(arr)]*3)) syntax:

He compares 3 different options:

In [5]: %timeit [(arr[3*x], arr[3*x+1], arr[3*x+2]) for x in range(len(arr)/3)]

10 loops, best of 3: 41.3 ms per loop

In [6]: %timeit numpy.reshape(arr, (-1, 3))

10 loops, best of 3: 25.3 ms per loop

In [7]: timeit zip(*([iter(arr)]*3))

100 loops, best of 3: 13.4 ms per loop

The first one spends a substantial amount of time doing the same

calculation -- 3*x. You can trivially shave about 25% of the time off

that with no algorithmic changes just by avoiding tripling x so much:

In [8]: %timeit [(arr[x], arr[x+1], arr[x+2]) for x in xrange(0, len(arr), 3)]

10 loops, best of 3: 26.6 ms per loop

Any compiler would optimize that out, but the Python interpreter

doesn't. As for numpy -- the vast majority of the time spent there is

in data copy. As Oscar pointed out, numpy is almost always faster and

better at what numpy does well. If you run this test on a numpy array

instead of a list:

In [10]: %timeit numpy.reshape(arr2, (-1, 3))

100000 loops, best of 3: 1.91 ?s per loop

So here, option 2 is really ~4 orders of magnitude faster; but that's a

little cheating since no data is ever copied (reshape always returns a

view). Doing an actual data copy, but always living in numpy, is a

little closer to pure Python (but still ~1 order of magnitude faster):

In [14]: %timeit np.array(arr2.reshape((-1, 3)))

1000 loops, best of 3: 307 ?s per loop

So the iter-magic is really about 10-100x slower than an equivalent

numpy (and 10000x slower than an optimized numpy) variant, and only ~2x

faster than the more explicit option.

But it's still a nice trick (and one I may make use of in the

future :)).

Thanks for the post,

Jason