# Slices time complexity

82 messages
12345
Open this post in threaded view
|

## Slices time complexity

 I'd like to understand what I'm being told about slices in https://wiki.python.org/moin/TimeComplexityParticularly, what's a 'del slice' and a 'set slice' and whether this information pertains to both CPython 2.7 and 3.4. >From the above link it seems slices work in linear time on all cases. And this really has a big impact on certain operations. For instance, the code below may surprise some people when they realize it doesn't run in linear time on 3.4:     def minimum(values):         if len(values) == 1:             return values[0]         else:             m = minimum(values[1:])             return m if m < values[0] else values[0] Other languages implement slices. I'm currently being faced with a Go snippet that mirrors the exact code above and it does run in linear time. Is there any reason why Python 3.4 implementation of slices cannot be a near constant operation?
Open this post in threaded view
|

## Slices time complexity

 On Tue, May 19, 2015 at 5:23 AM, Mario Figueiredo wrote: > From the above link it seems slices work in linear time on all cases. > And this really has a big impact on certain operations. For instance, > the code below may surprise some people when they realize it doesn't > run in linear time on 3.4: > >     def minimum(values): >         if len(values) == 1: >             return values[0] >         else: >             m = minimum(values[1:]) >             return m if m < values[0] else values[0] https://xkcd.com/1270/Is there really a reason to code this in such a bizarrely inefficient way? Linear or not, it's bound to be less efficient than the more obvious form: def minimum(values):     values = iter(values)     min = next(values)     for value in values:         if value < min: min = value     return min And if you know your value domain (maybe you're working with floats, or positive integers, or something) and can put a hard-coded base value in, it becomes even simpler: def minimum(values):     min = 0 # or float("-inf") etc     for value in values:         if value < min: min = value     return min It's obvious that this code will complete in linear time. It's also pretty obvious how it's working: it steps through the collection, comparing each value against the current smallest. With your recursive version, it effectively steps backward through the list, comparing each value against the current smallest, all while unwinding the stack. It can't even be tail-call-optimized in its current state. What's the point of optimizing slicing to allow you to use a poor algorithm, instead of fixing your algorithm? ChrisA
Open this post in threaded view
|

## Slices time complexity

 In reply to this post by Mario Figueiredo On May 18, 2015 9:26 PM, "Mario Figueiredo" wrote: > > I'd like to understand what I'm being told about slices in > https://wiki.python.org/moin/TimeComplexity> > Particularly, what's a 'del slice' and a 'set slice' and whether this > information pertains to both CPython 2.7 and 3.4. > > From the above link it seems slices work in linear time on all cases. > And this really has a big impact on certain operations. For instance, > the code below may surprise some people when they realize it doesn't > run in linear time on 3.4: > >     def minimum(values): >         if len(values) == 1: >             return values[0] >         else: >             m = minimum(values[1:]) >             return m if m < values[0] else values[0] > > Other languages implement slices. I'm currently being faced with a Go > snippet that mirrors the exact code above and it does run in linear > time. > > Is there any reason why Python 3.4 implementation of slices cannot be > a near constant operation? In this case you are copying the items (or rather pointers to the items) from the list to a new list. This is inherently a O(k) operation. There are other ways to implement it. I recall the was talk of a way to get views of sequences, although this would involve keeping the original list in memory and any changes to the new list would be passed to the original (this is how numpy works) . It is also possible to do copy-on-write, which avoids altering the original list but has the same memory problems while still involving a O(k) copy operation if you want to change the list, and it is more complex to implement (this is how MATLAB works) . But to have a new list that can be edited independently requires coping something, and that will be a O(k) operation, unless you use a radically different data structure with its own limitations. -------------- next part -------------- An HTML attachment was scrubbed... URL:
Open this post in threaded view
|

## Slices time complexity

 In reply to this post by Mario Figueiredo On Tue, 19 May 2015 05:36:44 +1000, Chris Angelico wrote: >What's the point of optimizing slicing to allow you to use a poor >algorithm, instead of fixing your algorithm? > Chris, thank you for your input. But the code isn't really the question, is it? It's just an example. It was being used earlier to demonstrate Time Complexity calculations in another forum. It's not real live code. Never will be. Besides we already have a min() function in Python.
Open this post in threaded view
|

## Slices time complexity

 In reply to this post by Mario Figueiredo On Mon, May 18, 2015 at 1:23 PM, Mario Figueiredo wrote: > I'd like to understand what I'm being told about slices in > https://wiki.python.org/moin/TimeComplexity> > Particularly, what's a 'del slice' and a 'set slice' and whether this > information pertains to both CPython 2.7 and 3.4. "Del Slice" is the operation where a slice of a list is deleted, and "Set Slice" is the operation where a slice is replaced. E.g.: >>> x = list(range(100)) >>> del x[2:98] >>> x [0, 1, 98, 99] >>> x[1:3] = [7, 6, 5, 4, 3] >>> x [0, 7, 6, 5, 4, 3, 99] > Other languages implement slices. I'm currently being faced with a Go > snippet that mirrors the exact code above and it does run in linear > time. > > Is there any reason why Python 3.4 implementation of slices cannot be > a near constant operation? The semantics are different. IIRC, a slice in Go is just a view of some underlying array; if you change the array (or some other slice of it), the change will be reflected in the slice. A slice of a list in Python, OTOH, constructs a completely independent list. It may be possible that lists in CPython could be made to share their internal arrays with other lists on a copy-on-write basis, which could allow slicing to be O(1) as long as neither list is modified while the array is being shared. I expect this would be a substantial piece of work, and I don't know if it's something that anybody has looked into.
Open this post in threaded view
|

## Slices time complexity

 In reply to this post by Mario Figueiredo On 05/18/2015 09:49 PM, Ian Kelly wrote: > It may be possible that lists in CPython could be made to share their > internal arrays with other lists on a copy-on-write basis, which could > allow slicing to be O(1) as long as neither list is modified while the > array is being shared. I expect this would be a substantial piece of > work, and I don't know if it's something that anybody has looked into. Isn't Numpy doing this (not sure, not a python nor a numpy expert):  >>> import numpy as np  >>> a = np.array([1,2,3,4])  >>> b = a[1:]  >>> b[0] = 9  >>> a array([1, 9, 3, 4]) Fabien
Open this post in threaded view
|

## Slices time complexity

Open this post in threaded view
|

## Slices time complexity

 In reply to this post by Fabien On May 18, 2015 9:56 PM, "Fabien" wrote: > > On 05/18/2015 09:49 PM, Ian Kelly wrote: >> >> It may be possible that lists in CPython could be made to share their >> internal arrays with other lists on a copy-on-write basis, which could >> allow slicing to be O(1) as long as neither list is modified while the >> array is being shared. I expect this would be a substantial piece of >> work, and I don't know if it's something that anybody has looked into. > > > Isn't Numpy doing this (not sure, not a python nor a numpy expert): > > >>> import numpy as np > >>> a = np.array([1,2,3,4]) > >>> b = a[1:] > >>> b[0] = 9 > >>> a > array([1, 9, 3, 4]) > > > Fabien Numpy arrays use views. Matlab arrays use copy-on-write.  But as discussed in the recent thread about string views, these approaches have a memory penalty since they require keeping the original array in memory. And the copy-on-write approach still has a O(k) complexity if you try to make any changes. -------------- next part -------------- An HTML attachment was scrubbed... URL:
Open this post in threaded view
|

## Slices time complexity

 In reply to this post by Mario Figueiredo On Mon, 18 May 2015 13:49:45 -0600, Ian Kelly wrote: >> Other languages implement slices. I'm currently being faced with a Go >> snippet that mirrors the exact code above and it does run in linear >> time. >> >> Is there any reason why Python 3.4 implementation of slices cannot be >> a near constant operation? > >The semantics are different. IIRC, a slice in Go is just a view of >some underlying array; if you change the array (or some other slice of >it), the change will be reflected in the slice. A slice of a list in >Python, OTOH, constructs a completely independent list. > >It may be possible that lists in CPython could be made to share their >internal arrays with other lists on a copy-on-write basis, which could >allow slicing to be O(1) as long as neither list is modified while the >array is being shared. I expect this would be a substantial piece of >work, and I don't know if it's something that anybody has looked into. This is what I was after. Thank you Ian. So we basically don't have a view of a list. Makes sense now, since slice does create a new list. I should have seen that. Thank you once again. It would a good addition though. Even if only on specialized implementations like pypy. Inspecting and not changing lists is a good chunk of our code. But that's besides the point. I was more interested in understanding the behavior. Thank you once again.
Open this post in threaded view
|

## Slices time complexity

 On 18/05/2015 22:04, Mario Figueiredo wrote: > On Mon, 18 May 2015 13:49:45 -0600, Ian Kelly > wrote: > >>> Other languages implement slices. I'm currently being faced with a Go >>> snippet that mirrors the exact code above and it does run in linear >>> time. >>> >>> Is there any reason why Python 3.4 implementation of slices cannot be >>> a near constant operation? >> >> The semantics are different. IIRC, a slice in Go is just a view of >> some underlying array; if you change the array (or some other slice of >> it), the change will be reflected in the slice. A slice of a list in >> Python, OTOH, constructs a completely independent list. >> >> It may be possible that lists in CPython could be made to share their >> internal arrays with other lists on a copy-on-write basis, which could >> allow slicing to be O(1) as long as neither list is modified while the >> array is being shared. I expect this would be a substantial piece of >> work, and I don't know if it's something that anybody has looked into. > > This is what I was after. Thank you Ian. > > So we basically don't have a view of a list. Makes sense now, since > slice does create a new list. I should have seen that. Thank you once > again. > > It would a good addition though. Even if only on specialized > implementations like pypy. Inspecting and not changing lists is a good > chunk of our code. But that's besides the point. I was more interested > in understanding the behavior. Thank you once again. > Not directly affecting slices but the idea of views has come into Python 3, please see:- https://www.python.org/dev/peps/pep-3118/https://docs.python.org/3/whatsnew/3.3.html#pep-3118-new-memoryview-implementation-and-buffer-protocol-documentationhttps://docs.python.org/3/library/stdtypes.html#typememoryview-- My fellow Pythonistas, ask not what our language can do for you, ask what you can do for our language. Mark Lawrence
Open this post in threaded view
|

## Slices time complexity

 In reply to this post by Mario Figueiredo On 5/18/2015 5:04 PM, Mario Figueiredo wrote: >>> Other languages implement slices. I'm currently being faced with a Go >>> snippet that mirrors the exact code above and it does run in linear >>> time. >>> Is there any reason why Python 3.4 implementation of slices cannot be >>> a near constant operation? >> >> The semantics are different. IIRC, a slice in Go is just a view of >> some underlying array; if you change the array (or some other slice of >> it), the change will be reflected in the slice. A slice of a list in >> Python, OTOH, constructs a completely independent list. >> >> It may be possible that lists in CPython could be made to share their >> internal arrays with other lists on a copy-on-write basis, which could >> allow slicing to be O(1) as long as neither list is modified while the >> array is being shared. I expect this would be a substantial piece of >> work, and I don't know if it's something that anybody has looked into. > > This is what I was after. Thank you Ian. > > So we basically don't have a view of a list. Actually we do if you think about things the right way.  An index can be viewed as representing the slice of a list from the indexed item to the end.  In this view, "for i in range(len(seq)):" works with progressively shrinking slices, the same as with the recursive version of the algorithm. The analogy is better with iterators.  iter(seq) returns a seq_iterator that initially represent a tail slice consisting of the entire sequence.   Each next(seq_iter) call return the head of the sequence and mutates seq_iter to represent a reduced tail-slice.  The effect is the same as repeatedly stripping the head from a linked list. For statements automate the next calls and StopIteration checks. -- Terry Jan Reedy
Open this post in threaded view
|

## Slices time complexity

 In reply to this post by Mario Figueiredo On Tuesday, May 19, 2015 at 1:20:55 AM UTC+5:30, Ian wrote: > It may be possible that lists in CPython could be made to share their > internal arrays with other lists on a copy-on-write basis, which could > allow slicing to be O(1) as long as neither list is modified while the > array is being shared. I expect this would be a substantial piece of > work, and I don't know if it's something that anybody has looked into. One fundamental difference/choice in language semantics is copy vs reference C#/.Net call it value-type and reference-type. [The names may be a bit odd...] Functional languages OTOH tend to the philosophy: - maximise sharing internally - mutability taboo which has its own costs -- eg 'obvious' data structures like arrays become hard/impossible Its quite likely that your proposal - slices as COW-views -- will have significant penalties in other usage scenarios.  Imagine plastering more and more complex slice-views on an array inter-mixed with updates. I must say I am impressed by C#/.Net for making the value/object distinction first-class all the way from language to VM.
Open this post in threaded view
|

## Slices time complexity

 On Tuesday 19 May 2015 12:20, Rustom Mody wrote: > I must say I am impressed by C#/.Net for making the value/object > distinction first-class all the way from language to VM. I'm not sure what you mean by that. Are you referring to something similar to Java's distinction between native/unboxed types versus objects and boxed values? Apart from the possible efficiency gains, what benefit do you see from distinguishing between "values which are objects" versus "values which are not objects"? -- Steve
Open this post in threaded view
|

## Slices time complexity

 On Tuesday, May 19, 2015 at 10:44:10 AM UTC+5:30, Steven D'Aprano wrote: > On Tuesday 19 May 2015 12:20, Rustom Mody wrote: > > > I must say I am impressed by C#/.Net for making the value/object > > distinction first-class all the way from language to VM. > > I'm not sure what you mean by that. Neither am I [Dont know too much about that environment] > Are you referring to something similar > to Java's distinction between native/unboxed types versus objects and boxed > values? Yes except that it seems to be more core to C# than to Java (or python): https://msdn.microsoft.com/en-us/library/s1ax56ch.aspxAnd then this distinction goes all the way down to the CLR [in ways that I am not very clear about] eg http://www.informit.com/articles/article.aspx?p=30608&seqNum=3> Apart from the possible efficiency gains, what benefit do you see from > distinguishing between "values which are objects" versus "values which are > not objects"? As I said, in the context of a low level language its probably a bit of a misnomer However conceptually/pedagogically making a fundamenal distinction of timeless | time value | object immutable | mutable expression | statement function | procedure is key to getting programming [and is something that Pascal got better than most of its successors]. The FPers want to squeeze the whole world into column 1 The OOPers want to do the opposite and are embarrassed by the existence of column-1 [status of int in java etc] Unless one is committed to some philosophical extreme position -- Only One True Way -- I believe accepting two fundamentals is the most sane choice
Open this post in threaded view
|

## Slices time complexity

 Rustom Mody : > However conceptually/pedagogically making a fundamenal distinction of > timeless | time > value | object > immutable | mutable > expression | statement > function | procedure > > is key to getting programming [and is something that Pascal got better > than most of its successors]. > > The FPers want to squeeze the whole world into column 1 > The OOPers want to do the opposite and are embarrassed by the existence of > column-1 [status of int in java etc] > Unless one is committed to some philosophical extreme position -- > Only One True Way -- I believe accepting two fundamentals is the most > sane choice I sympathize. Can you get Python without getting a language like C first? Can a baby be born without an umbilical cord? Can you skip Newton and go straight to quantum mechanics and relativity? I have noticed some experienced Java programmers are a bit lost in the woods because they don't have an idea of what is going on under the hood. Scheme almost gets away with the dilemma but must face it with pairs:     A pair (sometimes called a dotted pair) is a record structure with     two fields called the car and cdr fields (for historical reasons).     Pairs are created by the procedure cons. The car and cdr fields are     accessed by the procedures car and cdr. The car and cdr fields are     assigned by the procedures set-car! and set-cdr!.     Pairs are used primarily to represent lists.     What is a "record structure?" What is a "field?" What is "creation?" Still, I don't like the dichotomy of boxed/unboxed values. Is a Python integer an "object" or a "value?" The Faithful have a surefire answer, but I think the key is: who cares? What matters is the effect of a program. If two metaphysical formulations of language semantics produce the same outcome, neither is objectively better than the other. Pedagogically, I think you could introduce topics with half-truths and simplifications, and revisit them later with corrections when the students are better equipped to handle the abstractions. Marko
Open this post in threaded view
|

## Slices time complexity

 On Tuesday 19 May 2015 17:12, Marko Rauhamaa wrote: > Can you get Python without getting a language like C first? Yes. There are a few senses in which C is very important to Python. Python was designed to be a glue language for interoperability with C libraries, so that's a very important sense, but in another way it is a trivial sense. You could replace C by any other language, or no language at all, and Python would still be more or less the same. Another sense is that Python has borrowed some syntax and concepts from C, e.g. zero-based indexing, "continue" and "break" keywords, etc. But these are incidentals, the language would be mostly still the same if "continue" was spelled "next" in the Pascal tradition. Python has been influenced by, and borrowed concepts from, a whole slew of languages, including Pascal, Algol, Modula 3, Icon, CLU, Lisp, Haskell, TCL, Smalltalk, ML, and especially ABC. [...] > Still, I don't like the dichotomy of boxed/unboxed values. Distinguishing the two is important for mere reasons of implementation efficiency, but it makes the language semantics messy. > Is a Python integer an "object" or a "value?" Why should that be a dichotomy? Ints are values. Floats are values. Lists are values. Strings are values. Structs and records and union types and even functions are values. Why should objects not also be values? > The Faithful have a surefire answer, > but I think the key is: who cares? What matters is the effect of a > program. If two metaphysical formulations of language semantics produce > the same outcome, neither is objectively better than the other. By outcome, do you mean the program's output? But the metaphysical formulation of the language is irrelevant to the program output. You can get the same output in Forth, Haskell, Prolog, C, Applescript or a Turing Machine, despite having radically different paradigms. The outcome which matters is *usability*. Some paradigms are better suited to human understanding than others. Complexity and inconsistency is hard. A language where the rules for dealing with lists is different from those for floats will be harder to use than a language where they are both treated in the same way. I'm not talking about the functionality available to each type -- obviously lists and floats use different syntax and different functionality. But if you wrote `\$spam = alist` to assign a list, and `spam := \$afloat` to assign a float, that would be bad. A consistent language paradigm is better than an inconsistent one. Think of Perl with its complicated rules that you have to memorize before you know whether to prefix a variable with a \$ or a @ or whatever the sigils are. -- Steve
Open this post in threaded view
|

## Slices time complexity

 In reply to this post by Rustom Mody On Tuesday 19 May 2015 15:33, Rustom Mody wrote: > However conceptually/pedagogically making a fundamenal distinction of > timeless | time > value | object > immutable | mutable > expression | statement > function | procedure > > is key to getting programming [and is something that Pascal got better > than most of its successors]. Hmmm. Well, I don't quite know what distinction you are making between timeless and time, that's ambiguous, and I strongly disagree with the value/object distinction, but the rest seems reasonable. -- Steve
Open this post in threaded view
|

## Slices time complexity

 In reply to this post by Steven D'Aprano-11 Steven D'Aprano : >> What matters is the effect of a program. If two metaphysical >> formulations of language semantics produce the same outcome, neither >> is objectively better than the other. > > By outcome, do you mean the program's output? Yes, the program's output, but also the interactions between program parts. > But the metaphysical formulation of the language is irrelevant to the > program output. You can get the same output in Forth, Haskell, Prolog, > C, Applescript or a Turing Machine, despite having radically different > paradigms. You misunderstood my point. A valid Python program would not produce the same output when given to a Prolog interpreter (it would likely produce a syntax error). What I'm saying is that it doesn't matter what semantic description you give Python constructs as long as the observed behavior is correct. For example, you could explain Python's object references as pointers (memory addresses) if you considered that helpful. (That's how Lisp textbooks often explain cons cells.) Marko
Open this post in threaded view
|

## Slices time complexity

 On Tue, May 19, 2015 at 6:39 PM, Marko Rauhamaa wrote: > For example, you could explain Python's object references as pointers > (memory addresses) if you considered that helpful. (That's how Lisp > textbooks often explain cons cells.) Sorta-kinda-maybe, but if a C programmer's idea of pointers is invoked to explain Python's object references, the differences will start to be problematic: 1) Pointer arithmetic simply doesn't exist in Python. Arrays/lists are not just pointers to their first elements, and subscripting is most definitely NOT "add to pointer and dereference". 2) In fact, dereferencing as a whole isn't really a 'thing' either. At best, it happens automatically. 3) References actually mean something. Copying a pointer doesn't. Whether the Python you're using is refcounted (CPython) or mark-and-sweep (uPy, I think) or some other model, an additional reference to the same object will prevent it from being disposed of, which isn't the case in C. 4) A pointer is itself a value. You can pass a pointer-to-local-variable to another function and have that function change a local variable. 5) Furthermore, since a pointer is a value, you can have pointers to pointers, etc. Doesn't make any sense in Python. A closer comparison is the C++ "reference", which works kinda like pass-by-reference, kinda like a pointer, and kinda not like anything at all, really. But that's still not the same thing as HLL object semantics (the same semantics used by Python, Pike, ECMAScript, and a bunch of other languages). As long as you are aware that analogies are always limited, you can certainly use them as part of an explanation; but you do have to take care. ChrisA