functions, optional parameters

classic Classic list List threaded Threaded
20 messages Options
Reply | Threaded
Open this post in threaded view
|

functions, optional parameters

Steven D'Aprano-11
On Fri, 8 May 2015 09:59 pm, Michael Welle wrote:

> Hello,
>
> assume the following function definition:
>
> def bar(foo = []):
>     print("foo: %s" % foo)
>     foo.append("foo")
>
> It doesn't work like one would expect (or as I would expect ;-)). As I
> understand it the assignment of the empty list to the optional parameter
> foo take place when the function object is created, not when it is
> called. I think from the perspective of a user this is very strange.

I think it is perfectly expected.

Do you think that Python will re-compile the body of the function every time
you call it? Setting the default is part of the process of compiling the
function.


If we have this function definition:

    def spam(eggs=long_complex_calculation()):
        pass


do you expect the long complex calculation to be repeated every time you
call the function?

How about this definition:

    default = 23
    def spam(eggs=default):
        pass

    del default

    print spam()


Do you expect the function call to fail because `default` doesn't exist?


My answers to those questions are all No. To me, it is not only expected,
but desirable that function defaults are set once, not every time the
function is called. This behaviour is called "early binding" of defaults.

The opposite behaviour is called "late binding".

If your language uses late binding, it is very inconvenient to get early
binding when you want it. But if your language uses early binding, it is
very simple to get late binding when you want it: just put the code you
want to run inside the body of the function:

    # simulate late binding
    def spam(eggs=None):
        if eggs is None:
            # perform the calculation every time you call the function
            eggs = long_complex_calculation()


    default = 23
    def spam(eggs=None):
        if eggs is None:
            # look up the global variable every time you call the function
            eggs = default


On the rare times that you want to allow None as an ordinary value, you can
create your own private sentinel value:


    _SENTINEL = object()

    def spam(eggs=_SENTINEL):
        if eggs is _SENTINEL:
            eggs = something_else



--
Steven



Reply | Threaded
Open this post in threaded view
|

functions, optional parameters

Ian Kelly-2
On May 8, 2015 9:26 AM, "Steven D'Aprano" <
steve+comp.lang.python at pearwood.info> wrote:
>
> Do you think that Python will re-compile the body of the function every
time
> you call it? Setting the default is part of the process of compiling the
> function.

To be a bit pedantic, that's not accurate. The default is evaluated when
the function object is created, i.e. when the def statement is executed at
runtime, not when the underlying code object is compiled.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/python-list/attachments/20150508/8ade46fb/attachment.html>

Reply | Threaded
Open this post in threaded view
|

functions, optional parameters

Chris Angelico
On Sat, May 9, 2015 at 1:48 AM, Ian Kelly <ian.g.kelly at gmail.com> wrote:

> On May 8, 2015 9:26 AM, "Steven D'Aprano"
> <steve+comp.lang.python at pearwood.info> wrote:
>>
>> Do you think that Python will re-compile the body of the function every
>> time
>> you call it? Setting the default is part of the process of compiling the
>> function.
>
> To be a bit pedantic, that's not accurate. The default is evaluated when the
> function object is created, i.e. when the def statement is executed at
> runtime, not when the underlying code object is compiled.

Aside from constructing two closures in the same context and proving
that their __code__ attributes point to the same object, is there any
way to distinguish between "code object compilation time" and "def
execution time"? I just played around with it, and as far as I can
tell, code objects are completely read-only.

ChrisA


Reply | Threaded
Open this post in threaded view
|

functions, optional parameters

Steven D'Aprano-11
In reply to this post by Steven D'Aprano-11
On Sat, 9 May 2015 01:48 am, Ian Kelly wrote:

> On May 8, 2015 9:26 AM, "Steven D'Aprano" <
> steve+comp.lang.python at pearwood.info> wrote:
>>
>> Do you think that Python will re-compile the body of the function every
> time
>> you call it? Setting the default is part of the process of compiling the
>> function.
>
> To be a bit pedantic, that's not accurate. The default is evaluated when
> the function object is created, i.e. when the def statement is executed at
> runtime, not when the underlying code object is compiled.

Yes, that is the pedantically correct version.

"Technically correct -- the best kind of correct."



--
Steven



Reply | Threaded
Open this post in threaded view
|

functions, optional parameters

Steven D'Aprano-11
In reply to this post by Ian Kelly-2
On Sat, 9 May 2015 02:02 am, Chris Angelico wrote:

> On Sat, May 9, 2015 at 1:48 AM, Ian Kelly <ian.g.kelly at gmail.com> wrote:
>> On May 8, 2015 9:26 AM, "Steven D'Aprano"
>> <steve+comp.lang.python at pearwood.info> wrote:
>>>
>>> Do you think that Python will re-compile the body of the function every
>>> time
>>> you call it? Setting the default is part of the process of compiling the
>>> function.
>>
>> To be a bit pedantic, that's not accurate. The default is evaluated when
>> the function object is created, i.e. when the def statement is executed
>> at runtime, not when the underlying code object is compiled.
>
> Aside from constructing two closures in the same context and proving
> that their __code__ attributes point to the same object, is there any
> way to distinguish between "code object compilation time" and "def
> execution time"? I just played around with it, and as far as I can
> tell, code objects are completely read-only.

Sure there is. Write this Python code:


# test.py
print("Function definition time.")
def func():
    pass


Now from your shell, run this:


echo "Compile time"
python -m compileall test.py
rm test.py
sleep 5
python test.pyc



(The sleep is just to make it clear that the compilation and definition time
can be very far apart. They could literally be years apart.)



Actually, we don't need external tools, we can do it all in Python!

py> source = """\
... print "Function definition time."
... def func():
...     pass
... """
py> print "Compile time."; code = compile(source, '', 'exec')
Compile time.
py> exec(code)
Function definition time.




--
Steven



Reply | Threaded
Open this post in threaded view
|

functions, optional parameters

Chris Angelico
On Sat, May 9, 2015 at 3:36 AM, Steven D'Aprano
<steve+comp.lang.python at pearwood.info> wrote:

> On Sat, 9 May 2015 02:02 am, Chris Angelico wrote:
>> Aside from constructing two closures in the same context and proving
>> that their __code__ attributes point to the same object, is there any
>> way to distinguish between "code object compilation time" and "def
>> execution time"? I just played around with it, and as far as I can
>> tell, code objects are completely read-only.
>
> Sure there is. Write this Python code:
>
> py> source = """\
> ... print "Function definition time."
> ... def func():
> ...     pass
> ... """
> py> print "Compile time."; code = compile(source, '', 'exec')
> Compile time.
> py> exec(code)
> Function definition time.

Yes, but can you *distinguish* them in terms of default argument
versus code object creation? How do you know that the function's code
object was created when compile() happened, rather than being created
when the function was defined? Is there anything that lets you in any
way show different behaviour based on that timing difference?

ChrisA


Reply | Threaded
Open this post in threaded view
|

functions, optional parameters

mwilson
In reply to this post by Steven D'Aprano-11
On Sat, 09 May 2015 03:49:36 +1000, Chris Angelico wrote:

> Yes, but can you *distinguish* them in terms of default argument versus
> code object creation? How do you know that the function's code object
> was created when compile() happened, rather than being created when the
> function was defined? Is there anything that lets you in any way show
> different behaviour based on that timing difference?

This might show that default objects are fixed at run time:

Python 2.7.3 (default, Mar 14 2014, 11:57:14)
[GCC 4.7.2] on linux2
Type "help", "copyright", "credits" or "license" for more information.
>>> a = []
>>> def b (arr=a):
...   arr.append ('c')
...
>>> print repr(a)
[]
>>> b()
>>> print repr(a)
['c']
>>> b()
>>> print repr(a)
['c', 'c']
>>>



Reply | Threaded
Open this post in threaded view
|

functions, optional parameters

Gregory Ewing
In reply to this post by Steven D'Aprano-11
Chris Angelico wrote:
> How do you know that the function's code
> object was created when compile() happened, rather than being created
> when the function was defined?

Python 3.4.2 (default, Feb  4 2015, 20:08:25)
[GCC 4.2.1 (Apple Inc. build 5664)] on darwin
Type "help", "copyright", "credits" or "license" for more information.
 >>> source = "def f(x = 42): pass"
 >>> code = compile(source, "", "exec")
 >>> c1 = code.co_consts[1]
 >>> c1
<code object f at 0x53d430, file "", line 1>
 >>> e = {}
 >>> exec(code, e)
 >>> c2 = e['f'].__code__
 >>> c2
<code object f at 0x53d430, file "", line 1>
 >>> c1 is c2
True

Is that proof enough for you?

--
Greg


Reply | Threaded
Open this post in threaded view
|

functions, optional parameters

Chris Angelico
On Sat, May 9, 2015 at 11:41 AM, Gregory Ewing
<greg.ewing at canterbury.ac.nz> wrote:

> Chris Angelico wrote:
>>
>> How do you know that the function's code
>> object was created when compile() happened, rather than being created
>> when the function was defined?
>
>
> Python 3.4.2 (default, Feb  4 2015, 20:08:25)
> [GCC 4.2.1 (Apple Inc. build 5664)] on darwin
> Type "help", "copyright", "credits" or "license" for more information.
>>>> source = "def f(x = 42): pass"
>>>> code = compile(source, "", "exec")
>>>> c1 = code.co_consts[1]
>>>> c1
> <code object f at 0x53d430, file "", line 1>
>>>> e = {}
>>>> exec(code, e)
>>>> c2 = e['f'].__code__
>>>> c2
> <code object f at 0x53d430, file "", line 1>
>>>> c1 is c2
> True
>
> Is that proof enough for you?

That's what I reached for as my first try, but it's no different from this:

>>> def f(x=42): return x + 1
...
>>> n1 = f()
>>> n2 = f(n1-1)
>>> n1 is n2
True

Clearly in this case, the "x + 1" is getting evaluated at run-time,
and yet the interpreter is welcome to intern the constants. So no, it
isn't proof - it's equally well explained by the code object being
constant.

ChrisA


Reply | Threaded
Open this post in threaded view
|

functions, optional parameters

Steven D'Aprano-11
In reply to this post by Steven D'Aprano-11
On Sat, 9 May 2015 03:49 am, Chris Angelico wrote:

> On Sat, May 9, 2015 at 3:36 AM, Steven D'Aprano
> <steve+comp.lang.python at pearwood.info> wrote:
>> On Sat, 9 May 2015 02:02 am, Chris Angelico wrote:
>>> Aside from constructing two closures in the same context and proving
>>> that their __code__ attributes point to the same object, is there any
>>> way to distinguish between "code object compilation time" and "def
>>> execution time"? I just played around with it, and as far as I can
>>> tell, code objects are completely read-only.
>>
>> Sure there is. Write this Python code:
>>
>> py> source = """\
>> ... print "Function definition time."
>> ... def func():
>> ...     pass
>> ... """
>> py> print "Compile time."; code = compile(source, '', 'exec')
>> Compile time.
>> py> exec(code)
>> Function definition time.
>
> Yes, but can you *distinguish* them in terms of default argument
> versus code object creation? How do you know that the function's code
> object was created when compile() happened, rather than being created
> when the function was defined? Is there anything that lets you in any
> way show different behaviour based on that timing difference?

I think the answer is, "yes, but it's only by peering into the
implementation".

You can read the source code of the Python compiler.

You can compile the code, and then disassemble the byte-code to see that the
code object exists but the function is assembled when the byte-code runs:


py> from dis import dis
py> code = compile("def spam(x): return x + name", "", "exec")
py> dis(code)
  1           0 LOAD_CONST               0 (<code object spam at 0xb7b88160,
file "", line 1>)
              3 LOAD_CONST               1 ('spam')
              6 MAKE_FUNCTION            0
              9 STORE_NAME               0 (spam)
             12 LOAD_CONST               2 (None)
             15 RETURN_VALUE



Contrast that to what happens with a default argument:


py> code = compile("def spam(x=name+1): return x + name", "", "exec")
py> dis(code)
  1           0 LOAD_NAME                0 (name)
              3 LOAD_CONST               0 (1)
              6 BINARY_ADD
              7 LOAD_CONST               1 (<code object spam at 0xb7bce890,
file "", line 1>)
             10 LOAD_CONST               2 ('spam')
             13 MAKE_FUNCTION            1
             16 STORE_NAME               1 (spam)
             19 LOAD_CONST               3 (None)
             22 RETURN_VALUE




--
Steven



Reply | Threaded
Open this post in threaded view
|

functions, optional parameters

Gregory Ewing
In reply to this post by Gregory Ewing
Chris Angelico wrote:

> So no, it
> isn't proof - it's equally well explained by the code object being
> constant.

I suppose, strictly speaking, that's true -- but
then the code object *might as well* be created
at compile time, since the semantics are identical.

In any case, it's easy to see from the data structure
that the default values are kept in the function object:

Python 3.4.2 (default, Feb  4 2015, 20:08:25)
[GCC 4.2.1 (Apple Inc. build 5664)] on darwin
Type "help", "copyright", "credits" or "license" for more information.
 >>> y = "spam"
 >>> def f(x = y): pass
...
 >>> f.__defaults__
('spam',)

I suppose you could argue that f.__defaults__
could be a computed property that's looking inside
f.__code__ somewhere, but that would be stretching
credibility.

--
Greg


Reply | Threaded
Open this post in threaded view
|

functions, optional parameters

Ian Kelly-2
In reply to this post by Steven D'Aprano-11
On Fri, May 8, 2015 at 9:50 AM, Michael Welle <mwe012008 at gmx.net> wrote:
>
> Steven D'Aprano <steve+comp.lang.python at pearwood.info> writes:
>>
>> If your language uses late binding, it is very inconvenient to get early
>> binding when you want it. But if your language uses early binding, it is
>> very simple to get late binding when you want it: just put the code you
>> want to run inside the body of the function:
> And you have to do it all the time again and again. I can't provide hard
> numbers, but I think usually I want late binding.

You could perhaps write a decorator to evaluate your defaults at call
time. This one relies on inspect.signature, so it requires Python 3.3
or newer:

import inspect
from functools import wraps

def late_defaults(**defaults):
    def decorator(f):
        sig = inspect.signature(f)
        @wraps(f)
        def wrapped(*args, **kwargs):
            bound_args = sig.bind_partial(*args, **kwargs)
            for name, get_value in defaults.items():
                if name not in bound_args.arguments:
                    bound_args.arguments[name] = get_value()
            return f(*bound_args.args, **bound_args.kwargs)
        return wrapped
    return decorator

@late_defaults(b=lambda: x+1, c=lambda: y*2)
def f(a, b, c=None):
    print(a, b, c)

x = 14
y = 37
f(10)
x = 30
y = 19
f(10)
f(10, 11)
f(10, 11, c=12)

Output:

10 15 74
10 31 38
10 11 38
10 11 12

For documentation purposes I suggest using default values of None in
the function spec to indicate that the arguments are optional, and
elaborating on the actual defaults in the docstring. Alternatively you
could put the lambdas in the the actual function spec and then just
tell the decorator which ones to apply if not supplied, but that would
result in less readable pydoc.


Reply | Threaded
Open this post in threaded view
|

functions, optional parameters

Steven D'Aprano-11
In reply to this post by Steven D'Aprano-11
On Sat, 9 May 2015 01:50 am, Michael Welle wrote:

[...]

>> How about this definition:
>>
>>     default = 23
>>     def spam(eggs=default):
>>         pass
>>
>>     del default
>>
>>     print spam()
>>
>>
>> Do you expect the function call to fail because `default` doesn't exist?
>
> If I reference an object, that isn't available in the current context, I
> want to see it fail, yes.

Well, that's an interesting response. Of course I agree with you if the
reference to default is in the code being executed:

def spam():
    value = default


that's quite normal rules for Python functions.

Aside: note that *closures* behave differently, by design: a closure will
keep non-local values alive even if the parent function is deleted.

py> def outer():
...     default = 23
...     def closure():
...             return default
...     return closure
...
py> f = outer()
py> del outer
py> f()
23


But I don't agree with you about default parameters. Suppose we do this:

default = 23
eggs = default
# some time later
del default
print(eggs)

I trust that you agree that eggs shouldn't raise a NameError here just
because default no longer exists!

Why should that be any different just because the assignment is inside a
parameter list?

def spam(eggs=default):
    ...


One of the nice things about Python's current behaviour is that function
defaults don't behave any differently from any other name binding. Python
uses the same semantics for binding names wherever the name is without the
need for users to memorise a bunch of special rules.

Things which look similar should behave similarly.


>> My answers to those questions are all No.
>
> Different answers are possible as it seems ;).

Obviously :-)

And if Python used late binding, as some other languages do (Lisp, I think),
we would have a FAQ

"Q: Why does my function run slowly/raise an exception when I use a default
value?"

"A: Because the default is re-evaluated every time you call the function,
not just once when you define it."


>> To me, it is not only expected,
>> but desirable that function defaults are set once, not every time the
>> function is called. This behaviour is called "early binding" of defaults.
>>
>> The opposite behaviour is called "late binding".
>>
>> If your language uses late binding, it is very inconvenient to get early
>> binding when you want it. But if your language uses early binding, it is
>> very simple to get late binding when you want it: just put the code you
>> want to run inside the body of the function:
>
> And you have to do it all the time again and again. I can't provide hard
> numbers, but I think usually I want late binding.

I'm pretty sure that you don't. You just think you do because you're
thinking of the subset of cases where you want to use a mutable default
like [], or perhaps delay looking up a global default until runtime, and
not thinking of all the times you use a default.

I predict that the majority of the time, late binding would just be a
pointless waste of time:

def process_string(thestr, start=0, end=None, slice=1, reverse=True):
    pass

Why would you want 0, None, 1 and True to be re-evaluated every time?
Admittedly it will be fast, but not as fast as evaluating them once, then
grabbing a static default value when needed. (See below for timings.)

Whether you use early or late binding, Python still has to store the
default, then retrieve it at call-time. What happens next depends on the
binding model.

With early binding, Python has the value, and can just use it directly. With
late binding, it needs to store a delayed computation object, an executable
expression if you prefer. There are two obvious ways to implement such a
thunk in Python: a code object, or a function.

thunk = compile('0', '', 'eval')  # when the function is defined
value = eval(thunk)  # when the function is called

# or

thunk = lambda: 0
value = thunk()


Both of those are considerably slower than the current behaviour:

py> from timeit import Timer
py> static = Timer("x = 0")
py> thunk = Timer("x = eval(t)", setup="t = compile('0', '', 'eval')")
py> func = Timer("x = f()", setup="f = lambda: 0")
py> min(static.repeat(repeat=7))  # Best of seven trials.
0.04563648998737335
py> min(thunk.repeat(repeat=7))
1.2324241530150175
py> min(func.repeat(repeat=7))
0.20116623677313328


It would be nice to have syntax for late binding, but given that we don't,
and only have one or the other, using early binding is much more sensible.


This is the point where some people try to suggest some sort of complicated,
fragile, DWIM heuristic where the compiler tries to guess whether the user
actually wants the default to use early or late binding, based on what the
expression looks like. "0 is an immutable int, use early binding; [] is a
mutable list, use late binding." sort of thing. Such a thing might work
well for the obvious cases, but it would be a bugger to debug and
work-around for the non-obvious cases when it guesses wrong -- and it will.


--
Steven



Reply | Threaded
Open this post in threaded view
|

functions, optional parameters

Chris Angelico
On Sun, May 10, 2015 at 12:45 PM, Steven D'Aprano
<steve+comp.lang.python at pearwood.info> wrote:
> This is the point where some people try to suggest some sort of complicated,
> fragile, DWIM heuristic where the compiler tries to guess whether the user
> actually wants the default to use early or late binding, based on what the
> expression looks like. "0 is an immutable int, use early binding; [] is a
> mutable list, use late binding." sort of thing. Such a thing might work
> well for the obvious cases, but it would be a bugger to debug and
> work-around for the non-obvious cases when it guesses wrong -- and it will.

What you could have is "late-binding semantics, optional early binding
as an optimization but only in cases where the result is
indistinguishable". That would allow common cases (int/bool/str/None
literals) to be optimized, since there's absolutely no way for them to
evaluate differently.

I personally don't think it'd be that good an idea, but it's a simple
enough rule that it wouldn't break anything. As far as anyone's code
is concerned, the rule is "late binding, always". In fact, that would
be the language definition; the rest is an optimization. (It's like
how "x.y()" technically first looks up attribute "y" on object x, then
calls the result; but it's perfectly reasonable for a Python
implementation to notice this extremely common case and do an
"optimized method call" that doesn't actually create a function
object.) The simpler the rule, the easier to grok, and therefore the
less chance of introducing bugs.

ChrisA


Reply | Threaded
Open this post in threaded view
|

functions, optional parameters

Rustom Mody
In reply to this post by Steven D'Aprano-11
On Sunday, May 10, 2015 at 8:16:07 AM UTC+5:30, Steven D'Aprano wrote:

> I predict that the majority of the time, late binding would just be a
> pointless waste of time:
>
> def process_string(thestr, start=0, end=None, slice=1, reverse=True):
>     pass
>
> Why would you want 0, None, 1 and True to be re-evaluated every time?
> Admittedly it will be fast, but not as fast as evaluating them once, then
> grabbing a static default value when needed. (See below for timings.)
>

And what is the work involved in (re)computing 0, None, 1, True??

If I write (... arg=square_root_of_grahams_number())
I would expect to pay for it.
If I write trivial defaults, then I expect trivial payment.


Reply | Threaded
Open this post in threaded view
|

functions, optional parameters

Steven D'Aprano-11
In reply to this post by Steven D'Aprano-11
On Sun, 10 May 2015 01:33 pm, Chris Angelico wrote:

> On Sun, May 10, 2015 at 12:45 PM, Steven D'Aprano
> <steve+comp.lang.python at pearwood.info> wrote:
>> This is the point where some people try to suggest some sort of
>> complicated, fragile, DWIM heuristic where the compiler tries to guess
>> whether the user actually wants the default to use early or late binding,
>> based on what the expression looks like. "0 is an immutable int, use
>> early binding; [] is a mutable list, use late binding." sort of thing.
>> Such a thing might work well for the obvious cases, but it would be a
>> bugger to debug and work-around for the non-obvious cases when it guesses
>> wrong -- and it will.
>
> What you could have is "late-binding semantics, optional early binding
> as an optimization but only in cases where the result is
> indistinguishable". That would allow common cases (int/bool/str/None
> literals) to be optimized, since there's absolutely no way for them to
> evaluate differently.
>
> I personally don't think it'd be that good an idea, but it's a simple
> enough rule that it wouldn't break anything.

It's a change in semantics, and it would break code that expects early
binding.


> As far as anyone's code
> is concerned, the rule is "late binding, always".

Sure, other languages have made that choice. I think it is the wrong choice,
but if we went back to 1991 Guido could have made that same choice.


> In fact, that would
> be the language definition; the rest is an optimization. (It's like
> how "x.y()" technically first looks up attribute "y" on object x, then
> calls the result; but it's perfectly reasonable for a Python
> implementation to notice this extremely common case and do an
> "optimized method call" that doesn't actually create a function
> object.)

class X:
   def y(self): pass

y is already a function object.

I think maybe you've got it backwards, and you mean the *method* object
doesn't have to be created. Well, sure, that's possible, and maybe PyPy
does something like that, and maybe it doesn't. Or maybe the function
descriptor __get__ method could cache the result:

# inside FunctionType class
    def __get__(self, instance, type):
        if type is not None:
            if self._method is None:
                self._method = MethodType(self, instance)
            return self._method
        else:
            return self

(I think that's more or less how function __get__ currently works, apart
from the caching. But don't quote me.)

But that's much simpler than the early/late binding example. You talk
about "the obvious cases" like int, bool, str and None. What about floats
and frozensets, are they obvious? How about tuples? How about
MyExpensiveImmutableObject?


> The simpler the rule, the easier to grok, and therefore the
> less chance of introducing bugs.

You're still going to surprise people who expect early binding:

FLAG = True

def spam(eggs=FLAG):
    ...


What do you mean, the default value gets recalculated every time I call
spam? It's an obvious immutable type! And why does Python crash when I
delete FLAG?

Worse:


def factory():
    funcs = []
    for i in range(1, 5):
        def adder(x, y=i):
            return x + y
        adder.__name__ = "adder%d" % i
        funcs.append(adder)
    return funcs


The current behaviour with early binding:


py> funcs = factory()
py> [f(100) for f in funcs]
[101, 102, 103, 104]


What would it do with late binding? That's a tricky one. I can see two
likely results:

[f(100) for f in funcs]
=> returns [104, 104, 104, 104]

or

NameError: name 'i' is not defined


both of which are significantly less useful.

As I've said, it is trivial to get late binding semantics if you start with
early binding: just move setting the default value into the body of the
function. 99% of the time you can use None as a sentinel, so the common
case is easy:

def func(x=None):
    if x is None:
        x = some_complex_calculation(i, want, to, repeat, each, time)


and the rest of the time, you just need *one* persistent variable to hold a
sentinel value to use instead of None:

_sentinel = object
def func(x=_sentinel, y=_sentinel, z=_sentinel):
    if x is _sentinel: ...


But if you start with late binding, it's hard to *cleanly* get early binding
semantics. You need a separate global for each parameter of every function
in the module:

_default_x = some_complex_calculation(do, it, once)
_default_y = another_complex_calculation(do, it, once)
_default_z = a_third_complex_calculation(do, it, once)
_default_x_for_some_other_function = something_else()


def func(x=_default_x, y=_default_x, z=_default_z):  # oops, see the bug
    ...


which is just hideous. And even then, you don't really have early binding,
you have a lousy simulacra of it. If you modify or delete any of the
default globals, you're screwed.

No, early binding by default is the only sensible solution, and Guido got it
right. Having syntax for late binding would be a bonus, but it isn't really
needed. We already have a foolproof and simple way to evaluate an
expression at function call-time: put it in the body of the function.


--
Steven



Reply | Threaded
Open this post in threaded view
|

functions, optional parameters

Steven D'Aprano-11
In reply to this post by Rustom Mody
On Sun, 10 May 2015 01:35 pm, Rustom Mody wrote:

> On Sunday, May 10, 2015 at 8:16:07 AM UTC+5:30, Steven D'Aprano wrote:
>> I predict that the majority of the time, late binding would just be a
>> pointless waste of time:
>>
>> def process_string(thestr, start=0, end=None, slice=1, reverse=True):
>>     pass
>>
>> Why would you want 0, None, 1 and True to be re-evaluated every time?
>> Admittedly it will be fast, but not as fast as evaluating them once, then
>> grabbing a static default value when needed. (See below for timings.)
>>
>
> And what is the work involved in (re)computing 0, None, 1, True??

Re-computing a constant is about 5 times more expensive than re-using it,
according to my earlier timing tests. So if you have four of them, there
will be about 20 times more overhead due to the defaults, each and every
time you call the function.

Setting the defaults isn't the only source of overhead, but my guestimate is
that switching to late binding would probably double the overall overhead
of calling a function with one or two defaults. If your function is
expensive, that's trivial, but for small fast functions, that will be
painful. Python's slow enough without making it slower for dubious gains.


> If I write (... arg=square_root_of_grahams_number())
> I would expect to pay for it.

Sure, but only once. If you think that Graham's Number is likely to change
*wink* then you can put it into the body of the function, like any other
code you want run every time you call the function.



--
Steven



Reply | Threaded
Open this post in threaded view
|

functions, optional parameters

Chris Angelico
In reply to this post by Steven D'Aprano-11
(To clarify, I am *not* talking about this as a change to Python, so
all questions of backward compatibility are immaterial. This is "what
happens if we go back in time and have Python use late binding
semantics". This is the "alternate 1985" of Back to the Future.)

On Sun, May 10, 2015 at 3:20 PM, Steven D'Aprano
<steve+comp.lang.python at pearwood.info> wrote:

> On Sun, 10 May 2015 01:33 pm, Chris Angelico wrote:
>> In fact, that would
>> be the language definition; the rest is an optimization. (It's like
>> how "x.y()" technically first looks up attribute "y" on object x, then
>> calls the result; but it's perfectly reasonable for a Python
>> implementation to notice this extremely common case and do an
>> "optimized method call" that doesn't actually create a function
>> object.)
>
> class X:
>    def y(self): pass
>
> y is already a function object.
>
> I think maybe you've got it backwards, and you mean the *method* object
> doesn't have to be created. Well, sure, that's possible, and maybe PyPy
> does something like that, and maybe it doesn't. Or maybe the function
> descriptor __get__ method could cache the result:

Apologies, that was indeed an error of terminology. I did indeed mean
the method object that doesn't have to be created. There is already a
function object (which can be identified as X.y - Py2 differences
needn't concern us here), and AFAIK, a peephole optimizer can
transform this safely:

x = X()
x.y()
# into
x = X()
X.y(x)

That's an optimization that can't possibly change the result (at
least, I'm not aware of a way that it can; I may be wrong), and so
it's a viable change for something like PyPy to do. But semantically,
a bound method object is still created, which means it's fully legal
to split that into two parts:

x = X()
f = x.y
f()

The only result should be that this defeats the optimization, so you
end up paying a greater cost in object (de)allocations.

> But that's much simpler than the early/late binding example. You talk
> about "the obvious cases" like int, bool, str and None. What about floats
> and frozensets, are they obvious? How about tuples? How about
> MyExpensiveImmutableObject?

Simple: if the optimizer doesn't know about them, they go by the
regular rule. As there's no semantic difference, there cannot be any
true effect beyond performance. Floats can easily be added to the list
I gave; tuples could be, as long as their members are also immutable;
frozenset doesn't have a literal form, nor would
MyExpensiveImmutableObject, so they would miss out on this benefit.

>> The simpler the rule, the easier to grok, and therefore the
>> less chance of introducing bugs.
>
> You're still going to surprise people who expect early binding:
>
> FLAG = True
>
> def spam(eggs=FLAG):
>     ...
>
>
> What do you mean, the default value gets recalculated every time I call
> spam? It's an obvious immutable type! And why does Python crash when I
> delete FLAG?

Still simple: Since late binding is the semantically-mandated
behaviour, this will always reevaluate FLAG - the optimizer has been
bypassed here. It's not an obvious immutable type - the example I
actually gave was "int/bool/str/None *literals*", not *values*. Here's
a non-toy example that would use this kind of flag-lookup semantics
usefully:

default_timeout = 60 # seconds

def url_get(url, timeout=default_timeout):
    """Perform a GET request and return the data"""

def url_post(url, body, timeout=default_timeout):
    """Perform a POST request and return the data"""

def dns_lookup(server, name, type="A", class="IN", timeout=default_timeout):
    """Send a DNS request and await a response"""


By changing modulename.default_timeout, you instantly change all of
the functions' defaults. In current Python, this would have to be done
as:

def url_get(url, timeout=None):
    if timeout is None: timeout = default_timeout

which duplicates that code down all of them, and it means that
introspection of the function can't show you what it's actually doing.
With late binding, an introspection could yield both the expression
used ("default_timeout") and, with evaluation, the effective default.

Now, this is a rarity. This is far FAR less common than the situations
where early binding is better. But there are places where it would
make sense.

> Worse:
>
>
> def factory():
>     funcs = []
>     for i in range(1, 5):
>         def adder(x, y=i):
>             return x + y
>         adder.__name__ = "adder%d" % i
>         funcs.append(adder)
>     return funcs
>
>
> The current behaviour with early binding:
>
>
> py> funcs = factory()
> py> [f(100) for f in funcs]
> [101, 102, 103, 104]
>
>
> What would it do with late binding? That's a tricky one. I can see two
> likely results:
>
> [f(100) for f in funcs]
> => returns [104, 104, 104, 104]
>
> or
>
> NameError: name 'i' is not defined
>
>
> both of which are significantly less useful.

I'd say the former makes more sense - it's what would happen if you
evaluated the expression "i" in the context of that factory function.
But yes, significantly less useful than early binding; I'm not sure
how to cleanly implement that kind of metaprogramming otherwise.

> As I've said, it is trivial to get late binding semantics if you start with
> early binding: just move setting the default value into the body of the
> function. 99% of the time you can use None as a sentinel, so the common
> case is easy:
>
> def func(x=None):
>     if x is None:
>         x = some_complex_calculation(i, want, to, repeat, each, time)
>
>
> and the rest of the time, you just need *one* persistent variable to hold a
> sentinel value to use instead of None:
>
> _sentinel = object
> def func(x=_sentinel, y=_sentinel, z=_sentinel):
>     if x is _sentinel: ...

Presumably that would instantiate an object() rather than using the
object type itself, but yes. Sometimes it'd be nice to be able to get
something with a more useful repr, but that's not a big deal.

> But if you start with late binding, it's hard to *cleanly* get early binding
> semantics. You need a separate global for each parameter of every function
> in the module:
>
> _default_x = some_complex_calculation(do, it, once)
> _default_y = another_complex_calculation(do, it, once)
> _default_z = a_third_complex_calculation(do, it, once)
> _default_x_for_some_other_function = something_else()
>
>
> def func(x=_default_x, y=_default_x, z=_default_z):  # oops, see the bug
>     ...

Yup, I see it... but I quite probably wouldn't if your variable names
were less toyish. Even as it is, the important info is getting lost in
this sea of "=_default_" that keeps having to be repeated.

> No, early binding by default is the only sensible solution, and Guido got it
> right. Having syntax for late binding would be a bonus, but it isn't really
> needed. We already have a foolproof and simple way to evaluate an
> expression at function call-time: put it in the body of the function.

I agree. As I said at the top, this is all just what happens if Biff
is in charge instead of Guido. It's not instantly internally
inconsistent, but it is a lot less useful than the early binding we
currently have.

The advantage of a late-binding syntax is that it could be visible in
the function signature, instead of being buried inside. If we had
something like this:

def print_list(lst, start=0, end==len(lst)):
    """Print out some or all elements of a given list"""

then it'd be obvious that the one-arg behaviour is to print out the
whole list; otherwise, you'd see None up there, and have to presume
that it means "to end of list". But syntax has to justify itself with
a lot more than uber-rare cases like these.

ChrisA


Reply | Threaded
Open this post in threaded view
|

functions, optional parameters

Dave Angel-4
In reply to this post by Chris Angelico
On 05/09/2015 11:33 PM, Chris Angelico wrote:

> On Sun, May 10, 2015 at 12:45 PM, Steven D'Aprano
> <steve+comp.lang.python at pearwood.info> wrote:
>> This is the point where some people try to suggest some sort of complicated,
>> fragile, DWIM heuristic where the compiler tries to guess whether the user
>> actually wants the default to use early or late binding, based on what the
>> expression looks like. "0 is an immutable int, use early binding; [] is a
>> mutable list, use late binding." sort of thing. Such a thing might work
>> well for the obvious cases, but it would be a bugger to debug and
>> work-around for the non-obvious cases when it guesses wrong -- and it will.
>
> What you could have is "late-binding semantics, optional early binding
> as an optimization but only in cases where the result is
> indistinguishable". That would allow common cases (int/bool/str/None
> literals) to be optimized, since there's absolutely no way for them to
> evaluate differently.
>

Except for literals, True, False and None, I can't see any way to
optimize such a thing.  Just because the name on the right side
references an immutable object at compile time, it doesn't follow that
it'll still be the same object later.

Unless late binding means something very different than I understood.

--
DaveA


Reply | Threaded
Open this post in threaded view
|

functions, optional parameters

Chris Angelico
On Sun, May 10, 2015 at 9:25 PM, Dave Angel <davea at davea.name> wrote:

> On 05/09/2015 11:33 PM, Chris Angelico wrote:
>> What you could have is "late-binding semantics, optional early binding
>> as an optimization but only in cases where the result is
>> indistinguishable". That would allow common cases (int/bool/str/None
>> literals) to be optimized, since there's absolutely no way for them to
>> evaluate differently.
>>
>
> Except for literals, True, False and None, I can't see any way to optimize
> such a thing.

I did specifically say "literals" :)

ChrisA