Quantcast

PEP: New timestamp formats

classic Classic list List threaded Threaded
35 messages Options
12
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate
star

PEP: New timestamp formats

Victor STINNER
Even if I am not really conviced that a PEP helps to design an API,
here is a draft of a PEP to add new timestamp formats to Python 3.3.
Don't see the draft as a final proposition, it is just a document
supposed to help the discussion :-)

---

PEP: xxx
Title: New timestamp formats
Version: $Revision$
Last-Modified: $Date$
Author: Victor Stinner <[hidden email]>
Status: Draft
Type: Standards Track
Content-Type: text/x-rst
Created: 01-Feburary-2012
Python-Version: 3.3


Abstract
========

Python 3.3 introduced functions supporting nanosecond resolutions. Python 3.3
only supports int or float to store timestamps, but these types cannot be use
to store a timestamp with a nanosecond resolution.


Motivation
==========

Python 2.3 introduced float timestamps to support subsecond resolutions,
os.stat() uses float timestamps by default since Python 2.5. Python 3.3
introduced functions supporting nanosecond resolutions:

 * os.stat()
 * os.utimensat()
 * os.futimens()
 * time.clock_gettime()
 * time.clock_getres()
 * time.wallclock() (reuse time.clock_gettime(time.CLOCK_MONOTONIC))

The problem is that floats of 64 bits are unable to store nanoseconds (10^-9)
for timestamps bigger than 2^24 seconds (194 days 4 hours: 1970-07-14 for an
Epoch timestamp) without loosing precision.

.. note::
   64 bits float starts to loose precision with microsecond (10^-6) resolution
   for timestamp bigger than 2^33 seconds (272 years: 2242-03-16 for an Epoch
   timestamp).


Timestamp formats
=================

Choose a new format for nanosecond resolution
---------------------------------------------

To support nanosecond resolution, four formats were considered:

 * 128 bits float
 * decimal.Decimal
 * datetime.datetime
 * tuple of integers

Criteria
--------

It should be possible to do arithmetic, for example::

    t1 = time.time()
    # ...
    t2 = time.time()
    dt = t2 - t1

Two timestamps should be comparable (t2 > t1).

The format should have a resolution of a least 1 nanosecond (without loosing
precision). It is better if the format can have an arbitrary resolution.

128 bits float
--------------

Add a new IEEE 754-2008 quad-precision float type. The IEEE 754-2008 quad
precision float has 1 sign bit, 15 bits of exponent and 112 bits of mantissa.

128 bits float is supported by GCC (4.3), Clang and ICC. The problem is that
Visual C++ 2008 doesn't support it. Python must be portable and so cannot rely
on a type only available on some platforms. Another example: GCC 4.3 does not
support __float128 in 32-bit mode on x86 (but gcc 4.4 does).

Intel CPUs have FPU supporting 80-bit floats, but not using SSE intructions.
Other CPU vendors don't support this float size.

There is also a license issue: GCC uses the MPFR library which is distributed
under the GNU LGPL license. This license is incompatible with the Python
Software License.

datetime.datetime
-----------------

datetime.datetime only supports microsecond resolution, but can be enhanced
to support nanosecond.

datetime.datetime has issues:

- there is no easy way to convert it into "seconds since the epoch"
- any broken-down time has issues of time stamp ordering in the
  duplicate hour of switching from DST to normal time
- time zone support is flaky-to-nonexistent in the datetime module

decimal.Decimal
---------------

The decimal module is implemented in Python and is not really fast.

Using Decimal by default would cause bootstrap issue because the module is
implemented in Python.

Decimal can store a timestamp with any resolution, not only nanosecond, the
resolution is configurable at runtime.

Decimal objects support all arithmetics operations and are compatible with int
and float.

The decimal module is slow, but there is a C reimplementation of the decimal
module which is almost ready for inclusion.

tuple
-----

Various kind of tuples have been proposed. All propositions only use integers:

 * a) (sec, nsec): C timespec structure, useful for os.futimens() for example
 * b) (sec, floatpart, exponent): value = sec + floatpart * 10**exponent
 * c) (sec, floatpart, divisor): value = sec + floatpart / divisor

The format (a) only supports nanosecond resolution.

The format (a) and (b) may loose precision if the clock divisor is not a
power of 10.

For format (c) should be enough for most cases.

Creating a tuple of integers is fast.

Arithmetic operations cannot be done directly on tuple: t2-t1 doesn't work for
example.

Final formats
-------------

The PEP proposes to provide 5 different timestamp formats:

 * numbers:

   * int
   * float
   * decimal.Decimal
   * datetime.timedelta

 * broken-down time:

   * datetime.datetime


API design
==========

Change the default result type
------------------------------

Python 2.3 introduced os.stat_float_times(). The problem is that this flag
is global, and so may break libraries if the application changes the type.

Changing the default result type would break backward compatibility.

Callback and creating a new module to convert timestamps
--------------------------------------------------------

Use a callback taking integers to create a timestamp. Example with float:

    def timestamp_to_float(seconds, floatpart, divisor):
        return seconds + floatpart / divisor

The time module can provide some builtin converters, and other module, like
datetime, can provide their own converters. Users can define their own types.

An alternative is to add new module for all functions converting timestamps.

The problem is that we have to design the API of the callback and we cannot
change it later. We may need more information for future needs later.

os.stat: add new fields
-----------------------

It was proposed to add 3 fields to os.stat() structure to get nanoseconds of
timestamps.

Add an argument to change the result type
-----------------------------------------

Add a argument to all functions creating timestamps, like time.time(), to
change their result type. It was first proposed to use a string argument,
e.g. time.time(format="decimal"). The problem is that the function has
to import internally a module. Then it was decided to pass directly the
type, e.g. time.time(format=decimal.Decimal). Using a type, the user has
first to import the module. There is no direct link between a type and the
function used to create the timestamp.

By default, the float type is used to keep backward compatibility. For stat
functions like os.stat(), the default type depends on os.stat_float_times().

Add new functions
-----------------

Add new functions for each type, examples:

 * time.time_decimal()
 * os.stat_decimal()
 * os.stat_datetime()
 * etc.


Changes
=======

 * Add *format* optional argument to time.clock(), time.clock_gettime(),
   time.clock_getres(), time.time() and time.wallclock().
 * Add *timestamp* optional argument to os.fstat(), os.fstatat(), os.lstat()
   and os.stat().

Functions accepting timestamp as input should support decimal.Decimal objects
without an internal conversion to float which may loose precision:

 * datetime.datetime.fromtimestamp()
 * time.localtime()
 * time.gmtime()

TODO:

 * Change os.utimensat() and os.futimens() to accept Decimal
 * Change os.utimensat() and os.futimens() to not accept tuple anymore
 * Drop os.utimensat() and os.futimens() and patch os.utimeat() instead?
 * datetime should maybe support nanosecond?


Backwards Compatibility
=======================

Changes only add an new optional argument. The default type is unchanged and
there is no impact on performances.


Links
=====

 * `Issue #11457: os.stat(): add new fields to get timestamps as
Decimal objects with nanosecond resolution
<http://bugs.python.org/issue11457>`_
 * `Issue #13882: Add format argument for time.time(), time.clock(),
... to get a timestamp as a Decimal object
<http://bugs.python.org/issue13882>`_
 * `[Python-Dev] Store timestamps as decimal.Decimal objects
<http://mail.python.org/pipermail/python-dev/2012-January/116025.html>`_


Copyright
=========

This document has been placed in the public domain.
_______________________________________________
Python-Dev mailing list
[hidden email]
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: http://mail.python.org/mailman/options/python-dev/lists%2B1324100855712-1801473%40n6.nabble.com
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate
star

Re: PEP: New timestamp formats

Nick Coghlan
On Thu, Feb 2, 2012 at 11:03 AM, Victor Stinner
<[hidden email]> wrote:
> Even if I am not really conviced that a PEP helps to design an API,
> here is a draft of a PEP to add new timestamp formats to Python 3.3.
> Don't see the draft as a final proposition, it is just a document
> supposed to help the discussion :-)

Helping keep a discussion on track (and avoiding rehashing old ground)
is precisely why the PEP process exists. Thanks for writing this up :)

> ---
>
> PEP: xxx
> Title: New timestamp formats
> Version: $Revision$
> Last-Modified: $Date$
> Author: Victor Stinner <[hidden email]>
> Status: Draft
> Type: Standards Track
> Content-Type: text/x-rst
> Created: 01-Feburary-2012
> Python-Version: 3.3
>
>
> Abstract
> ========
>
> Python 3.3 introduced functions supporting nanosecond resolutions. Python 3.3
> only supports int or float to store timestamps, but these types cannot be use
> to store a timestamp with a nanosecond resolution.
>
>
> Motivation
> ==========
>
> Python 2.3 introduced float timestamps to support subsecond resolutions,
> os.stat() uses float timestamps by default since Python 2.5. Python 3.3
> introduced functions supporting nanosecond resolutions:
>
>  * os.stat()
>  * os.utimensat()
>  * os.futimens()
>  * time.clock_gettime()
>  * time.clock_getres()
>  * time.wallclock() (reuse time.clock_gettime(time.CLOCK_MONOTONIC))
>
> The problem is that floats of 64 bits are unable to store nanoseconds (10^-9)
> for timestamps bigger than 2^24 seconds (194 days 4 hours: 1970-07-14 for an
> Epoch timestamp) without loosing precision.
>
> .. note::
>   64 bits float starts to loose precision with microsecond (10^-6) resolution
>   for timestamp bigger than 2^33 seconds (272 years: 2242-03-16 for an Epoch
>   timestamp).
>
>
> Timestamp formats
> =================
>
> Choose a new format for nanosecond resolution
> ---------------------------------------------
>
> To support nanosecond resolution, four formats were considered:
>
>  * 128 bits float
>  * decimal.Decimal
>  * datetime.datetime
>  * tuple of integers

I'd add datetime.timedelta to this list. It's exactly what timestamps
are, after all - the difference between the current time and the
relevant epoch value.

> Various kind of tuples have been proposed. All propositions only use integers:
>
>  * a) (sec, nsec): C timespec structure, useful for os.futimens() for example
>  * b) (sec, floatpart, exponent): value = sec + floatpart * 10**exponent
>  * c) (sec, floatpart, divisor): value = sec + floatpart / divisor
>
> The format (a) only supports nanosecond resolution.
>
> The format (a) and (b) may loose precision if the clock divisor is not a
> power of 10.
>
> For format (c) should be enough for most cases.

Format (b) only loses precision if the exponent chosen for a given
value is too small relative to the precision of the underlying timer
(it's the same as using decimal.Decimal in that respect). The problem
with (a) is that it simply cannot represent times with greater than
nanosecond precision. Since we have the opportunity, we may as well
deal with the precision question once and for all.

Alternatively, you could return a 4-tuple that specifies the base in
addition to the exponent.

> Callback and creating a new module to convert timestamps
> --------------------------------------------------------
>
> Use a callback taking integers to create a timestamp. Example with float:
>
>    def timestamp_to_float(seconds, floatpart, divisor):
>        return seconds + floatpart / divisor
>
> The time module can provide some builtin converters, and other module, like
> datetime, can provide their own converters. Users can define their own types.
>
> An alternative is to add new module for all functions converting timestamps.
>
> The problem is that we have to design the API of the callback and we cannot
> change it later. We may need more information for future needs later.

I'd be more specific here - either of the 3-tuple options already
presented in the PEP, or the 4-tuple option I mentioned above, would
be suitable as the signature of an arbitrary precision callback API
that assumes timestamps are always expressed as "seconds since a
particular epoch value". Such an API could only become limiting if
timestamps ever become something other than "the difference in time
between right now and the relevant epoch value", and that's a
sufficiently esoteric possibility that it really doesn't seem
worthwhile to take it into account. The past problems with timestamp
APIs have all related to increases in precision, not timestamps being
redefined as something radically different.

The PEP should also mention PJE's suggestion of creating a new named
protocol specifically for the purpose (with a signature based on one
of the proposed tuple formats), such that you could simply write:

    time.time()  # output=float by default
    time.time(output=float)
    time.time(output=int)
    time.time(output=fractions.Fraction)
    time.time(output=decimal.Decimal)
    time.time(output=datetime.timedelta)
    time.time(output=datetime.datetime)
    # (and similarly for os.stat with a timestamp=type parameter)

Rather than being timestamp specific, such a protocol would be a
general numeric protocol. If (integer, numerator, denominator) is used
(i.e. a "mixed number" in mathematical terms), then "__from_mixed__"
would be an appropriate name. If (integer, fractional, exponent) is
used (i.e. a fixed point notation), then "__from_fixed__" would work.

    # Algorithm for a "from mixed numbers" protocol, assuming division
doesn't lose precision...
    def __from_mixed__(cls, integer, numerator, denominator):
        return cls(integer) + cls(numerator) / cls(denominator)

    # Algorithm for a "from fixed point" protocol, assuming negative
exponents don't lose precision...
    def __from_fixed__(cls, integer, mantissa, base, exponent):
        return cls(integer) + cls(mantissa) * cls(base) ** cls(exponent)

>From a *usage* point of view, this idea is actually the same as the
proposal currently in the PEP. The difference is that instead of
adding custom support for a few particular types directly to time and
os, it instead defines a more general purpose protocol that covers not
only this use case, but also any other situation where high precision
fractions are relevant.

One interesting question with a named protocol approach is whether
such a protocol should *require* explicit support, or if it should
fall back to the underlying mathematical operations. Since the
conversions to float and int in the timestamp case are already known
to be lossy, permitting lossy conversion via the mathematical
equivalents seems reasonable, suggesting possible protocol definitions
as follows:

    # Algorithm for a potentially precision-losing "from mixed numbers" protocol
    def from_mixed(cls, integer, numerator, denominator):
        try:
            factory = cls.__from_mixed__
        except AttributeError:
            return cls(integer) + cls(numerator) / cls(denominator)
        return factory(integer, numerator, denominator)

    # Algorithm for a potentially lossy "from fixed point" protocol
    def from_fixed(cls, integer, mantissa, base, exponent):
        try:
            factory = cls.__from_fixed__
        except AttributeError:
            return cls(integer) + cls(mantissa) * cls(base) ** cls(exponent)
        return factory(integer, mantissa, base, exponent)

> os.stat: add new fields
> -----------------------
>
> It was proposed to add 3 fields to os.stat() structure to get nanoseconds of
> timestamps.

It's worth noting that the challenge with this is that it's
potentially time consuming to populating the extra fields, and that
this approach doesn't help with the time APIs that return timestamps
directly.

> Add an argument to change the result type
> -----------------------------------------
>
> Add a argument to all functions creating timestamps, like time.time(), to
> change their result type. It was first proposed to use a string argument,
> e.g. time.time(format="decimal"). The problem is that the function has
> to import internally a module. Then it was decided to pass directly the
> type, e.g. time.time(format=decimal.Decimal). Using a type, the user has
> first to import the module. There is no direct link between a type and the
> function used to create the timestamp.
>
> By default, the float type is used to keep backward compatibility. For stat
> functions like os.stat(), the default type depends on os.stat_float_times().

There should also be a description of the "set a boolean flag to
request high precision output" approach.

Cheers,
Nick.

--
Nick Coghlan   |   [hidden email]   |   Brisbane, Australia
_______________________________________________
Python-Dev mailing list
[hidden email]
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: http://mail.python.org/mailman/options/python-dev/lists%2B1324100855712-1801473%40n6.nabble.com
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate
star

Re: PEP: New timestamp formats

Paul Moore
On 2 February 2012 03:47, Nick Coghlan <[hidden email]> wrote:

> Rather than being timestamp specific, such a protocol would be a
> general numeric protocol. If (integer, numerator, denominator) is used
> (i.e. a "mixed number" in mathematical terms), then "__from_mixed__"
> would be an appropriate name. If (integer, fractional, exponent) is
> used (i.e. a fixed point notation), then "__from_fixed__" would work.
>
>    # Algorithm for a "from mixed numbers" protocol, assuming division
> doesn't lose precision...
>    def __from_mixed__(cls, integer, numerator, denominator):
>        return cls(integer) + cls(numerator) / cls(denominator)
>
>    # Algorithm for a "from fixed point" protocol, assuming negative
> exponents don't lose precision...
>    def __from_fixed__(cls, integer, mantissa, base, exponent):
>        return cls(integer) + cls(mantissa) * cls(base) ** cls(exponent)
>
> >From a *usage* point of view, this idea is actually the same as the
> proposal currently in the PEP. The difference is that instead of
> adding custom support for a few particular types directly to time and
> os, it instead defines a more general purpose protocol that covers not
> only this use case, but also any other situation where high precision
> fractions are relevant.
>
> One interesting question with a named protocol approach is whether
> such a protocol should *require* explicit support, or if it should
> fall back to the underlying mathematical operations. Since the
> conversions to float and int in the timestamp case are already known
> to be lossy, permitting lossy conversion via the mathematical
> equivalents seems reasonable, suggesting possible protocol definitions
> as follows:
>
>    # Algorithm for a potentially precision-losing "from mixed numbers" protocol
>    def from_mixed(cls, integer, numerator, denominator):
>        try:
>            factory = cls.__from_mixed__
>        except AttributeError:
>            return cls(integer) + cls(numerator) / cls(denominator)
>        return factory(integer, numerator, denominator)
>
>    # Algorithm for a potentially lossy "from fixed point" protocol
>    def from_fixed(cls, integer, mantissa, base, exponent):
>        try:
>            factory = cls.__from_fixed__
>        except AttributeError:
>            return cls(integer) + cls(mantissa) * cls(base) ** cls(exponent)
>        return factory(integer, mantissa, base, exponent)

The key problem with a protocol is that the implementer has to make
these decisions. The callback approach defers that decision to the end
user. After all, the end user is the one who knows for his app whether
precision loss is acceptable.

You could probably also have a standard named protocol which can be
used as a callback in straightforward cases

    time.time(callback=timedelta.__from_mixed__)

That's wordy, and a bit ugly, though. The callback code could
special-case types and look for __from_mixed__, I guess. Or use an
ABC, and have the code that uses the callback do

    if issubclass(cb, MixedNumberABC):
        return cb.__from_mixed__(whole, num, den)
    else:
        return cb(whole, num, den)

(The second branch is the one that allows the user to override the
predefined types that work - if you omit that, you're back to a named
protocol and ABCs don't gain you much beyond documentation).

Part of me feels that there's a use case for generic functions in
here, but maybe not (as it's overloading on the return type). Let's
not open that discussion again, though.

Paul.
_______________________________________________
Python-Dev mailing list
[hidden email]
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: http://mail.python.org/mailman/options/python-dev/lists%2B1324100855712-1801473%40n6.nabble.com
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate
star

Re: PEP: New timestamp formats

Victor STINNER
In reply to this post by Nick Coghlan
> I'd add datetime.timedelta to this list. It's exactly what timestamps
> are, after all - the difference between the current time and the
> relevant epoch value.

Ah yes, I forgot to mention it, whereas it is listed in the "final
timestamp formats list" :-)

>>  * a) (sec, nsec): C timespec structure, useful for os.futimens() for example
>>  * b) (sec, floatpart, exponent): value = sec + floatpart * 10**exponent
>>  * c) (sec, floatpart, divisor): value = sec + floatpart / divisor
>>
>> The format (a) and (b) may loose precision if the clock divisor is not a
>> power of 10.
>
> Format (b) only loses precision if the exponent chosen for a given
> value is too small relative to the precision of the underlying timer
> (it's the same as using decimal.Decimal in that respect).

Let's take an NTP timestamp in format (c): (sec=0,
floatpart=100000000, divisor=2**32):

>>> Decimal(100000000) * Decimal(10)**-10
Decimal('0.0100000000')
>>> Decimal(100000000) / Decimal(2)**32
Decimal('0.023283064365386962890625')

You have an error of 57%. Or do you mean that not only 2**32 should be
modified, but also 100000000? How do you adapt 100000000 (floatpart)
when changing the divisor (2**32 => 10**-10)? The format (c) avoids an
operation (base^exponent) and avoids loosing precision.

There is the same issue with QueryPerformanceFrequency and
QueryPerformanceCounter  used by time.clock(), the frequency is not a
power of any base.

I forgot to mention another advantage of (c), used by my patch for the
Decimal format: you can get the exact resolution of the clock
directly: 1/divisor. It works for any divisor (not only
base^exponent).

By the way, the format (c) can be simplified as a fraction:
(numerator, denominator) using (seconds * divisor + floatpart,
divisor). But this format is less practical to implement a function
creating a timestamp.

>> Callback and creating a new module to convert timestamps
> (...)
> Such an API could only become limiting if
> timestamps ever become something other than "the difference in time
> between right now and the relevant epoch value", and that's a
> sufficiently esoteric possibility that it really doesn't seem
> worthwhile to take it into account.

It may be interesting to support a different start date (other than
1970.1.1), if we choose to support broken-down timestamps (e.g.
datetime.datetime).

> The PEP should also mention PJE's suggestion of creating a new named
> protocol specifically for the purpose (with a signature based on one
> of the proposed tuple formats) (...)

Ok, I will add it.

> Rather than being timestamp specific, such a protocol would be a
> general numeric protocol. If (integer, numerator, denominator) is used
> (i.e. a "mixed number" in mathematical terms), then "__from_mixed__"
> would be an appropriate name. If (integer, fractional, exponent) is
> used (i.e. a fixed point notation), then "__from_fixed__" would work.
>
>    # Algorithm for a "from mixed numbers" protocol, assuming division
> doesn't lose precision...
>    def __from_mixed__(cls, integer, numerator, denominator):
>        return cls(integer) + cls(numerator) / cls(denominator)

Even if I like the idea, I don't think that we need all this machinery
to support nanosecond resolution. I should maybe forget my idea of
using datetime.datetime or datetime.timedelta, or only only support
int, float and decimal.Decimal.

datetime.datetime and datetime.timedelta are already compatible with
Decimal (except that they may loose precision because of an internal
conversion to float): datetime.datetime.fromtimestamp(t) and
datetime.timedelta(seconds=t).

If we only support int, float and Decimal, we don't need to add a new
protocol, hardcoded functions are enough :-)

>> os.stat: add new fields
>> -----------------------
>>
>> It was proposed to add 3 fields to os.stat() structure to get nanoseconds of
>> timestamps.
>
> It's worth noting that the challenge with this is that it's
> potentially time consuming to populating the extra fields, and that
> this approach doesn't help with the time APIs that return timestamps
> directly.

New fields can be optional (add a flag to get them), but I don't like
the idea of a structure with a variable number of fields, especially
because os.stat() structure can be used as a tuple (get a field by its
index).

Patching os.stat() doesn't solve the problem for the time module anyway.

>> Add an argument to change the result type
>> -----------------------------------------
>
> There should also be a description of the "set a boolean flag to
> request high precision output" approach.

You mean something like: time.time(hires=True)? Or time.time(decimal=True)?

Victor
_______________________________________________
Python-Dev mailing list
[hidden email]
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: http://mail.python.org/mailman/options/python-dev/lists%2B1324100855712-1801473%40n6.nabble.com
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate
star

Re: PEP: New timestamp formats

Paul Moore
On 2 February 2012 12:16, Victor Stinner <[hidden email]> wrote:

> Let's take an NTP timestamp in format (c): (sec=0,
> floatpart=100000000, divisor=2**32):
>
>>>> Decimal(100000000) * Decimal(10)**-10
> Decimal('0.0100000000')
>>>> Decimal(100000000) / Decimal(2)**32
> Decimal('0.023283064365386962890625')
>
> You have an error of 57%. Or do you mean that not only 2**32 should be
> modified, but also 100000000? How do you adapt 100000000 (floatpart)
> when changing the divisor (2**32 => 10**-10)? The format (c) avoids an
> operation (base^exponent) and avoids loosing precision.

Am I missing something? If you're using the fixed point form
(fraction, exponent) then 0.023283064365386962890625 would be written
as (23283064365386962890625, -23). Same precision as the (100000000,
base=2, exponent=32) format.

Confused,
Paul
_______________________________________________
Python-Dev mailing list
[hidden email]
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: http://mail.python.org/mailman/options/python-dev/lists%2B1324100855712-1801473%40n6.nabble.com
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate
star

Re: PEP: New timestamp formats

Nick Coghlan
In reply to this post by Victor STINNER
On Thu, Feb 2, 2012 at 10:16 PM, Victor Stinner
<[hidden email]> wrote:
> If we only support int, float and Decimal, we don't need to add a new
> protocol, hardcoded functions are enough :-)

Yup, that's why your middle-ground approach didn't make any sense to
me. Returning Decimal when a flag is set to request high precision
values actually handles everything (since any epoch related questions
only arise later when converting the decimal timestamp to an absolute
time value).

I think a protocol based approach would be *feasible*, but also
overkill for the specific problem we're trying to handle (i.e.
arbitrary precision timestamps). If a dependency from time and os on
the decimal module means we decide to finally incorporate Stefan's
cdecimal branch, I consider that a win in its own right (there are
some speed hacks in decimal that didn't fair well in the Py3k
transition because they went from being 8-bit str based to Unicode str
based. They didn't *break* from a correctness point of view, but my
money would be on they're being pessimisations now instead of
optimisations).

>>> os.stat: add new fields
>>> -----------------------
> New fields can be optional (add a flag to get them), but I don't like
> the idea of a structure with a variable number of fields, especially
> because os.stat() structure can be used as a tuple (get a field by its
> index).
>
> Patching os.stat() doesn't solve the problem for the time module anyway.

We can't add new fields to the stat tuple anyway - it breaks tuple
unpacking. Any new fields would have been accessible by name only
(which poses its own problems, but is a solution we've used before -
in the codecs module, for example).

As you say though, this was never going to be adequate since it
doesn't help with the time APIs.

>>> Add an argument to change the result type
>>> -----------------------------------------
>>
>> There should also be a description of the "set a boolean flag to
>> request high precision output" approach.
>
> You mean something like: time.time(hires=True)? Or time.time(decimal=True)?

Yeah, I was thinking "hires" as the short form of "high resolution",
but it's a little confusing since it also parses as the word "hires"
(i.e. "hire"+"s"). "hi_res", "hi_prec" (for "high precision") or
"full_prec" (for "full precision") might be better.

I don't really like "decimal" as the flag name, since it confuses an
implementation detail (using decimal.Decimal) with the design intent
(preserving the full precision of the underlying timestamp).

Cheers,
Nick.

--
Nick Coghlan   |   [hidden email]   |   Brisbane, Australia
_______________________________________________
Python-Dev mailing list
[hidden email]
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: http://mail.python.org/mailman/options/python-dev/lists%2B1324100855712-1801473%40n6.nabble.com
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate
star

Re: PEP: New timestamp formats

Victor STINNER
In reply to this post by Victor STINNER
> Even if I like the idea, I don't think that we need all this machinery
> to support nanosecond resolution. I should maybe forget my idea of
> using datetime.datetime or datetime.timedelta, or only only support
> int, float and decimal.Decimal.

I updated my patch (issue #13882) to only support int, float and
decimal.Decimal types. I suppose that it is just enough.

Only adding decimal.Decimal type avoids many questions:

 - which API / protocol should be used to support other types
 - what is the start of a timestamp?
 - etc.

As we seen: using time.time(timestamp=type) API, it will be easy to
support new types later (using a new protocol, a registry like Unicode
codecs, or anything else).

Let's start with decimal.Decimal and support it correctly (e.g. patch
datetime.datetime.fromtimestamp() and os.*utime*() functions).

Victor
_______________________________________________
Python-Dev mailing list
[hidden email]
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: http://mail.python.org/mailman/options/python-dev/lists%2B1324100855712-1801473%40n6.nabble.com
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate
star

Re: PEP: New timestamp formats

Nick Coghlan
In reply to this post by Paul Moore
On Thu, Feb 2, 2012 at 10:45 PM, Paul Moore <[hidden email]> wrote:

> On 2 February 2012 12:16, Victor Stinner <[hidden email]> wrote:
>> Let's take an NTP timestamp in format (c): (sec=0,
>> floatpart=100000000, divisor=2**32):
>>
>>>>> Decimal(100000000) * Decimal(10)**-10
>> Decimal('0.0100000000')
>>>>> Decimal(100000000) / Decimal(2)**32
>> Decimal('0.023283064365386962890625')
>>
>> You have an error of 57%. Or do you mean that not only 2**32 should be
>> modified, but also 100000000? How do you adapt 100000000 (floatpart)
>> when changing the divisor (2**32 => 10**-10)? The format (c) avoids an
>> operation (base^exponent) and avoids loosing precision.
>
> Am I missing something? If you're using the fixed point form
> (fraction, exponent) then 0.023283064365386962890625 would be written
> as (23283064365386962890625, -23). Same precision as the (100000000,
> base=2, exponent=32) format.

Yeah, Victor's persuaded me that the only two integer based formats
that would be sufficiently flexible are (integer, numerator, divisor)
and (integer, mantissa, base, exponent). The latter allows for a few
more optimised conversions in particular cases. Assuming a base of 10
would just make things unnecessarily awkward when the underlying base
is 2, though.

However, I think it's even more right to not have a protocol at all
and just use decimal.Decimal for arbitrary precision timestamps
(explicitly requested via a flag to preserve backwards compatibility).

Cheers,
Nick.

--
Nick Coghlan   |   [hidden email]   |   Brisbane, Australia
_______________________________________________
Python-Dev mailing list
[hidden email]
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: http://mail.python.org/mailman/options/python-dev/lists%2B1324100855712-1801473%40n6.nabble.com
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate
star

Re: PEP: New timestamp formats

Nick Coghlan
In reply to this post by Victor STINNER
On Thu, Feb 2, 2012 at 11:10 PM, Victor Stinner
<[hidden email]> wrote:

>> Even if I like the idea, I don't think that we need all this machinery
>> to support nanosecond resolution. I should maybe forget my idea of
>> using datetime.datetime or datetime.timedelta, or only only support
>> int, float and decimal.Decimal.
>
> I updated my patch (issue #13882) to only support int, float and
> decimal.Decimal types. I suppose that it is just enough.
>
> Only adding decimal.Decimal type avoids many questions:
>
>  - which API / protocol should be used to support other types
>  - what is the start of a timestamp?
>  - etc.
>
> As we seen: using time.time(timestamp=type) API, it will be easy to
> support new types later (using a new protocol, a registry like Unicode
> codecs, or anything else).

Yeah, I can definitely live with the type-based API if we restrict it
to those 3 types.

Cheers,
Nick.

--
Nick Coghlan   |   [hidden email]   |   Brisbane, Australia
_______________________________________________
Python-Dev mailing list
[hidden email]
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: http://mail.python.org/mailman/options/python-dev/lists%2B1324100855712-1801473%40n6.nabble.com
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate
star

Re: PEP: New timestamp formats

Antoine Pitrou
In reply to this post by Nick Coghlan
On Thu, 2 Feb 2012 23:07:28 +1000
Nick Coghlan <[hidden email]> wrote:
>
> We can't add new fields to the stat tuple anyway - it breaks tuple
> unpacking.

I don't think that's true. The stat tuple already has a varying number
of fields: http://docs.python.org/dev/library/os.html#os.stat

“For backward compatibility, the return value of stat() is also
accessible as a tuple of *at least* 10 integers [...] More items may be
added at the end by some implementations.” (emphasis mine)

So at most you could tuple-unpack os.stat(...)[:10].

(I've never seen code tuple-unpacking a stat tuple, myself. It sounds
quite cumbersome to do so.)

> >>> Add an argument to change the result type
> >>> -----------------------------------------
> >>
> >> There should also be a description of the "set a boolean flag to
> >> request high precision output" approach.
> >
> > You mean something like: time.time(hires=True)? Or time.time(decimal=True)?
>
> Yeah, I was thinking "hires" as the short form of "high resolution",
> but it's a little confusing since it also parses as the word "hires"
> (i.e. "hire"+"s"). "hi_res", "hi_prec" (for "high precision") or
> "full_prec" (for "full precision") might be better.
>
> I don't really like "decimal" as the flag name, since it confuses an
> implementation detail (using decimal.Decimal) with the design intent
> (preserving the full precision of the underlying timestamp).

But that implementation detail will be visible to the user, including
when combining the result with other numbers (as Decimal "wins" over
float and int). IMHO it wouldn't be silly to make it explicit.

I think "hires" may confuse people into thinking the time source
has a higher resolution, whereas it's only the return type.
Perhaps it's just a documentation issue, though.

Regards

Antoine.


_______________________________________________
Python-Dev mailing list
[hidden email]
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: http://mail.python.org/mailman/options/python-dev/lists%2B1324100855712-1801473%40n6.nabble.com
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate
star

Re: PEP: New timestamp formats

Antoine Pitrou
In reply to this post by Victor STINNER
On Thu, 2 Feb 2012 14:10:14 +0100
Victor Stinner <[hidden email]> wrote:
> > Even if I like the idea, I don't think that we need all this machinery
> > to support nanosecond resolution. I should maybe forget my idea of
> > using datetime.datetime or datetime.timedelta, or only only support
> > int, float and decimal.Decimal.
>
> I updated my patch (issue #13882) to only support int, float and
> decimal.Decimal types. I suppose that it is just enough.

Why int? That doesn't seem to bring anything.

Regards

Antoine.


_______________________________________________
Python-Dev mailing list
[hidden email]
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: http://mail.python.org/mailman/options/python-dev/lists%2B1324100855712-1801473%40n6.nabble.com
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate
star

Re: PEP: New timestamp formats

M.-A. Lemburg
In reply to this post by Nick Coghlan
Nick Coghlan wrote:

> On Thu, Feb 2, 2012 at 10:16 PM, Victor Stinner
>>>> Add an argument to change the result type
>>>> -----------------------------------------
>>>
>>> There should also be a description of the "set a boolean flag to
>>> request high precision output" approach.
>>
>> You mean something like: time.time(hires=True)? Or time.time(decimal=True)?
>
> Yeah, I was thinking "hires" as the short form of "high resolution",
> but it's a little confusing since it also parses as the word "hires"
> (i.e. "hire"+"s"). "hi_res", "hi_prec" (for "high precision") or
> "full_prec" (for "full precision") might be better.

Isn't the above (having the return type depend on an argument
setting) something we generally try to avoid ?

I think it's better to settle on one type for high-res timers and
add a new API(s) for it.

--
Marc-Andre Lemburg
eGenix.com

Professional Python Services directly from the Source  (#1, Feb 02 2012)
>>> Python/Zope Consulting and Support ...        http://www.egenix.com/
>>> mxODBC.Zope.Database.Adapter ...             http://zope.egenix.com/
>>> mxODBC, mxDateTime, mxTextTools ...        http://python.egenix.com/
________________________________________________________________________

::: Try our new mxODBC.Connect Python Database Interface for free ! ::::


   eGenix.com Software, Skills and Services GmbH  Pastor-Loeh-Str.48
    D-40764 Langenfeld, Germany. CEO Dipl.-Math. Marc-Andre Lemburg
           Registered at Amtsgericht Duesseldorf: HRB 46611
               http://www.egenix.com/company/contact/
_______________________________________________
Python-Dev mailing list
[hidden email]
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: http://mail.python.org/mailman/options/python-dev/lists%2B1324100855712-1801473%40n6.nabble.com
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate
star

Re: PEP: New timestamp formats

Nick Coghlan
On Thu, Feb 2, 2012 at 11:31 PM, M.-A. Lemburg <[hidden email]> wrote:
> Isn't the above (having the return type depend on an argument
> setting) something we generally try to avoid ?

In Victor's actual patch, the returned object is an instance of the
type you pass in, so it actually avoids that issue.

> I think it's better to settle on one type for high-res timers and
> add a new API(s) for it.

We've basically settled on decimal.Decimal now, so yeah, the decision
becomes one of spelling - either new APIs that always return Decimal
instances, or a way to ask the existing APIs to return Decimal instead
of floats.

The way I see it, the latter should be significantly less hassle to
maintain (since the code remains almost entirely shared), and it
becomes trivial for someone to layer a convenience wrapper over the
top that *always* requests the high precision output.

Cheers,
Nick.

--
Nick Coghlan   |   [hidden email]   |   Brisbane, Australia
_______________________________________________
Python-Dev mailing list
[hidden email]
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: http://mail.python.org/mailman/options/python-dev/lists%2B1324100855712-1801473%40n6.nabble.com
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate
star

Re: PEP: New timestamp formats

Victor STINNER
In reply to this post by Antoine Pitrou
> Why int? That doesn't seem to bring anything.

It helps to deprecate/replace os.stat_float_times(), which may be used
for backward compatibility (with Python 2.2 ? :-)).
_______________________________________________
Python-Dev mailing list
[hidden email]
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: http://mail.python.org/mailman/options/python-dev/lists%2B1324100855712-1801473%40n6.nabble.com
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate
star

Re: PEP: New timestamp formats

Antoine Pitrou
On Thu, 2 Feb 2012 15:09:41 +0100
Victor Stinner <[hidden email]> wrote:

> > Why int? That doesn't seem to bring anything.
>
> It helps to deprecate/replace os.stat_float_times(), which may be used
> for backward compatibility (with Python 2.2 ? :-)).

I must admit I don't understand the stat_float_times documentation:

“For compatibility with older Python versions, accessing stat_result as
a tuple always returns integers.

Python now returns float values by default. Applications which do not
work correctly with floating point time stamps can use this function to
restore the old behaviour.”

These two paragraphs seem to contradict themselves.


That said, I don't understand why we couldn't simply deprecate
stat_float_times() right now. Having an option for integer timestamps
is pointless, you can just call int() on the result if you want.

Regards

Antoine.


_______________________________________________
Python-Dev mailing list
[hidden email]
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: http://mail.python.org/mailman/options/python-dev/lists%2B1324100855712-1801473%40n6.nabble.com
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate
star

Re: PEP: New timestamp formats

Victor STINNER
> That said, I don't understand why we couldn't simply deprecate
> stat_float_times() right now. Having an option for integer timestamps
> is pointless, you can just call int() on the result if you want.

So which API do you propose for time.time() to get a Decimal object?

time.time(timestamp=decimal.Decimal)
time.time(decimal=True) or time.time(hires=True)

or something else?

Victor
_______________________________________________
Python-Dev mailing list
[hidden email]
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: http://mail.python.org/mailman/options/python-dev/lists%2B1324100855712-1801473%40n6.nabble.com
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate
star

Re: PEP: New timestamp formats

Barry Warsaw
In reply to this post by Nick Coghlan
On Feb 02, 2012, at 11:07 PM, Nick Coghlan wrote:

>Yup, that's why your middle-ground approach didn't make any sense to
>me. Returning Decimal when a flag is set to request high precision
>values actually handles everything (since any epoch related questions
>only arise later when converting the decimal timestamp to an absolute
>time value).

Guido really dislikes APIs where a flag changes the return type, and I agree
with him.  It's because this is highly unreadable:

    results = blah.whatever(True)

What the heck does that `True` do?  It can be marginally better with a
keyword-only argument, but not much.

I haven't read the whole thread so maybe this is a stupid question, but why
can't we add a datetime-compatible higher precision type that hides all the
implementation details?

-Barry
_______________________________________________
Python-Dev mailing list
[hidden email]
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: http://mail.python.org/mailman/options/python-dev/lists%2B1324100855712-1801473%40n6.nabble.com
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate
star

Re: PEP: New timestamp formats

Nick Coghlan


On Feb 3, 2012 2:59 AM, "Barry Warsaw" <[hidden email]> wrote:
>
> On Feb 02, 2012, at 11:07 PM, Nick Coghlan wrote:
>
> >Yup, that's why your middle-ground approach didn't make any sense to
> >me. Returning Decimal when a flag is set to request high precision
> >values actually handles everything (since any epoch related questions
> >only arise later when converting the decimal timestamp to an absolute
> >time value).
>
> Guido really dislikes APIs where a flag changes the return type, and I agree
> with him.  It's because this is highly unreadable:
>
>    results = blah.whatever(True)
>
> What the heck does that `True` do?  It can be marginally better with a
> keyword-only argument, but not much.

Victor's patch passes in the return type rather than a binary flag, thus avoiding this particular problem.

> I haven't read the whole thread so maybe this is a stupid question, but why
> can't we add a datetime-compatible higher precision type that hides all the
> implementation details?
>
> -Barry

It's not a stupid question, but for backwards compatibility, what we would actually need is a version of Decimal that implicitly interoperates with binary floats. That's... not trivial.

Cheers,
Nick
--
Sent from my phone, thus the relative brevity :)


_______________________________________________
Python-Dev mailing list
[hidden email]
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: http://mail.python.org/mailman/options/python-dev/lists%2B1324100855712-1801473%40n6.nabble.com
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate
star

Re: PEP: New timestamp formats

Jeffrey Yasskin-2
In reply to this post by Victor STINNER
On Wed, Feb 1, 2012 at 5:03 PM, Victor Stinner
<[hidden email]> wrote:
> datetime.datetime
> -----------------
>
> datetime.datetime only supports microsecond resolution, but can be enhanced
> to support nanosecond.
>
> datetime.datetime has issues:
>
> - there is no easy way to convert it into "seconds since the epoch"

Not true:

>>> import datetime, time
>>> epoch = datetime.datetime(1970, 1, 1, 0, 0, 0)
>>> (datetime.datetime.utcnow() - epoch).total_seconds()
1328219742.385039
>>> time.time()
1328219747.640937
>>>

> - any broken-down time has issues of time stamp ordering in the
>  duplicate hour of switching from DST to normal time

Only if you insist on putting it in a timezone. Use UTC, and you should be fine.

> - time zone support is flaky-to-nonexistent in the datetime module

Why do you need time zone support for system interfaces that return
times in UTC?


I think I saw another objection that datetime represented points in
time, while functions like time.time() and os.stat() return an offset
from the epoch. This objection seems silly to me: the return value of
the system interfaces intends to represent points in time, even though
it has to be implemented as an offset since an epoch because of
limitations in C, and datetime is also implemented as an offset from
an epoch (year 0).

On the other hand, the return value of functions like time.clock() is
_not_ intended to represent an exact point in time, and so should be
either a timedelta or Decimal.

Jeffrey
_______________________________________________
Python-Dev mailing list
[hidden email]
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: http://mail.python.org/mailman/options/python-dev/lists%2B1324100855712-1801473%40n6.nabble.com
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate
star

Re: PEP: New timestamp formats

Glenn Linderman-3
In reply to this post by Antoine Pitrou
On 2/2/2012 6:28 AM, Antoine Pitrou wrote:
On Thu, 2 Feb 2012 15:09:41 +0100
Victor Stinner [hidden email] wrote:

Why int? That doesn't seem to bring anything.
It helps to deprecate/replace os.stat_float_times(), which may be used
for backward compatibility (with Python 2.2 ? :-)).
I must admit I don't understand the stat_float_times documentation:

“For compatibility with older Python versions, accessing stat_result as
a tuple always returns integers.

Python now returns float values by default. Applications which do not
work correctly with floating point time stamps can use this function to
restore the old behaviour.”

These two paragraphs seem to contradict themselves.


That said, I don't understand why we couldn't simply deprecate
stat_float_times() right now. Having an option for integer timestamps
is pointless, you can just call int() on the result if you want.

Regards

Antoine.

Sorry to bring this up, but the PEP should probably consider another option: Introducing a precedent following os.stat_decimal_times().  Like os.stat_float_times, it would decide the return types of timestamps from os.stat.  Or something along that line.  Having it affect the results of time.time would be weird, though.  And the whole design of os.stat_float_times smells of something being designed wrong in the first place, to need such an API to retain backward compatibility.  But I'm not sure it is, even yet, designed for such flexibility.

_______________________________________________
Python-Dev mailing list
[hidden email]
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: http://mail.python.org/mailman/options/python-dev/lists%2B1324100855712-1801473%40n6.nabble.com
12
Loading...