|
Even if I am not really conviced that a PEP helps to design an API,
here is a draft of a PEP to add new timestamp formats to Python 3.3. Don't see the draft as a final proposition, it is just a document supposed to help the discussion :-) --- PEP: xxx Title: New timestamp formats Version: $Revision$ Last-Modified: $Date$ Author: Victor Stinner <[hidden email]> Status: Draft Type: Standards Track Content-Type: text/x-rst Created: 01-Feburary-2012 Python-Version: 3.3 Abstract ======== Python 3.3 introduced functions supporting nanosecond resolutions. Python 3.3 only supports int or float to store timestamps, but these types cannot be use to store a timestamp with a nanosecond resolution. Motivation ========== Python 2.3 introduced float timestamps to support subsecond resolutions, os.stat() uses float timestamps by default since Python 2.5. Python 3.3 introduced functions supporting nanosecond resolutions: * os.stat() * os.utimensat() * os.futimens() * time.clock_gettime() * time.clock_getres() * time.wallclock() (reuse time.clock_gettime(time.CLOCK_MONOTONIC)) The problem is that floats of 64 bits are unable to store nanoseconds (10^-9) for timestamps bigger than 2^24 seconds (194 days 4 hours: 1970-07-14 for an Epoch timestamp) without loosing precision. .. note:: 64 bits float starts to loose precision with microsecond (10^-6) resolution for timestamp bigger than 2^33 seconds (272 years: 2242-03-16 for an Epoch timestamp). Timestamp formats ================= Choose a new format for nanosecond resolution --------------------------------------------- To support nanosecond resolution, four formats were considered: * 128 bits float * decimal.Decimal * datetime.datetime * tuple of integers Criteria -------- It should be possible to do arithmetic, for example:: t1 = time.time() # ... t2 = time.time() dt = t2 - t1 Two timestamps should be comparable (t2 > t1). The format should have a resolution of a least 1 nanosecond (without loosing precision). It is better if the format can have an arbitrary resolution. 128 bits float -------------- Add a new IEEE 754-2008 quad-precision float type. The IEEE 754-2008 quad precision float has 1 sign bit, 15 bits of exponent and 112 bits of mantissa. 128 bits float is supported by GCC (4.3), Clang and ICC. The problem is that Visual C++ 2008 doesn't support it. Python must be portable and so cannot rely on a type only available on some platforms. Another example: GCC 4.3 does not support __float128 in 32-bit mode on x86 (but gcc 4.4 does). Intel CPUs have FPU supporting 80-bit floats, but not using SSE intructions. Other CPU vendors don't support this float size. There is also a license issue: GCC uses the MPFR library which is distributed under the GNU LGPL license. This license is incompatible with the Python Software License. datetime.datetime ----------------- datetime.datetime only supports microsecond resolution, but can be enhanced to support nanosecond. datetime.datetime has issues: - there is no easy way to convert it into "seconds since the epoch" - any broken-down time has issues of time stamp ordering in the duplicate hour of switching from DST to normal time - time zone support is flaky-to-nonexistent in the datetime module decimal.Decimal --------------- The decimal module is implemented in Python and is not really fast. Using Decimal by default would cause bootstrap issue because the module is implemented in Python. Decimal can store a timestamp with any resolution, not only nanosecond, the resolution is configurable at runtime. Decimal objects support all arithmetics operations and are compatible with int and float. The decimal module is slow, but there is a C reimplementation of the decimal module which is almost ready for inclusion. tuple ----- Various kind of tuples have been proposed. All propositions only use integers: * a) (sec, nsec): C timespec structure, useful for os.futimens() for example * b) (sec, floatpart, exponent): value = sec + floatpart * 10**exponent * c) (sec, floatpart, divisor): value = sec + floatpart / divisor The format (a) only supports nanosecond resolution. The format (a) and (b) may loose precision if the clock divisor is not a power of 10. For format (c) should be enough for most cases. Creating a tuple of integers is fast. Arithmetic operations cannot be done directly on tuple: t2-t1 doesn't work for example. Final formats ------------- The PEP proposes to provide 5 different timestamp formats: * numbers: * int * float * decimal.Decimal * datetime.timedelta * broken-down time: * datetime.datetime API design ========== Change the default result type ------------------------------ Python 2.3 introduced os.stat_float_times(). The problem is that this flag is global, and so may break libraries if the application changes the type. Changing the default result type would break backward compatibility. Callback and creating a new module to convert timestamps -------------------------------------------------------- Use a callback taking integers to create a timestamp. Example with float: def timestamp_to_float(seconds, floatpart, divisor): return seconds + floatpart / divisor The time module can provide some builtin converters, and other module, like datetime, can provide their own converters. Users can define their own types. An alternative is to add new module for all functions converting timestamps. The problem is that we have to design the API of the callback and we cannot change it later. We may need more information for future needs later. os.stat: add new fields ----------------------- It was proposed to add 3 fields to os.stat() structure to get nanoseconds of timestamps. Add an argument to change the result type ----------------------------------------- Add a argument to all functions creating timestamps, like time.time(), to change their result type. It was first proposed to use a string argument, e.g. time.time(format="decimal"). The problem is that the function has to import internally a module. Then it was decided to pass directly the type, e.g. time.time(format=decimal.Decimal). Using a type, the user has first to import the module. There is no direct link between a type and the function used to create the timestamp. By default, the float type is used to keep backward compatibility. For stat functions like os.stat(), the default type depends on os.stat_float_times(). Add new functions ----------------- Add new functions for each type, examples: * time.time_decimal() * os.stat_decimal() * os.stat_datetime() * etc. Changes ======= * Add *format* optional argument to time.clock(), time.clock_gettime(), time.clock_getres(), time.time() and time.wallclock(). * Add *timestamp* optional argument to os.fstat(), os.fstatat(), os.lstat() and os.stat(). Functions accepting timestamp as input should support decimal.Decimal objects without an internal conversion to float which may loose precision: * datetime.datetime.fromtimestamp() * time.localtime() * time.gmtime() TODO: * Change os.utimensat() and os.futimens() to accept Decimal * Change os.utimensat() and os.futimens() to not accept tuple anymore * Drop os.utimensat() and os.futimens() and patch os.utimeat() instead? * datetime should maybe support nanosecond? Backwards Compatibility ======================= Changes only add an new optional argument. The default type is unchanged and there is no impact on performances. Links ===== * `Issue #11457: os.stat(): add new fields to get timestamps as Decimal objects with nanosecond resolution <http://bugs.python.org/issue11457>`_ * `Issue #13882: Add format argument for time.time(), time.clock(), ... to get a timestamp as a Decimal object <http://bugs.python.org/issue13882>`_ * `[Python-Dev] Store timestamps as decimal.Decimal objects <http://mail.python.org/pipermail/python-dev/2012-January/116025.html>`_ Copyright ========= This document has been placed in the public domain. _______________________________________________ Python-Dev mailing list [hidden email] http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/lists%2B1324100855712-1801473%40n6.nabble.com |
|
On Thu, Feb 2, 2012 at 11:03 AM, Victor Stinner
<[hidden email]> wrote: > Even if I am not really conviced that a PEP helps to design an API, > here is a draft of a PEP to add new timestamp formats to Python 3.3. > Don't see the draft as a final proposition, it is just a document > supposed to help the discussion :-) Helping keep a discussion on track (and avoiding rehashing old ground) is precisely why the PEP process exists. Thanks for writing this up :) > --- > > PEP: xxx > Title: New timestamp formats > Version: $Revision$ > Last-Modified: $Date$ > Author: Victor Stinner <[hidden email]> > Status: Draft > Type: Standards Track > Content-Type: text/x-rst > Created: 01-Feburary-2012 > Python-Version: 3.3 > > > Abstract > ======== > > Python 3.3 introduced functions supporting nanosecond resolutions. Python 3.3 > only supports int or float to store timestamps, but these types cannot be use > to store a timestamp with a nanosecond resolution. > > > Motivation > ========== > > Python 2.3 introduced float timestamps to support subsecond resolutions, > os.stat() uses float timestamps by default since Python 2.5. Python 3.3 > introduced functions supporting nanosecond resolutions: > > * os.stat() > * os.utimensat() > * os.futimens() > * time.clock_gettime() > * time.clock_getres() > * time.wallclock() (reuse time.clock_gettime(time.CLOCK_MONOTONIC)) > > The problem is that floats of 64 bits are unable to store nanoseconds (10^-9) > for timestamps bigger than 2^24 seconds (194 days 4 hours: 1970-07-14 for an > Epoch timestamp) without loosing precision. > > .. note:: > 64 bits float starts to loose precision with microsecond (10^-6) resolution > for timestamp bigger than 2^33 seconds (272 years: 2242-03-16 for an Epoch > timestamp). > > > Timestamp formats > ================= > > Choose a new format for nanosecond resolution > --------------------------------------------- > > To support nanosecond resolution, four formats were considered: > > * 128 bits float > * decimal.Decimal > * datetime.datetime > * tuple of integers I'd add datetime.timedelta to this list. It's exactly what timestamps are, after all - the difference between the current time and the relevant epoch value. > Various kind of tuples have been proposed. All propositions only use integers: > > * a) (sec, nsec): C timespec structure, useful for os.futimens() for example > * b) (sec, floatpart, exponent): value = sec + floatpart * 10**exponent > * c) (sec, floatpart, divisor): value = sec + floatpart / divisor > > The format (a) only supports nanosecond resolution. > > The format (a) and (b) may loose precision if the clock divisor is not a > power of 10. > > For format (c) should be enough for most cases. Format (b) only loses precision if the exponent chosen for a given value is too small relative to the precision of the underlying timer (it's the same as using decimal.Decimal in that respect). The problem with (a) is that it simply cannot represent times with greater than nanosecond precision. Since we have the opportunity, we may as well deal with the precision question once and for all. Alternatively, you could return a 4-tuple that specifies the base in addition to the exponent. > Callback and creating a new module to convert timestamps > -------------------------------------------------------- > > Use a callback taking integers to create a timestamp. Example with float: > > def timestamp_to_float(seconds, floatpart, divisor): > return seconds + floatpart / divisor > > The time module can provide some builtin converters, and other module, like > datetime, can provide their own converters. Users can define their own types. > > An alternative is to add new module for all functions converting timestamps. > > The problem is that we have to design the API of the callback and we cannot > change it later. We may need more information for future needs later. I'd be more specific here - either of the 3-tuple options already presented in the PEP, or the 4-tuple option I mentioned above, would be suitable as the signature of an arbitrary precision callback API that assumes timestamps are always expressed as "seconds since a particular epoch value". Such an API could only become limiting if timestamps ever become something other than "the difference in time between right now and the relevant epoch value", and that's a sufficiently esoteric possibility that it really doesn't seem worthwhile to take it into account. The past problems with timestamp APIs have all related to increases in precision, not timestamps being redefined as something radically different. The PEP should also mention PJE's suggestion of creating a new named protocol specifically for the purpose (with a signature based on one of the proposed tuple formats), such that you could simply write: time.time() # output=float by default time.time(output=float) time.time(output=int) time.time(output=fractions.Fraction) time.time(output=decimal.Decimal) time.time(output=datetime.timedelta) time.time(output=datetime.datetime) # (and similarly for os.stat with a timestamp=type parameter) Rather than being timestamp specific, such a protocol would be a general numeric protocol. If (integer, numerator, denominator) is used (i.e. a "mixed number" in mathematical terms), then "__from_mixed__" would be an appropriate name. If (integer, fractional, exponent) is used (i.e. a fixed point notation), then "__from_fixed__" would work. # Algorithm for a "from mixed numbers" protocol, assuming division doesn't lose precision... def __from_mixed__(cls, integer, numerator, denominator): return cls(integer) + cls(numerator) / cls(denominator) # Algorithm for a "from fixed point" protocol, assuming negative exponents don't lose precision... def __from_fixed__(cls, integer, mantissa, base, exponent): return cls(integer) + cls(mantissa) * cls(base) ** cls(exponent) >From a *usage* point of view, this idea is actually the same as the proposal currently in the PEP. The difference is that instead of adding custom support for a few particular types directly to time and os, it instead defines a more general purpose protocol that covers not only this use case, but also any other situation where high precision fractions are relevant. One interesting question with a named protocol approach is whether such a protocol should *require* explicit support, or if it should fall back to the underlying mathematical operations. Since the conversions to float and int in the timestamp case are already known to be lossy, permitting lossy conversion via the mathematical equivalents seems reasonable, suggesting possible protocol definitions as follows: # Algorithm for a potentially precision-losing "from mixed numbers" protocol def from_mixed(cls, integer, numerator, denominator): try: factory = cls.__from_mixed__ except AttributeError: return cls(integer) + cls(numerator) / cls(denominator) return factory(integer, numerator, denominator) # Algorithm for a potentially lossy "from fixed point" protocol def from_fixed(cls, integer, mantissa, base, exponent): try: factory = cls.__from_fixed__ except AttributeError: return cls(integer) + cls(mantissa) * cls(base) ** cls(exponent) return factory(integer, mantissa, base, exponent) > os.stat: add new fields > ----------------------- > > It was proposed to add 3 fields to os.stat() structure to get nanoseconds of > timestamps. It's worth noting that the challenge with this is that it's potentially time consuming to populating the extra fields, and that this approach doesn't help with the time APIs that return timestamps directly. > Add an argument to change the result type > ----------------------------------------- > > Add a argument to all functions creating timestamps, like time.time(), to > change their result type. It was first proposed to use a string argument, > e.g. time.time(format="decimal"). The problem is that the function has > to import internally a module. Then it was decided to pass directly the > type, e.g. time.time(format=decimal.Decimal). Using a type, the user has > first to import the module. There is no direct link between a type and the > function used to create the timestamp. > > By default, the float type is used to keep backward compatibility. For stat > functions like os.stat(), the default type depends on os.stat_float_times(). There should also be a description of the "set a boolean flag to request high precision output" approach. Cheers, Nick. -- Nick Coghlan | [hidden email] | Brisbane, Australia _______________________________________________ Python-Dev mailing list [hidden email] http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/lists%2B1324100855712-1801473%40n6.nabble.com |
|
On 2 February 2012 03:47, Nick Coghlan <[hidden email]> wrote:
> Rather than being timestamp specific, such a protocol would be a > general numeric protocol. If (integer, numerator, denominator) is used > (i.e. a "mixed number" in mathematical terms), then "__from_mixed__" > would be an appropriate name. If (integer, fractional, exponent) is > used (i.e. a fixed point notation), then "__from_fixed__" would work. > > # Algorithm for a "from mixed numbers" protocol, assuming division > doesn't lose precision... > def __from_mixed__(cls, integer, numerator, denominator): > return cls(integer) + cls(numerator) / cls(denominator) > > # Algorithm for a "from fixed point" protocol, assuming negative > exponents don't lose precision... > def __from_fixed__(cls, integer, mantissa, base, exponent): > return cls(integer) + cls(mantissa) * cls(base) ** cls(exponent) > > >From a *usage* point of view, this idea is actually the same as the > proposal currently in the PEP. The difference is that instead of > adding custom support for a few particular types directly to time and > os, it instead defines a more general purpose protocol that covers not > only this use case, but also any other situation where high precision > fractions are relevant. > > One interesting question with a named protocol approach is whether > such a protocol should *require* explicit support, or if it should > fall back to the underlying mathematical operations. Since the > conversions to float and int in the timestamp case are already known > to be lossy, permitting lossy conversion via the mathematical > equivalents seems reasonable, suggesting possible protocol definitions > as follows: > > # Algorithm for a potentially precision-losing "from mixed numbers" protocol > def from_mixed(cls, integer, numerator, denominator): > try: > factory = cls.__from_mixed__ > except AttributeError: > return cls(integer) + cls(numerator) / cls(denominator) > return factory(integer, numerator, denominator) > > # Algorithm for a potentially lossy "from fixed point" protocol > def from_fixed(cls, integer, mantissa, base, exponent): > try: > factory = cls.__from_fixed__ > except AttributeError: > return cls(integer) + cls(mantissa) * cls(base) ** cls(exponent) > return factory(integer, mantissa, base, exponent) The key problem with a protocol is that the implementer has to make these decisions. The callback approach defers that decision to the end user. After all, the end user is the one who knows for his app whether precision loss is acceptable. You could probably also have a standard named protocol which can be used as a callback in straightforward cases time.time(callback=timedelta.__from_mixed__) That's wordy, and a bit ugly, though. The callback code could special-case types and look for __from_mixed__, I guess. Or use an ABC, and have the code that uses the callback do if issubclass(cb, MixedNumberABC): return cb.__from_mixed__(whole, num, den) else: return cb(whole, num, den) (The second branch is the one that allows the user to override the predefined types that work - if you omit that, you're back to a named protocol and ABCs don't gain you much beyond documentation). Part of me feels that there's a use case for generic functions in here, but maybe not (as it's overloading on the return type). Let's not open that discussion again, though. Paul. _______________________________________________ Python-Dev mailing list [hidden email] http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/lists%2B1324100855712-1801473%40n6.nabble.com |
|
In reply to this post by Nick Coghlan
> I'd add datetime.timedelta to this list. It's exactly what timestamps
> are, after all - the difference between the current time and the > relevant epoch value. Ah yes, I forgot to mention it, whereas it is listed in the "final timestamp formats list" :-) >> * a) (sec, nsec): C timespec structure, useful for os.futimens() for example >> * b) (sec, floatpart, exponent): value = sec + floatpart * 10**exponent >> * c) (sec, floatpart, divisor): value = sec + floatpart / divisor >> >> The format (a) and (b) may loose precision if the clock divisor is not a >> power of 10. > > Format (b) only loses precision if the exponent chosen for a given > value is too small relative to the precision of the underlying timer > (it's the same as using decimal.Decimal in that respect). Let's take an NTP timestamp in format (c): (sec=0, floatpart=100000000, divisor=2**32): >>> Decimal(100000000) * Decimal(10)**-10 Decimal('0.0100000000') >>> Decimal(100000000) / Decimal(2)**32 Decimal('0.023283064365386962890625') You have an error of 57%. Or do you mean that not only 2**32 should be modified, but also 100000000? How do you adapt 100000000 (floatpart) when changing the divisor (2**32 => 10**-10)? The format (c) avoids an operation (base^exponent) and avoids loosing precision. There is the same issue with QueryPerformanceFrequency and QueryPerformanceCounter used by time.clock(), the frequency is not a power of any base. I forgot to mention another advantage of (c), used by my patch for the Decimal format: you can get the exact resolution of the clock directly: 1/divisor. It works for any divisor (not only base^exponent). By the way, the format (c) can be simplified as a fraction: (numerator, denominator) using (seconds * divisor + floatpart, divisor). But this format is less practical to implement a function creating a timestamp. >> Callback and creating a new module to convert timestamps > (...) > Such an API could only become limiting if > timestamps ever become something other than "the difference in time > between right now and the relevant epoch value", and that's a > sufficiently esoteric possibility that it really doesn't seem > worthwhile to take it into account. It may be interesting to support a different start date (other than 1970.1.1), if we choose to support broken-down timestamps (e.g. datetime.datetime). > The PEP should also mention PJE's suggestion of creating a new named > protocol specifically for the purpose (with a signature based on one > of the proposed tuple formats) (...) Ok, I will add it. > Rather than being timestamp specific, such a protocol would be a > general numeric protocol. If (integer, numerator, denominator) is used > (i.e. a "mixed number" in mathematical terms), then "__from_mixed__" > would be an appropriate name. If (integer, fractional, exponent) is > used (i.e. a fixed point notation), then "__from_fixed__" would work. > > # Algorithm for a "from mixed numbers" protocol, assuming division > doesn't lose precision... > def __from_mixed__(cls, integer, numerator, denominator): > return cls(integer) + cls(numerator) / cls(denominator) Even if I like the idea, I don't think that we need all this machinery to support nanosecond resolution. I should maybe forget my idea of using datetime.datetime or datetime.timedelta, or only only support int, float and decimal.Decimal. datetime.datetime and datetime.timedelta are already compatible with Decimal (except that they may loose precision because of an internal conversion to float): datetime.datetime.fromtimestamp(t) and datetime.timedelta(seconds=t). If we only support int, float and Decimal, we don't need to add a new protocol, hardcoded functions are enough :-) >> os.stat: add new fields >> ----------------------- >> >> It was proposed to add 3 fields to os.stat() structure to get nanoseconds of >> timestamps. > > It's worth noting that the challenge with this is that it's > potentially time consuming to populating the extra fields, and that > this approach doesn't help with the time APIs that return timestamps > directly. New fields can be optional (add a flag to get them), but I don't like the idea of a structure with a variable number of fields, especially because os.stat() structure can be used as a tuple (get a field by its index). Patching os.stat() doesn't solve the problem for the time module anyway. >> Add an argument to change the result type >> ----------------------------------------- > > There should also be a description of the "set a boolean flag to > request high precision output" approach. You mean something like: time.time(hires=True)? Or time.time(decimal=True)? Victor _______________________________________________ Python-Dev mailing list [hidden email] http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/lists%2B1324100855712-1801473%40n6.nabble.com |
|
On 2 February 2012 12:16, Victor Stinner <[hidden email]> wrote:
> Let's take an NTP timestamp in format (c): (sec=0, > floatpart=100000000, divisor=2**32): > >>>> Decimal(100000000) * Decimal(10)**-10 > Decimal('0.0100000000') >>>> Decimal(100000000) / Decimal(2)**32 > Decimal('0.023283064365386962890625') > > You have an error of 57%. Or do you mean that not only 2**32 should be > modified, but also 100000000? How do you adapt 100000000 (floatpart) > when changing the divisor (2**32 => 10**-10)? The format (c) avoids an > operation (base^exponent) and avoids loosing precision. Am I missing something? If you're using the fixed point form (fraction, exponent) then 0.023283064365386962890625 would be written as (23283064365386962890625, -23). Same precision as the (100000000, base=2, exponent=32) format. Confused, Paul _______________________________________________ Python-Dev mailing list [hidden email] http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/lists%2B1324100855712-1801473%40n6.nabble.com |
|
In reply to this post by Victor STINNER
On Thu, Feb 2, 2012 at 10:16 PM, Victor Stinner
<[hidden email]> wrote: > If we only support int, float and Decimal, we don't need to add a new > protocol, hardcoded functions are enough :-) Yup, that's why your middle-ground approach didn't make any sense to me. Returning Decimal when a flag is set to request high precision values actually handles everything (since any epoch related questions only arise later when converting the decimal timestamp to an absolute time value). I think a protocol based approach would be *feasible*, but also overkill for the specific problem we're trying to handle (i.e. arbitrary precision timestamps). If a dependency from time and os on the decimal module means we decide to finally incorporate Stefan's cdecimal branch, I consider that a win in its own right (there are some speed hacks in decimal that didn't fair well in the Py3k transition because they went from being 8-bit str based to Unicode str based. They didn't *break* from a correctness point of view, but my money would be on they're being pessimisations now instead of optimisations). >>> os.stat: add new fields >>> ----------------------- > New fields can be optional (add a flag to get them), but I don't like > the idea of a structure with a variable number of fields, especially > because os.stat() structure can be used as a tuple (get a field by its > index). > > Patching os.stat() doesn't solve the problem for the time module anyway. We can't add new fields to the stat tuple anyway - it breaks tuple unpacking. Any new fields would have been accessible by name only (which poses its own problems, but is a solution we've used before - in the codecs module, for example). As you say though, this was never going to be adequate since it doesn't help with the time APIs. >>> Add an argument to change the result type >>> ----------------------------------------- >> >> There should also be a description of the "set a boolean flag to >> request high precision output" approach. > > You mean something like: time.time(hires=True)? Or time.time(decimal=True)? Yeah, I was thinking "hires" as the short form of "high resolution", but it's a little confusing since it also parses as the word "hires" (i.e. "hire"+"s"). "hi_res", "hi_prec" (for "high precision") or "full_prec" (for "full precision") might be better. I don't really like "decimal" as the flag name, since it confuses an implementation detail (using decimal.Decimal) with the design intent (preserving the full precision of the underlying timestamp). Cheers, Nick. -- Nick Coghlan | [hidden email] | Brisbane, Australia _______________________________________________ Python-Dev mailing list [hidden email] http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/lists%2B1324100855712-1801473%40n6.nabble.com |
|
In reply to this post by Victor STINNER
> Even if I like the idea, I don't think that we need all this machinery
> to support nanosecond resolution. I should maybe forget my idea of > using datetime.datetime or datetime.timedelta, or only only support > int, float and decimal.Decimal. I updated my patch (issue #13882) to only support int, float and decimal.Decimal types. I suppose that it is just enough. Only adding decimal.Decimal type avoids many questions: - which API / protocol should be used to support other types - what is the start of a timestamp? - etc. As we seen: using time.time(timestamp=type) API, it will be easy to support new types later (using a new protocol, a registry like Unicode codecs, or anything else). Let's start with decimal.Decimal and support it correctly (e.g. patch datetime.datetime.fromtimestamp() and os.*utime*() functions). Victor _______________________________________________ Python-Dev mailing list [hidden email] http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/lists%2B1324100855712-1801473%40n6.nabble.com |
|
In reply to this post by Paul Moore
On Thu, Feb 2, 2012 at 10:45 PM, Paul Moore <[hidden email]> wrote:
> On 2 February 2012 12:16, Victor Stinner <[hidden email]> wrote: >> Let's take an NTP timestamp in format (c): (sec=0, >> floatpart=100000000, divisor=2**32): >> >>>>> Decimal(100000000) * Decimal(10)**-10 >> Decimal('0.0100000000') >>>>> Decimal(100000000) / Decimal(2)**32 >> Decimal('0.023283064365386962890625') >> >> You have an error of 57%. Or do you mean that not only 2**32 should be >> modified, but also 100000000? How do you adapt 100000000 (floatpart) >> when changing the divisor (2**32 => 10**-10)? The format (c) avoids an >> operation (base^exponent) and avoids loosing precision. > > Am I missing something? If you're using the fixed point form > (fraction, exponent) then 0.023283064365386962890625 would be written > as (23283064365386962890625, -23). Same precision as the (100000000, > base=2, exponent=32) format. Yeah, Victor's persuaded me that the only two integer based formats that would be sufficiently flexible are (integer, numerator, divisor) and (integer, mantissa, base, exponent). The latter allows for a few more optimised conversions in particular cases. Assuming a base of 10 would just make things unnecessarily awkward when the underlying base is 2, though. However, I think it's even more right to not have a protocol at all and just use decimal.Decimal for arbitrary precision timestamps (explicitly requested via a flag to preserve backwards compatibility). Cheers, Nick. -- Nick Coghlan | [hidden email] | Brisbane, Australia _______________________________________________ Python-Dev mailing list [hidden email] http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/lists%2B1324100855712-1801473%40n6.nabble.com |
|
In reply to this post by Victor STINNER
On Thu, Feb 2, 2012 at 11:10 PM, Victor Stinner
<[hidden email]> wrote: >> Even if I like the idea, I don't think that we need all this machinery >> to support nanosecond resolution. I should maybe forget my idea of >> using datetime.datetime or datetime.timedelta, or only only support >> int, float and decimal.Decimal. > > I updated my patch (issue #13882) to only support int, float and > decimal.Decimal types. I suppose that it is just enough. > > Only adding decimal.Decimal type avoids many questions: > > - which API / protocol should be used to support other types > - what is the start of a timestamp? > - etc. > > As we seen: using time.time(timestamp=type) API, it will be easy to > support new types later (using a new protocol, a registry like Unicode > codecs, or anything else). Yeah, I can definitely live with the type-based API if we restrict it to those 3 types. Cheers, Nick. -- Nick Coghlan | [hidden email] | Brisbane, Australia _______________________________________________ Python-Dev mailing list [hidden email] http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/lists%2B1324100855712-1801473%40n6.nabble.com |
|
In reply to this post by Nick Coghlan
On Thu, 2 Feb 2012 23:07:28 +1000
Nick Coghlan <[hidden email]> wrote: > > We can't add new fields to the stat tuple anyway - it breaks tuple > unpacking. I don't think that's true. The stat tuple already has a varying number of fields: http://docs.python.org/dev/library/os.html#os.stat “For backward compatibility, the return value of stat() is also accessible as a tuple of *at least* 10 integers [...] More items may be added at the end by some implementations.” (emphasis mine) So at most you could tuple-unpack os.stat(...)[:10]. (I've never seen code tuple-unpacking a stat tuple, myself. It sounds quite cumbersome to do so.) > >>> Add an argument to change the result type > >>> ----------------------------------------- > >> > >> There should also be a description of the "set a boolean flag to > >> request high precision output" approach. > > > > You mean something like: time.time(hires=True)? Or time.time(decimal=True)? > > Yeah, I was thinking "hires" as the short form of "high resolution", > but it's a little confusing since it also parses as the word "hires" > (i.e. "hire"+"s"). "hi_res", "hi_prec" (for "high precision") or > "full_prec" (for "full precision") might be better. > > I don't really like "decimal" as the flag name, since it confuses an > implementation detail (using decimal.Decimal) with the design intent > (preserving the full precision of the underlying timestamp). But that implementation detail will be visible to the user, including when combining the result with other numbers (as Decimal "wins" over float and int). IMHO it wouldn't be silly to make it explicit. I think "hires" may confuse people into thinking the time source has a higher resolution, whereas it's only the return type. Perhaps it's just a documentation issue, though. Regards Antoine. _______________________________________________ Python-Dev mailing list [hidden email] http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/lists%2B1324100855712-1801473%40n6.nabble.com |
|
In reply to this post by Victor STINNER
On Thu, 2 Feb 2012 14:10:14 +0100
Victor Stinner <[hidden email]> wrote: > > Even if I like the idea, I don't think that we need all this machinery > > to support nanosecond resolution. I should maybe forget my idea of > > using datetime.datetime or datetime.timedelta, or only only support > > int, float and decimal.Decimal. > > I updated my patch (issue #13882) to only support int, float and > decimal.Decimal types. I suppose that it is just enough. Why int? That doesn't seem to bring anything. Regards Antoine. _______________________________________________ Python-Dev mailing list [hidden email] http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/lists%2B1324100855712-1801473%40n6.nabble.com |
|
In reply to this post by Nick Coghlan
Nick Coghlan wrote:
> On Thu, Feb 2, 2012 at 10:16 PM, Victor Stinner >>>> Add an argument to change the result type >>>> ----------------------------------------- >>> >>> There should also be a description of the "set a boolean flag to >>> request high precision output" approach. >> >> You mean something like: time.time(hires=True)? Or time.time(decimal=True)? > > Yeah, I was thinking "hires" as the short form of "high resolution", > but it's a little confusing since it also parses as the word "hires" > (i.e. "hire"+"s"). "hi_res", "hi_prec" (for "high precision") or > "full_prec" (for "full precision") might be better. Isn't the above (having the return type depend on an argument setting) something we generally try to avoid ? I think it's better to settle on one type for high-res timers and add a new API(s) for it. -- Marc-Andre Lemburg eGenix.com Professional Python Services directly from the Source (#1, Feb 02 2012) >>> Python/Zope Consulting and Support ... http://www.egenix.com/ >>> mxODBC.Zope.Database.Adapter ... http://zope.egenix.com/ >>> mxODBC, mxDateTime, mxTextTools ... http://python.egenix.com/ ________________________________________________________________________ ::: Try our new mxODBC.Connect Python Database Interface for free ! :::: eGenix.com Software, Skills and Services GmbH Pastor-Loeh-Str.48 D-40764 Langenfeld, Germany. CEO Dipl.-Math. Marc-Andre Lemburg Registered at Amtsgericht Duesseldorf: HRB 46611 http://www.egenix.com/company/contact/ _______________________________________________ Python-Dev mailing list [hidden email] http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/lists%2B1324100855712-1801473%40n6.nabble.com |
|
On Thu, Feb 2, 2012 at 11:31 PM, M.-A. Lemburg <[hidden email]> wrote:
> Isn't the above (having the return type depend on an argument > setting) something we generally try to avoid ? In Victor's actual patch, the returned object is an instance of the type you pass in, so it actually avoids that issue. > I think it's better to settle on one type for high-res timers and > add a new API(s) for it. We've basically settled on decimal.Decimal now, so yeah, the decision becomes one of spelling - either new APIs that always return Decimal instances, or a way to ask the existing APIs to return Decimal instead of floats. The way I see it, the latter should be significantly less hassle to maintain (since the code remains almost entirely shared), and it becomes trivial for someone to layer a convenience wrapper over the top that *always* requests the high precision output. Cheers, Nick. -- Nick Coghlan | [hidden email] | Brisbane, Australia _______________________________________________ Python-Dev mailing list [hidden email] http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/lists%2B1324100855712-1801473%40n6.nabble.com |
|
In reply to this post by Antoine Pitrou
> Why int? That doesn't seem to bring anything.
It helps to deprecate/replace os.stat_float_times(), which may be used for backward compatibility (with Python 2.2 ? :-)). _______________________________________________ Python-Dev mailing list [hidden email] http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/lists%2B1324100855712-1801473%40n6.nabble.com |
|
On Thu, 2 Feb 2012 15:09:41 +0100
Victor Stinner <[hidden email]> wrote: > > Why int? That doesn't seem to bring anything. > > It helps to deprecate/replace os.stat_float_times(), which may be used > for backward compatibility (with Python 2.2 ? :-)). I must admit I don't understand the stat_float_times documentation: “For compatibility with older Python versions, accessing stat_result as a tuple always returns integers. Python now returns float values by default. Applications which do not work correctly with floating point time stamps can use this function to restore the old behaviour.” These two paragraphs seem to contradict themselves. That said, I don't understand why we couldn't simply deprecate stat_float_times() right now. Having an option for integer timestamps is pointless, you can just call int() on the result if you want. Regards Antoine. _______________________________________________ Python-Dev mailing list [hidden email] http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/lists%2B1324100855712-1801473%40n6.nabble.com |
|
> That said, I don't understand why we couldn't simply deprecate
> stat_float_times() right now. Having an option for integer timestamps > is pointless, you can just call int() on the result if you want. So which API do you propose for time.time() to get a Decimal object? time.time(timestamp=decimal.Decimal) time.time(decimal=True) or time.time(hires=True) or something else? Victor _______________________________________________ Python-Dev mailing list [hidden email] http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/lists%2B1324100855712-1801473%40n6.nabble.com |
|
In reply to this post by Nick Coghlan
On Feb 02, 2012, at 11:07 PM, Nick Coghlan wrote:
>Yup, that's why your middle-ground approach didn't make any sense to >me. Returning Decimal when a flag is set to request high precision >values actually handles everything (since any epoch related questions >only arise later when converting the decimal timestamp to an absolute >time value). Guido really dislikes APIs where a flag changes the return type, and I agree with him. It's because this is highly unreadable: results = blah.whatever(True) What the heck does that `True` do? It can be marginally better with a keyword-only argument, but not much. I haven't read the whole thread so maybe this is a stupid question, but why can't we add a datetime-compatible higher precision type that hides all the implementation details? -Barry _______________________________________________ Python-Dev mailing list [hidden email] http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/lists%2B1324100855712-1801473%40n6.nabble.com |
|
Victor's patch passes in the return type rather than a binary flag, thus avoiding this particular problem. > I haven't read the whole thread so maybe this is a stupid question, but why It's not a stupid question, but for backwards compatibility, what we would actually need is a version of Decimal that implicitly interoperates with binary floats. That's... not trivial. Cheers, _______________________________________________ Python-Dev mailing list [hidden email] http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/lists%2B1324100855712-1801473%40n6.nabble.com |
|
In reply to this post by Victor STINNER
On Wed, Feb 1, 2012 at 5:03 PM, Victor Stinner
<[hidden email]> wrote: > datetime.datetime > ----------------- > > datetime.datetime only supports microsecond resolution, but can be enhanced > to support nanosecond. > > datetime.datetime has issues: > > - there is no easy way to convert it into "seconds since the epoch" Not true: >>> import datetime, time >>> epoch = datetime.datetime(1970, 1, 1, 0, 0, 0) >>> (datetime.datetime.utcnow() - epoch).total_seconds() 1328219742.385039 >>> time.time() 1328219747.640937 >>> > - any broken-down time has issues of time stamp ordering in the > duplicate hour of switching from DST to normal time Only if you insist on putting it in a timezone. Use UTC, and you should be fine. > - time zone support is flaky-to-nonexistent in the datetime module Why do you need time zone support for system interfaces that return times in UTC? I think I saw another objection that datetime represented points in time, while functions like time.time() and os.stat() return an offset from the epoch. This objection seems silly to me: the return value of the system interfaces intends to represent points in time, even though it has to be implemented as an offset since an epoch because of limitations in C, and datetime is also implemented as an offset from an epoch (year 0). On the other hand, the return value of functions like time.clock() is _not_ intended to represent an exact point in time, and so should be either a timedelta or Decimal. Jeffrey _______________________________________________ Python-Dev mailing list [hidden email] http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/lists%2B1324100855712-1801473%40n6.nabble.com |
|
In reply to this post by Antoine Pitrou
On 2/2/2012 6:28 AM, Antoine Pitrou wrote:
On Thu, 2 Feb 2012 15:09:41 +0100 Victor Stinner [hidden email] wrote:Why int? That doesn't seem to bring anything.It helps to deprecate/replace os.stat_float_times(), which may be used for backward compatibility (with Python 2.2 ? :-)).I must admit I don't understand the stat_float_times documentation: “For compatibility with older Python versions, accessing stat_result as a tuple always returns integers. Python now returns float values by default. Applications which do not work correctly with floating point time stamps can use this function to restore the old behaviour.” These two paragraphs seem to contradict themselves. That said, I don't understand why we couldn't simply deprecate stat_float_times() right now. Having an option for integer timestamps is pointless, you can just call int() on the result if you want. Regards Antoine. Sorry to bring this up, but the PEP should probably consider another option: Introducing a precedent following os.stat_decimal_times(). Like os.stat_float_times, it would decide the return types of timestamps from os.stat. Or something along that line. Having it affect the results of time.time would be weird, though. And the whole design of os.stat_float_times smells of something being designed wrong in the first place, to need such an API to retain backward compatibility. But I'm not sure it is, even yet, designed for such flexibility. _______________________________________________ Python-Dev mailing list [hidden email] http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/lists%2B1324100855712-1801473%40n6.nabble.com |
| Powered by Nabble | Edit this page |
