Quantcast

dBase III and VFP tables

Previous Topic Next Topic
 
classic Classic list List threaded Threaded
13 messages Options
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

dBase III and VFP tables

Ethan Furman-2
Greetings,

I am finally adding Null support to my dbf package
(http://pypi.python.org/pypi/dbf) and, as usual, the VFP side is being a
pain.

The issue is that the Logical, Date, and DateTime field types, even when
the table does not allow Null values, can be empty -- and when Null
values are allowed, those fields can be empty or Null.  (Empty pretty
much meaning not initialized -- neither True nor False, and no Date nor
Time.)

At this point, also, my module does not support the DB API (that is a
goal though), so right now you pretty much get direct access to the
physical table.

My question:  for the direct access side, is there any advantage to
treating an empty L, D, or T field differently from a Null L, D, or T
field?  Or should they be semantically the same, always returning, for
example, None whether it's actually Null in the table or just empty?

~Ethan~
_______________________________________________
DB-SIG maillist  -  [hidden email]
http://mail.python.org/mailman/listinfo/db-sig
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: dBase III and VFP tables

Carl Karsten
On Tue, Jul 19, 2011 at 6:39 PM, Ethan Furman <[hidden email]> wrote:

> Greetings,
>
> I am finally adding Null support to my dbf package
> (http://pypi.python.org/pypi/dbf) and, as usual, the VFP side is being a
> pain.
>
> The issue is that the Logical, Date, and DateTime field types, even when the
> table does not allow Null values, can be empty -- and when Null values are
> allowed, those fields can be empty or Null.  (Empty pretty much meaning not
> initialized -- neither True nor False, and no Date nor Time.)
>
> At this point, also, my module does not support the DB API (that is a goal
> though), so right now you pretty much get direct access to the physical
> table.
>
> My question:  for the direct access side, is there any advantage to treating
> an empty L, D, or T field differently from a Null L, D, or T field?  Or
> should they be semantically the same, always returning, for example, None
> whether it's actually Null in the table or just empty?

for those of you that are wondering about this nuttyness:  VFP is a
dBase derivative.  In there early days, there was no support for Null.
 But a side effect of adding a row but not storing values to it was
empty fields.  the best example is a date.  dates are stored on disk
as a string "YYYYMMDD"  and in the case of an empty date: "        ".
Same with numbers: N(6,2) would store 3.14 as "  3.14" and an empty
number was "      ".  Then they added Null support, but instead of
using this empty state, they added a flag, so now a field could be
Null, Blank or have a value.  which was a mistake, but here we are.

For D/T I am not sure you have a choice.  What value would you return
for an empty date other than Null?

for Logical I would return False.  Mainly because I am pretty sure
that because in VFP an empty logical will be treated as .f. anywhere a
logical is evaluated.



--
Carl K
_______________________________________________
DB-SIG maillist  -  [hidden email]
http://mail.python.org/mailman/listinfo/db-sig
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: dBase III and VFP tables

Ethan Furman-2
Carl Karsten wrote:
> for those of you that are wondering about this nuttyness:  VFP is a
> dBase derivative.  In there early days, there was no support for Null.
>  But a side effect of adding a row but not storing values to it was
> empty fields.  the best example is a date.  dates are stored on disk
> as a string "YYYYMMDD"  and in the case of an empty date: "        ".
> Same with numbers: N(6,2) would store 3.14 as "  3.14" and an empty
> number was "      ".  Then they added Null support, but instead of
> using this empty state, they added a flag, so now a field could be
> Null, Blank or have a value.  which was a mistake, but here we are.

Thanks for the clarification, Carl.

> For D/T I am not sure you have a choice.  What value would you return
> for an empty date other than Null?

As it happens, I created custom Date, Time, and DateTime objects so that
I could deal with the empty values.  I probably just should have
returned None from the beginning, but did I mention that this is the
project I have learned Python with?

Now that I have them, I'm thinking of keeping and continuing to use them
for the empty case... but not sure if that's a great idea, hence my
question here.

> for Logical I would return False.  Mainly because I am pretty sure
> that because in VFP an empty logical will be treated as .f. anywhere a
> logical is evaluated.

The test I tried before I left work seems to confirm that (it was just
printing).

~Ethan~
_______________________________________________
DB-SIG maillist  -  [hidden email]
http://mail.python.org/mailman/listinfo/db-sig
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: dBase III and VFP tables

Carl Karsten
On Tue, Jul 19, 2011 at 8:41 PM, Ethan Furman <[hidden email]> wrote:

> Carl Karsten wrote:
>>
>> for those of you that are wondering about this nuttyness:  VFP is a
>> dBase derivative.  In there early days, there was no support for Null.
>>  But a side effect of adding a row but not storing values to it was
>> empty fields.  the best example is a date.  dates are stored on disk
>> as a string "YYYYMMDD"  and in the case of an empty date: "        ".
>> Same with numbers: N(6,2) would store 3.14 as "  3.14" and an empty
>> number was "      ".  Then they added Null support, but instead of
>> using this empty state, they added a flag, so now a field could be
>> Null, Blank or have a value.  which was a mistake, but here we are.
>
> Thanks for the clarification, Carl.
>
>> For D/T I am not sure you have a choice.  What value would you return
>> for an empty date other than Null?
>
> As it happens, I created custom Date, Time, and DateTime objects so that I
> could deal with the empty values.  I probably just should have returned None
> from the beginning, but did I mention that this is the project I have
> learned Python with?

Impressive!

>
> Now that I have them, I'm thinking of keeping and continuing to use them for
> the empty case... but not sure if that's a great idea, hence my question
> here.

I would say supporting Empty is a backwards compatibility thing: If
there was code that relied on it, then you should continue to support
it.

When I was using VFP I never needed both values.

My guess is the custom objects will cause problems and solve none.


>
>> for Logical I would return False.  Mainly because I am pretty sure
>> that because in VFP an empty logical will be treated as .f. anywhere a
>> logical is evaluated.
>
> The test I tried before I left work seems to confirm that (it was just
> printing).



--
Carl K
_______________________________________________
DB-SIG maillist  -  [hidden email]
http://mail.python.org/mailman/listinfo/db-sig
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: dBase III and VFP tables

Ethan Furman-2
Carl Karsten wrote:
> I would say supporting Empty is a backwards compatibility thing: If
> there was code that relied on it, then you should continue to support
> it.
>
> When I was using VFP I never needed both values.
>
> My guess is the custom objects will cause problems and solve none.


Okay, I think what I'll do is keep it simple by default: None will be
returned both for Empty and Null, and if None is assigned to a field it
will be written as either Empty, or Null if that field is Nullable.
I'll keep the custom objects around and provide a mechanism to specify
which objects to return based on field type/null status, so if anyone
wants, for example, Decimals instead of floats they can have them, or
Empty/Null objects they can have those too.

Seem reasonable?

~Ethan~
_______________________________________________
DB-SIG maillist  -  [hidden email]
http://mail.python.org/mailman/listinfo/db-sig
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: dBase III and VFP tables

Carl Karsten
On Wed, Jul 20, 2011 at 10:58 AM, Ethan Furman <[hidden email]> wrote:

> Carl Karsten wrote:
>>
>> I would say supporting Empty is a backwards compatibility thing: If
>> there was code that relied on it, then you should continue to support
>> it.
>>
>> When I was using VFP I never needed both values.
>>
>> My guess is the custom objects will cause problems and solve none.
>
>
> Okay, I think what I'll do is keep it simple by default: None will be
> returned both for Empty and Null, and if None is assigned to a field it will
> be written as either Empty, or Null if that field is Nullable. I'll keep the
> custom objects around and provide a mechanism to specify which objects to
> return based on field type/null status, so if anyone wants, for example,
> Decimals instead of floats they can have them, or Empty/Null objects they
> can have those too.
>
> Seem reasonable?

Yep.

I was kinda thinking of keeping the empty-able objects around.   I
hate the idea of trying to document them;  I hardly like trying to
keep track of how VFP handles them :)

--
Carl K
_______________________________________________
DB-SIG maillist  -  [hidden email]
http://mail.python.org/mailman/listinfo/db-sig
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: dBase III and VFP tables

Chris Clark-2
In reply to this post by Carl Karsten
Carl Karsten wrote:

> On Tue, Jul 19, 2011 at 6:39 PM, Ethan Furman <[hidden email]> wrote:
>  
>> I am finally adding Null support to my dbf package
>> (http://pypi.python.org/pypi/dbf) and, as usual, the VFP side is being a
>> pain.
>>
>> The issue is that the Logical, Date, and DateTime field types, even when the
>> table does not allow Null values, can be empty -- and when Null values are
>> allowed, those fields can be empty or Null.  (Empty pretty much meaning not
>> initialized -- neither True nor False, and no Date nor Time.)
>>
>> At this point, also, my module does not support the DB API (that is a goal
>> though), so right now you pretty much get direct access to the physical
>> table.
>>    

Take a look at engine (formally SnakeSQL) and squawk for ideas on adding
both SQL and a dbapi interface.

http://pypi.python.org/pypi/engine/
http://pypi.python.org/pypi/SnakeSQL/

http://pypi.python.org/pypi/squawk/

I've not used either so this isn't a recommendation but they look very cool.

>> My question:  for the direct access side, is there any advantage to treating
>> an empty L, D, or T field differently from a Null L, D, or T field?  Or
>> should they be semantically the same, always returning, for example, None
>> whether it's actually Null in the table or just empty?
>>    
>
> for those of you that are wondering about this nuttyness:  VFP is a
> dBase derivative.  In there early days, there was no support for Null.
>  But a side effect of adding a row but not storing values to it was
> empty fields.  the best example is a date.  dates are stored on disk
> as a string "YYYYMMDD"  and in the case of an empty date: "        ".
> Same with numbers: N(6,2) would store 3.14 as "  3.14" and an empty
> number was "      ".  Then they added Null support, but instead of
> using this empty state, they added a flag, so now a field could be
> Null, Blank or have a value.  which was a mistake, but here we are.
>
> For D/T I am not sure you have a choice.  What value would you return
> for an empty date other than Null?
>
> for Logical I would return False.  Mainly because I am pretty sure
> that because in VFP an empty logical will be treated as .f. anywhere a
> logical is evaluated.
>  

My 2 cents, it looks like the "type" number "n(6,2)" is really a string
with some restrictions on formatting (i.e. only contains digits, a
single period, and spaces). You could simple return strings for
everything. An optional (and configurable) type conversion layer could
then be used on top of that.

Ingres has an old (non ANSI-) date type that we still support, this old
date can be empty (empty string '' rather than blank padded as described
in earlier in the thread). IMHO dbapi drivers can't go returning empty
strings for dates (as this is wildly incompatible with other DBMSs), we
return a "magic" date for empty dates (the year 9999). It is reasonable
to add an extension return type that supports empty, e.g. one could then
do checks like:

if my_result is empty:
    ....

But I'd suggest that empty support NOT be enabled by default. So lots of
options :-S


Chris

_______________________________________________
DB-SIG maillist  -  [hidden email]
http://mail.python.org/mailman/listinfo/db-sig
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: dBase III and VFP tables

Ethan Furman-2
Chris Clark wrote:
> Take a look at engine (formally SnakeSQL) and squawk for ideas on adding
> both SQL and a dbapi interface.
>
> http://pypi.python.org/pypi/engine/
> http://pypi.python.org/pypi/SnakeSQL/
>
> http://pypi.python.org/pypi/squawk/

Thanks, I'll check them out.


> My 2 cents, it looks like the "type" number "n(6,2)" is really a string
> with some restrictions on formatting (i.e. only contains digits, a
> single period, and spaces). You could simple return strings for
> everything. An optional (and configurable) type conversion layer could
> then be used on top of that.

I'm going for the configurable route.  VFP and dBase IV and above have
binary fields as well as text fields, so just returning what I find in
the table doesn't feel very friendly.  By default I'll be returning
ints/floats, bools, dates, and Char (another custom type that treats
whitespace as false, as well as ignoring trailing white space for
comparisons) with a mechanism for overriding either on a by type basis
(any Number) or a field basis (only Age).


> Ingres has an old (non ANSI-) date type that we still support, this old
> date can be empty (empty string '' rather than blank padded as described
> in earlier in the thread). IMHO dbapi drivers can't go returning empty
> strings for dates (as this is wildly incompatible with other DBMSs), we
> return a "magic" date for empty dates (the year 9999). It is reasonable
> to add an extension return type that supports empty, e.g. one could then
> do checks like:
>
> if my_result is empty:
>    ....
>
> But I'd suggest that empty support NOT be enabled by default. So lots of
> options :-S

Yes, that's the conclusion I have come to.

Thanks for the links and your comments!

~Ethan~
_______________________________________________
DB-SIG maillist  -  [hidden email]
http://mail.python.org/mailman/listinfo/db-sig
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: dBase III and VFP tables

Carl Karsten
On Wed, Jul 20, 2011 at 11:33 AM, Ethan Furman <[hidden email]> wrote:
> as well as ignoring trailing white space for comparisons)

I wouldn't bother supporting that in the driver.

Another background lesson: dBase started as an interactive
environment. It didn't look like a spreadsheet, but think of a
spreadsheet: the typical use is to start typing stuff in by hand,
select some data with the mouse, hit sort,  select some numbers and
click Summation,  see the results on the screen.  At the time things
were still in the relm of teletype machines were the only thing you
looked at was printed output.  Screens were new, and the API was still
much like a printer.  (not exactly true, there was the EDIT command
that would clear the screen, draw field names and input boxes and let
you enter data and tab/back tab around the screen.)

It wasn't until later versions that they added the concept of
stringing a bunch of commands together.  conditionals came later.  A
design goal was syntax that is easy to type interactively, not make
sense in a formal programming language.    today we have "incremental
search" where as you type query parameters the search is executed and
you see results.  If typing "KAR" is all you need to figure out "CARL
KARSTEN" is the person you are looking for, then 3 keys are all you
need to press.  likewise, in dBase you could do
DISPLAY FOR LNAME="KAR"
and see just the rows where LNAME _begines with_ "KAR"  which makes
sense for an interactive environment, but causes grief when you try to
write programs.  later they added the == operator, but maintained
backwards computability with the _only compare until you run out of
chars_ behaviour.

_backwards computability_ is the only reason to implement this in the
driver, and given there is currenly no code relying on it, nothing
will break if you don't.  So don't :)

--
Carl K
_______________________________________________
DB-SIG maillist  -  [hidden email]
http://mail.python.org/mailman/listinfo/db-sig
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: dBase III and VFP tables

Ethan Furman-2
Carl Karsten wrote:
> On Wed, Jul 20, 2011 at 11:33 AM, Ethan Furman <[hidden email]> wrote:
>> as well as ignoring trailing white space for comparisons)
>
> _backwards computability_ is the only reason to implement this in the
> driver, and given there is currenly no code relying on it, nothing
> will break if you don't.  So don't :)

Okay, have you written a book?  'Cause I'm ready to buy it.  ;)

The reason I have that (and I can make it be the option, not the
default), is that without it I have to write.... oh -- I could use
.startswith(), couldn't I?  Well, I was going to say that code such as:

# look for so-and-so...
for rec in table:
     if rec.full_name == 'John Doe':
         do_something_with(rec)

and of course this will fail because FULL_NAME is a 40 character field,
and I should have said

     if rec.full_name == 'John Doe                                '

and I really don't want to do that!  Also, .startswith() is not the same
as ignoring trailing whitespace.

Given this little example, would you still not worry about trailing
whitespace?

~Ethan~
_______________________________________________
DB-SIG maillist  -  [hidden email]
http://mail.python.org/mailman/listinfo/db-sig
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: dBase III and VFP tables

Carl Karsten
On Wed, Jul 20, 2011 at 12:36 PM, Ethan Furman <[hidden email]> wrote:

> Carl Karsten wrote:
>>
>> On Wed, Jul 20, 2011 at 11:33 AM, Ethan Furman <[hidden email]> wrote:
>>>
>>> as well as ignoring trailing white space for comparisons)
>>
>> _backwards computability_ is the only reason to implement this in the
>> driver, and given there is currenly no code relying on it, nothing
>> will break if you don't.  So don't :)
>
> Okay, have you written a book?  'Cause I'm ready to buy it.  ;)
>
> The reason I have that (and I can make it be the option, not the default),
> is that without it I have to write.... oh -- I could use .startswith(),
> couldn't I?  Well, I was going to say that code such as:
>
> # look for so-and-so...
> for rec in table:
>    if rec.full_name == 'John Doe':
>        do_something_with(rec)
>
> and of course this will fail because FULL_NAME is a 40 character field, and
> I should have said
>
>    if rec.full_name == 'John Doe                                '
>
> and I really don't want to do that!  Also, .startswith() is not the same as
> ignoring trailing whitespace.
>
> Given this little example, would you still not worry about trailing
> whitespace?


Oh yeah, that.

>>> "abc   ".strip()
'abc'

  if rec.full_name.strip() == 'John Doe'

Personally, I would leave it at that.

half baked idea: If you want to add some code to hide that, add some
sort of auto-strip to the driver so it drops the extra spaces.


--
Carl K
_______________________________________________
DB-SIG maillist  -  [hidden email]
http://mail.python.org/mailman/listinfo/db-sig
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: dBase III and VFP tables

Ethan Furman-2
Carl Karsten wrote:

>> Given this little example, would you still not worry about trailing
>> whitespace?
>
>
> Oh yeah, that.
>
>--> "abc   ".strip()
> 'abc'
>
>   if rec.full_name.strip() == 'John Doe'
>
> Personally, I would leave it at that.
>
> half baked idea: If you want to add some code to hide that, add some
> sort of auto-strip to the driver so it drops the extra spaces.

Heh, that's what I've been doing.  It did two things for me: 1) made
equality checking easier; and 2) made boolean checks make sense.  It has
its own problems though -- mainly that single character empty fields
don't work well:

if rtyp in 'FMU': # is this Forwardable, Moved, or Undeliverable?

' ' gets stripped down to '', and '' is in every string, this doesn't work.

And yes, I could use a .strip() every time I have a field comparison,
but I have *lots* of them in my code, and it feels an awful lot like
unnecessary boiler plate.

Okay, I'll leave the default at str, and make Char, like Date, DateTime,
and Logical, be available options.

~Ethan~
_______________________________________________
DB-SIG maillist  -  [hidden email]
http://mail.python.org/mailman/listinfo/db-sig
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: dBase III and VFP tables

Carl Karsten
On Wed, Jul 20, 2011 at 1:46 PM, Ethan Furman <[hidden email]> wrote:

> Carl Karsten wrote:
>>>
>>> Given this little example, would you still not worry about trailing
>>> whitespace?
>>
>>
>> Oh yeah, that.
>>
>> --> "abc   ".strip()
>> 'abc'
>>
>>  if rec.full_name.strip() == 'John Doe'
>>
>> Personally, I would leave it at that.
>>
>> half baked idea: If you want to add some code to hide that, add some
>> sort of auto-strip to the driver so it drops the extra spaces.
>
> Heh, that's what I've been doing.  It did two things for me: 1) made
> equality checking easier; and 2) made boolean checks make sense.  It has its
> own problems though -- mainly that single character empty fields don't work
> well:
>
> if rtyp in 'FMU': # is this Forwardable, Moved, or Undeliverable?
>
> ' ' gets stripped down to '', and '' is in every string, this doesn't work.

huh. yeah, its true.  i'm kinda surprised.

>>> '' in 'abc'
True

But this works:
>>> '' in ['a','b','c']
False
>>> 'b' in ['a','b','c']
True


>
> And yes, I could use a .strip() every time I have a field comparison, but I
> have *lots* of them in my code, and it feels an awful lot like unnecessary
> boiler plate.

I know what you mean.

My advice:

2 wads of code: the dbapi module and some wrapper to make it nicer to work with.

>
> Okay, I'll leave the default at str, and make Char, like Date, DateTime, and
> Logical, be available options.
>
> ~Ethan~
> _______________________________________________
> DB-SIG maillist  -  [hidden email]
> http://mail.python.org/mailman/listinfo/db-sig
>



--
Carl K
_______________________________________________
DB-SIG maillist  -  [hidden email]
http://mail.python.org/mailman/listinfo/db-sig
Loading...