Don't assume that missing fields from POST data are equal to an empty string value.

Previous Topic Next Topic
 
classic Classic list List threaded Threaded
24 messages Options
12
Reply | Threaded
Open this post in threaded view
|

Don't assume that missing fields from POST data are equal to an empty string value.

Tai Lee-2
There is a potential for data loss with optional form fields that are
(for whatever reason) omitted from an HTML template.

For example, if we take an existing model form and template that
works, add an optional character field to the model but fail to add a
corresponding field to the HTML template (e.g. human error, forgot
about a template, didn't tell the template author to make a change,
didn't realise a change needed to be made to a template), when that
form is submitted Django will assume that the user has provided an
empty string value for the missing field and save that to the model,
erasing any existing value. This is not a bug, but it is relatively
easy to trigger silent and unexpected data loss.

I have briefly discussed this with PaulM and dstufft on django-dev,
but could did not reach any consensus.

RFC1866 [1] says:

> The fields are listed in the order they appear in the
> document with the name separated from the value by `=' and
> the pairs separated from each other by `&'. Fields with null
> values may be omitted. In particular, unselected radio
> buttons and checkboxes should not appear in the encoded
> data, but hidden fields with VALUE attributes present
> should.

The HTML4 spec at W3C.org [2] says:

> Users interact with forms through named controls.
>
> A control's "control name" is given by its name attribute. The scope of the
> name attribute for a control within a FORM element is the FORM element.
>
> Each control has both an initial value and a current value, both of which are
> character strings. Please consult the definition of each control for
> information about initial values and possible constraints on values imposed by
> the control. In general, a control's "initial value" may be specified with the
> control element's value attribute. However, the initial value of a TEXTAREA
> element is given by its contents, and the initial value of an OBJECT element
> in a form is determined by the object implementation (i.e., it lies outside
> the scope of this specification).
>
> The control's "current value" is first set to the initial value. Thereafter,
> the control's current value may be modified through user interaction and
> scripts.
>
> A control's initial value does not change. Thus, when a form is reset, each
> control's current value is reset to its initial value. If a control does not
> have an initial value, the effect of a form reset on that control is
> undefined.
>
> When a form is submitted for processing, some controls have their name paired
> with their current value and these pairs are submitted with the form. Those
> controls for which name/value pairs are submitted are called successful
> controls.

as well as [3]:

> A successful control is "valid" for submission. Every successful control has
> its control name paired with its current value as part of the submitted form
> data set. A successful control must be defined within a FORM element and must
> have a control name.
>
> However:
>
> * Controls that are disabled cannot be successful.
> * If a form contains more than one submit button, only the activated submit
>   button is successful.
> * All "on" checkboxes may be successful.
> * For radio buttons that share the same value of the name attribute, only the
>   "on" radio button may be successful.
> * For menus, the control name is provided by a SELECT element and values are
>   provided by OPTION elements. Only selected options may be successful. When
>   no options are selected, the control is not successful and neither the name
>   nor any values are submitted to the server when the form is submitted.
> * The current value of a file select is a list of one or more file names. Upon
>   submission of the form, the contents of each file are submitted with the
>   rest of the form data. The file contents are packaged according to the
>   form's content type.
> * The current value of an object control is determined by the object's
>   implementation.
>
> If a control doesn't have a current value when the form is submitted, user
> agents are not required to treat it as a successful control.
>
> Furthermore, user agents should not consider the following controls
> successful:
>
> * Reset buttons.
> * OBJECT elements whose declare attribute has been set.
>
> Hidden controls and controls that are not rendered because of style sheet
> settings may still be successful.

I interpret the above to mean that any text input with a value
attribute (even `value=""`) is successful control, and should be
included in the encoded form data. This is what current versions of
Chrome and Firefox do, at least. I have not found any examples of
browsers which are known not to do this.

What I would like to change in Django is for it to stop assuming that
missing POST data for a character field is actually an empty string,
and instead raise a form validation error that would prevent the model
instance from being saved to the database (potentially causing data
loss for that field).

I don't see any benefit in the current behaviour, except to
potentially support undetermined and unspecified UAs which might treat
form fields as unsuccessful controls if the have an empty string as
their value, and if those UAs accordingly do not include those fields
in the encoded form data. Even if such UAs were found, I would argue
that the RFC and HTML4 specs show that text fields with a value (even
an empty string) are successful controls that should be included.

Failing that, I would like for Django to at least raise a loud warning
when a form is bound to data is missing values for character fields,
so that it will at least be easier to detect this silent and
unexpected data loss in a test environment, before it occurs in a
production environment.

Does anybody else have opposing or supporting arguments, or any
knowledge of actual UAs that do not include fields with empty values
in their encoded form data, or an alternative interpretation of the
RFC and HTML4 specs above?

Cheers.
Tai.

[1] http://www.rfc-editor.org/rfc/rfc1866.txt
[2] http://www.w3.org/TR/html4/interact/forms.html#h-17.2
[3] http://www.w3.org/TR/html4/interact/forms.html#successful-controls

--
You received this message because you are subscribed to the Google Groups "Django developers" group.
To post to this group, send email to [hidden email].
To unsubscribe from this group, send email to [hidden email].
For more options, visit this group at http://groups.google.com/group/django-developers?hl=en.

Reply | Threaded
Open this post in threaded view
|

Re: Don't assume that missing fields from POST data are equal to an empty string value.

Babatunde Akinyanmi
-1
I think a programmer should not specify a field that would contain
important data as optional in the first place. If the data loss from
not including it in the form is going to cause problems then it should
not be optional.

On 1/11/12, Tai Lee <[hidden email]> wrote:

> There is a potential for data loss with optional form fields that are
> (for whatever reason) omitted from an HTML template.
>
> For example, if we take an existing model form and template that
> works, add an optional character field to the model but fail to add a
> corresponding field to the HTML template (e.g. human error, forgot
> about a template, didn't tell the template author to make a change,
> didn't realise a change needed to be made to a template), when that
> form is submitted Django will assume that the user has provided an
> empty string value for the missing field and save that to the model,
> erasing any existing value. This is not a bug, but it is relatively
> easy to trigger silent and unexpected data loss.
>
> I have briefly discussed this with PaulM and dstufft on django-dev,
> but could did not reach any consensus.
>
> RFC1866 [1] says:
>
>> The fields are listed in the order they appear in the
>> document with the name separated from the value by `=' and
>> the pairs separated from each other by `&'. Fields with null
>> values may be omitted. In particular, unselected radio
>> buttons and checkboxes should not appear in the encoded
>> data, but hidden fields with VALUE attributes present
>> should.
>
> The HTML4 spec at W3C.org [2] says:
>
>> Users interact with forms through named controls.
>>
>> A control's "control name" is given by its name attribute. The scope of
>> the
>> name attribute for a control within a FORM element is the FORM element.
>>
>> Each control has both an initial value and a current value, both of which
>> are
>> character strings. Please consult the definition of each control for
>> information about initial values and possible constraints on values
>> imposed by
>> the control. In general, a control's "initial value" may be specified with
>> the
>> control element's value attribute. However, the initial value of a
>> TEXTAREA
>> element is given by its contents, and the initial value of an OBJECT
>> element
>> in a form is determined by the object implementation (i.e., it lies
>> outside
>> the scope of this specification).
>>
>> The control's "current value" is first set to the initial value.
>> Thereafter,
>> the control's current value may be modified through user interaction and
>> scripts.
>>
>> A control's initial value does not change. Thus, when a form is reset,
>> each
>> control's current value is reset to its initial value. If a control does
>> not
>> have an initial value, the effect of a form reset on that control is
>> undefined.
>>
>> When a form is submitted for processing, some controls have their name
>> paired
>> with their current value and these pairs are submitted with the form.
>> Those
>> controls for which name/value pairs are submitted are called successful
>> controls.
>
> as well as [3]:
>
>> A successful control is "valid" for submission. Every successful control
>> has
>> its control name paired with its current value as part of the submitted
>> form
>> data set. A successful control must be defined within a FORM element and
>> must
>> have a control name.
>>
>> However:
>>
>> * Controls that are disabled cannot be successful.
>> * If a form contains more than one submit button, only the activated
>> submit
>>   button is successful.
>> * All "on" checkboxes may be successful.
>> * For radio buttons that share the same value of the name attribute, only
>> the
>>   "on" radio button may be successful.
>> * For menus, the control name is provided by a SELECT element and values
>> are
>>   provided by OPTION elements. Only selected options may be successful.
>> When
>>   no options are selected, the control is not successful and neither the
>> name
>>   nor any values are submitted to the server when the form is submitted.
>> * The current value of a file select is a list of one or more file names.
>> Upon
>>   submission of the form, the contents of each file are submitted with the
>>   rest of the form data. The file contents are packaged according to the
>>   form's content type.
>> * The current value of an object control is determined by the object's
>>   implementation.
>>
>> If a control doesn't have a current value when the form is submitted, user
>> agents are not required to treat it as a successful control.
>>
>> Furthermore, user agents should not consider the following controls
>> successful:
>>
>> * Reset buttons.
>> * OBJECT elements whose declare attribute has been set.
>>
>> Hidden controls and controls that are not rendered because of style sheet
>> settings may still be successful.
>
> I interpret the above to mean that any text input with a value
> attribute (even `value=""`) is successful control, and should be
> included in the encoded form data. This is what current versions of
> Chrome and Firefox do, at least. I have not found any examples of
> browsers which are known not to do this.
>
> What I would like to change in Django is for it to stop assuming that
> missing POST data for a character field is actually an empty string,
> and instead raise a form validation error that would prevent the model
> instance from being saved to the database (potentially causing data
> loss for that field).
>
> I don't see any benefit in the current behaviour, except to
> potentially support undetermined and unspecified UAs which might treat
> form fields as unsuccessful controls if the have an empty string as
> their value, and if those UAs accordingly do not include those fields
> in the encoded form data. Even if such UAs were found, I would argue
> that the RFC and HTML4 specs show that text fields with a value (even
> an empty string) are successful controls that should be included.
>
> Failing that, I would like for Django to at least raise a loud warning
> when a form is bound to data is missing values for character fields,
> so that it will at least be easier to detect this silent and
> unexpected data loss in a test environment, before it occurs in a
> production environment.
>
> Does anybody else have opposing or supporting arguments, or any
> knowledge of actual UAs that do not include fields with empty values
> in their encoded form data, or an alternative interpretation of the
> RFC and HTML4 specs above?
>
> Cheers.
> Tai.
>
> [1] http://www.rfc-editor.org/rfc/rfc1866.txt
> [2] http://www.w3.org/TR/html4/interact/forms.html#h-17.2
> [3] http://www.w3.org/TR/html4/interact/forms.html#successful-controls
>
> --
> You received this message because you are subscribed to the Google Groups
> "Django developers" group.
> To post to this group, send email to [hidden email].
> To unsubscribe from this group, send email to
> [hidden email].
> For more options, visit this group at
> http://groups.google.com/group/django-developers?hl=en.
>
>

--
Sent from my mobile device

--
You received this message because you are subscribed to the Google Groups "Django developers" group.
To post to this group, send email to [hidden email].
To unsubscribe from this group, send email to [hidden email].
For more options, visit this group at http://groups.google.com/group/django-developers?hl=en.

Reply | Threaded
Open this post in threaded view
|

Re: Don't assume that missing fields from POST data are equal to an empty string value.

Daniel Sokolowski
+1 even though I agree with what Babatunde said I support this change as anything that minimizes a 'gotchas' during development is very good. I raether get an exception instead of spending half an hour debuging why my data is not being saved. Sure once I figure it out I would learn to pay attention in similar situations in the future but realistically I shouldn't and nor should the next python/django programmer re-discovering this caveat.

On Wed, Jan 11, 2012 at 9:11 AM, Babatunde Akinyanmi <[hidden email]> wrote:
-1
I think a programmer should not specify a field that would contain
important data as optional in the first place. If the data loss from
not including it in the form is going to cause problems then it should
not be optional.

On 1/11/12, Tai Lee <[hidden email]> wrote:
> There is a potential for data loss with optional form fields that are
> (for whatever reason) omitted from an HTML template.
>
> For example, if we take an existing model form and template that
> works, add an optional character field to the model but fail to add a
> corresponding field to the HTML template (e.g. human error, forgot
> about a template, didn't tell the template author to make a change,
> didn't realise a change needed to be made to a template), when that
> form is submitted Django will assume that the user has provided an
> empty string value for the missing field and save that to the model,
> erasing any existing value. This is not a bug, but it is relatively
> easy to trigger silent and unexpected data loss.
>
> I have briefly discussed this with PaulM and dstufft on django-dev,
> but could did not reach any consensus.
>
> RFC1866 [1] says:
>
>> The fields are listed in the order they appear in the
>> document with the name separated from the value by `=' and
>> the pairs separated from each other by `&'. Fields with null
>> values may be omitted. In particular, unselected radio
>> buttons and checkboxes should not appear in the encoded
>> data, but hidden fields with VALUE attributes present
>> should.
>
> The HTML4 spec at W3C.org [2] says:
>
>> Users interact with forms through named controls.
>>
>> A control's "control name" is given by its name attribute. The scope of
>> the
>> name attribute for a control within a FORM element is the FORM element.
>>
>> Each control has both an initial value and a current value, both of which
>> are
>> character strings. Please consult the definition of each control for
>> information about initial values and possible constraints on values
>> imposed by
>> the control. In general, a control's "initial value" may be specified with
>> the
>> control element's value attribute. However, the initial value of a
>> TEXTAREA
>> element is given by its contents, and the initial value of an OBJECT
>> element
>> in a form is determined by the object implementation (i.e., it lies
>> outside
>> the scope of this specification).
>>
>> The control's "current value" is first set to the initial value.
>> Thereafter,
>> the control's current value may be modified through user interaction and
>> scripts.
>>
>> A control's initial value does not change. Thus, when a form is reset,
>> each
>> control's current value is reset to its initial value. If a control does
>> not
>> have an initial value, the effect of a form reset on that control is
>> undefined.
>>
>> When a form is submitted for processing, some controls have their name
>> paired
>> with their current value and these pairs are submitted with the form.
>> Those
>> controls for which name/value pairs are submitted are called successful
>> controls.
>
> as well as [3]:
>
>> A successful control is "valid" for submission. Every successful control
>> has
>> its control name paired with its current value as part of the submitted
>> form
>> data set. A successful control must be defined within a FORM element and
>> must
>> have a control name.
>>
>> However:
>>
>> * Controls that are disabled cannot be successful.
>> * If a form contains more than one submit button, only the activated
>> submit
>>   button is successful.
>> * All "on" checkboxes may be successful.
>> * For radio buttons that share the same value of the name attribute, only
>> the
>>   "on" radio button may be successful.
>> * For menus, the control name is provided by a SELECT element and values
>> are
>>   provided by OPTION elements. Only selected options may be successful.
>> When
>>   no options are selected, the control is not successful and neither the
>> name
>>   nor any values are submitted to the server when the form is submitted.
>> * The current value of a file select is a list of one or more file names.
>> Upon
>>   submission of the form, the contents of each file are submitted with the
>>   rest of the form data. The file contents are packaged according to the
>>   form's content type.
>> * The current value of an object control is determined by the object's
>>   implementation.
>>
>> If a control doesn't have a current value when the form is submitted, user
>> agents are not required to treat it as a successful control.
>>
>> Furthermore, user agents should not consider the following controls
>> successful:
>>
>> * Reset buttons.
>> * OBJECT elements whose declare attribute has been set.
>>
>> Hidden controls and controls that are not rendered because of style sheet
>> settings may still be successful.
>
> I interpret the above to mean that any text input with a value
> attribute (even `value=""`) is successful control, and should be
> included in the encoded form data. This is what current versions of
> Chrome and Firefox do, at least. I have not found any examples of
> browsers which are known not to do this.
>
> What I would like to change in Django is for it to stop assuming that
> missing POST data for a character field is actually an empty string,
> and instead raise a form validation error that would prevent the model
> instance from being saved to the database (potentially causing data
> loss for that field).
>
> I don't see any benefit in the current behaviour, except to
> potentially support undetermined and unspecified UAs which might treat
> form fields as unsuccessful controls if the have an empty string as
> their value, and if those UAs accordingly do not include those fields
> in the encoded form data. Even if such UAs were found, I would argue
> that the RFC and HTML4 specs show that text fields with a value (even
> an empty string) are successful controls that should be included.
>
> Failing that, I would like for Django to at least raise a loud warning
> when a form is bound to data is missing values for character fields,
> so that it will at least be easier to detect this silent and
> unexpected data loss in a test environment, before it occurs in a
> production environment.
>
> Does anybody else have opposing or supporting arguments, or any
> knowledge of actual UAs that do not include fields with empty values
> in their encoded form data, or an alternative interpretation of the
> RFC and HTML4 specs above?
>
> Cheers.
> Tai.
>
> [1] http://www.rfc-editor.org/rfc/rfc1866.txt
> [2] http://www.w3.org/TR/html4/interact/forms.html#h-17.2
> [3] http://www.w3.org/TR/html4/interact/forms.html#successful-controls
>
> --
> You received this message because you are subscribed to the Google Groups
> "Django developers" group.
> To post to this group, send email to [hidden email].
> To unsubscribe from this group, send email to
> [hidden email].
> For more options, visit this group at
> http://groups.google.com/group/django-developers?hl=en.
>
>

--
Sent from my mobile device

--
You received this message because you are subscribed to the Google Groups "Django developers" group.
To post to this group, send email to [hidden email].
To unsubscribe from this group, send email to [hidden email].
For more options, visit this group at http://groups.google.com/group/django-developers?hl=en.


--
You received this message because you are subscribed to the Google Groups "Django developers" group.
To post to this group, send email to [hidden email].
To unsubscribe from this group, send email to [hidden email].
For more options, visit this group at http://groups.google.com/group/django-developers?hl=en.
Reply | Threaded
Open this post in threaded view
|

Re: Don't assume that missing fields from POST data are equal to an empty string value.

Andre Terra
+1 on a loud exception being raised.
"Errors should never pass silently.
Unless explicitly silenced."

Cheers,
AT


On Wed, Jan 11, 2012 at 1:29 PM, Daniel Sokolowski <[hidden email]> wrote:
+1 even though I agree with what Babatunde said I support this change as anything that minimizes a 'gotchas' during development is very good. I raether get an exception instead of spending half an hour debuging why my data is not being saved. Sure once I figure it out I would learn to pay attention in similar situations in the future but realistically I shouldn't and nor should the next python/django programmer re-discovering this caveat.

On Wed, Jan 11, 2012 at 9:11 AM, Babatunde Akinyanmi <[hidden email]> wrote:
-1
I think a programmer should not specify a field that would contain
important data as optional in the first place. If the data loss from
not including it in the form is going to cause problems then it should
not be optional.

On 1/11/12, Tai Lee <[hidden email]> wrote:
> There is a potential for data loss with optional form fields that are
> (for whatever reason) omitted from an HTML template.
>
> For example, if we take an existing model form and template that
> works, add an optional character field to the model but fail to add a
> corresponding field to the HTML template (e.g. human error, forgot
> about a template, didn't tell the template author to make a change,
> didn't realise a change needed to be made to a template), when that
> form is submitted Django will assume that the user has provided an
> empty string value for the missing field and save that to the model,
> erasing any existing value. This is not a bug, but it is relatively
> easy to trigger silent and unexpected data loss.
>
> I have briefly discussed this with PaulM and dstufft on django-dev,
> but could did not reach any consensus.
>
> RFC1866 [1] says:
>
>> The fields are listed in the order they appear in the
>> document with the name separated from the value by `=' and
>> the pairs separated from each other by `&'. Fields with null
>> values may be omitted. In particular, unselected radio
>> buttons and checkboxes should not appear in the encoded
>> data, but hidden fields with VALUE attributes present
>> should.
>
> The HTML4 spec at W3C.org [2] says:
>
>> Users interact with forms through named controls.
>>
>> A control's "control name" is given by its name attribute. The scope of
>> the
>> name attribute for a control within a FORM element is the FORM element.
>>
>> Each control has both an initial value and a current value, both of which
>> are
>> character strings. Please consult the definition of each control for
>> information about initial values and possible constraints on values
>> imposed by
>> the control. In general, a control's "initial value" may be specified with
>> the
>> control element's value attribute. However, the initial value of a
>> TEXTAREA
>> element is given by its contents, and the initial value of an OBJECT
>> element
>> in a form is determined by the object implementation (i.e., it lies
>> outside
>> the scope of this specification).
>>
>> The control's "current value" is first set to the initial value.
>> Thereafter,
>> the control's current value may be modified through user interaction and
>> scripts.
>>
>> A control's initial value does not change. Thus, when a form is reset,
>> each
>> control's current value is reset to its initial value. If a control does
>> not
>> have an initial value, the effect of a form reset on that control is
>> undefined.
>>
>> When a form is submitted for processing, some controls have their name
>> paired
>> with their current value and these pairs are submitted with the form.
>> Those
>> controls for which name/value pairs are submitted are called successful
>> controls.
>
> as well as [3]:
>
>> A successful control is "valid" for submission. Every successful control
>> has
>> its control name paired with its current value as part of the submitted
>> form
>> data set. A successful control must be defined within a FORM element and
>> must
>> have a control name.
>>
>> However:
>>
>> * Controls that are disabled cannot be successful.
>> * If a form contains more than one submit button, only the activated
>> submit
>>   button is successful.
>> * All "on" checkboxes may be successful.
>> * For radio buttons that share the same value of the name attribute, only
>> the
>>   "on" radio button may be successful.
>> * For menus, the control name is provided by a SELECT element and values
>> are
>>   provided by OPTION elements. Only selected options may be successful.
>> When
>>   no options are selected, the control is not successful and neither the
>> name
>>   nor any values are submitted to the server when the form is submitted.
>> * The current value of a file select is a list of one or more file names.
>> Upon
>>   submission of the form, the contents of each file are submitted with the
>>   rest of the form data. The file contents are packaged according to the
>>   form's content type.
>> * The current value of an object control is determined by the object's
>>   implementation.
>>
>> If a control doesn't have a current value when the form is submitted, user
>> agents are not required to treat it as a successful control.
>>
>> Furthermore, user agents should not consider the following controls
>> successful:
>>
>> * Reset buttons.
>> * OBJECT elements whose declare attribute has been set.
>>
>> Hidden controls and controls that are not rendered because of style sheet
>> settings may still be successful.
>
> I interpret the above to mean that any text input with a value
> attribute (even `value=""`) is successful control, and should be
> included in the encoded form data. This is what current versions of
> Chrome and Firefox do, at least. I have not found any examples of
> browsers which are known not to do this.
>
> What I would like to change in Django is for it to stop assuming that
> missing POST data for a character field is actually an empty string,
> and instead raise a form validation error that would prevent the model
> instance from being saved to the database (potentially causing data
> loss for that field).
>
> I don't see any benefit in the current behaviour, except to
> potentially support undetermined and unspecified UAs which might treat
> form fields as unsuccessful controls if the have an empty string as
> their value, and if those UAs accordingly do not include those fields
> in the encoded form data. Even if such UAs were found, I would argue
> that the RFC and HTML4 specs show that text fields with a value (even
> an empty string) are successful controls that should be included.
>
> Failing that, I would like for Django to at least raise a loud warning
> when a form is bound to data is missing values for character fields,
> so that it will at least be easier to detect this silent and
> unexpected data loss in a test environment, before it occurs in a
> production environment.
>
> Does anybody else have opposing or supporting arguments, or any
> knowledge of actual UAs that do not include fields with empty values
> in their encoded form data, or an alternative interpretation of the
> RFC and HTML4 specs above?
>
> Cheers.
> Tai.
>
> [1] http://www.rfc-editor.org/rfc/rfc1866.txt
> [2] http://www.w3.org/TR/html4/interact/forms.html#h-17.2
> [3] http://www.w3.org/TR/html4/interact/forms.html#successful-controls
>
> --
> You received this message because you are subscribed to the Google Groups
> "Django developers" group.
> To post to this group, send email to [hidden email].
> To unsubscribe from this group, send email to
> [hidden email].
> For more options, visit this group at
> http://groups.google.com/group/django-developers?hl=en.
>
>

--
Sent from my mobile device

--
You received this message because you are subscribed to the Google Groups "Django developers" group.
To post to this group, send email to [hidden email].
To unsubscribe from this group, send email to [hidden email].
For more options, visit this group at http://groups.google.com/group/django-developers?hl=en.


--
You received this message because you are subscribed to the Google Groups "Django developers" group.
To post to this group, send email to [hidden email].
To unsubscribe from this group, send email to [hidden email].
For more options, visit this group at http://groups.google.com/group/django-developers?hl=en.

--
You received this message because you are subscribed to the Google Groups "Django developers" group.
To post to this group, send email to [hidden email].
To unsubscribe from this group, send email to [hidden email].
For more options, visit this group at http://groups.google.com/group/django-developers?hl=en.
Reply | Threaded
Open this post in threaded view
|

Re: Don't assume that missing fields from POST data are equal to an empty string value.

Tom Evans-3
In reply to this post by Daniel Sokolowski
On Wed, Jan 11, 2012 at 3:29 PM, Daniel Sokolowski
<[hidden email]> wrote:
> +1 even though I agree with what Babatunde said I support this change as
> anything that minimizes a 'gotchas' during development is very good. I
> raether get an exception instead of spending half an hour debuging why my
> data is not being saved. Sure once I figure it out I would learn to pay
> attention in similar situations in the future but realistically I shouldn't
> and nor should the next python/django programmer re-discovering this caveat.
>

The consequence of this is not that data would not be saved, it would
be that previously saved data is removed by django when updating other
fields on the model.

Eg, if you had a Profile object:

class Profile(models.Model):
  logo = models.ImageField()
  salutation = models.CharField(max_length=64)

and there is a model form for updating the logo:

class UpdateProfileLogoForm(forms.ModelForm):
  class Meta:
    model = Profile
    exclude=('salutation',)

and the view that presents this form renders it like so:

<form method='post'>
  {{ form.logo }}
  <input type='submit' value='submit'/>
</form>

then this would work correctly. However, consider what happens when an
additional field is added to Profile:

class Profile(models.Model):
  logo = models.ImageField()
  salutation = models.CharField(max_length=64)
  pet_name = models.CharField(max_length=128, blank=True)

Now, the model form instantly has an additional field 'pet_name', but
if it is empty or not present, the form will still validate as
blank=True on that field.
If a user sets a pet_name, but then updates their logo using this
model form, the pet_name will be assumed by django to have been
removed, and the pet_name stored in the DB will be updated to the
empty string.

Are there any other situations where this can happen? The problem is
in fact caused by using 'exclude' to choose the fields presented in a
model form, using 'fields' and explicitly listing the fields seems
much safer.

Cheers

Tom

--
You received this message because you are subscribed to the Google Groups "Django developers" group.
To post to this group, send email to [hidden email].
To unsubscribe from this group, send email to [hidden email].
For more options, visit this group at http://groups.google.com/group/django-developers?hl=en.

Reply | Threaded
Open this post in threaded view
|

Re: Don't assume that missing fields from POST data are equal to an empty string value.

Tom Evans-3
In reply to this post by Andre Terra
On Wed, Jan 11, 2012 at 3:38 PM, Andre Terra <[hidden email]> wrote:
> +1 on a loud exception being raised.
>
> "Errors should never pass silently.
> Unless explicitly silenced."
>
>
> Cheers,
> AT
>

So, look at the example I (just) posted. The 'error' is that the
field-restricted form and template mistakenly don't output all the
fields they should. How should the framework detect that, given that
it is the same behaviour as if the control were emptied?

Cheers

Tom

--
You received this message because you are subscribed to the Google Groups "Django developers" group.
To post to this group, send email to [hidden email].
To unsubscribe from this group, send email to [hidden email].
For more options, visit this group at http://groups.google.com/group/django-developers?hl=en.

Reply | Threaded
Open this post in threaded view
|

Re: Don't assume that missing fields from POST data are equal to an empty string value.

Donald Stufft
In reply to this post by Tai Lee-2
I'm very much -1 on this change.

To "fix" this change would require throwing an error anytime an incomplete dictionary was passed as the data arg to a form. This would break any existing code that relies on this (in particular it's common practice to accept a subset of the fields via json). So this would be a backwards incompatible change.

Further more I disagree with the interpretation of the RFC as provided. The RFC states that any UA may choose to not send along a form field if it contains a null value. So the question then becomes what is a null value in regards to the RFC? As I cannot find any mention of what constitutes a null value in the RFC I went to my browser. Using javascript I executed ``document.getElementById('textfield').value = null`` in the js console. The result was that the value was set to "". So in my browser (Chrome on OS X) it is treating null and "" with equivalence.

Going by my personal interpretation of the RFC, and Chrome's behavior in my javascript test I can only conclude that the proposed change would cause Django forms to violate the RFC spec and while Violating the RFC spec in and of itself isn't always the wrong thing to do I do believe that it should only be done when RFC and implementations are at odds in a way that are incompatible with each other. In this case they are not, and the RFC is more permissive and should be followed as Django does not have a list of supported browsers so we must strive to follow the RFC's where we can (and where they are actually defined) and deviate only when the alternative is being broken in major browsers.

Additionally I believe in this case there are two major error conditions.

A) The proposed change is made, a visitor is using a UA that I believe follows the RFC and any Django forms with optional, and unfilled in values stop working for this visitor.
B) The proposed change is not made, and when an optional form field is left off of a form (or json, or any partially incomplete dictionary of values) the form assumes the default initial value of "".

Neither error condition is optimal, however A has the additional downside that this error is completely outside of the control of the developer whereas B is the result of developer error and is under his control.

On Tuesday, January 10, 2012 at 8:38 PM, Tai Lee wrote:

There is a potential for data loss with optional form fields that are
(for whatever reason) omitted from an HTML template.

For example, if we take an existing model form and template that
works, add an optional character field to the model but fail to add a
corresponding field to the HTML template (e.g. human error, forgot
about a template, didn't tell the template author to make a change,
didn't realise a change needed to be made to a template), when that
form is submitted Django will assume that the user has provided an
empty string value for the missing field and save that to the model,
erasing any existing value. This is not a bug, but it is relatively
easy to trigger silent and unexpected data loss.

I have briefly discussed this with PaulM and dstufft on django-dev,
but could did not reach any consensus.

RFC1866 [1] says:

The fields are listed in the order they appear in the
document with the name separated from the value by `=' and
the pairs separated from each other by `&'. Fields with null
values may be omitted. In particular, unselected radio
buttons and checkboxes should not appear in the encoded
data, but hidden fields with VALUE attributes present
should.

The HTML4 spec at W3C.org [2] says:

Users interact with forms through named controls.

A control's "control name" is given by its name attribute. The scope of the
name attribute for a control within a FORM element is the FORM element.

Each control has both an initial value and a current value, both of which are
character strings. Please consult the definition of each control for
information about initial values and possible constraints on values imposed by
the control. In general, a control's "initial value" may be specified with the
control element's value attribute. However, the initial value of a TEXTAREA
element is given by its contents, and the initial value of an OBJECT element
in a form is determined by the object implementation (i.e., it lies outside
the scope of this specification).

The control's "current value" is first set to the initial value. Thereafter,
the control's current value may be modified through user interaction and
scripts.

A control's initial value does not change. Thus, when a form is reset, each
control's current value is reset to its initial value. If a control does not
have an initial value, the effect of a form reset on that control is
undefined.

When a form is submitted for processing, some controls have their name paired
with their current value and these pairs are submitted with the form. Those
controls for which name/value pairs are submitted are called successful
controls.

as well as [3]:

A successful control is "valid" for submission. Every successful control has
its control name paired with its current value as part of the submitted form
data set. A successful control must be defined within a FORM element and must
have a control name.

However:

* Controls that are disabled cannot be successful.
* If a form contains more than one submit button, only the activated submit
button is successful.
* All "on" checkboxes may be successful.
* For radio buttons that share the same value of the name attribute, only the
"on" radio button may be successful.
* For menus, the control name is provided by a SELECT element and values are
provided by OPTION elements. Only selected options may be successful. When
no options are selected, the control is not successful and neither the name
nor any values are submitted to the server when the form is submitted.
* The current value of a file select is a list of one or more file names. Upon
submission of the form, the contents of each file are submitted with the
rest of the form data. The file contents are packaged according to the
form's content type.
* The current value of an object control is determined by the object's
implementation.

If a control doesn't have a current value when the form is submitted, user
agents are not required to treat it as a successful control.

Furthermore, user agents should not consider the following controls
successful:

* Reset buttons.
* OBJECT elements whose declare attribute has been set.

Hidden controls and controls that are not rendered because of style sheet
settings may still be successful.

I interpret the above to mean that any text input with a value
attribute (even `value=""`) is successful control, and should be
included in the encoded form data. This is what current versions of
Chrome and Firefox do, at least. I have not found any examples of
browsers which are known not to do this.

What I would like to change in Django is for it to stop assuming that
missing POST data for a character field is actually an empty string,
and instead raise a form validation error that would prevent the model
instance from being saved to the database (potentially causing data
loss for that field).

I don't see any benefit in the current behaviour, except to
potentially support undetermined and unspecified UAs which might treat
form fields as unsuccessful controls if the have an empty string as
their value, and if those UAs accordingly do not include those fields
in the encoded form data. Even if such UAs were found, I would argue
that the RFC and HTML4 specs show that text fields with a value (even
an empty string) are successful controls that should be included.

Failing that, I would like for Django to at least raise a loud warning
when a form is bound to data is missing values for character fields,
so that it will at least be easier to detect this silent and
unexpected data loss in a test environment, before it occurs in a
production environment.

Does anybody else have opposing or supporting arguments, or any
knowledge of actual UAs that do not include fields with empty values
in their encoded form data, or an alternative interpretation of the
RFC and HTML4 specs above?

Cheers.
Tai.


--
You received this message because you are subscribed to the Google Groups "Django developers" group.
To post to this group, send email to [hidden email].
To unsubscribe from this group, send email to [hidden email].
For more options, visit this group at http://groups.google.com/group/django-developers?hl=en.

--
You received this message because you are subscribed to the Google Groups "Django developers" group.
To post to this group, send email to [hidden email].
To unsubscribe from this group, send email to [hidden email].
For more options, visit this group at http://groups.google.com/group/django-developers?hl=en.
Reply | Threaded
Open this post in threaded view
|

Re: Don't assume that missing fields from POST data are equal to an empty string value.

Tai Lee-2
In reply to this post by Babatunde Akinyanmi
Optional or required is not the same thing as important. It is
entirely valid to have important but optional fields on a model.

For example, a field that contains a string reference to a python
callback function that modifies the behaviour of some model methods
when set.

We use this on a Broadcast model in a mailing list application. The
callback is used to generate additional context for the email template
for personalisation when sending out email newsletters.

This is important data that should not be lost as it will change the
behaviour of model methods (in this case, potentially sending the same
email newsletter without personalisation to many recipients), but it
may not be required for every instance of the model.

Cheers.
Tai.


On Jan 12, 1:11 am, Babatunde Akinyanmi <[hidden email]> wrote:
> -1
> I think a programmer should not specify a field that would contain
> important data as optional in the first place. If the data loss from
> not including it in the form is going to cause problems then it should
> not be optional.

--
You received this message because you are subscribed to the Google Groups "Django developers" group.
To post to this group, send email to [hidden email].
To unsubscribe from this group, send email to [hidden email].
For more options, visit this group at http://groups.google.com/group/django-developers?hl=en.

Reply | Threaded
Open this post in threaded view
|

Re: Don't assume that missing fields from POST data are equal to an empty string value.

Tai Lee-2
In reply to this post by Tom Evans-3
Tom, the problem is not caused by using `exclude`, it is caused by not
using `fields`. If you use `exclude` or simply don't specify `exclude`
or `fields` then all fields will be included. Thanks for your other
example and description of the problem, I just wanted to clarify that
it is not only a problem when using `exclude`.

Cheers.
Tai.


On Jan 12, 2:40 am, Tom Evans <[hidden email]> wrote:
> Are there any other situations where this can happen? The problem is
> in fact caused by using 'exclude' to choose the fields presented in a
> model form, using 'fields' and explicitly listing the fields seems
> much safer.

--
You received this message because you are subscribed to the Google Groups "Django developers" group.
To post to this group, send email to [hidden email].
To unsubscribe from this group, send email to [hidden email].
For more options, visit this group at http://groups.google.com/group/django-developers?hl=en.

Reply | Threaded
Open this post in threaded view
|

Re: Don't assume that missing fields from POST data are equal to an empty string value.

Tai Lee-2
In reply to this post by Donald Stufft
Donald,

Thanks for sharing your feedback with everyone.

I do believe that there should be some kind of validation error or at
least a loud warning to the console or logs when a form is bound to
incomplete data. The only time when this should occur is for
"unsuccessful controls" such as unchecked radio buttons and
checkboxes. These shouldn't be a problem for Django, which already
normalises no value for a checkbox to False and any value for it to
True.

I'm not sure that there is a widespread practice of submitting partial
forms with JS and still expecting the entire form to validate and save
is widespread, or even valid according to the RFC and HTML4 specs
which expect every successful control to be included in the form data.

Sure, I can see that forms could be fully or even partially submitted
via JS to perform AJAX validation in real time, but I don't see how
the form as a whole being invalid and validation errors on the missing
fields would impact that.

If we can't find a clear definition or distinction between null and an
empty string in the RFC or HTML4 specs, and we go to our browser (you
mentioned Chrome), I think you will find that they consider text input
fields with an empty string as the value are "successful controls", as
these are included in the form data.

Before knocking back this proposal on the grounds of ambiguous specs
(multiple interpretations), I would like to know if there are actually
any UAs that behave as you fear they might. If there are, then this
change would definitely be a problem for those UAs.

However, even in the event that this is deemed a backwards
incompatible change and the potential silent data loss issue is not
serious enough to override that, then a loud warning could still be
added without changing the current behaviour.

Cheers.
Tai.


On Jan 12, 3:29 am, Donald Stufft <[hidden email]> wrote:

> I'm very much -1 on this change.
>
> To "fix" this change would require throwing an error anytime an incomplete dictionary was passed as the data arg to a form. This would break any existing code that relies on this (in particular it's common practice to accept a subset of the fields via json). So this would be a backwards incompatible change.
>
> Further more I disagree with the interpretation of the RFC as provided. The RFC states that any UA may choose to not send along a form field if it contains a null value. So the question then becomes what is a null value in regards to the RFC? As I cannot find any mention of what constitutes a null value in the RFC I went to my browser. Using javascript I executed ``document.getElementById('textfield').value = null`` in the js console. The result was that the value was set to "". So in my browser (Chrome on OS X) it is treating null and "" with equivalence.
>
> Going by my personal interpretation of the RFC, and Chrome's behavior in my javascript test I can only conclude that the proposed change would cause Django forms to violate the RFC spec and while Violating the RFC spec in and of itself isn't always the wrong thing to do I do believe that it should only be done when RFC and implementations are at odds in a way that are incompatible with each other. In this case they are not, and the RFC is more permissive and should be followed as Django does not have a list of supported browsers so we must strive to follow the RFC's where we can (and where they are actually defined) and deviate only when the alternative is being broken in major browsers.
>
> Additionally I believe in this case there are two major error conditions.
>
> A) The proposed change is made, a visitor is using a UA that I believe follows the RFC and any Django forms with optional, and unfilled in values stop working for this visitor.
> B) The proposed change is not made, and when an optional form field is left off of a form (or json, or any partially incomplete dictionary of values) the form assumes the default initial value of "".
>
> Neither error condition is optimal, however A has the additional downside that this error is completely outside of the control of the developer whereas B is the result of developer error and is under his control.
>
>
>
>
>
>
>
> On Tuesday, January 10, 2012 at 8:38 PM, Tai Lee wrote:
> > There is a potential for data loss with optional form fields that are
> > (for whatever reason) omitted from an HTML template.
>
> > For example, if we take an existing model form and template that
> > works, add an optional character field to the model but fail to add a
> > corresponding field to the HTML template (e.g. human error, forgot
> > about a template, didn't tell the template author to make a change,
> > didn't realise a change needed to be made to a template), when that
> > form is submitted Django will assume that the user has provided an
> > empty string value for the missing field and save that to the model,
> > erasing any existing value. This is not a bug, but it is relatively
> > easy to trigger silent and unexpected data loss.
>
> > I have briefly discussed this with PaulM and dstufft on django-dev,
> > but could did not reach any consensus.
>
> > RFC1866 [1] says:
>
> > > The fields are listed in the order they appear in the
> > > document with the name separated from the value by `=' and
> > > the pairs separated from each other by `&'. Fields with null
> > > values may be omitted. In particular, unselected radio
> > > buttons and checkboxes should not appear in the encoded
> > > data, but hidden fields with VALUE attributes present
> > > should.
>
> > The HTML4 spec at W3C.org (http://W3C.org) [2] says:
>
> > > Users interact with forms through named controls.
>
> > > A control's "control name" is given by its name attribute. The scope of the
> > > name attribute for a control within a FORM element is the FORM element.
>
> > > Each control has both an initial value and a current value, both of which are
> > > character strings. Please consult the definition of each control for
> > > information about initial values and possible constraints on values imposed by
> > > the control. In general, a control's "initial value" may be specified with the
> > > control element's value attribute. However, the initial value of a TEXTAREA
> > > element is given by its contents, and the initial value of an OBJECT element
> > > in a form is determined by the object implementation (i.e., it lies outside
> > > the scope of this specification).
>
> > > The control's "current value" is first set to the initial value. Thereafter,
> > > the control's current value may be modified through user interaction and
> > > scripts.
>
> > > A control's initial value does not change. Thus, when a form is reset, each
> > > control's current value is reset to its initial value. If a control does not
> > > have an initial value, the effect of a form reset on that control is
> > > undefined.
>
> > > When a form is submitted for processing, some controls have their name paired
> > > with their current value and these pairs are submitted with the form. Those
> > > controls for which name/value pairs are submitted are called successful
> > > controls.
>
> > as well as [3]:
>
> > > A successful control is "valid" for submission. Every successful control has
> > > its control name paired with its current value as part of the submitted form
> > > data set. A successful control must be defined within a FORM element and must
> > > have a control name.
>
> > > However:
>
> > > * Controls that are disabled cannot be successful.
> > > * If a form contains more than one submit button, only the activated submit
> > > button is successful.
> > > * All "on" checkboxes may be successful.
> > > * For radio buttons that share the same value of the name attribute, only the
> > > "on" radio button may be successful.
> > > * For menus, the control name is provided by a SELECT element and values are
> > > provided by OPTION elements. Only selected options may be successful. When
> > > no options are selected, the control is not successful and neither the name
> > > nor any values are submitted to the server when the form is submitted.
> > > * The current value of a file select is a list of one or more file names. Upon
> > > submission of the form, the contents of each file are submitted with the
> > > rest of the form data. The file contents are packaged according to the
> > > form's content type.
> > > * The current value of an object control is determined by the object's
> > > implementation.
>
> > > If a control doesn't have a current value when the form is submitted, user
> > > agents are not required to treat it as a successful control.
>
> > > Furthermore, user agents should not consider the following controls
> > > successful:
>
> > > * Reset buttons.
> > > * OBJECT elements whose declare attribute has been set.
>
> > > Hidden controls and controls that are not rendered because of style sheet
> > > settings may still be successful.
>
> > I interpret the above to mean that any text input with a value
> > attribute (even `value=""`) is successful control, and should be
> > included in the encoded form data. This is what current versions of
> > Chrome and Firefox do, at least. I have not found any examples of
> > browsers which are known not to do this.
>
> > What I would like to change in Django is for it to stop assuming that
> > missing POST data for a character field is actually an empty string,
> > and instead raise a form validation error that would prevent the model
> > instance from being saved to the database (potentially causing data
> > loss for that field).
>
> > I don't see any benefit in the current behaviour, except to
> > potentially support undetermined and unspecified UAs which might treat
> > form fields as unsuccessful controls if the have an empty string as
> > their value, and if those UAs accordingly do not include those fields
> > in the encoded form data. Even if such UAs were found, I would argue
> > that the RFC and HTML4 specs show that text fields with a value (even
> > an empty string) are successful controls that should be included.
>
> > Failing that, I would like for Django to at least raise a loud warning
> > when a form is bound to data is missing values for character fields,
> > so that it will at least be easier to detect this silent and
> > unexpected data loss in a test environment, before it occurs in a
> > production environment.
>
> > Does anybody else have opposing or supporting arguments, or any
> > knowledge of actual UAs that do not include fields with empty values
> > in their encoded form data, or an alternative interpretation of the
> > RFC and HTML4 specs above?
>
> > Cheers.
> > Tai.
>
> > [1]http://www.rfc-editor.org/rfc/rfc1866.txt
> > [2]http://www.w3.org/TR/html4/interact/forms.html#h-17.2
> > [3]http://www.w3.org/TR/html4/interact/forms.html#successful-controls
>
> > --
> > You received this message because you are subscribed to the Google Groups "Django developers" group.
> > To post to this group, send email to [hidden email] (mailto:[hidden email]).
> > To unsubscribe from this group, send email to [hidden email] (mailto:[hidden email]).
> > For more options, visit this group athttp://groups.google.com/group/django-developers?hl=en.

--
You received this message because you are subscribed to the Google Groups "Django developers" group.
To post to this group, send email to [hidden email].
To unsubscribe from this group, send email to [hidden email].
For more options, visit this group at http://groups.google.com/group/django-developers?hl=en.

Reply | Threaded
Open this post in threaded view
|

Re: Don't assume that missing fields from POST data are equal to an empty string value.

Daniel Sokolowski
Donald

Backward compatibility can be maintained with just a log warning to give developer time to clean up code or throwing an exception when `settings.DEBUG=True` and only log warning when `False`; I am not sure how this ought to be implemented but again I am big +1 for this proposal.

On Thu, Jan 12, 2012 at 4:02 AM, Tai Lee <[hidden email]> wrote:
Donald,

Thanks for sharing your feedback with everyone.

I do believe that there should be some kind of validation error or at
least a loud warning to the console or logs when a form is bound to
incomplete data. The only time when this should occur is for
"unsuccessful controls" such as unchecked radio buttons and
checkboxes. These shouldn't be a problem for Django, which already
normalises no value for a checkbox to False and any value for it to
True.

I'm not sure that there is a widespread practice of submitting partial
forms with JS and still expecting the entire form to validate and save
is widespread, or even valid according to the RFC and HTML4 specs
which expect every successful control to be included in the form data.

Sure, I can see that forms could be fully or even partially submitted
via JS to perform AJAX validation in real time, but I don't see how
the form as a whole being invalid and validation errors on the missing
fields would impact that.

If we can't find a clear definition or distinction between null and an
empty string in the RFC or HTML4 specs, and we go to our browser (you
mentioned Chrome), I think you will find that they consider text input
fields with an empty string as the value are "successful controls", as
these are included in the form data.

Before knocking back this proposal on the grounds of ambiguous specs
(multiple interpretations), I would like to know if there are actually
any UAs that behave as you fear they might. If there are, then this
change would definitely be a problem for those UAs.

However, even in the event that this is deemed a backwards
incompatible change and the potential silent data loss issue is not
serious enough to override that, then a loud warning could still be
added without changing the current behaviour.

Cheers.
Tai.


On Jan 12, 3:29 am, Donald Stufft <[hidden email]> wrote:
> I'm very much -1 on this change.
>
> To "fix" this change would require throwing an error anytime an incomplete dictionary was passed as the data arg to a form. This would break any existing code that relies on this (in particular it's common practice to accept a subset of the fields via json). So this would be a backwards incompatible change.
>
> Further more I disagree with the interpretation of the RFC as provided. The RFC states that any UA may choose to not send along a form field if it contains a null value. So the question then becomes what is a null value in regards to the RFC? As I cannot find any mention of what constitutes a null value in the RFC I went to my browser. Using javascript I executed ``document.getElementById('textfield').value = null`` in the js console. The result was that the value was set to "". So in my browser (Chrome on OS X) it is treating null and "" with equivalence.
>
> Going by my personal interpretation of the RFC, and Chrome's behavior in my javascript test I can only conclude that the proposed change would cause Django forms to violate the RFC spec and while Violating the RFC spec in and of itself isn't always the wrong thing to do I do believe that it should only be done when RFC and implementations are at odds in a way that are incompatible with each other. In this case they are not, and the RFC is more permissive and should be followed as Django does not have a list of supported browsers so we must strive to follow the RFC's where we can (and where they are actually defined) and deviate only when the alternative is being broken in major browsers.
>
> Additionally I believe in this case there are two major error conditions.
>
> A) The proposed change is made, a visitor is using a UA that I believe follows the RFC and any Django forms with optional, and unfilled in values stop working for this visitor.
> B) The proposed change is not made, and when an optional form field is left off of a form (or json, or any partially incomplete dictionary of values) the form assumes the default initial value of "".
>
> Neither error condition is optimal, however A has the additional downside that this error is completely outside of the control of the developer whereas B is the result of developer error and is under his control.
>
>
>
>
>
>
>
> On Tuesday, January 10, 2012 at 8:38 PM, Tai Lee wrote:
> > There is a potential for data loss with optional form fields that are
> > (for whatever reason) omitted from an HTML template.
>
> > For example, if we take an existing model form and template that
> > works, add an optional character field to the model but fail to add a
> > corresponding field to the HTML template (e.g. human error, forgot
> > about a template, didn't tell the template author to make a change,
> > didn't realise a change needed to be made to a template), when that
> > form is submitted Django will assume that the user has provided an
> > empty string value for the missing field and save that to the model,
> > erasing any existing value. This is not a bug, but it is relatively
> > easy to trigger silent and unexpected data loss.
>
> > I have briefly discussed this with PaulM and dstufft on django-dev,
> > but could did not reach any consensus.
>
> > RFC1866 [1] says:
>
> > > The fields are listed in the order they appear in the
> > > document with the name separated from the value by `=' and
> > > the pairs separated from each other by `&'. Fields with null
> > > values may be omitted. In particular, unselected radio
> > > buttons and checkboxes should not appear in the encoded
> > > data, but hidden fields with VALUE attributes present
> > > should.
>
> > The HTML4 spec at W3C.org (http://W3C.org) [2] says:
>
> > > Users interact with forms through named controls.
>
> > > A control's "control name" is given by its name attribute. The scope of the
> > > name attribute for a control within a FORM element is the FORM element.
>
> > > Each control has both an initial value and a current value, both of which are
> > > character strings. Please consult the definition of each control for
> > > information about initial values and possible constraints on values imposed by
> > > the control. In general, a control's "initial value" may be specified with the
> > > control element's value attribute. However, the initial value of a TEXTAREA
> > > element is given by its contents, and the initial value of an OBJECT element
> > > in a form is determined by the object implementation (i.e., it lies outside
> > > the scope of this specification).
>
> > > The control's "current value" is first set to the initial value. Thereafter,
> > > the control's current value may be modified through user interaction and
> > > scripts.
>
> > > A control's initial value does not change. Thus, when a form is reset, each
> > > control's current value is reset to its initial value. If a control does not
> > > have an initial value, the effect of a form reset on that control is
> > > undefined.
>
> > > When a form is submitted for processing, some controls have their name paired
> > > with their current value and these pairs are submitted with the form. Those
> > > controls for which name/value pairs are submitted are called successful
> > > controls.
>
> > as well as [3]:
>
> > > A successful control is "valid" for submission. Every successful control has
> > > its control name paired with its current value as part of the submitted form
> > > data set. A successful control must be defined within a FORM element and must
> > > have a control name.
>
> > > However:
>
> > > * Controls that are disabled cannot be successful.
> > > * If a form contains more than one submit button, only the activated submit
> > > button is successful.
> > > * All "on" checkboxes may be successful.
> > > * For radio buttons that share the same value of the name attribute, only the
> > > "on" radio button may be successful.
> > > * For menus, the control name is provided by a SELECT element and values are
> > > provided by OPTION elements. Only selected options may be successful. When
> > > no options are selected, the control is not successful and neither the name
> > > nor any values are submitted to the server when the form is submitted.
> > > * The current value of a file select is a list of one or more file names. Upon
> > > submission of the form, the contents of each file are submitted with the
> > > rest of the form data. The file contents are packaged according to the
> > > form's content type.
> > > * The current value of an object control is determined by the object's
> > > implementation.
>
> > > If a control doesn't have a current value when the form is submitted, user
> > > agents are not required to treat it as a successful control.
>
> > > Furthermore, user agents should not consider the following controls
> > > successful:
>
> > > * Reset buttons.
> > > * OBJECT elements whose declare attribute has been set.
>
> > > Hidden controls and controls that are not rendered because of style sheet
> > > settings may still be successful.
>
> > I interpret the above to mean that any text input with a value
> > attribute (even `value=""`) is successful control, and should be
> > included in the encoded form data. This is what current versions of
> > Chrome and Firefox do, at least. I have not found any examples of
> > browsers which are known not to do this.
>
> > What I would like to change in Django is for it to stop assuming that
> > missing POST data for a character field is actually an empty string,
> > and instead raise a form validation error that would prevent the model
> > instance from being saved to the database (potentially causing data
> > loss for that field).
>
> > I don't see any benefit in the current behaviour, except to
> > potentially support undetermined and unspecified UAs which might treat
> > form fields as unsuccessful controls if the have an empty string as
> > their value, and if those UAs accordingly do not include those fields
> > in the encoded form data. Even if such UAs were found, I would argue
> > that the RFC and HTML4 specs show that text fields with a value (even
> > an empty string) are successful controls that should be included.
>
> > Failing that, I would like for Django to at least raise a loud warning
> > when a form is bound to data is missing values for character fields,
> > so that it will at least be easier to detect this silent and
> > unexpected data loss in a test environment, before it occurs in a
> > production environment.
>
> > Does anybody else have opposing or supporting arguments, or any
> > knowledge of actual UAs that do not include fields with empty values
> > in their encoded form data, or an alternative interpretation of the
> > RFC and HTML4 specs above?
>
> > Cheers.
> > Tai.
>
> > [1]http://www.rfc-editor.org/rfc/rfc1866.txt
> > [2]http://www.w3.org/TR/html4/interact/forms.html#h-17.2
> > [3]http://www.w3.org/TR/html4/interact/forms.html#successful-controls
>
> > --
> > You received this message because you are subscribed to the Google Groups "Django developers" group.
> > To post to this group, send email to [hidden email] (mailto:[hidden email]).
> > To unsubscribe from this group, send email to [hidden email] (mailto:[hidden email]).
> > For more options, visit this group athttp://groups.google.com/group/django-developers?hl=en.

--
You received this message because you are subscribed to the Google Groups "Django developers" group.
To post to this group, send email to [hidden email].
To unsubscribe from this group, send email to [hidden email].
For more options, visit this group at http://groups.google.com/group/django-developers?hl=en.


--
You received this message because you are subscribed to the Google Groups "Django developers" group.
To post to this group, send email to [hidden email].
To unsubscribe from this group, send email to [hidden email].
For more options, visit this group at http://groups.google.com/group/django-developers?hl=en.
Reply | Threaded
Open this post in threaded view
|

Re: Don't assume that missing fields from POST data are equal to an empty string value.

Ian Clelland-2
In reply to this post by Tai Lee-2


On Thu, Jan 12, 2012 at 1:02 AM, Tai Lee <[hidden email]> wrote:
Donald,

Thanks for sharing your feedback with everyone.

I do believe that there should be some kind of validation error or at
least a loud warning to the console or logs when a form is bound to
incomplete data. The only time when this should occur is for
"unsuccessful controls" such as unchecked radio buttons and
checkboxes. These shouldn't be a problem for Django, which already
normalises no value for a checkbox to False and any value for it to
True.

I'm not sure that there is a widespread practice of submitting partial
forms with JS and still expecting the entire form to validate and save
is widespread, or even valid according to the RFC and HTML4 specs
which expect every successful control to be included in the form data.

You are using the word 'form' in two different contexts here -- There is the HTML <form>, on the web page that a user can interact with -- this is simply a (mostly) formalized way for a user to submit data to a web service. (I say 'mostly' because, as we are discovering, there are edge cases where the handling of missing data is not completely specified.)

Secondly, there is the Form object that exists with a Django application. This is the only form that we have complete control over, and it is the only place that we get to say that data is or is not 'valid'.

There is definitely not a one-to-one correspondence between these two forms, and on the web, we can't assume that we have complete control over both of them. On the one hand, the HTML form is not the only way for a GET or POST request to be submitted. We need to consider, at least:

- Modern, mainstream, buggy web browsers, like Chrome or Firefox
- Older, mainstream, buggy web browsers
- Non-mainstream web browsers
- Other HTML-based User-Agents
- Other REST-based applications
- JavaScript submitting AJAX requests with data serialized from an object (not a form.submit() call), from any number of frameworks
- curl / wget command-line interfaces
- Python urllib / httplib requests (and other languages' equivalents)
- URL query parameters


Many of these would never even see or parse an HTML <form> element, but they can all still provide data which will be used to construct a Django Form. We absolutely cannot claim to have the same level of confidence in the behaviour of these that we do by a reading of the RFC and an examination of the output from recent versions of Firefox and Chrome.

And then, on the other side, data that comes into a view may be handled by multiple Form objects -- it's not uncommon to render fields in an HTML form that are going to be handled (or not) by several Django Forms.

Even in the simplest case -- one HTML form, in a browser window, and one Django Form in a view -- it may even be the case that several fields were left off of the HTML form deliberately, because the same Django view and Form are also used by different pages on the site. In this case, I *want* to declare the fields to be optional, and then choose how to handle it after determining that the presented form fields are valid. With this proposal, I won't be able to, without subclassing the form or providing different views to handle each subset of data that I need to be able to accept.


 
Sure, I can see that forms could be fully or even partially submitted
via JS to perform AJAX validation in real time, but I don't see how
the form as a whole being invalid and validation errors on the missing
fields would impact that.

The problem is that AJAX (or any number of other methods) could be assembling data that should validate (ie, not for ajax validation, but with the intention of actually submitting data), and it may not be easy (or possible) to match the handling of null / blank / undefined values to what a browser would do. 

Not having the ability to get past a Django-mandated ValidationError would certainly impact the user experience in this case ;)


--
Regards,
Ian Clelland
<[hidden email]>

--
You received this message because you are subscribed to the Google Groups "Django developers" group.
To post to this group, send email to [hidden email].
To unsubscribe from this group, send email to [hidden email].
For more options, visit this group at http://groups.google.com/group/django-developers?hl=en.
Reply | Threaded
Open this post in threaded view
|

Re: Don't assume that missing fields from POST data are equal to an empty string value.

alejandro varas

On 12 ene, 14:07, Ian Clelland <[hidden email]> wrote:

> On Thu, Jan 12, 2012 at 1:02 AM, Tai Lee <[hidden email]> wrote:
> > Donald,
>
> > Thanks for sharing your feedback with everyone.
>
> > I do believe that there should be some kind of validation error or at
> > least a loud warning to the console or logs when a form is bound to
> > incomplete data. The only time when this should occur is for
> > "unsuccessful controls" such as unchecked radio buttons and
> > checkboxes. These shouldn't be a problem for Django, which already
> > normalises no value for a checkbox to False and any value for it to
> > True.
>
> > I'm not sure that there is a widespread practice of submitting partial
> > forms with JS and still expecting the entire form to validate and save
> > is widespread, or even valid according to the RFC and HTML4 specs
> > which expect every successful control to be included in the form data.
>
> You are using the word 'form' in two different contexts here -- There is
> the HTML <form>, on the web page that a user can interact with -- this is
> simply a (mostly) formalized way for a user to submit data to a web
> service. (I say 'mostly' because, as we are discovering, there are edge
> cases where the handling of missing data is not completely specified.)
>
> Secondly, there is the Form object that exists with a Django application.
> This is the only form that we have complete control over, and it is the
> only place that we get to say that data is or is not 'valid'.
>
> There is definitely not a one-to-one correspondence between these two
> forms, and on the web, we can't assume that we have complete control over
> both of them. On the one hand, the HTML form is not the only way for a GET
> or POST request to be submitted. We need to consider, at least:
>
> - Modern, mainstream, buggy web browsers, like Chrome or Firefox
> - Older, mainstream, buggy web browsers
> - Non-mainstream web browsers
> - Other HTML-based User-Agents
> - Other REST-based applications
> - JavaScript submitting AJAX requests with data serialized from an object
> (not a form.submit() call), from any number of frameworks
> - curl / wget command-line interfaces
> - Python urllib / httplib requests (and other languages' equivalents)
> - URL query parameters
>
> Many of these would never even see or parse an HTML <form> element, but
> they can all still provide data which will be used to construct a Django
> Form. We absolutely cannot claim to have the same level of confidence in
> the behaviour of these that we do by a reading of the RFC and an
> examination of the output from recent versions of Firefox and Chrome.
>
> And then, on the other side, data that comes into a view may be handled by
> multiple Form objects -- it's not uncommon to render fields in an HTML form
> that are going to be handled (or not) by several Django Forms.
>
> Even in the simplest case -- one HTML form, in a browser window, and one
> Django Form in a view -- it may even be the case that several fields were
> left off of the HTML form deliberately, because the same Django view and
> Form are also used by different pages on the site. In this case, I *want*
> to declare the fields to be optional, and then choose how to handle it
> after determining that the presented form fields are valid. With this
> proposal, I won't be able to, without subclassing the form or providing
> different views to handle each subset of data that I need to be able to
> accept.
>
> > Sure, I can see that forms could be fully or even partially submitted
> > via JS to perform AJAX validation in real time, but I don't see how
> > the form as a whole being invalid and validation errors on the missing
> > fields would impact that.
>
> The problem is that AJAX (or any number of other methods) could be
> assembling data that should validate (ie, not for ajax validation, but with
> the intention of actually submitting data), and it may not be easy (or
> possible) to match the handling of null / blank / undefined values to what
> a browser would do.
>
> Not having the ability to get past a Django-mandated ValidationError would
> certainly impact the user experience in this case ;)
>
> --
> Regards,
> Ian Clelland
> <[hidden email]>

Hi all,

I've looked carefuly to the example Tom posted and if I undertand the
situation correctly, ther is no problem if a field is missing in the
HTML form, there are to cases you can pass request.POST to a form

1. You are *creating* an object, so if a field is missing, no data is
going to be lost
2. You are *updating* an object, so you need to pass 'instance' to the
form class, doing this only the fields in 'data' are going to be
changed

in both cases you pass request.POST to the form containing only the
fields specified in the HTML, so if you forget a field it won't be
part of request.POST and there is no way your modelform "knows" that.

I've created this project to test this: https://github.com/alej0varas/learn-django,
it uses Django 1.3.1, the important code is in "missingdataform/oneapp/
tests.py".

Regards

--
You received this message because you are subscribed to the Google Groups "Django developers" group.
To post to this group, send email to [hidden email].
To unsubscribe from this group, send email to [hidden email].
For more options, visit this group at http://groups.google.com/group/django-developers?hl=en.

Reply | Threaded
Open this post in threaded view
|

Re: Don't assume that missing fields from POST data are equal to an empty string value.

Tai Lee-2
In reply to this post by Ian Clelland-2
Ian,

I agree that there are a lot of different ways that form data can be
submitted to Django, including near endless combinations of multiple
Django forms, multiple HTML forms, AJAX, etc.

If we reduce the scope of our discussion and consider only a Django
form (forgetting HTML forms and AJAX) and two possible scenarios:

1. Partial form data is bound to a Django form, and it is not expected
to result in a call to the form's `save()` method and a change to the
database. It is only intended to examine the form's validation state
relating to the partial data that is bound to the form. In this case,
the proposed change should have no impact.

2. Full form data is bound to a form, and it IS expected to validate
and result in a call to the form's `save()` method and a change to the
database. In this case, why would we ever want to assume that a
character field which has no bound data, is actually bound to an empty
string?

This is a dangerous assumption, I think. If we intend to obtain
complete data from the user, bind it to a form and save it to the
database, we should be sure (as much as we can be) that the data is
actually complete.

Take the following pure Django example:

{{{
class Profile(models.Model):
    first_name = models.CharField(blank=True, max_length=50)
    last_name = models.CharField(blank=True, max_length=50)
    address = models.CharField(blank=True, max_length=50)

class ProfileForm(ModelForm):
    class Meta:
        model = Profile

profile = Profile.objects.create(first_name='Tai', last_name='Lee',
address='Sydney')
form = ProfileForm({}, instance=profile)
if form.is_valid():
    form.save()
}}}

The profile will have no first name, last name or address. The form
will produce no validation errors and will save the updated model to
the database.

This is very surprising and counter-intuitive to me.

I think the only time we could safely make the assumption that fields
with empty string values will be omitted by the UA is when we are
confident that the source of the form data also makes a similar
assumption that the back-end will re-normalise those missing fields
back to an empty string.

I'd be surprised if any UAs actually do make that assumption, because
the way Django treats missing character fields does not appear to be
based on any spec, and Django is not the only form processing back-end
(or even the most popular one).

I'm not sure I follow your simplest case example of one HTML form in a
browser window and one Django Form in a view. If optional fields are
left off the HTML form deliberately, without change the Form class or
the view code, this is exactly when data loss will currently occur. I
think you are confusing optional as in "may not be specified by the
user" with optional as in "may not be processed by the form"?

Cheers.
Tai.

--
You received this message because you are subscribed to the Google Groups "Django developers" group.
To post to this group, send email to [hidden email].
To unsubscribe from this group, send email to [hidden email].
For more options, visit this group at http://groups.google.com/group/django-developers?hl=en.

Reply | Threaded
Open this post in threaded view
|

Re: Don't assume that missing fields from POST data are equal to an empty string value.

Mark Lavin
> If optional fields are
> left off the HTML form deliberately, without change the Form class or
> the view code, this is exactly when data loss will currently occur.
> I think you are confusing optional as in "may not be specified by the
> user" with optional as in "may not be processed by the form"?

I think it would be more accurate to call this developer error rather
than a data loss bug in Django. If you are defining forms in this way
you are exposing all the model fields to be changed by the form. You
shouldn't be defining forms with fields you don't want changed.
Imagine this form:

{{{
from django import forms
from django.contrib.auth.models import User

class UserForm(forms.ModelForm):
     class Meta(object):
         model = User
}}}

Look innocent? It's not. Even if you don't render the is_staff or
is_superuser fields they can still be changed if they are given in the
POST. That means exposing this form can allow changing a user to a
superuser. Don't trust data from the client and be explicit about the
fields you want to expose for the form to change. This is why the
'fields' and 'exclude' Meta options exist.

Best,

Mark

On Jan 12, 9:40 pm, Tai Lee <[hidden email]> wrote:

> Ian,
>
> I agree that there are a lot of different ways that form data can be
> submitted to Django, including near endless combinations of multiple
> Django forms, multiple HTML forms, AJAX, etc.
>
> If we reduce the scope of our discussion and consider only a Django
> form (forgetting HTML forms and AJAX) and two possible scenarios:
>
> 1. Partial form data is bound to a Django form, and it is not expected
> to result in a call to the form's `save()` method and a change to the
> database. It is only intended to examine the form's validation state
> relating to the partial data that is bound to the form. In this case,
> the proposed change should have no impact.
>
> 2. Full form data is bound to a form, and it IS expected to validate
> and result in a call to the form's `save()` method and a change to the
> database. In this case, why would we ever want to assume that a
> character field which has no bound data, is actually bound to an empty
> string?
>
> This is a dangerous assumption, I think. If we intend to obtain
> complete data from the user, bind it to a form and save it to the
> database, we should be sure (as much as we can be) that the data is
> actually complete.
>
> Take the following pure Django example:
>
> {{{
> class Profile(models.Model):
>     first_name = models.CharField(blank=True, max_length=50)
>     last_name = models.CharField(blank=True, max_length=50)
>     address = models.CharField(blank=True, max_length=50)
>
> class ProfileForm(ModelForm):
>     class Meta:
>         model = Profile
>
> profile = Profile.objects.create(first_name='Tai', last_name='Lee',
> address='Sydney')
> form = ProfileForm({}, instance=profile)
> if form.is_valid():
>     form.save()
>
> }}}
>
> The profile will have no first name, last name or address. The form
> will produce no validation errors and will save the updated model to
> the database.
>
> This is very surprising and counter-intuitive to me.
>
> I think the only time we could safely make the assumption that fields
> with empty string values will be omitted by the UA is when we are
> confident that the source of the form data also makes a similar
> assumption that the back-end will re-normalise those missing fields
> back to an empty string.
>
> I'd be surprised if any UAs actually do make that assumption, because
> the way Django treats missing character fields does not appear to be
> based on any spec, and Django is not the only form processing back-end
> (or even the most popular one).
>
> I'm not sure I follow your simplest case example of one HTML form in a
> browser window and one Django Form in a view. If optional fields are
> left off the HTML form deliberately, without change the Form class or
> the view code, this is exactly when data loss will currently occur. I
> think you are confusing optional as in "may not be specified by the
> user" with optional as in "may not be processed by the form"?
>
> Cheers.
> Tai.

--
You received this message because you are subscribed to the Google Groups "Django developers" group.
To post to this group, send email to [hidden email].
To unsubscribe from this group, send email to [hidden email].
For more options, visit this group at http://groups.google.com/group/django-developers?hl=en.

Reply | Threaded
Open this post in threaded view
|

Re: Don't assume that missing fields from POST data are equal to an empty string value.

Daniel Sokolowski
In reply to this post by Tai Lee-2
1. How does this proposal effect default values specified on model fields? (ModelForm processing) 
2. Forgive my ignorance but who has the final say if it goes in or not? And since it should be a yes what's next?

On Thu, Jan 12, 2012 at 8:40 PM, Tai Lee <[hidden email]> wrote:
Ian,

I agree that there are a lot of different ways that form data can be
submitted to Django, including near endless combinations of multiple
Django forms, multiple HTML forms, AJAX, etc.

If we reduce the scope of our discussion and consider only a Django
form (forgetting HTML forms and AJAX) and two possible scenarios:

1. Partial form data is bound to a Django form, and it is not expected
to result in a call to the form's `save()` method and a change to the
database. It is only intended to examine the form's validation state
relating to the partial data that is bound to the form. In this case,
the proposed change should have no impact.

2. Full form data is bound to a form, and it IS expected to validate
and result in a call to the form's `save()` method and a change to the
database. In this case, why would we ever want to assume that a
character field which has no bound data, is actually bound to an empty
string?

This is a dangerous assumption, I think. If we intend to obtain
complete data from the user, bind it to a form and save it to the
database, we should be sure (as much as we can be) that the data is
actually complete.

Take the following pure Django example:

{{{
class Profile(models.Model):
   first_name = models.CharField(blank=True, max_length=50)
   last_name = models.CharField(blank=True, max_length=50)
   address = models.CharField(blank=True, max_length=50)

class ProfileForm(ModelForm):
   class Meta:
       model = Profile

profile = Profile.objects.create(first_name='Tai', last_name='Lee',
address='Sydney')
form = ProfileForm({}, instance=profile)
if form.is_valid():
   form.save()
}}}

The profile will have no first name, last name or address. The form
will produce no validation errors and will save the updated model to
the database.

This is very surprising and counter-intuitive to me.

I think the only time we could safely make the assumption that fields
with empty string values will be omitted by the UA is when we are
confident that the source of the form data also makes a similar
assumption that the back-end will re-normalise those missing fields
back to an empty string.

I'd be surprised if any UAs actually do make that assumption, because
the way Django treats missing character fields does not appear to be
based on any spec, and Django is not the only form processing back-end
(or even the most popular one).

I'm not sure I follow your simplest case example of one HTML form in a
browser window and one Django Form in a view. If optional fields are
left off the HTML form deliberately, without change the Form class or
the view code, this is exactly when data loss will currently occur. I
think you are confusing optional as in "may not be specified by the
user" with optional as in "may not be processed by the form"?

Cheers.
Tai.

--
You received this message because you are subscribed to the Google Groups "Django developers" group.
To post to this group, send email to [hidden email].
To unsubscribe from this group, send email to [hidden email].
For more options, visit this group at http://groups.google.com/group/django-developers?hl=en.


--
You received this message because you are subscribed to the Google Groups "Django developers" group.
To post to this group, send email to [hidden email].
To unsubscribe from this group, send email to [hidden email].
For more options, visit this group at http://groups.google.com/group/django-developers?hl=en.
Reply | Threaded
Open this post in threaded view
|

Re: Don't assume that missing fields from POST data are equal to an empty string value.

Mark Lavin
I think Ian demonstrated exactly why this should not go in and that is not all forms are ModelForms. Imagine writing a REST API which allows for an optional format parameter for GET requests which is validated by a Django form. With the inclusion of this proposal all clients would be forced to specify the format in the GET (even if left blank) or else the form won't validate. That's not much of an optional field and a large backwards incompatibility.

On Fri, Jan 13, 2012 at 10:20 AM, Daniel Sokolowski <[hidden email]> wrote:
1. How does this proposal effect default values specified on model fields? (ModelForm processing) 
2. Forgive my ignorance but who has the final say if it goes in or not? And since it should be a yes what's next?

On Thu, Jan 12, 2012 at 8:40 PM, Tai Lee <[hidden email]> wrote:
Ian,

I agree that there are a lot of different ways that form data can be
submitted to Django, including near endless combinations of multiple
Django forms, multiple HTML forms, AJAX, etc.

If we reduce the scope of our discussion and consider only a Django
form (forgetting HTML forms and AJAX) and two possible scenarios:

1. Partial form data is bound to a Django form, and it is not expected
to result in a call to the form's `save()` method and a change to the
database. It is only intended to examine the form's validation state
relating to the partial data that is bound to the form. In this case,
the proposed change should have no impact.

2. Full form data is bound to a form, and it IS expected to validate
and result in a call to the form's `save()` method and a change to the
database. In this case, why would we ever want to assume that a
character field which has no bound data, is actually bound to an empty
string?

This is a dangerous assumption, I think. If we intend to obtain
complete data from the user, bind it to a form and save it to the
database, we should be sure (as much as we can be) that the data is
actually complete.

Take the following pure Django example:

{{{
class Profile(models.Model):
   first_name = models.CharField(blank=True, max_length=50)
   last_name = models.CharField(blank=True, max_length=50)
   address = models.CharField(blank=True, max_length=50)

class ProfileForm(ModelForm):
   class Meta:
       model = Profile

profile = Profile.objects.create(first_name='Tai', last_name='Lee',
address='Sydney')
form = ProfileForm({}, instance=profile)
if form.is_valid():
   form.save()
}}}

The profile will have no first name, last name or address. The form
will produce no validation errors and will save the updated model to
the database.

This is very surprising and counter-intuitive to me.

I think the only time we could safely make the assumption that fields
with empty string values will be omitted by the UA is when we are
confident that the source of the form data also makes a similar
assumption that the back-end will re-normalise those missing fields
back to an empty string.

I'd be surprised if any UAs actually do make that assumption, because
the way Django treats missing character fields does not appear to be
based on any spec, and Django is not the only form processing back-end
(or even the most popular one).

I'm not sure I follow your simplest case example of one HTML form in a
browser window and one Django Form in a view. If optional fields are
left off the HTML form deliberately, without change the Form class or
the view code, this is exactly when data loss will currently occur. I
think you are confusing optional as in "may not be specified by the
user" with optional as in "may not be processed by the form"?

Cheers.
Tai.

--
You received this message because you are subscribed to the Google Groups "Django developers" group.
To post to this group, send email to [hidden email].
To unsubscribe from this group, send email to [hidden email].
For more options, visit this group at http://groups.google.com/group/django-developers?hl=en.


--
You received this message because you are subscribed to the Google Groups "Django developers" group.
To post to this group, send email to [hidden email].
To unsubscribe from this group, send email to [hidden email].
For more options, visit this group at http://groups.google.com/group/django-developers?hl=en.

--
You received this message because you are subscribed to the Google Groups "Django developers" group.
To post to this group, send email to [hidden email].
To unsubscribe from this group, send email to [hidden email].
For more options, visit this group at http://groups.google.com/group/django-developers?hl=en.
Reply | Threaded
Open this post in threaded view
|

Re: Don't assume that missing fields from POST data are equal to an empty string value.

Adrian Holovaty-4
In reply to this post by Tai Lee-2
On Thu, Jan 12, 2012 at 7:40 PM, Tai Lee <[hidden email]> wrote:

> class Profile(models.Model):
>    first_name = models.CharField(blank=True, max_length=50)
>    last_name = models.CharField(blank=True, max_length=50)
>    address = models.CharField(blank=True, max_length=50)
>
> class ProfileForm(ModelForm):
>    class Meta:
>        model = Profile
>
> profile = Profile.objects.create(first_name='Tai', last_name='Lee',
> address='Sydney')
> form = ProfileForm({}, instance=profile)
> if form.is_valid():
>    form.save()
> }}}
>
> The profile will have no first name, last name or address. The form
> will produce no validation errors and will save the updated model to
> the database.
>
> This is very surprising and counter-intuitive to me.

The ultimate solution is: don't use model forms. I never use them for
anything, precisely because of things like this, where the framework
is trying to do too much behind the scenes. It's too magical for my
tastes, and while I understand why we added it to the framework, I see
it as a crutch for lazy developers. C'mon, it's not a lot of work to
create a "normal" (non-model) form class and pass its cleaned data to
a model's save() method.

End rant. :-)

Thanks for writing out this example, Tai. I disagree and do not see it
as counterintuitive. The "ProfileForm({}, instance=profile)" is
clearly passing in empty data (the empty dictionary), and it makes
sense that Django would see the empty data, then determine that empty
data is allowed on the fields (blank=True) and set those fields to
empty data. If you want to avoid this, you have two options: don't use
"blank=True," or don't use a model form.

Adrian

--
You received this message because you are subscribed to the Google Groups "Django developers" group.
To post to this group, send email to [hidden email].
To unsubscribe from this group, send email to [hidden email].
For more options, visit this group at http://groups.google.com/group/django-developers?hl=en.

Reply | Threaded
Open this post in threaded view
|

Re: Don't assume that missing fields from POST data are equal to an empty string value.

Paul McMillan-3
> The "ProfileForm({}, instance=profile)" is
> clearly passing in empty data (the empty dictionary), and it makes
> sense that Django would see the empty data, then determine that empty
> data is allowed on the fields (blank=True) and set those fields to
> empty data. If you want to avoid this, you have two options: don't use
> "blank=True," or don't use a model form.

I agree with Adrian. Django doesn't have control over all user agents,
and we know that most of them already behave in exactly the way the
RFC specifies (not sending anything for blank fields) in at least some
cases (checkboxes and radioboxes). I don't think writing code to
special-case everything else is the right solution.

If the person writing your form leaves fields off that should be
present and it results in data loss, I'd treat that like any other
code bug - we don't special case to save the data from views that
throw a 500 because you wrote invalid Python, so I don't see why we
should add a special case for when you might write incorrect HTML.

-Paul

--
You received this message because you are subscribed to the Google Groups "Django developers" group.
To post to this group, send email to [hidden email].
To unsubscribe from this group, send email to [hidden email].
For more options, visit this group at http://groups.google.com/group/django-developers?hl=en.

Reply | Threaded
Open this post in threaded view
|

Re: Don't assume that missing fields from POST data are equal to an empty string value.

Anupam Jain
In reply to this post by Mark Lavin
Wow - I just realised that we have been losing data for sometime on our web platform since there were some fields in the ModelForm that were hidden and not being sent. They were all being overwritten as blank values. Thank God, we use django-reversion to track changes. Will take us sometime to recover the data though.

Its been 6 years since the last post on this thread. Has anyone able to come up with a smart solution yet? (apart from not using Model Forms)

thanks
Anupam

On Friday, January 13, 2012 at 9:09:02 PM UTC+5:30, Mark Lavin wrote:
I think Ian demonstrated exactly why this should not go in and that is not all forms are ModelForms. Imagine writing a REST API which allows for an optional format parameter for GET requests which is validated by a Django form. With the inclusion of this proposal all clients would be forced to specify the format in the GET (even if left blank) or else the form won't validate. That's not much of an optional field and a large backwards incompatibility.

On Fri, Jan 13, 2012 at 10:20 AM, Daniel Sokolowski <<a href="javascript:" target="_blank" gdf-obfuscated-mailto="E26j8fr2h7EJ" rel="nofollow" onmousedown="this.href=&#39;javascript:&#39;;return true;" onclick="this.href=&#39;javascript:&#39;;return true;">daniel.s...@klinsight.com> wrote:
1. How does this proposal effect default values specified on model fields? (ModelForm processing) 
2. Forgive my ignorance but who has the final say if it goes in or not? And since it should be a yes what's next?

On Thu, Jan 12, 2012 at 8:40 PM, Tai Lee <<a href="javascript:" target="_blank" gdf-obfuscated-mailto="E26j8fr2h7EJ" rel="nofollow" onmousedown="this.href=&#39;javascript:&#39;;return true;" onclick="this.href=&#39;javascript:&#39;;return true;">real....@...> wrote:
Ian,

I agree that there are a lot of different ways that form data can be
submitted to Django, including near endless combinations of multiple
Django forms, multiple HTML forms, AJAX, etc.

If we reduce the scope of our discussion and consider only a Django
form (forgetting HTML forms and AJAX) and two possible scenarios:

1. Partial form data is bound to a Django form, and it is not expected
to result in a call to the form's `save()` method and a change to the
database. It is only intended to examine the form's validation state
relating to the partial data that is bound to the form. In this case,
the proposed change should have no impact.

2. Full form data is bound to a form, and it IS expected to validate
and result in a call to the form's `save()` method and a change to the
database. In this case, why would we ever want to assume that a
character field which has no bound data, is actually bound to an empty
string?

This is a dangerous assumption, I think. If we intend to obtain
complete data from the user, bind it to a form and save it to the
database, we should be sure (as much as we can be) that the data is
actually complete.

Take the following pure Django example:

{{{
class Profile(models.Model):
   first_name = models.CharField(blank=True, max_length=50)
   last_name = models.CharField(blank=True, max_length=50)
   address = models.CharField(blank=True, max_length=50)

class ProfileForm(ModelForm):
   class Meta:
       model = Profile

profile = Profile.objects.create(first_name='Tai', last_name='Lee',
address='Sydney')
form = ProfileForm({}, instance=profile)
if form.is_valid():
   form.save()
}}}

The profile will have no first name, last name or address. The form
will produce no validation errors and will save the updated model to
the database.

This is very surprising and counter-intuitive to me.

I think the only time we could safely make the assumption that fields
with empty string values will be omitted by the UA is when we are
confident that the source of the form data also makes a similar
assumption that the back-end will re-normalise those missing fields
back to an empty string.

I'd be surprised if any UAs actually do make that assumption, because
the way Django treats missing character fields does not appear to be
based on any spec, and Django is not the only form processing back-end
(or even the most popular one).

I'm not sure I follow your simplest case example of one HTML form in a
browser window and one Django Form in a view. If optional fields are
left off the HTML form deliberately, without change the Form class or
the view code, this is exactly when data loss will currently occur. I
think you are confusing optional as in "may not be specified by the
user" with optional as in "may not be processed by the form"?

Cheers.
Tai.

--
You received this message because you are subscribed to the Google Groups "Django developers" group.
To post to this group, send email to <a href="javascript:" target="_blank" gdf-obfuscated-mailto="E26j8fr2h7EJ" rel="nofollow" onmousedown="this.href=&#39;javascript:&#39;;return true;" onclick="this.href=&#39;javascript:&#39;;return true;">django-d...@googlegroups.com.
To unsubscribe from this group, send email to <a href="javascript:" target="_blank" gdf-obfuscated-mailto="E26j8fr2h7EJ" rel="nofollow" onmousedown="this.href=&#39;javascript:&#39;;return true;" onclick="this.href=&#39;javascript:&#39;;return true;">django-develop...@googlegroups.com.
For more options, visit this group at <a href="http://groups.google.com/group/django-developers?hl=en" target="_blank" rel="nofollow" onmousedown="this.href=&#39;http://groups.google.com/group/django-developers?hl\x3den&#39;;return true;" onclick="this.href=&#39;http://groups.google.com/group/django-developers?hl\x3den&#39;;return true;">http://groups.google.com/group/django-developers?hl=en.


--
You received this message because you are subscribed to the Google Groups "Django developers" group.
To post to this group, send email to <a href="javascript:" target="_blank" gdf-obfuscated-mailto="E26j8fr2h7EJ" rel="nofollow" onmousedown="this.href=&#39;javascript:&#39;;return true;" onclick="this.href=&#39;javascript:&#39;;return true;">django-d...@googlegroups.com.
To unsubscribe from this group, send email to <a href="javascript:" target="_blank" gdf-obfuscated-mailto="E26j8fr2h7EJ" rel="nofollow" onmousedown="this.href=&#39;javascript:&#39;;return true;" onclick="this.href=&#39;javascript:&#39;;return true;">django-develop...@googlegroups.com.
For more options, visit this group at <a href="http://groups.google.com/group/django-developers?hl=en" target="_blank" rel="nofollow" onmousedown="this.href=&#39;http://groups.google.com/group/django-developers?hl\x3den&#39;;return true;" onclick="this.href=&#39;http://groups.google.com/group/django-developers?hl\x3den&#39;;return true;">http://groups.google.com/group/django-developers?hl=en.

--
You received this message because you are subscribed to the Google Groups "Django developers (Contributions to Django itself)" group.
To unsubscribe from this group and stop receiving emails from it, send an email to [hidden email].
To post to this group, send email to [hidden email].
Visit this group at https://groups.google.com/group/django-developers.
To view this discussion on the web visit https://groups.google.com/d/msgid/django-developers/6efc767e-110e-44c4-8330-291d96c2151f%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.
12