Django Form源碼分析之BaseForm驗證邏輯

引言

在Django中，Form的主要功能分爲輸入驗證以及在模板中的展示。
首先看一下Form的源碼定義：

class Form(six.with_metaclass(DeclarativeFieldsMetaclass, BaseForm)):
    "A collection of Fields, plus their associated data."
    # This is a separate class from BaseForm in order to abstract the way
    # self.fields is specified. This class (Form) is the one that does the
    # fancy metaclass stuff purely for the semantic sugar -- it allows one
    # to define a form using declarative syntax.
    # BaseForm itself has no way of designating self.fields.

因爲主要涉及的是BaseForm的驗證邏輯分析，因此我在本篇博客中不會關心fields裏面的數據如何從請求中得到，我會假設數據經過處理已經傳達到了fields上面。（模板的輸出格式規範也不會涉及太多）
同時，我會在引言中給出BaseForm的初始化，方便查閱。

def __init__(self, data=None, files=None, auto_id='id_%s', prefix=None,
                 initial=None, error_class=ErrorList, label_suffix=None,
                 empty_permitted=False, field_order=None):
        self.is_bound = data is not None or files is not None
        self.data = data or {}
        self.files = files or {}
        self.auto_id = auto_id
        if prefix is not None:
            self.prefix = prefix
        self.initial = initial or {}
        self.error_class = error_class
        # Translators: This is the default suffix added to form field labels
        self.label_suffix = label_suffix if label_suffix is not None else _(':')
        self.empty_permitted = empty_permitted
        self._errors = None  # Stores the errors after clean() has been called.

        # The base_fields class attribute is the *class-wide* definition of
        # fields. Because a particular *instance* of the class might want to
        # alter self.fields, we create self.fields here by copying base_fields.
        # Instances should always modify self.fields; they should not modify
        # self.base_fields.
        self.fields = copy.deepcopy(self.base_fields)
        self._bound_fields_cache = {}
        self.order_fields(self.field_order if field_order is None else field_order)

表單數據綁定：Form.is_bound

從初始化中可以得知綁定邏輯判定如下：

 self.is_bound = data is not None or files is not None

這很好的解釋瞭如下情況：

>>> not_bound = exampleForm()
>>> not_bound.is_bound
False
>>> is_bound = exampleForm({})
>>> is_bound.is_bound
True

一般來說，data參數爲request.POST or request.GET，而files參數爲request.FILES，值得注意的是，在上傳文件的時候，要求request.method必須爲POST，而且enctype=’multipart/form-data’。

表單的驗證方法：Form.is_valid()

def is_valid(self):
     """
     Returns True if the form has no errors. Otherwise, False. If errors are being ignored, returns False.
     """
     return self.is_bound and not self.errors

表單驗證成功的條件如下：

表單綁定了相應數據(self.is_bound被設置爲True)。
表單的所有字段經過驗證之後沒有錯誤出現。(self.errors轉換爲布爾值之後爲False)

Form.errors

在初始化中我們可以看到根本就沒有self.errors這個屬性，只有Form._errors這個私有屬性。
這是因爲Form.errors使用了python中@property的用法，只要訪問Form.errors屬性就相當於執行以下方法：

@property
def errors(self):
    "Returns an ErrorDict for the data provided for the form"
    if self._errors is None:
        self.full_clean()
    return self._errors

在Form初始化的時候，Form._errors屬性被設置爲None，因此當首次調用Form.is_valid()或者訪問Form.errors屬性的時候纔會調用Form.full_clean()方法（Form.full_clean函數將Form._errors屬性初始化爲一個空的ErrorDict對象），除此以外都會直接返回私有的Form._errors屬性。

默認情況下，Form.errors實現爲一個ErrorDict對象（繼承於python dict），以field name爲key,value爲一個包含ValidationError實例的ErrorList（繼承python list）。

但是，如果直接訪問Form.errors屬性中某個字段的ErrorList，並不會返回ValidationError實例，而只會返回一個ValidationError.message的字符串。這是因爲ErrorList重寫了getitem魔術方法的緣故：

class ErrorList(UserList, list):

    def __getitem__(self, i):
        error = self.data[i]
        if isinstance(error, ValidationError):
            return list(error)[0]
        return force_text(error)

如果希望得到ValidationError實例，可以調用Form.errors.as_data()。

Form.full_clean()

在Form.full_clean()中初始化了Form._errors屬性，並調用了Form._clean_fields()，Form._clean_form()，Form._post_clean()完成對錶單的驗證，這三個方法分別對應字段層次的驗證，表單層次的驗證，ModelForm層次的額外驗證。

其中本篇博客暫時不談對具體字段的源碼分析（即Form._clean_fields()方法），重點分析表單層次整體的驗證。（即Form._clean_form()方法）

 def full_clean(self):
     """
     Cleans all of self.data and populates self._errors and
     self.cleaned_data.
     """
     self._errors = ErrorDict()
     if not self.is_bound:  # Stop further processing.
         return
     self.cleaned_data = {}
     # If the form is permitted to be empty, and none of the form data has
     # changed from the initial data, short circuit any validation.
     if self.empty_permitted and not self.has_changed():
          return

     self._clean_fields()
     self._clean_form()
     self._post_clean()

Form._clean_form()

 def _clean_form(self):
     try:
         cleaned_data = self.clean()
     except ValidationError as e:
         self.add_error(None, e)
     else:
         if cleaned_data is not None:
             self.cleaned_data = cleaned_data

Form.clean()方法是Django Form提供的一個供開發者自行定義表單層次的驗證邏輯的鉤子，在原始的定義中只是簡單的返回self.cleaned_data。在進行表單層次的驗證之前，綁定數據已經經過了字段層次的驗證(Form._clean_fields())。

def clean(self):
    """
    Hook for doing any extra form-wide cleaning after Field.clean() has been called on every field. Any ValidationError raised by this method will not be associated with a particular field; it will have a special-case association with the field named '__all__'.
    """
    return self.cleaned_data

在Form.clean()中引發的所有ValidationError Exception都會存儲在Form.errors中key爲__all__的ErrorList中。

之前已經提及了Form.clean()代表的是表單層次的驗證，因此在這個過程中拋出的ValidationError異常不應該跟特定的field相關聯，所以在Form.errors中指定一個存放表單層次的ValidationError的ErrorList，而它的field name則爲__all__。（因此，表單層次的驗證適合一些需要多個字段配合的驗證，比如業務邏輯驗證）

Form.add_error(field, error)

在Form._clean_fields()和Form._clean_form()這兩個層次的驗證中拋出的ValidationError異常處理都是調用Form.add_error()方法。

其中field參數是拋出ValidationError異常的字段的名字，或者是代表表單層次的異常__all__；error參數既可以是一段字符串，也可以是一個ValidationError實例。

def add_error(self, field, error):
    """
    Update the content of `self._errors`.
    更新了self._errors屬性的內容

    The `field` argument is the name of the field to which the errors
    should be added. If its value is None the errors will be treated as 
    NON_FIELD_ERRORS.
    向field參數傳遞需要添加錯誤的字段的命名，如果字段的命名值爲空則錯誤會被視爲非字段錯誤。

    The `error` argument can be a single error, a list of errors, or a
    dictionary that maps field names to lists of errors. What we define as
    an "error" can be either a simple string or an instance of
    ValidationError with its message attribute set and what we define as
    list or dictionary can be an actual `list` or `dict` or an instance
    of ValidationError with its `error_list` or `error_dict` attribute set.
    error參數可以傳遞單個錯誤，錯誤列表或者一個包含字段命名到錯誤列表映射的字典。這裏的錯誤被定義爲一個簡單的字符串或者一個設置了message屬性的ValidationError實例，而這裏說的列表和字典是指一個真正內置的list和dict，或者一個設置了error_list或error_dict屬性的ValidationError實例。

    If `error` is a dictionary, the `field` argument *must* be None and
    errors will be added to the fields that correspond to the keys of the
    dictionary.
    如果error參數被傳遞了一個字典，那麼field參數必須爲None，然後錯誤會被添加到字典中對應的字段裏。
    """
    if not isinstance(error, ValidationError):
        # Normalize to ValidationError and let its constructor
        # do the hard work of making sense of the input.
        error = ValidationError(error)

    if hasattr(error, 'error_dict'):
         if field is not None:
             raise TypeError(
                 "The argument `field` must be `None` when the `error` "
                    "argument contains errors for multiple fields."
             )
         else:
             error = error.error_dict
    else:
         error = {field or NON_FIELD_ERRORS: error.error_list}

    for field, error_list in error.items():
         if field not in self.errors:
             if field != NON_FIELD_ERRORS and field not in self.fields:
                 raise ValueError(
                     "'%s' has no field named '%s'." % (self.__class__.__name__, field))
             if field == NON_FIELD_ERRORS:
                 self._errors[field] = self.error_class(error_class='nonfield')
             else:
                 self._errors[field] = self.error_class()
                 self._errors[field].extend(error_list)
         if field in self.cleaned_data:
             del self.cleaned_data[field]

Form.add_error()方法流程如下：

首先檢查error是否是ValidationError的實例，若不是則構造一個ValidationError實例。
檢查ValidationError實例是否設置error_dict屬性，若不是則使error爲一個key爲field，value爲error_list的字典。此時若field爲None，則該error爲表單層次錯誤，key爲NON_FIELD_ERRORS, 默認爲__all__。
當field變量命名不在Form.errors中時，判斷field是否存在於Form.fields中，若存在，在errors中創建一個key爲field，並實例化一個ErrorList賦值給這個field的value。
field變量命名存在於Form.errors時，直接extend新的error_list。
最後從Form.cleaned_data中刪除該field。

表單的展示初始化:Form.initial

表單初始化的參數initial的優先級是高於字段的initial，從以下源碼分析得知：
```
def _clean_fields(self):
    for name, field in self.fields.items():
        # value_from_datadict() gets the data from the data dictionaries.
        # Each widget type knows how to retrieve its own data, because some
        # widgets split data over several HTML fields.
        if field.disabled:
            value = self.initial.get(name, field.initial)
```
一開始我以爲Form.initial這個屬性是作爲數據初始化的參數，但是我後來想想並非如此，只有當Field.disabled的時候，form纔會從initial裏面取值，可見initial這個參數是用於展示用的，並非用於數據的默認值。注意，這裏Form.initial應該跟ModelForm區分開來，參考django-forms-default-values-for-bound-forms。
同時，在Django官方文檔中也有說明：

These values are only displayed for unbound forms, and they’re not used as fallback values if a particular value isn’t provided.

最後，我再給出一個例子：
```
class QueryForm(forms.Form):
    limit = forms.IntegerField(required=False)
    offset = forms.IntegerField(required=False)
```
```
>>> f = QueryForm(data={}, initial={'limit': 20, 'offset': 0})
>>> f.is_bound
True
>>> f.is_valid()
True
>>> f.cleaned_data
{'limit': None, 'offset': None}
```
接下來我會在另一篇博客中展開對Field的源碼分析。

如有不足之處，敬請指教。

codeLeaves

發佈了51 篇原創文章 · 獲贊 22 · 訪問量 14萬+

私信關注

Django Form源碼分析之BaseForm驗證邏輯

引言

表單數據綁定：Form.is_bound

表單的驗證方法：Form.is_valid()

Form.errors

Form.full_clean()

Form._clean_form()

Form.add_error(field, error)

表單的展示初始化:Form.initial

服務端自動部署靜態項目的幾種方法

從零開始自動部署Django項目（一）：開發配置與生產配置

Django Form之動態數據初始化

RSA與SSL淺析

從零開始自動部署Django項目（三）：使用uWSGI emperor管理進程

Mac下配置sublime實現LaTeX

https://yachay.unat.edu.pe/blog/index.php?comment_area=format_blog&comment_component=blog&comment_co

linux以太網驅動總結