附录 C 数据库 API 参考

优质
小牛编辑
130浏览
2023-12-01

Django 数据库 API 是附录 B 中讨论过的数据模型 API 的另一部分。一旦定义了数据模型,你将会在任何要访问数据库的时候使用数据库 API。你已经在本书中看到了很多数据库 API 的例子,这篇附录对数据库 API 的各种变化详加阐释。

和附录 B 中讨论的数据模型 API 时一样,尽管认为这些 API 已经很稳定,Django 开发者一直在增加各种便捷方法。因此,查看最新的在线文档是个好方法,在线文档可以在 http://www.djangoproject.com/documentation/0.96/db-api/ 找到.

贯穿这个参考文档,我们都会提到下面的这个 models。它或许来自于一个简单的博客程序。 from django.db import models

class Blog(models.Model):

name = models.CharField(max_length=100) tagline = models.TextField()

def

__str__(self): return self.name

class Author(models.Model):

name = models.CharField(max_length=50) email = models.EmailField()

def

__str__(self): return self.name

class Entry(models.Model):

blog = models.ForeignKey(Blog)

headline = models.CharField(max_length=255) body_text = models.TextField()

pub_date = models.DateTimeField() authors = models.ManyToManyField(Author)

def

__str__(self): return self.headline

创建对象

要创建一个对象, 用模型类使用关键字参数实例化它, 接着调用 save() 将它保存到数据库中:

>>> from mysite.blog.models import Blog

>>> b = Blog(name='Beatles Blog', tagline='All the latest Beatles news.')

>>> b.save()

这会在后台执行一个 SQL 语句. 如果您不显式地调用 save() , Django 不会保存到数据库.

save() 方法没有返回值.

要在一个步骤中创建并保存一个对象, 参见会稍后讨论的 create 管理者方法,

当您保存的时候发生了什么?

当您保存一个对象的时候, Django 执行下面的步骤:

发出一个预存信号。 它发出一个将要存储一个对象的通知。你可以注册一个监听程序,在信号发出的时候就会被调用。到本书出版时,这些信号仍在开发中并且没有文档化,请查看在线文档来获得最新的消息。

预处理数据. 对于对象的每个字段,将根据需要进行自动的数据修改。

大部分字段并不预处理,它们会保持它们原来的样子。预处理仅仅用在那些有特殊性质的字段,比如文件字段。

为数据库准备数据。 每一个字段先要把当前值转化成数据库中可以保存的数据的类型。

大多数字段的数据不需要预先准备。简单的数据类型,比如整型和字符串等 python 对象可以直接写进数据库。然而,更复杂的数据类型需要做一些修改。比如, DateFields 使用 python的 datetime 对象来存储数据。数据库并不能存储 datetime 对象,所以该字段要存入数据库先要把值转化为符合 ISO 标准的日期字符串。

向数据库中插入数据。 经过预处理准备好的数据然后会组合成一条 SQL 语句来插入数据库。

发出存毕信号。 与预存信号类似,存毕信号在对象成功保存之后发出。同样,这些信号也还没有文档化。

自增主键

为了方便,每个数据库模型都会添加一个自增主键字段,即 id 。除非你在某个字段属性中显式的指定 primary_key=True (参见附录 B 中题为 AutoField 的章节)。

如果你的数据库模型中包括 AutoField ,这个自增量的值将会在你第一次调用 save() 时作为对象的一个属性计算得出并保存起来。

>>> b2 = Blog(name='Cheddar Talk', tagline='Thoughts on cheese.')

>>> b2.id # Returns None, because b doesn't have an ID yet. None

>>> b2.save()

>>> b2.id # Returns the ID of your new object. 14

在调用 save() 方法之前没有办法知道 ID 的值,因为这个值是数据库计算出来的,不是

Django。

如果你想在一个新数据存储时,定义其 AutoField 字段值,而不依赖于数据库自动分配,明确赋值即可。

>>> b3 = Blog(id=3, name='Cheddar Talk', tagline='Thoughts on cheese.')

>>> b3.id 3

>>> b3.save()

>>> b3.id 3

如果你手动指定自增主键的值,要确保这个主键在数据库中不存在!如果你显式地指定主键来创建新对象,而这个主键在数据库中已经存在的话,Django 会认为你要更改已经存在的那条记录,而不是创建一个新的。

以前面的 'Cheddar Talk' blog 为例,下面的例子会覆盖数据库中已经存在的记录:

>>> b4 = Blog(id=3, name='Not Cheddar', tagline='Anything but cheese.')

>>> b4.save() # Overrides the previous blog with ID=3!

如果你确信不会产生主键冲突的话,当需要保存大量对象的时候,明确指定自增主键的值是非常有用的。

保存对对象做的修改

要保存一个已经在数据库中存在的对象的变更, 使用 save() .

假定 b5 这个 Blog 实例已经保存到数据库中,下面这个例子更改了它的名字,并且更新了它在数据库中的记录:

>>> b5.name = 'New name'

>>> b5.save()

这个例子在后台执行了 UPDATE 这一 SQL 语句。再次声明,Django 在你显式地调用 save() 之前是不会更新数据库的。

Django 如何得知何时 UPDATE ,何时 INSERT 呢

你可能已经注意到 Django 数据库对象在创建和更改对象时,使用了同一个 save() 函数。

Django 抽象化了对 SQL 语句中的 INSERT 和 UPDATE 的需求,当你调用 save() 的时候,

Django 会遵守下面的原则:

  • 如果对象的主键属性被设置成相当于 True 的值(比如 None 或者空字符串之外的值),

    Django 会执行一个 SELECT 查询来检测是否已存在一个相同主键的记录。

  • 如果已经存在一个主键相同的记录,Django 就执行 UPDATE 查询。

  • 如果对象的主键属性 没有 被设置,或者被设置但数据库中没有与之同主键的记录,那么 Django 就会执行 INSERT 查询。

    正因如此,如果你不能确信数据库中不存在主键相同的记录的话,你应该避免没有明确指定主键的值。

    更新 ForeignKey 字段原理是一样的,只是要给这个字段赋予正确类型的对象就行了。

    >>> joe = Author.objects.create(name="Joe")

    >>> entry.author = joe

    >>> entry.save()

    如果你把一个错误类型的对象赋给它,Django 会警报的。

    获取对象

    在这本书中,获取对象都使用下面这样的代码实现的:

    >>> blogs = Blog.objects.filter(author__name__contains="Joe")

    在这幕后会有相当多的步骤:当你从数据库中获取对象的时候,你实际上用 Manager 模块构造了一个 QuerySet ,这个 QuerySet 知道怎样去执行 SQL 语句并返回你想要的对象。

    附录 B 从模块定义的角度讨论了这两个对象,现在让我们研究一下它们是怎么工作的。

    QuerySet 代表了你的数据库中的对象的一个集合。它根据所给参数可以构造若干个 过滤器来缩小这个集合的规模。用 SQL 术语来讲,一个 QuerySet 就相当于一个 SELECT 语句,过滤器相当于诸如 WHERE 或者 LIMIT 的限定语。

    你通过模块的 Manager 就可以得到一个 QuerySet 。每个模块至少有一个 Manager ,默认名称是 objects 。可以通过模块类来直接访问它,比如:

    >>> Blog.objects

    <django.db.models.manager.Manager object at 0x137d00d>

    为了强制分离数据表级别的操作和数据记录级别的操作, Manager 只能通过模块类而不是模块实例来访问:

    >>> b = Blog(name='Foo', tagline='Bar')

    >>> b.objects

    Traceback (most recent call last): File "<stdin>", line 1, in <module>

    AttributeError: Manager isn't accessible via Blog instances.

    对一个模块来讲, Manager 是 QuerySets 的主要来源。它就像一个根本的 QuerySet ,可以对模块的数据库表中的所有对象进行描述。比如, Blog.objects 就是包含着数据库中所有的 Blog 对象的一个根本的 QuerySet 。

    缓存与查询集

    为了减少数据库访问次数,每个 QuerySet 包含一个缓存,要写出高效的代码,理解这一点很重要。

    在刚被创建的 QuerySet 中,缓存是空的。当 QuerySet 第一次被赋值,就是执行数据库查询的时候,Django 会把查询结果保存到这个 QuerySet 的缓存中,并返回请求结果(例如,

    QuerySet 迭代结束的时候,就会返回下一条记录)。再次使用 QuerySet 的值的话会重复使用缓存中的内容。

    要时刻记住这种缓存机制,因为如果你不正确的使用 QuerySet 的话,可能会遇到麻烦。例如,下面这段代码会分别产生两个 QuerySet ,计算出来然后丢弃。

    print [e.headline for e in Entry.objects.all()] print [e.pub_date for e in Entry.objects.all()]

    这就意味着相同的数据库的查询会被执行两次,使数据库的负载加倍。而且这两个列表包含的数据可能不同,因为在两次查询的间隙,可能有一个 Entry 被添加或是删除了。

    避免这个问题,简单的方法是保存这个 QuerySet 并且重用它。 queryset = Poll.objects.all()

    print [p.headline for p in queryset] # Evaluate the query set.

    print [p.pub_date for p in queryset] # Reuse the cache from the evaluation.

    过滤器对象

    从数据表中获取对象的最简单的方法就是得到所有的对象,就是调用一个 Manager 的 all()方法。

    >>> Entry.objects.all()

    all() 方法返回一个包含数据库的所有对象的 QuerySet 。

    但是通常情况下,只需要从所有对象中请求一个子集,这就需要你细化一下刚才的 QuerySet ,加一些过滤条件。用 filter() 和 exclude() 方法可以实现这样的功能:

    >>> y2006 = Entry.objects.filter(pub_date__year=2006)

    >>> not2006 = Entry.objects.exclude(pub_date__year=2006)

    filter() 和 exclude() 方法都接受 字段查询 参数,我们稍后会详细讨论。

    级联过滤器

    细化过的 QuerySet 本身就是一个 QuerySet ,所以可以进一步细化,比如:

    >>> qs = Entry.objects.filter(headline__startswith='What')

    >>> qs = qs..exclude(pub_date__gte=datetime.datetime.now())

    >>> qs = qs.filter(pub_date gte=datetime.datetime(2005, 1, 1))

    这样,我们把最初过的数据库中所有内容的一个 QuerySet 经过添加一个过滤器、一个反向 过滤器和另外一个过滤器,得到一个最终的 QuerySet ,最终结果中包含了所有标题以“What”开头的 2005 年至今的出版的条目。

    这里需要指出的一点是,创建一个 QuerySet 并不会牵涉到任何数据库动作。事实上,上面的三行并不会产生 任何的 数据库调用。就是说你可以连接任意多个过滤器,只要你不把这个 QuerySet 用于赋值的话,Django 是不会执行查询的。

    你可以用下面的方法来计算 QuerySet 的值:

    迭代 : QuerySet 是可以迭代的,它会在迭代结束的时候执行数据库查询。例如,下面的这个 QuerySet 在 for 循环迭代完毕之前,是不会被赋值的:

    qs = Entry.objects.filter(pub_date__year=2006) qs = qs.filter(headline__icontains="bill") for e in qs:

    print e.headline

    它会打印 2006 年所有包含 bill 的标题,但只会触发一次数据库访问。

    打印 :对 QuerySet 使用 repr() 方法时,它是会被赋值的。这是为了方便 Python 的交互解释器,这样在交互环境中使用 API 时就会立刻看到结果。

    切片 : 在接下来的“限量查询集”一节中就会解释这一点, QuerySet 是可以用 Python 的数组切片的语法来切片的。通常切片过的 QuerySet 会返回另外一个(尚未赋值的) QuerySet ,但是如果在切片时使用步长参数的话,Django 会执行数据库查询的。

    转化成列表 :对 QuerySet 调用 list() 方法的话,就可以对它强制赋值,比如:

    >>> entry_list = list(Entry.objects.all())

    但是,需要警告的是这样做会导致很大的内存负载,因为 Django 会把列表的每一个元素加载到内存。相比之下,对 QuerySet 进行迭代会利用数据库来加载数据,并且在需要的时候才会把对象实例化。

    过滤过的查询集是独一无二的

    你每次细化一个 QuerySet 都会得到一个崭新的 QuerySet ,绝不会与之前的 QuerySet 有任何的瓜葛。每次的细化都会创建一个各自的截然不同的 QuerySet ,可以用来存储、使用和重用。

    q1 = Entry.objects.filter(headline__startswith="What") q2 = q1.exclude(pub_date gte=datetime.now())

    q3 = q1.filter(pub_date__gte=datetime.now())

    这三个 QuerySet 是无关的。第一个基础查询集包含了所有标题以 What 开始的条目。第二个查询集是第一个的子集,只是过滤掉了 pub_date 比当前时间大的记录。第三个查询集也是第一个的子集,只保留 pub_date 比当前时间大的记录。初始的 QuerySet ( q1 )是不受细化过程的影响。

    限量查询集

    可以用 Python 的数据切片的语法来限定 QuerySet 的结果数量,这和 SQL 中的 LIMIT 和

    OFFSET 语句是一样的。

    比如,这句返回前五个条目( LIMIT 5 ):

    >>> Entry.objects.all()[:5]

    这句返回第六到第十个条目( OFFSET 5 LIMIT 5 ):

    >>> Entry.objects.all()[5:10]

    一般地,对 QuerySet 进行切片会返回一个新的 QuerySet ,但并不执行查询。如果你在

    Python 切片语法中使用步长参数的话,就会出现特例。例如,要返回前十个对象中的偶序数对象的列表时,实际上会执行查询:

    >>> Entry.objects.all()[:10:2]

    要得到 单个 对象而不是一个列表时(例如 SELECT foo FROM bar LIMIT 1 ),可以不用切片而是使用下标。例如,这样就会返回数据库中对标题进行字母排序后的第一个 Entry :

    >>> Entry.objects.order_by('headline')[0]刚才这句和下面的大致相当:

    >>> Entry.objects.order_by('headline')[0:1].get()

    但是要记住,如果没有符合条件的记录的话,第一种用法会导致 IndexError ,而第二种用法会导致 DoesNotExist 。

    返回新的 QuerySets 的 查询方法

    Django 提供了一系列的 QuerySet 细化方法,既可以修改 QuerySet 返回的结果的类型,又可以修改对应的 SQL 查询的执行方法。这就是这一节我们要讨论的内容。其中有一些细化方法会接收字段查询参数,我们稍后会详细讨论。

    filter(**lookup)

    返回一个新的 QuerySet ,包含匹配参数 lookup 的对象。 exclude(**kwargs)

    返回一个新的 QuerySet ,包含不匹配参数 kwargs 的对象。 order_by(*fields)

    默认情况下,会返回一个按照models 的metadata 中的``ordering``选项排序的``QuerySet``

    (请查看附录 B)。你可以调用``order_by()``方法按照一个特定的规则进行排序以覆盖默认的行为:

    >> Entry.objects.filter(pub_date__year=2005).order_by('-pub_date', 'headline')

    结果将先对 pub_date 进行降序排序,然后对 headline 进行升序排序。 "-pub_date" 前面的符号代表降序排序。如果没有 "-" ,默认为升序排序。要使用随机的顺序,使用 "?" ,比如:

    >>> Entry.objects.order_by('?') distinct()

    就像使用”SELECT DISTINCT”在 SQL 查询一样,返回一个新的”QuerySet “。这消除了查询结果中的重复行。

    默认的情况下, “QuerySet”并不能消除重复的行。在练习中,可能会产生问题,因为像

    “Blog.objects”这么简单的查询并不一定能产生重复的行。

    然而,如果你的查询是多表关联查询,那么``QuerySet``查询的结果可能会有重复数据.因此我们要用``distinct()`` .

    values(*fields)

    返回一个特殊的 QuerySet 相当于一个字典列表,而不是 model 的实例。每个字典代表一个对象,它的 keys 对应于这个 model 的属性名。

    # This list contains a Blog object.

    >>> Blog.objects.filter(name__startswith='Beatles') [Beatles Blog]

    # This list contains a dictionary.

    >>> Blog.objects.filter(name__startswith='Beatles').values()

    [{'id': 1, 'name': 'Beatles Blog', 'tagline': 'All the latest Beatles news.'}]

    values() takes optional positional arguments, *fields , which specify field names to which the SELECT should be limited. If you specify the fields, each dictionary will contain only the field keys/values for the fields you specify. If you dont specify the fields, each dictionary will contain a key and value for every field in the database table:

    >>> Blog.objects.values()

    [{'id': 1, 'name': 'Beatles Blog', 'tagline': 'All the latest Beatles news.'}],

    >>> Blog.objects.values('id', 'name') [{'id': 1, 'name': 'Beatles Blog'}]

    This method is useful when you know youre only going to need values from a small number of the available fields and you wont need the functionality of a model instance object. Its more efficient to select only the fields you need to use.

    dates(field, kind, order)

    Returns a special QuerySet that evaluates to a list of datetime.datetime objects representing all available dates of a particular kind within the contents of the QuerySet .

    The field argument must be the name of a DateField or DateTimeField of your model. The kind argument must be either "year" , "month" , or "day" . Each datetime.datetime object in the result list is truncated to the given type :

  • "year" returns a list of all distinct year values for the field.

  • "month" returns a list of all distinct year/month values for the field.

  • "day" returns a list of all distinct year/month/day values for the field.

order , which defaults to 'ASC' , should be either 'ASC' or 'DESC' . This specifies how to order the results.

Here are a few examples:

>>> Entry.objects.dates('pub_date', 'year') [datetime.datetime(2005, 1, 1)]

>>> Entry.objects.dates('pub_date', 'month') [datetime.datetime(2005, 2, 1), datetime.datetime(2005, 3, 1)]

>>> Entry.objects.dates('pub_date', 'day') [datetime.datetime(2005, 2, 20), datetime.datetime(2005, 3, 20)]

>>> Entry.objects.dates('pub_date', 'day', order='DESC') [datetime.datetime(2005, 3, 20), datetime.datetime(2005, 2, 20)]

>>> Entry.objects.filter(headline contains='Lennon').dates('pub_date', 'day') [datetime.datetime(2005, 3, 20)]

select_related()

Returns a QuerySet that will automatically follow foreign key relationships, selecting that additional related-object data when it executes its query. This is a performance booster that results in (sometimes much) larger queries but means later use of foreign key relationships wont require database queries.

The following examples illustrate the difference between plain lookups and select_related() lookups. Heres standard lookup:

# Hits the database.

>>> e = Entry.objects.get(id=5)

# Hits the database again to get the related Blog object.

>>> b = e.blog

And heres select_related lookup:

# Hits the database.

>>> e = Entry.objects.select_related().get(id=5)

# Doesn't hit the database, because e.blog has been prepopulated

# in the previous query.

>>> b = e.blog

select_related() follows foreign keys as far as possible. If you have the following models:

class City(models.Model):

# ...

class Person(models.Model):

# ...

hometown = models.ForeignKey(City)

class Book(models.Model):

# ...

author = models.ForeignKey(Person)

then a call to Book.objects.select_related().get(id=4) will cache the related Person

and the related City :

>>> b = Book.objects.select_related().get(id=4)

>>> p = b.author # Doesn't hit the database.

>>> c = p.hometown # Doesn't hit the database.

>>> b = Book.objects.get(id=4) # No select_related() in this example.

>>> p = b.author # Hits the database.

>>> c = p.hometown # Hits the database.

Note that select_related() does not follow foreign keys that have null=True .

Usually, using select_related() can vastly improve performance because your application can avoid many database calls. However, in situations with deeply nested sets of relationships, select_related() can sometimes end up following too many relations and can generate queries so large that they end up being slow.

extra()

Sometimes, the Django query syntax by itself cant easily express a complex WHERE clause. For these edge cases, Django provides the extra() QuerySet modifier a hook for injecting specific clauses into the SQL generated by a QuerySet .

By definition, these extra lookups may not be portable to different database engines (because youre explicitly writing SQL code) and violate the DRY principle, so you should avoid them if possible.

Specify one or more of params , select , where , or tables . None of the arguments is required, but you should use at least one of them.

The select argument lets you put extra fields in the SELECT clause. It should be a dictionary mapping attribute names to SQL clauses to use to calculate that attribute:

>>> Entry.objects.extra(select={'is_recent': "pub_date > '2006-01-01'"})

As a result, each Entry object will have an extra attribute, is_recent , a Boolean representing whether the entrys pub_date is greater than January 1, 2006.

The next example is more advanced; it does a subquery to give each resulting Blog object an entry_count attribute, an integer count of associated Entry objects:

>>> subq = 'SELECT COUNT(*) FROM blog_entry WHERE blog_entry.blog_id = blog_blog.id'

>>> Blog.objects.extra(select={'entry_count': subq})

(In this particular case, were exploiting the fact that the query will already contain the blog_blog table in its FROM clause.)

You can define explicit SQL WHERE clauses perhaps to perform nonexplicit joins by using where . You can manually add tables to the SQL FROM clause by using tables .

where and tables both take a list of strings. All where parameters are ANDed to any other search criteria:

>>> Entry.objects.extra(where=['id IN (3, 4, 5, 20)'])

The select and where parameters described previously may use standard Python database string placeholders: '%s' to indicate parameters the database engine should automatically quote. The params argument is a list of any extra parameters to be substituted:

>>> Entry.objects.extra(where=['headline=%s'], params=['Lennon'])

Always use params instead of embedding values directly into select or where because params will ensure values are quoted correctly according to your particular database.

Heres an example of the wrong way: Entry.objects.extra(where=["headline='%s'" % name]) Heres an example of the correct way: Entry.objects.extra(where=['headline=%s'], params=[name])

QuerySet Methods That Do Not Return QuerySets

The following QuerySet methods evaluate the QuerySet and return something otherthan

a QuerySet a single object, value, and so forth. get(**lookup)

Returns the object matching the given lookup parameters, which should be in the format described in the Field Lookups section. This raises AssertionError if more than one object was found.

get() raises a DoesNotExist exception if an object wasnt found for the given parameters. The DoesNotExist exception is an attribute of the model class, for example:

>>> Entry.objects.get(id='foo') # raises Entry.DoesNotExist

The DoesNotExist exception inherits from django.core.exceptions.ObjectDoesNotExist , so you can target multiple DoesNotExist exceptions:

>>> from django.core.exceptions import ObjectDoesNotExist

>>> try:

... e = Entry.objects.get(id=3)

... b = Blog.objects.get(id=1)

... except ObjectDoesNotExist:

... print "Either the entry or blog doesn't exist." create(**kwargs)

这个快捷的方法可以一次性完成创建并保证对象。它让你完成了下面两个步骤:

>>> p = Person(first_name="Bruce", last_name="Springsteen")

>>> p.save()

into a single line:

>>> p = Person.objects.create(first_name="Bruce", last_name="Springsteen") get_or_create(**kwargs)

This is a convenience method for looking up an object and creating one if it doesnt exist. It returns a tuple of (object, created) , where object is the retrieved or created object and created is a Boolean specifying whether a new object was created.

This method is meant as a shortcut to boilerplate code and is mostly useful for data-import scripts, for example:

try:

obj = Person.objects.get(first_name='John', last_name='Lennon') except Person.DoesNotExist:

obj = Person(first_name='John', last_name='Lennon', birthday=date(1940, 10, 9)) obj.save()

This pattern gets quite unwieldy as the number of fields in a model increases. The previous example can be rewritten using get_or_create() like so:

obj, created = Person.objects.get_or_create( first_name = 'John',

last_name = 'Lennon',

defaults = {'birthday': date(1940, 10, 9)}

)

Any keyword arguments passed to get_or_create() except an optional one called defaults will be used in a get() call. If an object is found, get_or_create() returns a tuple of that object and False . If an object is not found, get_or_create() will instantiate and save a new object, returning a tuple of the new object and True . The new object will be created according to this algorithm:

defaults = kwargs.pop('defaults', {})

params = dict([(k, v) for k, v in kwargs.items() if '__' not in k]) params.update(defaults)

obj = self.model(**params) obj.save()

In English, that means start with any non-'defaults' keyword argument that doesnt contain a double underscore (which would indicate a nonexact lookup). Then add the contents of defaults , overriding any keys if necessary, and use the result as the keyword arguments to the model class.

If you have a field named defaults and want to use it as an exact lookup in get_or_create() , just use 'defaults exact' like so:

Foo.objects.get_or_create( defaults exact = 'bar', defaults={'defaults': 'baz'}

)

Note

As mentioned earlier, get_or_create() is mostly useful in scripts that need to parse data and create new records if existing ones arent available. But if you need to use get_or_create() in a view, please make sure to use it only in POST requests unless you have a good reason not to. GET requests shouldnt have any effect on data; use POST whenever a request to a page has a side effect on your data.

count()

Returns an integer representing the number of objects in the database matching the QuerySet . count() never raises exceptions. Heres an example:

# Returns the total number of entries in the database.

>>> Entry.objects.count() 4

# Returns the number of entries whose headline contains 'Lennon'

>>> Entry.objects.filter(headline contains='Lennon').count() 1

count() performs a SELECT COUNT(*) behind the scenes, so you should always use count() rather than loading all of the records into Python objects and calling len() on the result.

Depending on which database youre using (e.g., PostgreSQL or MySQL), count() may return a long integer instead of a normal Python integer. This is an underlying implementation quirk that shouldnt pose any real-world problems.

in_bulk(id_list)

Takes a list of primary key values and returns a dictionary mapping each primary key value to an instance of the object with the given ID, for example:

>>> Blog.objects.in_bulk([1])

{1: Beatles Blog}

>>> Blog.objects.in_bulk([1, 2])

{1: Beatles Blog, 2: Cheddar Talk}

>>> Blog.objects.in_bulk([])

{}

IDs of objects that dont exist are silently dropped from the result dictionary. If you pass in_bulk() an empty list, youll get an empty dictionary.

latest(field_name=None)

Returns the latest object in the table, by date, using the field_name provided as the date field. This example returns the latest Entry in the table, according to the pub_date field:

>>> Entry.objects.latest('pub_date')

If your models Meta specifies get_latest_by , you can leave off the field_name argument to latest() . Django will use the field specified in get_latest_by by default.

Like get() , latest() raises DoesNotExist if an object doesnt exist with the given parameters.

Field Lookups

Field lookups are how you specify the meat of an SQL WHERE clause. Theyre specified as keyword arguments to the QuerySet methods filter() , exclude() , and get() .

Basic lookup keyword arguments take the form field__lookuptype=value (note the double underscore). For example:

>>> Entry.objects.filter(pub_date lte='2006-01-01') translates (roughly) into the following SQL:

SELECT * FROM blog_entry WHERE pub_date <= '2006-01-01';

If you pass an invalid keyword argument, a lookup function will raise TypeError . The supported lookup types follow.

exact

Performs an exact match:

>>> Entry.objects.get(headline exact="Man bites dog")

This matches any object with the exact headline Man bites dog.

If you dont provide a lookup type that is, if your keyword argument doesnt contain a double underscore the lookup type is assumed to be exact .

例如,下面两个语句是等效的:

>>> Blog.objects.get(id__exact=14) # Explicit form

>>> Blog.objects.get(id=14) # exact is implied

This is for convenience, because exact lookups are the common case.

iexact

字符串比较(大小写无关)

>>> Blog.objects.get(name__iexact='beatles blog')

This will match 'Beatles Blog' , 'beatles blog' , 'BeAtLes BLoG' , and so forth.

contains

执行严格区分大小写的内容包含检测: Entry.objects.get(headline__contains='Lennon')

这将会匹配标题为``’Today Lennon honored’`` 的,而不匹配 ``‘today lennon

honored’``。

System Message: WARNING/2 (<string>, line 1777); backlink

Inline literal start-string without end-string.

System Message: WARNING/2 (<string>, line 1777); backlink

Inline literal start-string without end-string.

SQLite 不支持严格区分大小写的 LIKE 语句,所以在使用 SQLite 时``contains``的作用和

``icontains``一样。

除了 LIKE 语句中的百分号和下划线

使用相当于``LIKE``的 SQL 查找语句(iexact , contains , icontains, startswith, istartswith, endswith, 和``iendswith``)时,会自动的排除``LIKE``语句中使用的两个特殊符号:百分号、下划线。(在一条``LIKE``语句中,百分号是多个字符的通配符,下划线是单个字符的通配符)

这意味着使用的直观性,所以不会产生漏提取的。例如,查找所有含有一个百分号的项,只需要想用其他字符一样用一个百分号:

Entry.objects.filter(headline__contains='%')

Django 为你处理了这一引用。产生的 SQL 如下: SELECT ... WHERE headline LIKE '%\%%';

The same goes for underscores. Both percentage signs and underscores are handled for you transparently.

icontains

执行一个忽略大小写的内容包含检测:

>>> Entry.objects.get(headline icontains='Lennon')

与``contains``不同, icontains 会 匹配 'today lennon honored' 。

gt, gte, lt, and lte

这些即大于,大于或等于,小于,小于或等于:

>>> Entry.objects.filter(id gt=4)

>>> Entry.objects.filter(id lt=15)

>>> Entry.objects.filter(id gte=0)

这些查询分别返回 ID 大于 4,ID 小于 15,以及 ID 大于等于 0 的对象。

Youll usually use these on numeric fields. Be careful with character fields since character order isnt always what youd expect (i.e., the string 4 sorts after the string 10).

in

筛选出包含在给定列表中的数据: Entry.objects.filter(id__in=[1, 3, 4])这会返回所有 ID 为 1,3,或 4 的条目。

startswith

区分大小写的开头匹配:

>>> Entry.objects.filter(headline startswith='Will')

这将返回标题 Will he run?和 Willbur named judge,但是不会返回 Who is Will? 和 will found in crypt.

istartswith

Performs a case-insensitive starts-with:

>>> Entry.objects.filter(headline istartswith='will')

This will return the headlines Will he run?, Willbur named judge, and will found in crypt, but not Who is Will?

endswith and iendswith

区分大小写和忽略大小写的末尾匹配。

>>> Entry.objects.filter(headline endswith='cats')

>>> Entry.objects.filter(headline iendswith='cats')

range

Performs an inclusive range check:

>>> start_date = datetime.date(2005, 1, 1)

>>> end_date = datetime.date(2005, 3, 31)

>>> Entry.objects.filter(pub_date range=(start_date, end_date))

You can use range anywhere you can use BETWEEN in SQL for dates, numbers, and even characters.

year, month, and day

对 date/datetime 类型严格匹配年、月或日:

# Year lookup

>>>Entry.objects.filter(pub_date__year=2005)

# Month lookup -- takes integers

>>> Entry.objects.filter(pub_date month=12)

# Day lookup

>>> Entry.objects.filter(pub_date day=3)

# Combination: return all entries on Christmas of any year

>>> Entry.objects.filter(pub_date month=12, pub_date_day=25)

isnull

使用``True``或``False``,则分别相当于 SQL 语句中的``IS NULL``和``IS NOT NULL``:

>>> Entry.objects.filter(pub_date isnull=True)

__isnull=True vs. exact=None

__isnull=True``和``__exact=None``有一个很主要的区别。因为 SQL 规定无值就等于

``NULL ,所以``__exact=None``会 总是 返回一个空的结果。``__isnull``则取决于该阈是否当前有值 ``NULL``而不进行比较。

System Message: WARNING/2 (<string>, line 2109); backlink

Inline literal start-string without end-string.

search

A Boolean full-text search that takes advantage of full-text indexing. This is like contains but is significantly faster due to full-text indexing.

Note this is available only in MySQL and requires direct manipulation of the database to add the full-text index.

The pk Lookup Shortcut

For convenience, Django provides a pk lookup type, which stands for primary_key.

In the example Blog model, the primary key is the id field, so these three statements are equivalent:

>>> Blog.objects.get(id__exact=14) # Explicit form

>>> Blog.objects.get(id=14) # exact is implied

>>> Blog.objects.get(pk=14) # pk implies id__exact

The use of pk isnt limited to __exact queries any query term can be combined with

pk to perform a query on the primary key of a model:

# Get blogs entries with id 1, 4, and 7

>>> Blog.objects.filter(pk__in=[1,4,7])

# Get all blog entries with id > 14

>>> Blog.objects.filter(pk__gt=14)

pk lookups also work across joins. For example, these three statements are equivalent:

>>> Entry.objects.filter(blog__id exact=3) # Explicit form

>>> Entry.objects.filter(blog__id=3) # exact is implied

>>> Entry.objects.filter(blog__pk=3) # pk implies __id__exact

使用 Q 对象做联合查找

``filter()``等语句的参数都是取 AND 运算。如果想要执行更多的联合语句(如``OR``语句),你可以使用 ``Q``对象。

System Message: WARNING/2 (<string>, line 2214); backlink

Inline literal start-string without end-string.

System Message: WARNING/2 (<string>, line 2214); backlink

Inline literal start-string without end-string.

Q 对象 (django.db.models.Q ) 是一个用来囊括参数间连接的对象。这些参数会放在指定的域查询的位置。

例如,这个``Q``对象就包括了一个``LIKE``条件: Q(question__startswith='What')

Q 对象可以用运算符 & 和 | 来联合。当一个运算符连接两个 Q 对象时,就产生了一个新的

Q 对象。例如,这句生成一个单一的 Q 对象。相当于两个``”question__startswith”``条件的 OR:

System Message: WARNING/2 (<string>, line 2240); backlink

Inline literal start-string without end-string. Q(question__startswith='Who') | Q(question startswith='What')这相当于如下的 SQL ``WHERE``语句:

System Message: WARNING/2 (<string>, line 2258); backlink

Inline literal start-string without end-string. WHERE question LIKE 'Who%' OR question LIKE 'What%'

你可以用运算符``&``和 |``连接``Q 对象组成任意复杂的语句。你也可以使用附加组。

任一带关键字参数的的查找函数(如``filter()`` , exclude() ,``get()`` )也可将一到多个``Q`` 对象作为参数。如果在一个查询函数中使用多个``Q`` 对象参数,这些参数会被全体做 AND 运算,例如:

Poll.objects.get( Q(question__startswith='Who'),

Q(pub_date=date(2005, 5, 2)) | Q(pub_date=date(2005, 5, 6))

)

大致上可转换为如下的 SQL:

SELECT * from polls WHERE question LIKE 'Who%'

AND (pub_date = '2005-05-02' OR pub_date = '2005-05-06')

查询函数可以混合使用``Q``对象和关键字作参数。所有的参数作为查询函数的条件(无论他们是关键字参数还是``Q``对象)进行 AND 运算。然而,如果将一个``Q``对象作为条件,则它必须放在所有关键字参数定义之前。就像下面这样:

Poll.objects.get(

Q(pub_date=date(2005, 5, 2)) | Q(pub_date=date(2005, 5, 6)), question startswith='Who')

这是正确的,就相当于之前的例子。但如果这样:

# INVALID QUERY

Poll.objects.get(

question startswith='Who',

Q(pub_date=date(2005, 5, 2)) | Q(pub_date=date(2005, 5, 6)))就是不正确的。

在互联网上你可以找到一些例子 http://www.djangoproject.com/documentation/0.96/models/or_lookups/.

关系对象

当你定义了一个关系模型(例如:外键,一对一域,或多对多域),这一模式的实例将有一个方便的 API 来访问相关的对象。

例如,Entry 对象 e 能获得相关的 blog 对象访问博客属性 e.blog

Django also creates API accessors for the other side of the relationship the link from the related model to the model that defines the relationship. For example, a Blog object b has access to a list of all related Entry objects via the entry_set attribute: b.entry_set.all() .

All examples in this section use the sample Blog , Author , and Entry models defined at the top of this page.

跨越关系查找

Django offers a powerful and intuitive way to follow relationships in lookups, tak ing care of the SQL JOIN s for you automatically behind the scenes. To span a relationship, just use the field name of related fields across models, separated by double underscores, until you get to the field you want.

This example retrieves all Entry objects with a Blog whose name is 'Beatles Blog' :

>>> Entry.objects.filter(blog__name__exact='Beatles Blog')这跨越可之深可想而知!

It works backward, too. To refer to a reverse relationship, just use the lowercase name of the model.

This example retrieves all Blog objects that have at least one Entry whose headline contains 'Lennon' :

>>> Blog.objects.filter(entry__headline contains='Lennon')外键关系

如果一个模型里面有一个 ForeignKey 字段,那么它的实例化对象可以很轻易的通过模型的属性来访问与其关联的关系对象,例如:

e = Entry.objects.get(id=2)

e.blog # Returns the related Blog object.

你可以通过外键属性来获取并设置关联的外键对象。如你所料,单纯修改外键的操作是不能马上将修改的内容同步到数据库中的,你还必须调用 save() 方法才行,例如:

e = Entry.objects.get(id=2) e.blog = some_blog e.save()

如果一个 ForeignKey 字段设置了 null=True 选项(允许 NULL 值)时,你可以将 None 赋给它(译注:但纯设置 null=True 其实还是不行的,会抛出异常的,还不须把 blank=True 也设了才行,不知道什么原因,我一直以来都有点怀疑这是个 BUG):

e = Entry.objects.get(id=2) e.blog = None

e.save() # "UPDATE blog_entry SET blog_id = NULL ...;"

Forward access to one-to-many relationships is cached the first time the related object is accessed. Subsequent accesses to the foreign key on the same object instance are cached, for example:

e = Entry.objects.get(id=2)

print e.blog # Hits the database to retrieve the associated Blog. print e.blog # Doesn't hit the database; uses cached version.

Note that the select_related() QuerySet method recursively prepopulates the cache of all one-to-many relationships ahead of time:

e = Entry.objects.select_related().get(id=2)

print e.blog # Doesn't hit the database; uses cached version. print e.blog # Doesn't hit the database; uses cached version.

select_related() is documented in the QuerySet Methods That Return New QuerySets section.

外键的反引用关系

外键关系是自动对称反引用关系的,这可由一个外键可以指向另一个模型而得知.

如果一个源模型含有一个外键,那么它的外键模型的实例,可以利用”Manager”返回这个源模型的 所有实例.默认的这个”Manager”叫做”FOO_set”,这个”FOO”是源模型的名字,小写字母,这个”Manager”将返回 ”QuerySets”,对这个 QuerySets 进行过滤和操作,就像在检索对象章节中介绍的.

Heres an example:

b = Blog.objects.get(id=1)

b.entry_set.all() # Returns all Entry objects related to Blog.

# b.entry_set is a Manager that returns QuerySets. b.entry_set.filter(headline contains='Lennon') b.entry_set.count()

通过在”ForeignKey()”中定义 related_name 参数,你可以重载 ”FOO_set”名字.举例,如果把”Entry”模型修改为”blog =ForeignKey(Blog, related_name=’entries’)”,处理例子的代码如下:

b = Blog.objects.get(id=1)

b.entries.all() # Returns all Entry objects related to Blog.

# b.entries is a Manager that returns QuerySets. b.entries.filter(headline__contains='Lennon') b.entries.count()

你不能直接访问这个类的 reverse “ForeignKey” “Manager”;它必须通过一个实例: Blog.entry_set # Raises AttributeError: "Manager must be accessed via instance".

In addition to the QuerySet methods defined in the Retrieving Objects section, the ForeignKey Manager has these additional methods:

add(obj1, obj2, ...) : Adds the specified model objects to the related object set, for example:

b = Blog.objects.get(id=1)

e = Entry.objects.get(id=234)

b.entry_set.add(e) # Associates Entry e with Blog b.

create(**kwargs) : Creates a new object, saves it, and puts it in the related object set. It returns the newly created object:

b = Blog.objects.get(id=1)

e = b.entry_set.create(headline='Hello', body_text='Hi', pub_date=datetime.date(2005, 1, 1))

# No need to call e.save() at this point -- it's already been saved. This is equivalent to (but much simpler than) the following:

b = Blog.objects.get(id=1)

e = Entry(blog=b, headline='Hello', body_text='Hi', pub_date=datetime.date(2005, 1, 1))

e.save()

注意到,这并没有必要在定义了外键关系的模型中定义关键字参数.在之前的例子中,我们没有传递”blog”参数给”create()”.Django 会解决这个新建的”Entry”对象的 blog 字段值设置为 b.

remove(obj1, obj2, ...) : Removes the specified model objects from the related object set:

b = Blog.objects.get(id=1)

e = Entry.objects.get(id=234)

b.entry_set.remove(e) # Disassociates Entry e from Blog b.

为了阻止数据库的不稳定,这种方法只能对含有外键字段并且该字段可以为 null 的对象有效,如果关联 字段不能设置为”None”(“NULL”),then an object can’t be removed from a relation without being added to another. 在之前的例子中,从``b.entry_set()`` 中删除 e,相当于”e.blog=None”,因为这个”blog”“ForeignKey”不能”nullTrue”,所以这是无效的删除.

clear() : Removes all objects from the related object set: b = Blog.objects.get(id=1)

b.entry_set.clear()

注意: 这并不会删除关联的对象,仅是断开与它们的关联

Just like remove() , clear() is only available on ForeignKey``s where ``null=True .通过给关联集分配一个可迭代的对象可以实现一股脑的把多个对象赋给它

b = Blog.objects.get(id=1) b.entry_set = [e1, e2]

If the clear() method is available, any pre-existing objects will be removed from the entry_set before all objects in the iterable (in this case, a list) are added to the set. If the clear() method is not available, all objects in the iterable will be added without removing any existing elements.

Each reverse operation described in this section has an immediate effect on the database. Every addition, creation, and deletion is immediately and automatically saved to the database.

多对多关系

在多对多关系的两端,都可以通过相应的 API 来访问另外的一端。 API 的工作方式跟前一节所描述的反向一对多关系差不多。

唯一的不同在于属性的命名:定义了``ManyToManyField``的 model 的实例使用属性名称本身,另外一端的 model 的实例则使用 model 名称的小写加上``_set``来活得关联的对象集(就跟 反向一对多关系一样)

用例子来说明一下大家会更容易理解:

e = Entry.objects.get(id=3)

e.authors.all() # Returns all Author objects for this Entry. e.authors.count()

e.authors.filter(name contains='John')

a = Author.objects.get(id=5)

a.entry_set.all() # Returns all Entry objects for this Author.

Like ForeignKey , ManyToManyField can specify related_name . In the preceding example, if the ManyToManyField in Entry had specified related_name='entries' , then each Author instance would have an entries attribute instead of entry_set .

How Are the Backward Relationships Possible?

Other object-relational mappers require you to define relationships on both sides. The Django developers believe this is a violation of the DRY (Dont Repeat Yourself) principle, so Django requires you to define the relationship on only one end. But how is this possible, given that a model class doesnt know which other model classes are related to it until those other model classes are loaded?

The answer lies in the INSTALLED_APPS setting. The first time any model is loaded, Django iterates over every model in INSTALLED_APPS and creates the backward relationships in memory as needed. Essentially, one of the functions of INSTALLED_APPS is to tell Django the entire model domain.

通过关联对象查询

包含关联对象的搜索和包含普通字段的搜索遵循相同的规则。当指定一个值去查询时,你可以使用那个对象的一个实例,也可以使用它的主键值。

For example, if you have a Blog object b with id=5 , the following three queries would be identical:

Entry.objects.filter(blog=b) # Query using object instance Entry.objects.filter(blog=b.id) # Query using id from instance Entry.objects.filter(blog=5) # Query using id directly

删除对象

The delete method, conveniently, is named delete() . This method immediately deletes the object and has no return value:

    1. elete()

      You can also delete objects in bulk. Every QuerySet has a delete() method, which deletes all members of that QuerySet . For example, this deletes all Entry objects with a pub_date year of 2005:

      Entry.objects.filter(pub_date__year=2005).delete()

      When Django deletes an object, it emulates the behavior of the SQL constraint ON DELETE CASCADE in other words, any objects that had foreign keys pointing at the object to be deleted will be deleted along with it, for example:

      b = Blog.objects.get(pk=1)

      # This will delete the Blog and all of its Entry objects. b.delete()

      Note that delete() is the only QuerySet method that is not exposed on a Manager itself. This is a safety mechanism to prevent you from accidentally requesting Entry.objects.delete() and deleting all the entries. If you do want to delete all the objects, then you have to explicitly request a complete query set:

      Entry.objects.all().delete()

      Extra Instance Methods

      In addition to save() and delete() , a model object might get any or all of the following methods.

      get_FOO_display()

      For every field that has choices set, the object will have a get_FOO_display() method, where FOO is the name of the field. This method returns the human-readable value of the field. For example, in the following model:

      GENDER_CHOICES = ( ('M', 'Male'),

      ('F', 'Female'),

      )

      class Person(models.Model):

      name = models.CharField(max_length=20)

      gender = models.CharField(max_length=1, choices=GENDER_CHOICES)每一个 Person 实例都将有一个 get_gender_display() 方法:

      >>> p = Person(name='John', gender='M')

      >>> p.save()

      >>> p.gender 'M'

      >>> p.get_gender_display() 'Male'

      get_next_by_FOO(**kwargs) and get_previous_by_FOO(**kwargs)

      For every DateField and DateTimeField that does not have null=True , the object will have get_next_by_FOO() and get_previous_by_FOO() methods, where FOO is the name of the field. This returns the next and previous object with respect to the date field, raising the appropriate DoesNotExist exception when appropriate.

      两种方法都接受可选的关键词参数,这些参数应该遵循 “域查询”一节中的格式。

      Note that in the case of identical date values, these methods will use the ID as a fallback check. This guarantees that no records are skipped or duplicated. For a full example, see the lookup API samples at http://www.djangoproject.com/documentation/0.96/models/lookup/.

      get_FOO_filename()

      For every FileField , the object will have a get_FOO_filename() method, where FOO is the name of the field. This returns the full filesystem path to the file, according to your MEDIA_ROOT setting.

      注意到 ``ImageField``从技术上是 ``FileField``的子类,所以每个有``ImageField``的模型都有这个方法。

      System Message: WARNING/2 (<string>, line 3110); backlink

      Inline literal start-string without end-string.

      System Message: WARNING/2 (<string>, line 3110); backlink

      Inline literal start-string without end-string.

      get_FOO_url()

      For every FileField , the object will have a get_FOO_url() method, where FOO is the name of the field. This returns the full URL to the file, according to your MEDIA_URL setting. If the value is blank, this method returns an empty string.

      get_FOO_size()

      For every FileField , the object will have a get_FOO_size() method, where FOO is the name of the field. This returns the size of the file, in bytes. (Behind the scenes, it uses os.path.getsize .)

      save_FOO_file(filename, raw_contents)

      For every FileField , the object will have a save_FOO_file() method, where FOO is the name of the field. This saves the given file to the filesystem, using the given file name. If a file with the given file name already exists, Django adds an underscore to the end of the file name (but before the extension) until the file name is available.

      get_FOO_height() and get_FOO_width()

      For every ImageField , the object will have get_FOO_height() and get_FOO_width() methods, where FOO is the name of the field. This returns the height (or width) of the image, as an integer, in pixels.

      捷径

      As you develop views, you will discover a number of common idioms in the way you use the database API. Django encodes some of these idioms as shortcuts that can be used to simplify the process of writing views. These functions are in the django.shortcuts module.

      get_object_or_404()

      One common idiom to use get() and raise Http404 if the object doesnt exist. This idiom is captured by get_object_or_404() . This function takes a Django model as its first argument and an arbitrary number of keyword arguments, which it passes to the default managers get() function. It raises Http404 if the object doesnt exist, for example:

      # Get the Entry with a primary key of 3 e = get_object_or_404(Entry, pk=3)

      When you provide a model to this shortcut function, the default manager is used to execute the underlying get() query. If you dont want to use the default manager, or if you want to search a list of related objects, you can provide get_object_or_404() with a Manager object instead:

      # Get the author of blog instance e with a name of 'Fred' a = get_object_or_404(e.authors, name='Fred')

      # Use a custom manager 'recent_entries' in the search for an

      # entry with a primary key of 3

      e = get_object_or_404(Entry.recent_entries, pk=3)

      get_list_or_404()

      get_list_or_404 行为与 get_object_or_404() 相同,但是它用 filter() 取代了 get() 。如果列表为空,它将引发 Http404 。

      回归原始的 SQL 操作

      如果你需要写一个 SQL 查询,但是用 Django 的数据库映射来实现的话太复杂了,那么你可以考虑使用原始的 SQL 语句。

      解决这个问题的比较好的方法是,给模块写一个自定义的方法或者管理器方法来执行查询。尽管在 Django 中,数据库查询在模块中没有任何存在的 必要性 ,但是这种解决方案使你的数据访问在逻辑上保持一致,而且从组织代码的角度讲也更灵活。操作指南见附录 B。

      最后,请记住 Django 的数据库层仅仅是访问数据库的一个接口,你可以通过其他的工具、编程语言或者数据库框架来访问数据库,它并不是特定于 Django 使用的。