9.类 Classes
Python's class mechanism adds classes to the language with a minimum
of new syntax and semantics. It is a mixture of the class mechanisms
found in C++ and Modula-3. As is true for modules, classes in Python
do not put an absolute barrier between definition and user, but rather
rely on the politeness of the user not to ``break into the
definition.'' The most important features of classes are retained
with full power, however: the class inheritance mechanism allows
multiple base classes, a derived class can override any methods of its
base class or classes, a method can call the method of a base class with the
same name. Objects can contain an arbitrary amount of private data.
Python 在尽可能不增加新的语法和语义的情况下加入了类机制。这种机制是 C++ 和 Modula-3 的混合。Python中的类没有在用户和定义之间建立一个绝对的屏障,而是依赖于用户自觉的不去“破坏定义”。然而,类机制最重要的功能都完整的保留下来。类继承机制允许多继承,派生类可以覆盖(override)基类中的任何方法,方法中可以调用基类中的同名方法。对象可以包含任意数量的私有成员。
In C++ terminology, all class members (including the data members) are
public, and all member functions are virtual. There are
no special constructors or destructors. As in Modula-3, there are no
shorthands for referencing the object's members from its methods: the
method function is declared with an explicit first argument
representing the object, which is provided implicitly by the call. As
in Smalltalk, classes themselves are objects, albeit in the wider
sense of the word: in Python, all data types are objects. This
provides semantics for importing and renaming. Unlike
C++ and Modula-3, built-in types can be used as base classes for
extension by the user. Also, like in C++ but unlike in Modula-3, most
built-in operators with special syntax (arithmetic operators,
subscripting etc.) can be redefined for class instances.
用 C++ 术语来讲,所有的类成员(包括数据成员)都是公有( public )的,所有的成员函数都是虚拟( virtual )的。没有特定的构造和析构函数。用Modula-3的术语来讲,在成员方法中没有什么简便的方式(shorthands)可以引用对象的成员:方法函数在定义时需要以引用的对象做为第一个参数,调用时则会隐式引用对象。这样就形成了语义上的引入和重命名。( This provides semantics for importing and renaming. )但是,像 C++ 而非 Modula-3 中那样,大多数带有特殊语法的内置操作符(算法运算符、下标等)都可以针对类的需要重新定义。
9.1 有关术语的话题 A Word About Terminology
Lacking universally accepted terminology to talk about classes, I will
make occasional use of Smalltalk and C++ terms. (I would use Modula-3
terms, since its object-oriented semantics are closer to those of
Python than C++, but I expect that few readers have heard of it.)
由于没有什么关于类的通用术语,我从 Smalltalk 和 C++ 中借用一些(我更希望用 Modula-3 的,因为它的面向对象机制比 C++更接近Python,不过我想没多少读者听说过它)。
I also have to warn you that there's a terminological pitfall for
object-oriented readers: the word ``object'' in Python does not
necessarily mean a class instance. Like C++ and Modula-3, and
unlike Smalltalk, not all types in Python are classes: the basic
built-in types like integers and lists are not, and even somewhat more
exotic types like files aren't. However, all Python types
share a little bit of common semantics that is best described by using
the word object.
我要提醒读者,这里有一个面向对象方面的术语陷阱,在 Python 中“对象”这个词不一定指类实例。Python 中并非所有的类型都是类:例如整型、链表这些内置数据类型就不是,甚至某些像文件这样的外部类型也不是,这一点类似于 C++ 和 Modula-3,而不像 Smalltalk。然而,所有的 Python 类型在语义上都有一点相同之处:描述它们的最贴切词语是“对象”。
Objects have individuality, and multiple names (in multiple scopes)
can be bound to the same object. This is known as aliasing in other
languages. This is usually not appreciated on a first glance at
Python, and can be safely ignored when dealing with immutable basic
types (numbers, strings, tuples). However, aliasing has an
(intended!) effect on the semantics of Python code involving mutable
objects such as lists, dictionaries, and most types representing
entities outside the program (files, windows, etc.). This is usually
used to the benefit of the program, since aliases behave like pointers
in some respects. For example, passing an object is cheap since only
a pointer is passed by the implementation; and if a function modifies
an object passed as an argument, the caller will see the change -- this
eliminates the need for two different argument passing mechanisms as in
Pascal.
对象是被特化的,多个名字(在多个作用域中)可以绑定同一个对象。这相当于其它语言中的别名。通常对 Python 的第一印象中会忽略这一点,使用那些不可变的基本类型(数值、字符串、元组)时也可以很放心的忽视它。然而,在 Python 代码调用字典、链表之类可变对象,以及大多数涉及程序外部实体(文件、窗体等等)的类型时,这一语义就会有影响。这通用有助于优化程序,因为别名的行为在某些方面类似于指针。例如,很容易传递一个对象,因为在行为上只是传递了一个指针。如果函数修改了一个通过参数传递的对象,调用者可以接收到变化--在 Pascal 中这需要两个不同的参数传递机制。
9.2 Python 作用域和命名空间 Python Scopes and Name Spaces
Before introducing classes, I first have to tell you something about
Python's scope rules. Class definitions play some neat tricks with
namespaces, and you need to know how scopes and namespaces work to
fully understand what's going on. Incidentally, knowledge about this
subject is useful for any advanced Python programmer.
在介绍类之前,我首先介绍一些有关 Python 作用域的规则:类的定义非常巧妙的运用了命名空间,要完全理解接下来的知识,需要先理解作用域和命名空间的工作原理。另外,这一切的知识对于任何高级 Python 程序员都非常有用。
Let's begin with some definitions.
我们从一些定义开始。
A namespace is a mapping from names to objects. Most
namespaces are currently implemented as Python dictionaries, but
that's normally not noticeable in any way (except for performance),
and it may change in the future. Examples of namespaces are: the set
of built-in names (functions such as abs(), and built-in
exception names); the global names in a module; and the local names in
a function invocation. In a sense the set of attributes of an object
also form a namespace. The important thing to know about namespaces
is that there is absolutely no relation between names in different
namespaces; for instance, two different modules may both define a
function ``maximize'' without confusion -- users of the modules must
prefix it with the module name.
命名空间是从命名到对象的映射。当前命名空间主要是通过 Python 字典实现的,不过通常不关心具体的实现方式(除非出于性能考虑),以后也有可能会改变其实现方式。以下有一些命名空间的例子:内置命名(像 <#2558#>abs() 这样的函数,以及内置异常名)集,模块中的全局命名,函数调用中的局部命名。某种意义上讲对象的属性集也是一个命名空间。关于命名空间需要了解的一件很重要的事就是不同命名空间中的命名没有任何联系,例如两个不同的模块可能都会定义一个名为“maximize”的函数而不会发生混淆--用户必须以模块名为前缀来引用它们。
By the way, I use the word attribute for any name following a
dot -- for example, in the expression z.real
, real
is
an attribute of the object z
. Strictly speaking, references to
names in modules are attribute references: in the expressionmodname.funcname
, modname
is a module object andfuncname
is an attribute of it. In this case there happens to
be a straightforward mapping between the module's attributes and the
global names defined in the module: they share the same namespace!
9.1
顺便提一句,我称 Python 中任何一个“.”之后的命名为属性--例如,表达式 z.real
中的 real
是对象 z
的一个属性。严格来讲,从模块中引用命名是引用属性:表达式 modname.funcname
中, modname
是一个模块对象,funcname
是它的一个属性。因此,模块的属性和模块中的全局命名有直接的映射关系:它们共享同一命名空间!9.2
Attributes may be read-only or writable. In the latter case,
assignment to attributes is possible. Module attributes are writable:
you can write "modname.the_answer = 42". Writable attributes may
also be deleted with the del statement. For example,
"del modname.the_answer" will remove the attribute
the_answer from the object named by modname
.
属性可以是只读过或写的。后一种情况下,可以对属性赋值。你可以这样作:
"modname.the_answer = 42"。可写的属性也可以用 del
语句删除。例如:"del modname.the_answer" 会从modname
对象中删除 the_answer 属性。
Name spaces are created at different moments and have different
lifetimes. The namespace containing the built-in names is created
when the Python interpreter starts up, and is never deleted. The
global namespace for a module is created when the module definition
is read in; normally, module namespaces also last until the
interpreter quits. The statements executed by the top-level
invocation of the interpreter, either read from a script file or
interactively, are considered part of a module called
__main__, so they have their own global namespace. (The
built-in names actually also live in a module; this is called
__builtin__.)
不同的命名空间在不同的时刻创建,有不同的生存期。包含内置命名的命名空间在 Python 解释器启动时创建,会一直保留,不被删除。模块的全局命名空间在模块定义被读入时创建,通常,模块命名空间也会一直保存到解释器退出。由解释器在最高层调用执行的语句,不管它是从脚本文件中读入还是来自交互式输入,都是 __main__ 模块的一部分,所以它们也拥有自己的命名空间。(内置命名也同样被包含在一个模块中,它被称作 __builtin__ 。)
The local namespace for a function is created when the function is
called, and deleted when the function returns or raises an exception
that is not handled within the function. (Actually, forgetting would
be a better way to describe what actually happens.) Of course,
recursive invocations each have their own local namespace.
当函数被调用时创建一个局部命名空间,函数反正返回过抛出一个未在函数内处理的异常时删除。(实际上,说是遗忘更为贴切)。当然,每一个递归调用拥有自己的命名空间。
A scope is a textual region of a Python program where a
namespace is directly accessible. ``Directly accessible'' here means
that an unqualified reference to a name attempts to find the name in
the namespace.
尽管作用域是静态定义,在使用时他们都是动态的。每次执行时,至少有三个命名空间可以直接访问的作用域嵌套在一起:包含局部命名的使用域在最里面,首先被搜索;其次搜索的是中层的作用域,这里包含了同级的函数;最后搜索最外面的作用域,它包含内置命名。
Although scopes are determined statically, they are used dynamically.
At any time during execution, there are at least three nested scopes whose
namespaces are directly accessible: the innermost scope, which is searched
first, contains the local names; the namespaces of any enclosing
functions, which are searched starting with the nearest enclosing scope;
the middle scope, searched next, contains the current module's global names;
and the outermost scope (searched last) is the namespace containing built-in
names.
尽管作用域是静态定义,在使用时他们都是动态的。每次执行时,至少有三个命名空间可以直接访问的作用域嵌套在一起:包含局部命名的使用域在最里面,首先被搜索;其次搜索的是中层的作用域,这里包含了同级的函数;最后搜索最外面的作用域,它包含内置命名。
If a name is declared global, then all references and assignments go
directly to the middle scope containing the module's global names.
Otherwise, all variables found outside of the innermost scope are read-only.
如果一个命名声明为全局的,那么所有的赋值和引用都直接针对包含模全局命名的中级作用域。另外,从外部访问到的所有内层作用域的变量都是只读的。
Usually, the local scope references the local names of the (textually)
current function. Outside of functions, the local scope references
the same namespace as the global scope: the module's namespace.
Class definitions place yet another namespace in the local scope.
从文本意义上讲,局部作用域引用当前函数的命名。在函数之外,局部作用域与全局使用域引用同一命名空间:模块命名空间。类定义也是局部作用域中的另一个命名空间。
It is important to realize that scopes are determined textually: the
global scope of a function defined in a module is that module's
namespace, no matter from where or by what alias the function is
called. On the other hand, the actual search for names is done
dynamically, at run time -- however, the language definition is
evolving towards static name resolution, at ``compile'' time, so don't
rely on dynamic name resolution! (In fact, local variables are
already determined statically.)
作用域决定于源程序的文本:一个定义于某模块中的函数的全局作用域是该模块的命名空间,而不是该函数的别名被定义或调用的位置,了解这一点非常重要。另一方面,命名的实际搜索过程是动态的,在运行时确定的——然而,Python 语言也在不断发展,以后有可能会成为静态的“编译”时确定,所以不要依赖于动态解析!(事实上,局部变量已经是静态确定了。)
A special quirk of Python is that assignments always go into the
innermost scope. Assignments do not copy data -- they just
bind names to objects. The same is true for deletions: the statement
"del x" removes the binding of x
from the namespace
referenced by the local scope. In fact, all operations that introduce
new names use the local scope: in particular, import statements and
function definitions bind the module or function name in the local
scope. (The global statement can be used to indicate that
particular variables live in the global scope.)
Python 的一个特别之处在于其赋值操作总是在最里层的作用域。赋值不会复制数据——只是将命名绑定到对象。删除也是如此:"del
x" 只是从局部作用域的命名空间中删除命名 x
。事实上,所有引入新命名的操作都作用于局部作用域。特别是 import 语句和函数定将模块名或函数绑定于局部作用域。(可以使用 global 语句将变量引入到全局作用域。)
9.3 初识类 A First Look at Classes
Classes introduce a little bit of new syntax, three new object types,
and some new semantics.
类引入了一点新的语法,三种新的对象类型,以及一些新的语义。
9.3.1 类定义语法 Class Definition Syntax
The simplest form of class definition looks like this:
最简单的类定义形式如下:
class ClassName: <statement-1> . . . <statement-N>
Class definitions, like function definitions
(def statements) must be executed before they have any
effect. (You could conceivably place a class definition in a branch
of an if statement, or inside a function.)
类的定义就像函数定义( def 语句),要先执行才能生效。(你当然可以把它放进 if 语句的某一分支,或者一个函数的内部。)
In practice, the statements inside a class definition will usually be
function definitions, but other statements are allowed, and sometimes
useful -- we'll come back to this later. The function definitions
inside a class normally have a peculiar form of argument list,
dictated by the calling conventions for methods -- again, this is
explained later.
习惯上,类定义语句的内容通常是函数定义,不过其它语句也可以,有时会很有用——后面我们再回过头来讨论。类中的函数定义通常包括了一个特殊形式的参数列表,用于方法调用约定——同样我们在后面讨论这些。
When a class definition is entered, a new namespace is created, and
used as the local scope -- thus, all assignments to local variables
go into this new namespace. In particular, function definitions bind
the name of the new function here.
定义一个类的时候,会创建一个新的命名空间,将其作为局部作用域使用——因此,所以对局部变量的赋值都引入新的命名空间。特别是函数定义将新函数的命名绑定于此。
When a class definition is left normally (via the end), a class
object is created. This is basically a wrapper around the contents
of the namespace created by the class definition; we'll learn more
about class objects in the next section. The original local scope
(the one in effect just before the class definitions was entered) is
reinstated, and the class object is bound here to the class name given
in the class definition header (ClassName in the example).
类定义完成时(正常退出),就创建了一个类对象。基本上它是对类定义创建的命名空间进行了一个包装;我们在下一节进一步学习类对象的知识。原始的局部作用域(类定义引入之前生效的那个)得到恢复,类对象在这里绑定到类定义头部的类名(例子中是 ClassName )。
9.3.2 类对象 Class Objects
Class objects support two kinds of operations: attribute references
and instantiation.
类对象支持两种操作:属性引用和实例化。
Attribute references use the standard syntax used for all
attribute references in Python: obj.name
. Valid attribute
names are all the names that were in the class's namespace when the
class object was created. So, if the class definition looked like
this:
属性引用使用和 Python 中所有的属性引用一样的标准语法:obj.name
。类对象创建后,类命名空间中所有的命名都是有效属性名。所以如果类定义是这样:
class MyClass: "A simple example class" i = 12345 def f(self): return 'hello world'
then MyClass.i
and MyClass.f
are valid attribute
references, returning an integer and a method object, respectively.
Class attributes can also be assigned to, so you can change the value
of MyClass.i
by assignment. __doc__ is also a valid
attribute, returning the docstring belonging to the class: "A
.
simple example class"
那么 MyClass.i
和 MyClass.f
是有效的属性引用,分别返回一个整数和一个方法对象。也可以对类属性赋值,你可以通过给MyClass.i
赋值来修改它。 __doc__ 也是一个有效的属性,返回类的文档字符串: "A simple example class"
。
Class instantiation uses function notation. Just pretend that
the class object is a parameterless function that returns a new
instance of the class. For example (assuming the above class):
类的实例化使用函数符号。只要将类对象看作是一个返回新的类实例的无参数函数即可。例如(假设沿用前面的类):
x = MyClass()
creates a new instance of the class and assigns this object to
the local variable x
.
以上创建了一个新的类实例并将该对象赋给局部变量 x
。
The instantiation operation (``calling'' a class object) creates an
empty object. Many classes like to create objects in a known initial
state. Therefore a class may define a special method named
__init__(), like this:
这个实例化操作(“调用”一个类对象)来创建一个空的对象。很多类都倾向于将对象创建为有初始状态的。因此类可能会定义一个名为
__init__() 的特殊方法,像下面这样:
def __init__(self): self.data = []
When a class defines an __init__() method, class
instantiation automatically invokes __init__() for the
newly-created class instance. So in this example, a new, initialized
instance can be obtained by:
类定义了 __init__() 方法的话,类的实例化操作会自动为新创建的类实例调用 __init__() 方法。所以在下例中,可以这样创建一个新的实例:
x = MyClass()
Of course, the __init__() method may have arguments for
greater flexibility. In that case, arguments given to the class
instantiation operator are passed on to __init__(). For
example,
当然,出于弹性的需要, __init__() 方法可以有参数。事实上,参数通过 __init__() 传递到类的实例化操作上。例如:
>>> class Complex: ... def __init__(self, realpart, imagpart): ... self.r = realpart ... self.i = imagpart ... >>> x = Complex(3.0, -4.5) >>> x.r, x.i (3.0, -4.5)
9.3.3 实例对象 Instance Objects
Now what can we do with instance objects? The only operations
understood by instance objects are attribute references. There are
two kinds of valid attribute names.
现在我们可以用实例对象作什么?实例对象唯一可用的操作就是属性引用。有两种有效的属性名。
The first I'll call data attributes. These correspond to
``instance variables'' in Smalltalk, and to ``data members'' in
C++. Data attributes need not be declared; like local variables,
they spring into existence when they are first assigned to. For
example, if x
is the instance of MyClass created above,
the following piece of code will print the value 16
, without
leaving a trace:
第一种称作数据属性。这相当于 Smalltalk 中的“实例变量”或 C++中的“数据成员”。和局部变量一样,数据属性不需要声明,第一次使用时它们就会生成。例如,如果 x
是前面创建的 MyClass 实例,下面这段代码会打印出 16
而不会有任何多余的残留:
x.counter = 1 while x.counter < 10: x.counter = x.counter * 2 print x.counter del x.counter
The second kind of attribute references understood by instance objects
are methods. A method is a function that ``belongs to'' an
object. (In Python, the term method is not unique to class instances:
other object types can have methods as well. For example, list objects have
methods called append, insert, remove, sort, and so on. However,
below, we'll use the term method exclusively to mean methods of class
instance objects, unless explicitly stated otherwise.)
第二种为实例对象所接受的引用属性是方法。方法是属于一个对象的函数。(在 Python 中,方法不止是类实例所独有:其它类型的对象也可有方法。例如,链表对象有 append,insert,remove,sort 等等方法。然而,在这里,除非特别说明,我们提到的方法特指类方法)
Valid method names of an instance object depend on its class. By
definition, all attributes of a class that are (user-defined) function
objects define corresponding methods of its instances. So in our
example, x.f
is a valid method reference, sinceMyClass.f
is a function, but x.i
is not, sinceMyClass.i
is not. But x.f
is not the same thing asMyClass.f
-- it is a method object, not
a function object.
实例对象的有效名称依赖于它的类。按照定义,类中所有(用户定义)的函数对象对应它的实例中的方法。所以在我们的例子中,x.f
是一个有效的方法引用,因为 MyClass.f
是一个函数。但 x.i
不是,因为 MyClass.i
是不是函数。不过 x.f
和 MyClass.f
不同--它是一个方法对象,不是一个函数对象。
9.3.4 方法对象 Method Objects
Usually, a method is called immediately:
通常方法是直接调用的:
x.f()
In our example, this will return the string 'hello world'
.
However, it is not necessary to call a method right away:x.f
is a method object, and can be stored away and called at a
later time. For example:
在我们的例子中,这会返回字符串 'hello world'
。然而,也不是一定要直接调用方法。 x.f
是一个方法对象,它可以存储起来以后调用。例如:
xf = x.f while True: print xf()
will continue to print "hello world" until the end of time.
会不断的打印 "hello world" 。
What exactly happens when a method is called? You may have noticed
that x.f()
was called without an argument above, even though
the function definition for f specified an argument. What
happened to the argument? Surely Python raises an exception when a
function that requires an argument is called without any -- even if
the argument isn't actually used...
调用方法时发生了什么?你可能注意到调用 x.f()
时没有引用前面标出的变量,尽管在 f 的函数定义中指明了一个参数。这个参数怎么了?事实上如果函数调用中缺少参数,Python 会抛出异常--甚至这个参数实际上没什么用……
Actually, you may have guessed the answer: the special thing about
methods is that the object is passed as the first argument of the
function. In our example, the call x.f()
is exactly equivalent
to MyClass.f(x)
. In general, calling a method with a list of
n arguments is equivalent to calling the corresponding function
with an argument list that is created by inserting the method's object
before the first argument.
实际上,你可能已经猜到了答案:方法的特别之处在于实例对象作为函数的第一个参数传给了函数。在我们的例子中,调用 x.f()
相当于 MyClass.f(x)
。通常,以 n 个参数的列表去调用一个方法就相当于将方法的对象插入到参数列表的最前面后,以这个列表去调用相应的函数。
If you still don't understand how methods work, a look at the
implementation can perhaps clarify matters. When an instance
attribute is referenced that isn't a data attribute, its class is
searched. If the name denotes a valid class attribute that is a
function object, a method object is created by packing (pointers to)
the instance object and the function object just found together in an
abstract object: this is the method object. When the method object is
called with an argument list, it is unpacked again, a new argument
list is constructed from the instance object and the original argument
list, and the function object is called with this new argument list.
如果你还是不理解方法的工作原理,了解一下它的实现也许有帮助。引用非数据属性的实例属性时,会搜索它的类。如果这个命名确认为一个有效的函数对象类属性,就会将实例对象和函数对象封装进一个抽象对象:这就是方法对象。以一个参数列表调用方法对象时,它被重新拆封,用实例对象和原始的参数列表构造一个新的参数列表,然后函数对象调用这个新的参数列表。
9.4 一些说明 Random Remarks
〔有些内容可能需要明确一下……〕
Data attributes override method attributes with the same name; to
avoid accidental name conflicts, which may cause hard-to-find bugs in
large programs, it is wise to use some kind of convention that
minimizes the chance of conflicts. Possible conventions include
capitalizing method names, prefixing data attribute names with a small
unique string (perhaps just an underscore), or using verbs for methods
and nouns for data attributes.
同名的数据属性会覆盖方法属性,为了避免可能的命名冲突--这在大型程序中可能会导致难以发现的 bug --最好以某种命名约定来避免冲突。可选的约定包括方法的首字母大写,数据属性名前缀小写(可能只是一个下划线),或者方法使用动词而数据属性使用名词。
Data attributes may be referenced by methods as well as by ordinary
users (``clients'') of an object. In other words, classes are not
usable to implement pure abstract data types. In fact, nothing in
Python makes it possible to enforce data hiding -- it is all based
upon convention. (On the other hand, the Python implementation,
written in C, can completely hide implementation details and control
access to an object if necessary; this can be used by extensions to
Python written in C.)
数据属性可以由方法引用,也可以由普通用户(客户)调用。换句话说,类不能实现纯的数据类型。事实上 Python 中没有什么办法可以强制隐藏数据--一切都基本约定的惯例。(另一方法讲,Python 的实现是用 C 写成的,如果有必要,可以用 C 来编写 Python 扩展,完全隐藏实现的细节,控制对象的访问。)
Clients should use data attributes with care -- clients may mess up
invariants maintained by the methods by stamping on their data
attributes. Note that clients may add data attributes of their own to
an instance object without affecting the validity of the methods, as
long as name conflicts are avoided -- again, a naming convention can
save a lot of headaches here.
客户应该小心使用数据属性--客户可能会因为随意修改数据属性而破坏了本来由方法维护的数据一致性。需要注意的是,客户只要注意避免命名冲突,就可以随意向实例中添加数据属性而不会影响方法的有效性--再次强调,命名约定可以省去很多麻烦。
There is no shorthand for referencing data attributes (or other
methods!) from within methods. I find that this actually increases
the readability of methods: there is no chance of confusing local
variables and instance variables when glancing through a method.
从方法内部引用数据属性(以及其它方法!)没有什么快捷的方式。我认为这事实上增加了方法的可读性:即使粗略的浏览一个方法,也不会有混淆局部变量和实例变量的机会。
Conventionally, the first argument of methods is often calledself
. This is nothing more than a convention: the nameself
has absolutely no special meaning to Python. (Note,
however, that by not following the convention your code may be less
readable by other Python programmers, and it is also conceivable that
a class browser program be written which relies upon such a
convention.)
习惯上,方法的第一个参数命名为 self
。这仅仅是一个约定:对 Python 而言,self
绝对没有任何特殊含义。(然而要注意的是,如果不遵守这个约定,别的 Python 程序员阅读你的代码时会有不便,而且有些类浏览程序也是遵循此约定开发的。)
Any function object that is a class attribute defines a method for
instances of that class. It is not necessary that the function
definition is textually enclosed in the class definition: assigning a
function object to a local variable in the class is also ok. For
example:
类属性中的任何函数对象在类实例中都定义为方法。不是必须要将函数定义代码写进类定义中,也可以将一个函数对象赋给类中的一个变量。例如:
# Function defined outside the class def f1(self, x, y): return min(x, x+y) class C: f = f1 def g(self): return 'hello world' h = g
Now f
, g
and h
are all attributes of class
C that refer to function objects, and consequently they are all
methods of instances of C -- h
being exactly equivalent
to g
. Note that this practice usually only serves to confuse
the reader of a program.
现在 f
, g
和 h
都是类 C 的属性,引用的都是函数对象,因此它们都是 C 实例的方法-- h
严格等于 g
。要注意的是这种习惯通常只会迷惑程序的读者。
Methods may call other methods by using method attributes of theself
argument:
通过 self
参数的方法属性,方法可以调用其它的方法:
class Bag: def __init__(self): self.data = [] def add(self, x): self.data.append(x) def addtwice(self, x): self.add(x) self.add(x)
Methods may reference global names in the same way as ordinary
functions. The global scope associated with a method is the module
containing the class definition. (The class itself is never used as a
global scope!) While one rarely encounters a good reason for using
global data in a method, there are many legitimate uses of the global
scope: for one thing, functions and modules imported into the global
scope can be used by methods, as well as functions and classes defined
in it. Usually, the class containing the method is itself defined in
this global scope, and in the next section we'll find some good
reasons why a method would want to reference its own class!
方法可以像引用普通的函数那样引用全局命名。与方法关联的全局作用域是包含类定义的模块。(类本身永远不会做为全局作用域使用!)尽管很少有好的理由在方法中使用全局数据,全局作用域确有很多合法的用途:其一是方法可以调用导入全局作用域的函数和方法,也可以调用定义在其中的类和函数。通常,包含此方法的类也会定义在这个全局作用域,在下一节我们会了解为何一个方法要引用自己的类!
9.5 继承 Inheritance
Of course, a language feature would not be worthy of the name ``class''
without supporting inheritance. The syntax for a derived class
definition looks as follows:
当然,如果一种语言不支持继承就,“类”就没有什么意义。派生类的定义如下所示:
class DerivedClassName(BaseClassName): <statement-1> . . . <statement-N>
The name BaseClassName must be defined in a scope containing
the derived class definition. Instead of a base class name, an
expression is also allowed. This is useful when the base class is
defined in another module,
命名 BaseClassName(示例中的基类名)必须与派生类定义在一个作用域内。除了类,还可以用表达式,基类定义在另一个模块中时这一点非常有用:
class DerivedClassName(modname.BaseClassName):
Execution of a derived class definition proceeds the same as for a
base class. When the class object is constructed, the base class is
remembered. This is used for resolving attribute references: if a
requested attribute is not found in the class, it is searched in the
base class. This rule is applied recursively if the base class itself
is derived from some other class.
派生类定义的执行过程和基类是一样的。构造派生类对象时,就记住了基类。这在解析属性引用的时候尤其有用:如果在类中找不到请求调用的属性,就搜索基类。如果基类是由别的类派生而来,这个规则会递归的应用上去。
There's nothing special about instantiation of derived classes:DerivedClassName()
creates a new instance of the class. Method
references are resolved as follows: the corresponding class attribute
is searched, descending down the chain of base classes if necessary,
and the method reference is valid if this yields a function object.
派生类的实例化没有什么特殊之处:DerivedClassName()
(示列中的派生类)创建一个新的类实例。方法引用按如下规则解析:搜索对应的类属性,必要时沿基类链逐级搜索,如果找到了函数对象这个方法引用就是合法的
Derived classes may override methods of their base classes. Because
methods have no special privileges when calling other methods of the
same object, a method of a base class that calls another method
defined in the same base class, may in fact end up calling a method of
a derived class that overrides it. (For C++ programmers: all methods
in Python are effectively virtual.)
派生类可能会覆盖其基类的方法。因为方法调用同一个对象中的其它方法时没有特权,基类的方法调用同一个基类的方法时,可能实际上最终调用了派生类中的覆盖方法。(对于 C++ 程序员来说,Python中的所有方法本质上都是虚方法。)
An overriding method in a derived class may in fact want to extend
rather than simply replace the base class method of the same name.
There is a simple way to call the base class method directly: just
call "BaseClassName.methodname(self, arguments)". This is
occasionally useful to clients as well. (Note that this only works if
the base class is defined or imported directly in the global scope.)
派生类中的覆盖方法可能是想要扩充而不是简单的替代基类中的重名方法。有一个简单的方法可以直接调用基类方法,只要调用:"BaseClassName.methodname(self, arguments)"。有时这对于客户也很有用。(要注意的中只有基类在同一全局作用域定义或导入时才能这样用。)
9.5.1 多继承 Multiple Inheritance
Python supports a limited form of multiple inheritance as well. A
class definition with multiple base classes looks as follows:
Python同样有限的支持多继承形式。多继承的类定义形如下例:
class DerivedClassName(Base1, Base2, Base3): <statement-1> . . . <statement-N>
The only rule necessary to explain the semantics is the resolution
rule used for class attribute references. This is depth-first,
left-to-right. Thus, if an attribute is not found in
DerivedClassName, it is searched in Base1, then
(recursively) in the base classes of Base1, and only if it is
not found there, it is searched in Base2, and so on.
这里唯一需要解释的语义是解析类属性的规则。顺序是深度优先,从左到右。因此,如果在 DerivedClassName (示例中的派生类)中没有找到某个属性,就会搜索 Base1 ,然后(递归的)搜索其基类,如果最终没有找到,就搜索 Base2,以此类推。
(To some people breadth first -- searching Base2 and
Base3 before the base classes of Base1 -- looks more
natural. However, this would require you to know whether a particular
attribute of Base1 is actually defined in Base1 or in
one of its base classes before you can figure out the consequences of
a name conflict with an attribute of Base2. The depth-first
rule makes no differences between direct and inherited attributes of
Base1.)
(有些人认为广度优先--在搜索Base1的基类之前搜索Base2和Base3
--看起来更为自然。然而,如果Base1和Base2之间发生了命名冲突,你需要了解这个属性是定义于Base1还是Base1的基类中。而深度优先不区分属性继承自基类还是直接定义。)
It is clear that indiscriminate use of multiple inheritance is a
maintenance nightmare, given the reliance in Python on conventions to
avoid accidental name conflicts. A well-known problem with multiple
inheritance is a class derived from two classes that happen to have a
common base class. While it is easy enough to figure out what happens
in this case (the instance will have a single copy of ``instance
variables'' or data attributes used by the common base class), it is
not clear that these semantics are in any way useful.
显然不加限制的使用多继承会带来维护上的噩梦,因为 Python 中只依靠约定来避免命名冲突。多继承一个很有名的问题是派生继承的两个基类都是从同一个基类继承而来。目前还不清楚这在语义上有什么意义,然而很容易想到这会造成什么后果(实例会有一个独立的“实例变量”或数据属性复本作用于公共基类。)
9.6 私有变量 Private Variables
There is limited support for class-private
identifiers. Any identifier of the form __spam
(at least two
leading underscores, at most one trailing underscore) is now textually
replaced with _classname__spam
, where classname
is the
current class name with leading underscore(s) stripped. This mangling
is done without regard of the syntactic position of the identifier, so
it can be used to define class-private instance and class variables,
methods, as well as globals, and even to store instance variables
private to this class on instances of other classes. Truncation
may occur when the mangled name would be longer than 255 characters.
Outside classes, or when the class name consists of only underscores,
no mangling occurs.
Python 对类的私有成员提供了有限的支持。任何形如 __spam
(以至少双下划线开头,至多单下划线结尾)随即都被替代为 _classname__spam
,去掉前导下划线的 classname
即当前的类名。这种混淆不关心标识符的语法位置,所以可用来定义私有类实例和类变量、方法,以及全局变量,甚至于将其它类的实例保存为私有变量。混淆名长度超过255个字符的时候可能会发生截断。在类的外部,或类名只包含下划线时,不会发生截断。
Name mangling is intended to give classes an easy way to define
``private'' instance variables and methods, without having to worry
about instance variables defined by derived classes, or mucking with
instance variables by code outside the class. Note that the mangling
rules are designed mostly to avoid accidents; it still is possible for
a determined soul to access or modify a variable that is considered
private. This can even be useful in special circumstances, such as in
the debugger, and that's one reason why this loophole is not closed.
(Buglet: derivation of a class with the same name as the base class
makes use of private variables of the base class possible.)
命名混淆意在给出一个在类中定义“私有”实例变量和方法的简单途径,避免派生类的实例变量定义产生问题,或者与外界代码中的变量搞混。要注意的是混淆规则主要目的在于避免意外错误,被认作为私有的变量仍然有可能被访问或修改。在特定的场合它也是有用的,比如调试的时候,这也是一直没有堵上这个漏洞的原因之一(小漏洞:派生类和基类取相同的名字就可以使用基类的私有变量。)
Notice that code passed to exec
, eval()
orevalfile()
does not consider the classname of the invoking
class to be the current class; this is similar to the effect of theglobal
statement, the effect of which is likewise restricted to
code that is byte-compiled together. The same restriction applies togetattr()
, setattr()
and delattr()
, as well as
when referencing __dict__
directly.
要注意的是传入 exec
,eval()
或 evalfile()
的代码不会将调用它们的类视作当前类,这与 global
语句的情况类似,global
的作用局限于“同一批”进行字节编译的代码。同样的限制也适用于 getattr()
,setattr()
和delattr()
,以及直接引用 __dict__
的时候。
9.7 补充 Odds and Ends
Sometimes it is useful to have a data type similar to the Pascal
``record'' or C ``struct'', bundling together a couple of named data
items. An empty class definition will do nicely:
有时类似于Pascal中“记录(record)”或C中“结构(struct)”的数据类型很有用,它将一组已命名的数据项绑定在一起。一个空的类定义可以很好的实现这它:
class Employee: pass john = Employee() # Create an empty employee record # Fill the fields of the record john.name = 'John Doe' john.dept = 'computer lab' john.salary = 1000
A piece of Python code that expects a particular abstract data type
can often be passed a class that emulates the methods of that data
type instead. For instance, if you have a function that formats some
data from a file object, you can define a class with methods
read() and readline() that gets the data from a string
buffer instead, and pass it as an argument.
某一段 Python
代码需要一个特殊的抽象数据结构的话,通常可以传入一个类,事实上这模仿了该类的方法。例如,如果你有一个用于从文件对象中格式化数据的函数,你可以定义一个带有
read() 和 readline()
方法的类,以此从字符串缓冲读取数据,然后将该类的对象作为参数传入前述的函数。
Instance method objects have attributes, too: m.im_self
is the
object of which the method is an instance, and m.im_func
is the
function object corresponding to the method.
实例方法对象也有属性: m.im_self
是一个实例方法所属的对象,而 m.im_func
是这个方法对应的函数对象。
9.8 异常也是类 Exceptions Are Classes Too
User-defined exceptions are identified by classes as well. Using this
mechanism it is possible to create extensible hierarchies of exceptions.
用户自定义异常也可以是类。利用这个机制可以创建可扩展的异常体系。
There are two new valid (semantic) forms for the raise statement:
以下是两种新的有效(语义上的)异常抛出形式:
raise Class, instance raise instance
In the first form, instance
must be an instance of
Class or of a class derived from it. The second form is a
shorthand for:
第一种形式中,instance
必须是 Class 或其派生类的一个实例。第二种形式是以下形式的简写:
raise instance.__class__, instance
A class in an except clause is compatible with an exception if it is the same
class or a base class thereof (but not the other way around -- an
except clause listing a derived class is not compatible with a base
class). For example, the following code will print B, C, D in that
order:
发生的异常其类型如果是异常子句中列出的类,或者是其派生类,那么它们就是相符的(反过来说--发生的异常其类型如果是异常子句中列出的类的基类,它们就不相符)。例如,以下代码会按顺序打印B,C,D:
class B: pass class C(B): pass class D(C): pass for c in [B, C, D]: try: raise c() except D: print "D" except C: print "C" except B: print "B"
Note that if the except clauses were reversed (with
"except B" first), it would have printed B, B, B -- the first
matching except clause is triggered.
要注意的是如果异常子句的顺序颠倒过来( "execpt B" 在最前),它就会打印B,B,B--第一个匹配的异常被触发。
When an error message is printed for an unhandled exception which is a
class, the class name is printed, then a colon and a space, and
finally the instance converted to a string using the built-in function
str().
打印一个异常类的错误信息时,先打印类名,然后是一个空格、一个冒号,然后是用内置函数
str() 将类转换得到的完整字符串。
9.9 迭代器 Iterators
By now, you've probably noticed that most container objects can be looped
over using a for statement:
现在你可能注意到大多数容器对象都可以用 for
遍历:
for element in [1, 2, 3]: print element for element in (1, 2, 3): print element for key in {'one':1, 'two':2}: print key for char in "123": print char for line in open("myfile.txt"): print line
This style of access is clear, concise, and convenient. The use of iterators
pervades and unifies Python. Behind the scenes, the for
statement calls iter() on the container object. The
function returns an iterator object that defines the method
next() which accesses elements in the container one at a
time. When there are no more elements, next() raises a
StopIteration exception which tells the for loop
to terminate. This example shows how it all works:
这种形式的访问清晰、简洁、方便。这种迭代器的用法在 Python 中普遍而且统一。在后台,for
语句在容器对象中调用 iter() 。 该函数返回一个定义了 next() 方法的迭代器对象,它在容器中逐一访问元素。没有后续的元素时,next()抛出一个 StopIteration 异常通知 for
语句循环结束。以下是其工作原理的示例:
>>> s = 'abc' >>> it = iter(s) >>> it <iterator object at 0x00A1DB50> >>> it.next() 'a' >>> it.next() 'b' >>> it.next() 'c' >>> it.next() Traceback (most recent call last): File "<pyshell#6>", line 1, in -toplevel- it.next() StopIteration
Having seen the mechanics behind the iterator protocol, it is easy to add
iterator behavior to your classes. Define a __iter__() method
which returns an object with a next() method. If the class defines
next(), then __iter__() can just return self
:
了解了迭代器协议的后台机制,就可以很容易的给自己的类添加迭代器行为。定义一个 __iter__() 方法,使其返回一个带有 next() 方法的对象。如果这个类已经定义了 next(),那么 __iter__() 只需要返回self:
>>> class Reverse: "Iterator for looping over a sequence backwards" def __init__(self, data): self.data = data self.index = len(data) def __iter__(self): return self def next(self): if self.index == 0: raise StopIteration self.index = self.index - 1 return self.data[self.index] >>> for char in Reverse('spam'): print char m a p s
9.10 生成器 Generators
Generators are a simple and powerful tool for creating iterators. They are
written like regular functions but use the yield statement whenever
they want to return data. Each time the next() is called, the
generator resumes where it left-off (it remembers all the data values and
which statement was last executed). An example shows that generators can
be trivially easy to create:
生成器是创建迭代器的简单而强大的工具。它们写起来就像是正则函数,需要返回数据的时候使用 yield 语句。每次 next() 被调用时,生成器回复它脱离的位置(它记忆语句最后一次执行的位置和所有的数据值)。以下示例演示了生成器可以很简单的创建出来:
>>> def reverse(data): for index in range(len(data)-1, -1, -1): yield data[index] >>> for char in reverse('golf'): print char f l o g
Anything that can be done with generators can also be done with class based
iterators as described in the previous section. What makes generators so
compact is that the __iter__() and next() methods are
created automatically.
前一节中描述了基于类的迭代器,它能作的每一件事生成器也能作到。因为自动创建了 __iter__() 和 next() 方法,生成器显得如此简洁。
Another key feature is that the local variables and execution state
are automatically saved between calls. This made the function easier to write
and much more clear than an approach using class variables likeself.index
and self.data
.
另外一个关键的功能是两次调用之间的局部变量和执行情况都自动保存了下来。这样函数编写起来就比手动调用self.index
和 self.data
这样的类变量容易的多。
In addition to automatic method creation and saving program state, when
generators terminate, they automatically raise StopIteration.
In combination, these features make it easy to create iterators with no
more effort than writing a regular function.
除了创建和保存程序状态的自动方法,当发生器终结时,还会自动抛出 StopIteration 异常。综上所述,这些功能使得编写一个正则函数成为创建迭代器的最简单方法。
Footnotes
- ... namespace!9.1
Except for one thing. Module objects have a secret read-only
attribute called __dict__ which returns the dictionary
used to implement the module's namespace; the name
__dict__ is an attribute but not a global name.
Obviously, using this violates the abstraction of namespace
implementation, and should be restricted to things like
post-mortem debuggers.