5. 表达式

本章将解释 Python 中组成表达式的各种元素的的含义。

语法注释: 在本章和后续章节中,会使用扩展 BNF 标注来描述语法而不是词法分析。 当(某种替代的)语法规则具有如下形式

name ::=  othername

并且没有给出语义,则这种形式的 name 在语法上与 othername 相同。

5.1. 算术转换

When a description of an arithmetic operator below uses the phrase “the numeric arguments are converted to a common type,” the arguments are coerced using the coercion rules listed at Coercion rules. If both arguments are standard numeric types, the following coercions are applied:

  • 如果任一参数为复数,另一参数会被转换为复数;
  • 否则,如果任一参数为浮点数,另一参数会被转换为浮点数;
  • otherwise, if either argument is a long integer, the other is converted to long integer;
  • otherwise, both must be plain integers and no conversion is necessary.

Some additional rules apply for certain operators (e.g., a string left argument to the ‘%’ operator). Extensions can define their own coercions.

5.2. 原子

Atoms are the most basic elements of expressions. The simplest atoms are identifiers or literals. Forms enclosed in reverse quotes or in parentheses, brackets or braces are also categorized syntactically as atoms. The syntax for atoms is:

atom      ::=  identifier | literal | enclosure
enclosure ::=  parenth_form | list_display
               | generator_expression | dict_display | set_display
               | string_conversion | yield_atom

5.2.1. 标识符(名称)

作为原子出现的标识符叫做名称。 请参看 标识符和关键字 一节了解其词法定义,以及 命名与绑定 获取有关命名与绑定的文档。

当名称被绑定到一个对象时,对该原子求值将返回相应对象。 当名称未被绑定时,尝试对其求值将引发 NameError 异常。

私有名称转换: 当以文本形式出现在类定义中的一个标识符以两个或更多下划线开头并且不以两个或更多下划线结尾,它会被视为该类的 私有名称。 私有名称会在为其生成代码之前被转换为一种更长的形式。 转换时会插入类名,移除打头的下划线再在名称前增加一个下划线。 例如,出现在一个名为 Ham 的类中的标识符 __spam 会被转换为 _Ham__spam。 这种转换独立于标识符所使用的相关句法。 如果转换后的名称太长(超过 255 个字符),可能发生由具体实现定义的截断。 如果类名仅由下划线组成,则不会进行转换。

5.2.2. 字面值

Python supports string literals and various numeric literals:

literal ::=  stringliteral | integer | longinteger
             | floatnumber | imagnumber

Evaluation of a literal yields an object of the given type (string, integer, long integer, floating point number, complex number) with the given value. The value may be approximated in the case of floating point and imaginary (complex) literals. See section 字面值 for details.

所有字面值都对应与不可变数据类型,因此对象标识的重要性不如其实际值。 多次对具有相同值的字面值求值(不论是发生在程序文本的相同位置还是不同位置)可能得到相同对象或是具有相同值的不同对象。

5.2.3. 带圆括号的形式

带圆括号的形式是包含在圆括号中的可选表达式列表。

parenth_form ::=  "(" [expression_list] ")"

带圆括号的表达式列表将返回该表达式列表所产生的任何东西:如果该列表包含至少一个逗号,它会产生一个元组;否则,它会产生该表达式列表所对应的单一表达式。

一对内容为空的圆括号将产生一个空的元组对象。 由于元组是不可变对象,因此适用字面值规则(即两次出现的空元组所产生的对象可能相同也可能不同)。

请注意元组并不是由圆括号构建,实际起作用的是逗号操作符。 例外情况是空元组,这时圆括号 才是 必须的 — 允许在表达式中使用不带圆括号的 “空” 会导致歧义,并会造成常见输入错误无法被捕获。

5.2.4. 列表显示

列表显示是一个用方括号括起来的可能为空的表达式系列:

list_display        ::=  "[" [expression_list | list_comprehension] "]"
list_comprehension  ::=  expression list_for
list_for            ::=  "for" target_list "in" old_expression_list [list_iter]
old_expression_list ::=  old_expression [("," old_expression)+ [","]]
old_expression      ::=  or_test | old_lambda_expr
list_iter           ::=  list_for | list_if
list_if             ::=  "if" old_expression [list_iter]

A list display yields a new list object. Its contents are specified by providing either a list of expressions or a list comprehension. When a comma-separated list of expressions is supplied, its elements are evaluated from left to right and placed into the list object in that order. When a list comprehension is supplied, it consists of a single expression followed by at least one for clause and zero or more for or if clauses. In this case, the elements of the new list are those that would be produced by considering each of the for or if clauses a block, nesting from left to right, and evaluating the expression to produce a list element each time the innermost block is reached [1].

5.2.5. Displays for sets and dictionaries

For constructing a set or a dictionary Python provides special syntax called “displays”, each of them in two flavors:

  • 第一种是显式地列出容器内容
  • 第二种是通过一组循环和筛选指令计算出来,称为 推导式

推导式的常用句法元素为:

comprehension ::=  expression comp_for
comp_for      ::=  "for" target_list "in" or_test [comp_iter]
comp_iter     ::=  comp_for | comp_if
comp_if       ::=  "if" expression_nocond [comp_iter]

The comprehension consists of a single expression followed by at least one for clause and zero or more for or if clauses. In this case, the elements of the new container are those that would be produced by considering each of the for or if clauses a block, nesting from left to right, and evaluating the expression to produce an element each time the innermost block is reached.

Note that the comprehension is executed in a separate scope, so names assigned to in the target list don’t “leak” in the enclosing scope.

5.2.6. 生成器表达式

生成器表达式是用圆括号括起来的紧凑形式生成器标注。

generator_expression ::=  "(" expression comp_for ")"

生成器表达式会产生一个新的生成器对象。 其句法与推导式相同,区别在于它是用圆括号而不是用方括号或花括号括起来的。

Variables used in the generator expression are evaluated lazily when the __next__() method is called for generator object (in the same fashion as normal generators). However, the leftmost for clause is immediately evaluated, so that an error produced by it can be seen before any other possible error in the code that handles the generator expression. Subsequent for clauses cannot be evaluated immediately since they may depend on the previous for loop. For example: (x*y for x in range(10) for y in bar(x)).

The parentheses can be omitted on calls with only one argument. See section 调用 for the detail.

5.2.7. 字典显示

字典显示是一个用花括号括起来的可能为空的键/数据对系列:

dict_display       ::=  "{" [key_datum_list | dict_comprehension] "}"
key_datum_list     ::=  key_datum ("," key_datum)* [","]
key_datum          ::=  expression ":" expression
dict_comprehension ::=  expression ":" expression comp_for

字典显示会产生一个新的字典对象。

如果给出一个由逗号分隔的键/数据对序列,它们会从左至右被求值以定义字典的条目:每个键对象会被用作在字典中存放相应数据的键。 这意味着你可以在键/数据对序列中多次指定相同的键,最终字典的值将由最后一次给出的键决定。

字典推导式与列表和集合推导式有所不同,它需要以冒号分隔的两个表达式,后面带上标准的 “for” 和 “if” 子句。 当推导式被执行时,作为结果的键和值元素会按它们的产生顺序被加入新的字典。

对键取值类型的限制已列在之前的 标准类型层级结构 一节中。 (总的说来,键的类型应该为 hashable,这就把所有可变对象都排除在外。) 重复键之间的冲突不会被检测;指定键所保存的最后一个数据 (即在显示中排最右边的文本) 为最终有效数据。

5.2.8. 集合显示

集合显示是用花括号标明的,与字典显示的区别在于没有冒号分隔的键和值:

set_display ::=  "{" (expression_list | comprehension) "}"

集合显示会产生一个新的可变集合对象,其内容通过一系列表达式或一个推导式来指定。 当提供由逗号分隔的一系列表达式时,其元素会从左至右被求值并加入到集合对象。 当提供一个推导式时,集合会根据推导式所产生的结果元素进行构建。

空集合不能用 {} 来构建;该字面值所构建的是一个空字典。

5.2.9. String conversions

A string conversion is an expression list enclosed in reverse (a.k.a. backward) quotes:

string_conversion ::=  "`" expression_list "`"

A string conversion evaluates the contained expression list and converts the resulting object into a string according to rules specific to its type.

If the object is a string, a number, None, or a tuple, list or dictionary containing only objects whose type is one of these, the resulting string is a valid Python expression which can be passed to the built-in function eval() to yield an expression with the same value (or an approximation, if floating point numbers are involved).

(In particular, converting a string adds quotes around it and converts “funny” characters to escape sequences that are safe to print.)

Recursive objects (for example, lists or dictionaries that contain a reference to themselves, directly or indirectly) use ... to indicate a recursive reference, and the result cannot be passed to eval() to get an equal value (SyntaxError will be raised instead).

The built-in function repr() performs exactly the same conversion in its argument as enclosing it in parentheses and reverse quotes does. The built-in function str() performs a similar but more user-friendly conversion.

5.2.10. yield 表达式

yield_atom       ::=  "(" yield_expression ")"
yield_expression ::=  "yield" [expression_list]

2.5 新版功能.

The yield expression is only used when defining a generator function, and can only be used in the body of a function definition. Using a yield expression in a function definition is sufficient to cause that definition to create a generator function instead of a normal function.

When a generator function is called, it returns an iterator known as a generator. That generator then controls the execution of a generator function. The execution starts when one of the generator’s methods is called. At that time, the execution proceeds to the first yield expression, where it is suspended again, returning the value of expression_list to generator’s caller. By suspended we mean that all local state is retained, including the current bindings of local variables, the instruction pointer, and the internal evaluation stack. When the execution is resumed by calling one of the generator’s methods, the function can proceed exactly as if the yield expression was just another external call. The value of the yield expression after resuming depends on the method which resumed the execution.

All of this makes generator functions quite similar to coroutines; they yield multiple times, they have more than one entry point and their execution can be suspended. The only difference is that a generator function cannot control where should the execution continue after it yields; the control is always transferred to the generator’s caller.

5.2.10.1. 生成器-迭代器的方法

这个子小节描述了生成器迭代器的方法。 它们可被用于控制生成器函数的执行。

请注意在生成器已经在执行时调用以下任何方法都会引发 ValueError 异常。

generator.next()

Starts the execution of a generator function or resumes it at the last executed yield expression. When a generator function is resumed with a next() method, the current yield expression always evaluates to None. The execution then continues to the next yield expression, where the generator is suspended again, and the value of the expression_list is returned to next()’s caller. If the generator exits without yielding another value, a StopIteration exception is raised.

generator.send(value)

Resumes the execution and “sends” a value into the generator function. The value argument becomes the result of the current yield expression. The send() method returns the next value yielded by the generator, or raises StopIteration if the generator exits without yielding another value. When send() is called to start the generator, it must be called with None as the argument, because there is no yield expression that could receive the value.

generator.throw(type[, value[, traceback]])

Raises an exception of type type at the point where generator was paused, and returns the next value yielded by the generator function. If the generator exits without yielding another value, a StopIteration exception is raised. If the generator function does not catch the passed-in exception, or raises a different exception, then that exception propagates to the caller.

generator.close()

Raises a GeneratorExit at the point where the generator function was paused. If the generator function then raises StopIteration (by exiting normally, or due to already being closed) or GeneratorExit (by not catching the exception), close returns to its caller. If the generator yields a value, a RuntimeError is raised. If the generator raises any other exception, it is propagated to the caller. close() does nothing if the generator has already exited due to an exception or normal exit.

这里是一个简单的例子,演示了生成器和生成器函数的行为:

>>> def echo(value=None):
...     print "Execution starts when 'next()' is called for the first time."
...     try:
...         while True:
...             try:
...                 value = (yield value)
...             except Exception, e:
...                 value = e
...     finally:
...         print "Don't forget to clean up when 'close()' is called."
...
>>> generator = echo(1)
>>> print generator.next()
Execution starts when 'next()' is called for the first time.
1
>>> print generator.next()
None
>>> print generator.send(2)
2
>>> generator.throw(TypeError, "spam")
TypeError('spam',)
>>> generator.close()
Don't forget to clean up when 'close()' is called.

参见

PEP 342 - 通过增强型生成器实现协程
增强生成器 API 和语法的提议,使其可以被用作简单的协程。

5.3. 原型

原型代表编程语言中最紧密绑定的操作。 它们的句法如下:

primary ::=  atom | attributeref | subscription | slicing | call

5.3.1. 属性引用

属性引用是后面带有一个句点加一个名称的原型:

attributeref ::=  primary "." identifier

The primary must evaluate to an object of a type that supports attribute references, e.g., a module, list, or an instance. This object is then asked to produce the attribute whose name is the identifier. If this attribute is not available, the exception AttributeError is raised. Otherwise, the type and value of the object produced is determined by the object. Multiple evaluations of the same attribute reference may yield different objects.

5.3.2. 抽取

抽取就是在序列(字符串、元组或列表)或映射(字典)对象中选择一项:

subscription ::=  primary "[" expression_list "]"

The primary must evaluate to an object of a sequence or mapping type.

如果原型为映射,表达式列表必须求值为一个以该映射的键为值的对象,抽取操作会在映射中选出该键所对应的值。(表达式列表为一个元组,除非其中只有一项。)

If the primary is a sequence, the expression list must evaluate to a plain integer. If this value is negative, the length of the sequence is added to it (so that, e.g., x[-1] selects the last item of x.) The resulting value must be a nonnegative integer less than the number of items in the sequence, and the subscription selects the item whose index is that value (counting from zero).

字符串的项是字符。 字符不是单独的数据类型而是仅有一个字符的字符串。

5.3.3. 切片

切片就是在序列对象(字符串、元组或列表)中选择某个范围内的项。 切片可被用作表达式以及赋值或 del 语句的目标。 切片的句法如下:

slicing          ::=  simple_slicing | extended_slicing
simple_slicing   ::=  primary "[" short_slice "]"
extended_slicing ::=  primary "[" slice_list "]"
slice_list       ::=  slice_item ("," slice_item)* [","]
slice_item       ::=  expression | proper_slice | ellipsis
proper_slice     ::=  short_slice | long_slice
short_slice      ::=  [lower_bound] ":" [upper_bound]
long_slice       ::=  short_slice ":" [stride]
lower_bound      ::=  expression
upper_bound      ::=  expression
stride           ::=  expression
ellipsis         ::=  "..."

There is ambiguity in the formal syntax here: anything that looks like an expression list also looks like a slice list, so any subscription can be interpreted as a slicing. Rather than further complicating the syntax, this is disambiguated by defining that in this case the interpretation as a subscription takes priority over the interpretation as a slicing (this is the case if the slice list contains no proper slice nor ellipses). Similarly, when the slice list has exactly one short slice and no trailing comma, the interpretation as a simple slicing takes priority over that as an extended slicing.

The semantics for a simple slicing are as follows. The primary must evaluate to a sequence object. The lower and upper bound expressions, if present, must evaluate to plain integers; defaults are zero and the sys.maxint, respectively. If either bound is negative, the sequence’s length is added to it. The slicing now selects all items with index k such that i <= k < j where i and j are the specified lower and upper bounds. This may be an empty sequence. It is not an error if i or j lie outside the range of valid indexes (such items don’t exist so they aren’t selected).

The semantics for an extended slicing are as follows. The primary must evaluate to a mapping object, and it is indexed with a key that is constructed from the slice list, as follows. If the slice list contains at least one comma, the key is a tuple containing the conversion of the slice items; otherwise, the conversion of the lone slice item is the key. The conversion of a slice item that is an expression is that expression. The conversion of an ellipsis slice item is the built-in Ellipsis object. The conversion of a proper slice is a slice object (see section 标准类型层级结构) whose start, stop and step attributes are the values of the expressions given as lower bound, upper bound and stride, respectively, substituting None for missing expressions.

5.3.4. 调用

所谓调用就是附带可能为空的一系列 参数 来执行一个可调用对象 (例如 function):

call                 ::=  primary "(" [argument_list [","]
                          | expression genexpr_for] ")"
argument_list        ::=  positional_arguments ["," keyword_arguments]
                            ["," "*" expression] ["," keyword_arguments]
                            ["," "**" expression]
                          | keyword_arguments ["," "*" expression]
                            ["," "**" expression]
                          | "*" expression ["," keyword_arguments] ["," "**" expression]
                          | "**" expression
positional_arguments ::=  expression ("," expression)*
keyword_arguments    ::=  keyword_item ("," keyword_item)*
keyword_item         ::=  identifier "=" expression

A trailing comma may be present after the positional and keyword arguments but does not affect the semantics.

The primary must evaluate to a callable object (user-defined functions, built-in functions, methods of built-in objects, class objects, methods of class instances, and certain class instances themselves are callable; extensions may define additional callable object types). All argument expressions are evaluated before the call is attempted. Please refer to section 函数定义 for the syntax of formal parameter lists.

如果存在关键字参数,它们会先通过以下操作被转换为位置参数。 首先,为正式参数创建一个未填充空位的列表. 如果有 N 个位置参数,则将它们放入前 N 个空位。 然后,对于每个关键字参数,使用标识符来确定其对应的空位(如果标识符与第一个正式参数名相同则使用第一个个空位,依此类推)。 如果空位已被填充,则会引发 TypeError 异常。 否则,将参数值放入空位进行填充(即使表达式为 None 也会填充空位)。 当所有参数处理完毕时,尚未填充的空位将用来自函数定义的相应默认值来填充。 (函数一旦定义其参数默认值就会被计算;因此,当列表或字典这类可变对象被用作默认值时,将会被所有未指定相应空位参数值的调用所共享;这种情况通常应当避免。) 如果任何一个未填充空位没有指定默认值,则会引发 TypeError 异常。 否则的话,已填充空位的列表会被作为调用的参数列表。

某些实现可能提供位置参数没有名称的内置函数,即使它们在文档说明的场合下有“命名”,因此不能以关键字形式提供参数。 在 CPython 中,以 C 编写并使用 PyArg_ParseTuple() 来解析其参数的函数实现就属于这种情况。

如果存在比正式参数空位多的位置参数,将会引发 TypeError 异常,除非有一个正式参数使用了 *identifier 句法;在此情况下,该正式参数将接受一个包含了多余位置参数的元组(如果没有多余位置参数则为一个空元组)。

如果任何关键字参数没有与之对应的正式参数名称,将会引发 TypeError 异常,除非有一个正式参数使用了 **identifier 句法,该正式参数将接受一个包含了多余关键字参数的字典(使用关键字作为键而参数值作为与键对应的值),如果没有多余关键字参数则为一个(新的)空字典。

If the syntax *expression appears in the function call, expression must evaluate to an iterable. Elements from this iterable are treated as if they were additional positional arguments; if there are positional arguments x1, …, xN, and expression evaluates to a sequence y1, …, yM, this is equivalent to a call with M+N positional arguments x1, …, xN, y1, …, yM.

A consequence of this is that although the *expression syntax may appear after some keyword arguments, it is processed before the keyword arguments (and the **expression argument, if any – see below). So:

>>> def f(a, b):
...     print a, b
...
>>> f(b=1, *(2,))
2 1
>>> f(a=1, *(2,))
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
TypeError: f() got multiple values for keyword argument 'a'
>>> f(1, *(2,))
1 2

在同一个调用中同时使用关键字参数和 *expression 句法并不常见,因此实际上这样的混淆不会发生。

If the syntax **expression appears in the function call, expression must evaluate to a mapping, the contents of which are treated as additional keyword arguments. In the case of a keyword appearing in both expression and as an explicit keyword argument, a TypeError exception is raised.

Formal parameters using the syntax *identifier or **identifier cannot be used as positional argument slots or as keyword argument names. Formal parameters using the syntax (sublist) cannot be used as keyword argument names; the outermost sublist corresponds to a single unnamed argument slot, and the argument value is assigned to the sublist using the usual tuple assignment rules after all other parameter processing is done.

除非引发了异常,调用总是会有返回值,返回值也可能为 None。 返回值的计算方式取决于可调用对象的类型。

如果类型为—

用户自定义函数:

函数的代码块会被执行,并向其传入参数列表。 代码块所做的第一件事是将正式形参绑定到对应参数;相关描述参见 函数定义 一节。 当代码块执行 return 语句时,由其指定函数调用的返回值。

内置函数或方法:

具体结果依赖于解释器;有关内置函数和方法的描述参见 内置函数

类对象:

返回该类的一个新实例。

类实例方法:

调用相应的用户自定义函数,向其传入的参数列表会比调用的参数列表多一项:该实例将成为第一个参数。

类实例:

该类必须定义有 __call__() 方法;作用效果将等价于调用该方法。

5.4. 幂运算符

幂运算符的绑定比在其左侧的一元运算符更紧密;但绑定紧密程度不及在其右侧的一元运算符。 句法如下:

power ::=  primary ["**" u_expr]

因此,在一个未加圆括号的幂运算符和单目运算符序列中,运算符将从右向左求值(这不会限制操作数的求值顺序): -1**2 结果将为 -1

The power operator has the same semantics as the built-in pow() function, when called with two arguments: it yields its left argument raised to the power of its right argument. The numeric arguments are first converted to a common type. The result type is that of the arguments after coercion.

With mixed operand types, the coercion rules for binary arithmetic operators apply. For int and long int operands, the result has the same type as the operands (after coercion) unless the second argument is negative; in that case, all arguments are converted to float and a float result is delivered. For example, 10**2 returns 100, but 10**-2 returns 0.01. (This last feature was added in Python 2.2. In Python 2.1 and before, if both arguments were of integer types and the second argument was negative, an exception was raised).

Raising 0.0 to a negative power results in a ZeroDivisionError. Raising a negative number to a fractional power results in a ValueError.

5.5. 一元算术和位运算

所有算术和位运算具有相同的优先级:

u_expr ::=  power | "-" u_expr | "+" u_expr | "~" u_expr

一元运算符 - (负) 会产生其数值参数的负值。

一元运算符 + (正) 会产生与其数值参数相同的值。

The unary ~ (invert) operator yields the bitwise inversion of its plain or long integer argument. The bitwise inversion of x is defined as -(x+1). It only applies to integral numbers.

在所有三种情况下,如果参数的类型不正确,将引发 TypeError 异常。

5.6. 二元算术运算符

二元算术运算符遵循传统的优先级。 请注意某些此类运算符也作用于特定的非数字类型。 除幂运算符以外只有两个优先级别,一个作用于乘法型运算符,另一个作用于加法型运算符:

m_expr ::=  u_expr | m_expr "*" u_expr | m_expr "//" u_expr | m_expr "/" u_expr
            | m_expr "%" u_expr
a_expr ::=  m_expr | a_expr "+" m_expr | a_expr "-" m_expr

The * (multiplication) operator yields the product of its arguments. The arguments must either both be numbers, or one argument must be an integer (plain or long) and the other must be a sequence. In the former case, the numbers are converted to a common type and then multiplied together. In the latter case, sequence repetition is performed; a negative repetition factor yields an empty sequence.

The / (division) and // (floor division) operators yield the quotient of their arguments. The numeric arguments are first converted to a common type. Plain or long integer division yields an integer of the same type; the result is that of mathematical division with the ‘floor’ function applied to the result. Division by zero raises the ZeroDivisionError exception.

运算符 % (模) 将输出第一个参数除以第二个参数的余数。 两个数字参数将先被转换为相同类型。 右参数为零将引发 ZeroDivisionError 异常。 参数可以为浮点数,例如 3.14%0.7 等于 0.34 (因为 3.14 等于 4*0.7 + 0.34)。 模运算符的结果的正负总是与第二个操作数一致(或是为零);结果的绝对值一定小于第二个操作数的绝对值 [2]

The integer division and modulo operators are connected by the following identity: x == (x/y)*y + (x%y). Integer division and modulo are also connected with the built-in function divmod(): divmod(x, y) == (x/y, x%y). These identities don’t hold for floating point numbers; there similar identities hold approximately where x/y is replaced by floor(x/y) or floor(x/y) - 1 [3].

In addition to performing the modulo operation on numbers, the % operator is also overloaded by string and unicode objects to perform string formatting (also known as interpolation). The syntax for string formatting is described in the Python Library Reference, section String Formatting Operations.

2.3 版后已移除: The floor division operator, the modulo operator, and the divmod() function are no longer defined for complex numbers. Instead, convert to a floating point number using the abs() function if appropriate.

The + (addition) operator yields the sum of its arguments. The arguments must either both be numbers or both sequences of the same type. In the former case, the numbers are converted to a common type and then added together. In the latter case, the sequences are concatenated.

运算符 - (减) 将输出其参数的差。 两个数字参数将先被转换为相同类型。

5.7. 移位运算

移位运算的优先级低于算术运算:

shift_expr ::=  a_expr | shift_expr ( "<<" | ">>" ) a_expr

These operators accept plain or long integers as arguments. The arguments are converted to a common type. They shift the first argument to the left or right by the number of bits given by the second argument.

A right shift by n bits is defined as division by pow(2, n). A left shift by n bits is defined as multiplication with pow(2, n). Negative shift counts raise a ValueError exception.

注解

In the current implementation, the right-hand operand is required to be at most sys.maxsize. If the right-hand operand is larger than sys.maxsize an OverflowError exception is raised.

5.8. 二元位运算

三种位运算具有各不相同的优先级:

and_expr ::=  shift_expr | and_expr "&" shift_expr
xor_expr ::=  and_expr | xor_expr "^" and_expr
or_expr  ::=  xor_expr | or_expr "|" xor_expr

The & operator yields the bitwise AND of its arguments, which must be plain or long integers. The arguments are converted to a common type.

The ^ operator yields the bitwise XOR (exclusive OR) of its arguments, which must be plain or long integers. The arguments are converted to a common type.

The | operator yields the bitwise (inclusive) OR of its arguments, which must be plain or long integers. The arguments are converted to a common type.

5.9. 比较运算

与 C 不同,Python 中所有比较运算的优先级相同,低于任何算术、移位或位运算。 另一个与 C 不同之处在于 a < b < c 这样的表达式会按传统算术法则来解读:

comparison    ::=  or_expr ( comp_operator or_expr )*
comp_operator ::=  "<" | ">" | "==" | ">=" | "<=" | "<>" | "!="
                   | "is" ["not"] | ["not"] "in"

比较运算将输出布尔值: TrueFalse

比较运算可以任意串连,例如 x < y <= z 等价于 x < y and y <= z,除了 y 只被求值一次(但在两种写法下当 x < y 值为假时 z 都不会被求值)。

正式的说法是这样:如果 a, b, c, …, y, z 为表达式而 op1, op2, …, opN 为比较运算符,则 a op1 b op2 c ... y opN z 就等价于 a op1 b and b op2 c and ... y opN z,后者的不同之处只是每个表达式最多只被求值一次。

请注意 a op1 b op2 c 不意味着在 ac 之间进行任何比较,因此,如 x < y > z 这样的写法是完全合法的(虽然也许不太好看)。

The forms <> and != are equivalent; for consistency with C, != is preferred; where != is mentioned below <> is also accepted. The <> spelling is considered obsolescent.

5.9.1. 值比较

运算符 <, >, ==, >=, <=!= 将比较两个对象的值。 两个对象不要求为相同类型。

对象、值与类型 一章已说明对象都有相应的值(还有类型和标识号)。 对象值在 Python 中是一个相当抽象的概念:例如,对象值并没有一个规范的访问方法。 而且,对象值并不要求具有特定的构建方式,例如由其全部数据属性组成等。 比较运算符实现了一个特定的对象值概念。 人们可以认为这是通过实现对象比较间接地定义了对象值。

Types can customize their comparison behavior by implementing a __cmp__() method or rich comparison methods like __lt__(), described in 基本定制.

默认的一致性比较 (==!=) 是基于对象的标识号。 因此,具有相同标识号的实例一致性比较结果为相等,具有不同标识号的实例一致性比较结果为不等。 规定这种默认行为的动机是希望所有对象都应该是自反射的 (即 x is y 就意味着 x == y)。

The default order comparison (<, >, <=, and >=) gives a consistent but arbitrary order.

(This unusual definition of comparison was used to simplify the definition of operations like sorting and the in and not in operators. In the future, the comparison rules for objects of different types are likely to change.)

按照默认的一致性比较行为,具有不同标识号的实例总是不相等,这可能不适合某些对象值需要有合理定义并有基于值的一致性的类型。 这样的类型需要定制自己的比较行为,实际上,许多内置类型都是这样做的。

以下列表描述了最主要内置类型的比较行为。

  • 内置数值类型 (Numeric Types — int, float, long, complex) 以及标准库类型 fractions.Fractiondecimal.Decimal 可进行类型内部和跨类型的比较,例外限制是复数不支持次序比较。 在类型相关的限制以内,它们会按数学(算法)规则正确进行比较且不会有精度损失。

  • Strings (instances of str or unicode) compare lexicographically using the numeric equivalents (the result of the built-in function ord()) of their characters. [4] When comparing an 8-bit string and a Unicode string, the 8-bit string is converted to Unicode. If the conversion fails, the strings are considered unequal.

  • Instances of tuple or list can be compared only within each of their types. Equality comparison across these types results in unequality, and ordering comparison across these types gives an arbitrary order.

    These sequences compare lexicographically using comparison of corresponding elements, whereby reflexivity of the elements is enforced.

    In enforcing reflexivity of elements, the comparison of collections assumes that for a collection element x, x == x is always true. Based on that assumption, element identity is compared first, and element comparison is performed only for distinct elements. This approach yields the same result as a strict element comparison would, if the compared elements are reflexive. For non-reflexive elements, the result is different than for strict element comparison.

    内置多项集间的字典序比较规则如下:

    • 两个多项集若要相等,它们必须为相同类型、相同长度,并且每对相应的元素都必须相等(例如,[1,2] == (1,2) 为假值,因为类型不同)。
    • Collections are ordered the same as their first unequal elements (for example, cmp([1,2,x], [1,2,y]) returns the same as cmp(x,y)). If a corresponding element does not exist, the shorter collection is ordered first (for example, [1,2] < [1,2,3] is true).
  • 两个映射 (dict 的实例) 若要相等,必须当且仅当它们具有相同的 (键, 值) 对。 键和值的一致性比较强制规定自反射性。

    Outcomes other than equality are resolved consistently, but are not otherwise defined. [5]

  • Most other objects of built-in types compare unequal unless they are the same object; the choice whether one object is considered smaller or larger than another one is made arbitrarily but consistently within one execution of a program.

在可能的情况下,用户定义类在定制其比较行为时应当遵循一些一致性规则:

  • 相等比较应该是自反射的。 换句话说,相同的对象比较时应该相等:

    x is y 意味着 x == y

  • 比较应该是对称的。 换句话说,下列表达式应该有相同的结果:

    x == yy == x

    x != yy != x

    x < yy > x

    x <= yy >= x

  • 比较应该是可传递的。 下列(简要的)例子显示了这一点:

    x > y and y > z 意味着 x > z

    x < y and y <= z 意味着 x < z

  • 反向比较应该导致布尔值取反。 换句话说,下列表达式应该有相同的结果:

    x == ynot x != y

    x < ynot x >= y (对于完全排序)

    x > ynot x <= y (对于完全排序)

    最后两个表达式适用于完全排序的多项集(即序列而非集合或映射)。 另请参阅 total_ordering() 装饰器。

  • hash() 的结果应该与是否相等一致。 相等的对象应该或者具有相同的哈希值,或者标记为不可哈希。

Python does not enforce these consistency rules.

5.9.2. 成员检测运算

The operators in and not in test for membership. x in s evaluates to True if x is a member of s, and False otherwise. x not in s returns the negation of x in s. All built-in sequences and set types support this as well as dictionary, for which in tests whether the dictionary has a given key. For container types such as list, tuple, set, frozenset, dict, or collections.deque, the expression x in y is equivalent to any(x is e or x == e for e in y).

对于字符串和字节串类型来说,当且仅当 xy 的子串时 x in yTrue。 一个等价的检测是 y.find(x) != -1。 空字符串总是被视为任何其他字符串的子串,因此 "" in "abc" 将返回 True

对于定义了 __contains__() 方法的用户自定义类来说,如果 y.__contains__(x) 返回真值则 x in y 返回 True,否则返回 False

对于未定义 __contains__() 但定义了 __iter__() 的用户自定义类来说,如果在对 y 进行迭代时产生了值 zx == zx in yTrue。 如果在迭代期间引发了异常,则等同于 in 引发了该异常。

Lastly, the old-style iteration protocol is tried: if a class defines __getitem__(), x in y is True if and only if there is a non-negative integer index i such that x == y[i], and all lower integer indices do not raise IndexError exception. (If any other exception is raised, it is as if in raised that exception).

运算符 not in 被定义为具有与 in 的真值相反的结果。

5.9.3. 标识号比较

The operators is and is not test for object identity: x is y is true if and only if x and y are the same object. x is not y yields the inverse truth value. [6]

5.10. 布尔运算

or_test  ::=  and_test | or_test "or" and_test
and_test ::=  not_test | and_test "and" not_test
not_test ::=  comparison | "not" not_test

In the context of Boolean operations, and also when expressions are used by control flow statements, the following values are interpreted as false: False, None, numeric zero of all types, and empty strings and containers (including strings, tuples, lists, dictionaries, sets and frozensets). All other values are interpreted as true. (See the __nonzero__() special method for a way to change this.)

运算符 not 将在其参数为假值时产生 True,否则产生 False

表达式 x and y 首先对 x 求值;如果 x 为假则返回该值;否则对 y 求值并返回其结果值。

表达式 x or y 首先对 x 求值;如果 x 为真则返回该值;否则对 y 求值并返回其结果值。

(Note that neither and nor or restrict the value and type they return to False and True, but rather return the last evaluated argument. This is sometimes useful, e.g., if s is a string that should be replaced by a default value if it is empty, the expression s or 'foo' yields the desired value. Because not has to invent a value anyway, it does not bother to return a value of the same type as its argument, so e.g., not 'foo' yields False, not ''.)

5.11. Conditional Expressions

2.5 新版功能.

conditional_expression ::=  or_test ["if" or_test "else" expression]
expression             ::=  conditional_expression | lambda_expr

条件表达式(有时称为“三元运算符”)在所有 Python 运算中具有最低的优先级。

The expression x if C else y first evaluates the condition, C (not x); if C is true, x is evaluated and its value is returned; otherwise, y is evaluated and its value is returned.

请参阅 PEP 308 了解有关条件表达式的详情。

5.12. lambda 表达式

lambda_expr     ::=  "lambda" [parameter_list]: expression
old_lambda_expr ::=  "lambda" [parameter_list]: old_expression

Lambda expressions (sometimes called lambda forms) have the same syntactic position as expressions. They are a shorthand to create anonymous functions; the expression lambda parameters: expression yields a function object. The unnamed object behaves like a function object defined with

def <lambda>(parameters):
    return expression

See section 函数定义 for the syntax of parameter lists. Note that functions created with lambda expressions cannot contain statements.

5.13. 表达式列表

expression_list ::=  expression ( "," expression )* [","]

An expression list containing at least one comma yields a tuple. The length of the tuple is the number of expressions in the list. The expressions are evaluated from left to right.

末尾的逗号仅在创建单独元组 (或称 单例) 时需要;在所有其他情况下都是可选项。 没有末尾逗号的单独表达式不会创建一个元组,而是产生该表达式的值。 (要创建一个空元组,应使用一对内容为空的圆括号: ()。)

5.14. 求值顺序

Python evaluates expressions from left to right. Notice that while evaluating an assignment, the right-hand side is evaluated before the left-hand side.

在以下几行中,表达式将按其后缀的算术优先顺序被求值。:

expr1, expr2, expr3, expr4
(expr1, expr2, expr3, expr4)
{expr1: expr2, expr3: expr4}
expr1 + expr2 * (expr3 - expr4)
expr1(expr2, expr3, *expr4, **expr5)
expr3, expr4 = expr1, expr2

5.15. 运算符优先级

The following table summarizes the operator precedences in Python, from lowest precedence (least binding) to highest precedence (most binding). Operators in the same box have the same precedence. Unless the syntax is explicitly given, operators are binary. Operators in the same box group left to right (except for comparisons, including tests, which all have the same precedence and chain from left to right — see section 比较运算 — and exponentiation, which groups from right to left).

运算符 描述
lambda lambda 表达式
ifelse 条件表达式
or 布尔逻辑或 OR
and 布尔逻辑与 AND
not x 布尔逻辑非 NOT
in, not in, is, is not, <, <=, >, >=, <>, !=, == 比较运算,包括成员检测和标识号检测
| 按位或 OR
^ 按位异或 XOR
& 按位与 AND
<<, >> 移位
+, - 加和减
*, /, //, % Multiplication, division, remainder [7]
+x, -x, ~x 正,负,按位非 NOT
** 乘方 [8]
x[index], x[index:index], x(arguments...), x.attribute 抽取,切片,调用,属性引用
(expressions...), [expressions...], {key: value...}, `expressions...` Binding or tuple display, list display, dictionary display, string conversion

脚注

[1]In Python 2.3 and later releases, a list comprehension “leaks” the control variables of each for it contains into the containing scope. However, this behavior is deprecated, and relying on it will not work in Python 3.
[2]虽然 abs(x%y) < abs(y) 在数学中必为真,但对于浮点数而言,由于舍入的存在,其在数值上未必为真。 例如,假设在某个平台上的 Python 浮点数为一个 IEEE 754 双精度数值,为了使 -1e-100 % 1e100 具有与 1e100 相同的正负性,计算结果将是 -1e-100 + 1e100,这在数值上正好等于 1e100。 函数 math.fmod() 返回的结果则会具有与第一个参数相同的正负性,因此在这种情况下将返回 -1e-100。 何种方式更适宜取决于具体的应用。
[3]If x is very close to an exact integer multiple of y, it’s possible for floor(x/y) to be one larger than (x-x%y)/y due to rounding. In such cases, Python returns the latter result, in order to preserve that divmod(x,y)[0] * y + x % y be very close to x.
[4]

Unicode 标准明确区分 码位 (例如 U+0041) 和 抽象字符 (例如 “大写拉丁字母 A”)。 虽然 Unicode 中的大多数抽象字符都只用一个码位来代表,但也存在一些抽象字符可使用由多个码位组成的序列来表示。 例如,抽象字符 “带有下加符的大写拉丁字母 C” 可以用 U+00C7 码位上的单个 预设字符 来表示,也可以用一个 U+0043 码位上的 基础字符 (大写拉丁字母 C) 加上一个 U+0327 码位上的 组合字符 (组合下加符) 组成的序列来表示。

The comparison operators on unicode strings compare at the level of Unicode code points. This may be counter-intuitive to humans. For example, u"\u00C7" == u"\u0043\u0327" is False, even though both strings represent the same abstract character “LATIN CAPITAL LETTER C WITH CEDILLA”.

要按抽象字符级别(即对人类来说更直观的方式)对字符串进行比较,应使用 unicodedata.normalize()

[5]Earlier versions of Python used lexicographic comparison of the sorted (key, value) lists, but this was very expensive for the common case of comparing for equality. An even earlier version of Python compared dictionaries by identity only, but this caused surprises because people expected to be able to test a dictionary for emptiness by comparing it to {}.
[6]由于存在自动垃圾收集、空闲列表以及描述器的动态特性,你可能会注意到在特定情况下使用 is 运算符会出现看似不正常的行为,例如涉及到实例方法或常量之间的比较时就是如此。 更多信息请查看有关它们的文档。
[7]% 运算符也被用于字符串格式化;在此场合下会使用同样的优先级。
[8]幂运算符 ** 绑定的紧密程度低于在其右侧的算术或按位一元运算符,也就是说 2**-10.5