《Fluent Python》第10章序列的修改、散列和切片

前言

不要检查它是不是鸭子、它的叫声像不像鸭子、它的走路姿势像不像鸭子,等等。具体检查什么取决于你想使用语言的哪些行为。(comp.lang.python,2000 年 7月 26 日) ————Alex Martelli

多维向量

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53


from array import array
from math import sqrt
import reprlib

class Vector:
    typecode = 'd'

    def __init__(self, components):
        self._components = array(self.typecode, components)

    def __iter__(self):
        return iter(self._components)
    
    def __repr__(self):
        components = reprlib.repr(self._components)
        components = components[components.find('['):-1]
        return 'Vector({})'.format(components)
    
    def __str__(self):
        return str(tuple(self))
    
    def __bytes__(self):
        return (bytes([ord(self.typecode)]) + bytes(self._components))
    
    def __eq__(self, other):
        return tuple(self) == tuple(other)
    
    def __abs__(self):
        return sqrt(sum(x*x for x in self))

    def __bool__(self):
        return bool(abs(self))
    
    @classmethod
    def frombytes(cls, octets):
        typecode = chr(octets[0])
        memv = memoryview(octets[1:]).cast(typecode)
        return cls(memv)
    

v1 = Vector([3.1, 4.2])
print(v1)
>>>(3.1, 4.2)

v2 = Vector([3.0, 4.0, 5.0])
print(v2)
>>>(3.0, 4.0, 5.0)

v3 = Vector(range(100))
print('%r' % v3)
>>>Vector([0.0, 1.0, 2.0, 3.0, 4.0, ...])

reprlib.repr用于生成大型结构或递归结构的安全表示形式,它会限制输出字符串的长度,用...表示截断的部分。

序列协议

在面向对象编程中,协议是非正式的接口,只在文档中定义,在代码中不定义
Python的序列协议只需要实现__len__和__getitem__两个方法

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20


class Sequence:
    def __init__(self, seq):
        self.seq = list(seq)
    
    def __len__(self):
        return len(self.seq)
    
    def __getitem__(self, idx):
        return self.seq[idx]
    

s = Sequence([1, 2, 3, 4])
print(s)
>>><__main__.Sequence object at 0x7f5e1b940860>
print(len(s))
>>>4
print(s[1])
>>>2
print(s[2:])
>>>[3, 4]

我们说Sequence类是序列，不是因为我给他起名为sequence，而是它的行为像序列，这才是重点，结合本文前言那段话，人们称这样的类为鸭子类型

切片原理

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20


class MySeq:
    def __getitem__(self, idx):
        return idx

s = MySeq()
print(s[2333])
>>>2333

print(s[1:4])
>>>slice(1, 4, None)

print(s[1:4:2])
>>>slice(1, 4, 2)

# 注意下面这两个例子
print(s[1:4:2, 9])
>>>(slice(1, 4, 2), 9)

print(s[1:4:2, 7:9])
>>>(slice(1, 4, 2), slice(7, 9, None))

可以看到最后两个例子，返回的是slice类组成的元组

`slice.indices`

slice类提供indices()方法, 用于优雅地处理缺失索引和负数索引,以及长度超过目标序列的切片，将非法的下标转换成合法的下标

1
2
3
4


print(slice(None, 10, 2).indices(5))
>>>(0, 5, 2)
print(slice(-3, None, None).indices(5))
>>>(2, 5, 1)

'ABCDE'[:10:2]等同于'ABCDE'[0:5:2]
'ABCDE'[-3:]等同于'ABCDE[2:5:1]'

更完美的切片

认真观察可以发现，Sequence类实例的切片是利用了list的切片功能，返回的也是list, 下面将其完善为切片后仍旧为Sequence对象

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29


class Sequence:
    def __init__(self, seq):
        self.seq = list(seq)
    
    def __len__(self):
        return len(self.seq)
    
    def __getitem__(self, idx):
        cls = type(self)
        if isinstance(idx, slice):
            return cls(self.seq[idx])
        elif isinstance(idx, numbers.Integral):
            return self.seq[idx]
        else:
            msg = '{cls.__name__} indices must be integers'
            raise TypeError(msg.format(cls=cls))

    def __repr__(self):
        fmt = '<Sequence({txt})>'
        seq = ', '.join(str(x) for x in self.seq)
        return fmt.format(txt=seq)

s = Sequence([1, 2, 3, 4])
print(s)
<Sequence(1, 2, 3, 4)>

s1 = s[2:]
print(s1)
<Sequence(3, 4)>

`setattr`和`getattr`

__setattr__和__getattr__必须成对出现，以防止对象的行为不一致
如果想允许修改分量,可以使用__setitem__方法,支持v[0] = 1.1这样的赋值,以及(或者)实现__setattr__方法,支持v.x = 1.1这样的赋值

归约函数

归约函数(reduce()、sum()、any()、all())把序列或有限的可迭代对象变成一个聚合结果.它们的关键思想是,把一系列值归约成单个值

`zip()`函数

zip函数的名字取自拉链系结物(zipper fastener),因为这个物品用于把两个拉链边的链牙咬合在一起,这形象地说明了zip(left, right)的作用。zip函数与文件压缩没有关系。

1
2
3
4
5
6
7


L = list(zip(range(3), 'ABC'))
print(L)
>>>[(0, 'A'), (1, 'B'), (2, 'C')]

L = list(zip(range(3), 'ABC', [0.0, 1.1, 2.2, 3.3]))
print(L)
>>>[(0, 'A', 0.0), (1, 'B', 1.1), (2, 'C', 2.2)]

前言#

多维向量#

序列协议#

切片原理#

slice.indices#

更完美的切片#

__setattr__和__getattr__#

归约函数#

zip()函数#

前言