分类 Python 下的文章 - 小毕网_XBe

登录

Xbe

累计撰写 242 篇文章
累计收到 1 条评论

搜索到 81 篇与的结果

2025-03-02
深究Python中的asyncio库前面的代码都是异步的，就如sleep，需要用asyncio.sleep而不是阻塞的time.sleep，如果有同步逻辑，怎么利用asyncio实现并发呢？答案是用run_in_executor。在一开始我说过开发者创建 Future 对象情况很少，主要是用run_in_executor，就是让同步函数在一个执行器( executor)里面运行。同步代码def a(): time.sleep(1) return 'A' async def b(): await asyncio.sleep(1) return 'B' def show_perf(func): print('*' * 20) start = time.perf_counter() asyncio.run(func()) print(f'{func.__name__} Cost: {time.perf_counter() - start}') async def c1(): loop = asyncio.get_running_loop() await asyncio.gather( loop.run_in_executor(None, a), b() ) In : show_perf(c1) ******************** c1 Cost: 1.0027242230000866可以看到用run_into_executor可以把同步函数逻辑转化成一个协程，且实现了并发。这里要注意细节，就是函数a是普通函数，不能写成协程，下面的定义是错误的，不能实现并发：async def a(): time.sleep(1) return 'A'因为 a 里面没有异步代码，就不要用async def来定义。需要把这种逻辑用loop.run_in_executor封装到协程：async def c(): loop = asyncio.get_running_loop() return await loop.run_in_executor(None, a)大家理解了吧？loop.run_in_executor(None, a)这里面第一个参数是要传递concurrent.futures.Executor实例的，传递None会选择默认的executor：In : loop._default_executor Out: <concurrent.futures.thread.ThreadPoolExecutor at 0x112b60e80>当然我们还可以用进程池，这次换个常用的文件读写例子，并且用:async def c3(): loop = asyncio.get_running_loop() with concurrent.futures.ProcessPoolExecutor() as e: print(await asyncio.gather( loop.run_in_executor(e, a), b() )) In : show_perf(c3) ******************** ['A', 'B'] c3 Cost: 1.0218078890000015下一节：深究Python中的asyncio库-线程池
- 2025年03月02日
- 5 阅读
- 0 评论
- 0 点赞
2025-03-02
深究Python中的asyncio库在同步线程中使用的run_in_executor就如它方法的名字所示，把协程放到了一个执行器里面，可以在一个线程池，也可以在一个进程池。另外还可以使用run_coroutine_threadsafe在其他线程执行协程（这是线程安全的）。多线程def start_loop(loop): asyncio.set_event_loop(loop) loop.run_forever() def shutdown(loop): loop.stop() async def b1(): new_loop = asyncio.new_event_loop() t = Thread(target=start_loop, args=(new_loop,)) t.start() future = asyncio.run_coroutine_threadsafe(a(), new_loop) print(future) print(f'Result: {future.result(timeout=2)}') new_loop.call_soon_threadsafe(partial(shutdown, new_loop)) In : await b1() <Future at 0x107edf4e0 state=pending> Result: A这里面有几个细节要注意:协程应该从另一个线程中调用，而非事件循环运行所在线程，所以用asyncio.new_event_loop()新建一个事件循环在执行协程前要确保新创建的事件循环是运行着的，所以需要用start_loop之类的方式启动循环接着就可以用asyncio.run_coroutine_threadsafe执行协程a了，它返回了一个Future对象可以通过输出感受到future一开始是pending的，因为协程a里面会sleep 1秒才返回结果用future.result(timeout=2)就可以获得结果，设置timeout的值要大于a协程执行时间，要不然会抛出TimeoutError一开始我们创建的新的事件循环跑在一个线程里面，由于loop.run_forever会阻塞程序关闭，所以需要结束时杀掉线程，所以用call_soon_threadsafe回调函数shutdown去停止事件循环这里再说一下call_soon_threadsafe，看名字就知道它是线程安全版本的call_soon，其实就是在另外一个线程里面调度回调。BTW，其实asyncio.run_coroutine_threadsafe底层也是用的它。
- 2025年03月02日
- 4 阅读
- 0 评论
- 0 点赞
2025-03-02
详解Python元类(metaclass) 什么是元类？理解元类（metaclass）之前，我们先了解下Python中的OOP和类（Class）。面向对象全称 Object Oriented Programming 简称OOP，这种编程思想被大家所熟知。它是把对象作为一个程序的基本单元，把数据和功能封装在里面，能够实现很好的复用性，灵活性和扩展性。OOP中有2个基本概念：类和对象：类是描述如何创建一个对象的代码段，用来描述具有相同的属性和方法的对象的集合，它定义了该集合中每个对象所共有的属性和方法对象是类的实例（Instance）。我们举个例子：In : class ObjectCreator(object): ...: pass ...: In : my_object = ObjectCreator() In : my_object Out: <__main__.ObjectCreator at 0x1082bbef0>而Python中的类并不是仅限于此：In : print(ObjectCreator) <class '__main__.ObjectCreator'>ObjectCreator竟然可以被print，所以它的类也是对象！既然类是对象，你就能动态地创建它们，就像创建任何对象那样。我在日常工作里面就会有这种动态创建类的需求，比如在mock数据的时候，现在有个函数func接收一个参数：In : def func(instance): ...: print(instance.a, instance.b) ...: print(instance.method_a(10)) ...:正常使用起来传入的instance是符合需求的（有a、b属性和method_a方法），但是当我想单独调试func的时候，需要「造」一个，假如不用元类，应该是这样写:In : def generate_cls(a, b): ...: class Fake(object): ...: def method_a(self, n): ...: return n ...: Fake.a = a ...: Fake.b = b ...: return Fake ...: In : ins = generate_cls(1, 2)() In : ins.a, ins.b, ins.method_a(10) Out: (1, 2, 10)你会发现这不算算是「动态创建」的：类名（Fake）不方便改变要创建的类需要的属性和方法越多，就要对应的加码，不灵活。我平时怎么做呢：In : def method_a(self, n): ...: return n ...: In : ins = type('Fake', (), {'a': 1, 'b': 2, 'method_a': method_a})() In : ins.a, ins.b, ins.method_a(10) Out: (1, 2, 10)到了这里，引出了type函数。本来它用来能让你了解一个对象的类型：In : type(1) Out: int In : type('1') Out: str In : type(ObjectCreator) Out: type In : type(ObjectCreator()) Out: __main__.ObjectCreator另外，type如上所说还可以动态地创建类：type可以把对于类的描述作为参数，并返回一个类。用来创建类的东东就是「元类」MyClass = type('MyClass', (), {})这种用法就是由于type实际上是一个元类，作为元类的type在Python中被用于在后台创建所有的类。在Python语言上有个说法「Everything is an object」。包整数、字符串、函数和类... 所有这些都是对象。所有这些都是由一个类创建的：In : age = 35 In : age.__class__ Out: int In : name = 'bob' In : name.__class__ Out: str ...现在，任何__class__中的特定__class__是什么？In : age.__class__.__class__ Out: type In : name.__class__.__class__ Out: type ...如果你愿意，你可以把type称为「类工厂」。type是Python中内建元类，当然，你也可以创建你自己的元类。创建自己的元类Python2创建类的时候，可以添加一个__metaclass__属性：class Foo(object): __metaclass__ = something... [...]如果你这样做，Python会使用元类来创建Foo这个类。Python会在类定义中寻找__metaclass__。如果找到它，Python会用它来创建对象类Foo。如果没有找到它，Python将使用type来创建这个类。在Python3中语法改变了一下：class Simple1(object, metaclass=something...): [...]本质上是一样的。拿一个4年前写分享的元类例子（就是为了推荐你来阅读
- 2025年03月02日
- 3 阅读
- 0 评论
- 0 点赞
2025-03-02
如何在Python中对dicts列表进行排序问题：在使用MongoDB组合函数（它类似于SQL的GROUP BY）来聚合项目的一些结果。此功能虽然非常酷，但它不会对分组数据进行排序。解决：以下是如何对数据进行排序。（它只有一行Python，但很难记住如何做到这一点。）DATA是mongoDB组函数的输出。我想按照这个列表来排序'ups_ad'。from pprint import pprintDATA = [ {u'avg': 2.9165000000000001, u'count': 10.0, u'total': 29.165000000000003, u'ups_ad': u'10.194.154.49:80'}, {u'avg': 2.6931000000000003, u'count': 10.0, u'total': 26.931000000000001, u'ups_ad': u'10.194.155.176:80'}, {u'avg': 1.9860909090909091, u'count': 11.0, u'total': 21.847000000000001, u'ups_ad': u'10.195.71.146:80'}, {u'avg': 1.742818181818182, u'count': 11.0, u'total': 19.171000000000003, u'ups_ad': u'10.194.155.48:80'} ]data_sorted = sorted(DATA, key=lambda item: item['ups_ad'])pprint(data_sorted)结果：[{u'avg': 2.9165000000000001, u'count': 10.0, u'total': 29.165000000000003, u'ups_ad': u'10.194.154.49:80'}, {u'avg': 2.6931000000000003, u'count': 10.0, u'total': 26.931000000000001, u'ups_ad': u'10.194.155.176:80'}, {u'avg': 1.742818181818182, u'count': 11.0, u'total': 19.171000000000003, u'ups_ad': u'10.194.155.48:80'}, {u'avg': 1.9860909090909091, u'count': 11.0, u'total': 21.847000000000001, u'ups_ad': u'10.195.71.146:80'}]参考文献：HowTo / Sorting - PythonInfo Wiki排序的内置函数 - Python文档
- 2025年03月02日
- 3 阅读
- 0 评论
- 0 点赞
2025-03-02
Python中的描述符描述符是一种在多个属性上重复利用同一个存取逻辑的方式，他能"劫持"那些本对于self.__dict__的操作。描述符通常是一种包含__get__、__set__、__delete__三种方法中至少一种的类，给人的感觉是「把一个类的操作托付与另外一个类」。静态方法、类方法、property都是构建描述符的类。我们先看一个简单的描述符的例子：class MyDescriptor(object): _value = '' def __get__(self, instance, klass): return self._value def __set__(self, instance, value): self._value = value.swapcase() class Swap(object): swap = MyDescriptor()注意MyDescriptor要用新式类。调用一下：In [1]: from descriptor_example import Swap In [2]: instance = Swap() In [3]: instance.swap # 没有报AttributeError错误，因为对swap的属性访问被描述符类重载了 Out[3]: '' In [4]: instance.swap = 'make it swap' # 使用__set__重新设置_value In [5]: instance.swap Out[5]: 'MAKE IT SWAP' In [6]: instance.__dict__ # 没有用到__dict__:被劫持了 Out[6]: {}这就是描述符的威力。我们熟知的staticmethod、classmethod如果你不理解，那么看一下用Python实现的效果可能会更清楚了：>>> class myStaticMethod(object): ... def __init__(self, method): ... self.staticmethod = method ... def __get__(self, object, type=None): ... return self.staticmethod ... >>> class myClassMethod(object): ... def __init__(self, method): ... self.classmethod = method ... def __get__(self, object, klass=None): ... if klass is None: ... klass = type(object) ... def newfunc(*args): ... return self.classmethod(klass, *args) ... return newfunc在实际的生产项目中，描述符有什么用处呢？首先看MongoEngine中的Field的用法：from mongoengine import * class Metadata(EmbeddedDocument): tags = ListField(StringField()) revisions = ListField(IntField()) class WikiPage(Document): title = StringField(required=True) text = StringField() metadata = EmbeddedDocumentField(Metadata)有非常多的Field类型，其实它们的基类就是一个描述符，我简化下，大家看看实现的原理：class BaseField(object): name = None def __init__(self, **kwargs): self.__dict__.update(kwargs) ... def __get__(self, instance, owner): return instance._data.get(self.name) def __set__(self, instance, value): ... instance._data[self.name] = value很多项目的源代码看起来很复杂，在抽丝剥茧之后，其实原理非常简单，复杂的是业务逻辑。接着我们再看Flask的依赖Werkzeug中的cached_property：class _Missing(object): def __repr__(self): return 'no value' def __reduce__(self): return '_missing' _missing = _Missing() class cached_property(property): def __init__(self, func, name=None, doc=None): self.__name__ = name or func.__name__ self.__module__ = func.__module__ self.__doc__ = doc or func.__doc__ self.func = func def __set__(self, obj, value): obj.__dict__[self.__name__] = value def __get__(self, obj, type=None): if obj is None: return self value = obj.__dict__.get(self.__name__, _missing) if value is _missing: value = self.func(obj) obj.__dict__[self.__name__] = value return value其实看类的名字就知道这是缓存属性的，看不懂没关系，用一下：class Foo(object): @cached_property def foo(self): print 'Call me!' return 42调用下：In [1]: from cached_property import Foo ...: foo = Foo() ...: In [2]: foo.bar Call me! Out[2]: 42 In [3]: foo.bar Out[3]: 42可以看到在从第二次调用bar方法开始，其实用的是缓存的结果，并没有真的去执行。说了这么多描述符的用法。我们写一个做字段验证的描述符：class Quantity(object): def __init__(self, name): self.name = name def __set__(self, instance, value): if value > 0: instance.__dict__[self.name] = value else: raise ValueError('value must be > 0') class Rectangle(object): height = Quantity('height') width = Quantity('width') def __init__(self, height, width): self.height = height self.width = width @property def area(self): return self.height * self.width我们试一试：In [1]: from rectangle import Rectangle In [2]: r = Rectangle(10, 20) In [3]: r.area Out[3]: 200 In [4]: r = Rectangle(-1, 20) --------------------------------------------------------------------------- ValueError Traceback (most recent call last) <ipython-input-5-5a7fc56e8a> in <module>() ----> 1 r = Rectangle(-1, 20) /Users/dongweiming/mp/2017-03-23/rectangle.py in __init__(self, height, width) 15 16 def __init__(self, height, width): ---> 17 self.height = height 18 self.width = width 19 /Users/dongweiming/mp/2017-03-23/rectangle.py in __set__(self, instance, value) 7 instance.__dict__[self.name] = value 8 else: ----> 9 raise ValueError('value must be > 0') 10 11 ValueError: value must be > 0看到了吧，我们在描述符的类里面对传值进行了验证。ORM就是这么玩的！但是上面的这个实现有个缺点，就是不太自动化，你看height = Quantity('height')，这得让属性和Quantity的name都叫做height，那么可不可以不用指定name呢？当然可以，不过实现的要复杂很多：class Quantity(object): __counter = 0 def __init__(self): cls = self.__class__ prefix = cls.__name__ index = cls.__counter self.name = '_{}#{}'.format(prefix, index) cls.__counter += 1 def __get__(self, instance, owner): if instance is None: return self return getattr(instance, self.name) ... class Rectangle(object): height = Quantity() width = Quantity() ...Quantity的name相当于类名+计时器，这个计时器每调用一次就叠加1，用此区分。有一点值得提一提，在__get__中的：if instance is None: return self在很多地方可见，比如之前提到的MongoEngine中的BaseField。这是由于直接调用Rectangle.height这样的属性时候会报AttributeError, 因为描述符是实例上的属性。PS：这个灵感来自《Fluent Python》，书中还有一个我认为设计非常好的例子。就是当要验证的内容种类很多的时候，如何更好地扩展的问题。现在假设我们除了验证传入的值要大于0，还得验证不能为空和必须是数字（当然三种验证在一个方法中验证也是可以接受的，我这里就是个演示），我们先写一个abc的基类：class Validated(abc.ABC): __counter = 0 def __init__(self): cls = self.__class__ prefix = cls.__name__ index = cls.__counter self.name = '_{}#{}'.format(prefix, index) cls.__counter += 1 def __get__(self, instance, owner): if instance is None: return self else: return getattr(instance, self.name) def __set__(self, instance, value): value = self.validate(instance, value) setattr(instance, self.name, value) @abc.abstractmethod def validate(self, instance, value): """return validated value or raise ValueError"""现在新加一个检查类型，新增一个继承了Validated的、包含检查的validate方法的类就可以了：class Quantity(Validated): def validate(self, instance, value): if value <= 0: raise ValueError('value must be > 0') return value class NonBlank(Validated): def validate(self, instance, value): value = value.strip() if len(value) == 0: raise ValueError('value cannot be empty or blank') return value前面展示的描述符都是一个类，那么可不可以用函数来实现呢？也是可以的：def quantity(): try: quantity.counter += 1 except AttributeError: quantity.counter = 0 storage_name = '_{}:{}'.format('quantity', quantity.counter) def qty_getter(instance): return getattr(instance, storage_name) def qty_setter(instance, value): if value > 0: setattr(instance, storage_name, value) else: raise ValueError('value must be > 0') return property(qty_getter, qty_setter)
- 2025年03月02日
- 3 阅读
- 0 评论
- 0 点赞