阅读了一篇 Python 垃圾回收的文章, 用这篇日志记录一下. 原文地址: http://www.digi.com/wiki/developer/index.php/Python_Garbage_Collection
###Introduction to Python Memory Management
不像 c 和 c 艹, Python 的内存分配和释放是完全自动的.
- reference counting
- garbage collection
在 Python 2.0 之前, 只用 reference counting 作为内存管理.
原理: 记录一个对象被其他对象引用的次数. 当对这个对象的引用移除了, 引用计数也减小了. 要是减到 0 了, 这个对象也就被释放了.
这种方法很高效, 但也有一些 caveat(警告, 缺点的意思吧). 例如它无法解决 reference circle 的问题(有种死锁的味道):
但要 注意 的是如果 Python 已经把内存爆了的话, automatic garbage collection 是不会执行的. 这时候你需要去处理抛出的异常, 或者程序已经崩溃了.
'''This is aggravated by the fact that the automatic garbage collection places high weight upon the NUMBER of free objects, not on how large they are. Thus any portion of your code which frees up large blocks of memory is a good candidate for running manual garbage collection.
虽然在编码中 reference cycle 是要尽量去避免的, 但还是要有怎么去解决他们的办法.
手动地回收垃圾是个释放 reference cycle 垃圾内存的好方法.
创建了几个 reference cycle 的实例:
import sys, gc
- Event-based: For example, when a user disconnects from the application or when the application is known to enter an idle state.
- 不要太随意地去进行垃圾回收, 会严重影响性能(因为要去 evalute 每一个 memory object).
- 在你的应用启动并趋于稳定后, 再进行手动地垃圾回收.
- Run manual garbage collection after infrequently run sections of code which use and then free large blocks of memory. 最好在这时运行手动的垃圾回收: 当一段不常用的代码使用并释放了大量内存的是时候.
- 当一段代码对 timing 很敏感的时候, 手动回收垃圾最好在它之前或之后运行.
- Do not run garbage collection too freely, as it can take considerable time to evaluate every memory object within a large system. For example, one team having memory issues tried calling gc.collect() between every step of a complex start-up process, increasing the boot time by 20 times (2000%). Running it more than a few times per day - without specific design reasons - is likely a waste of device resources.
- Run manual garbage collection after your application has completed start up and moves into steady-state operation. This frees potentially huge blocks of memory used to open and parse file, to build and modify object lists, and even code modules never to be used again. For example, one application reading XML configuration files was consuming about 1.5MB of temporary memory during the process. Without manual garbage collection, there is no way to predict when that 1.5MB of memory will be returned to the python memory pools for reuse.
- Run manual garbage collection after infrequently run sections of code which use and then free large blocks of memory. For example, consider running garbage collection after a once-per-day task which evaluates thousands of data points, creates an XML 'report', and then sends that report to a central office via FTP or SMTP/email. One application doing such daily reports was creating over 800K worth of temporary sorted lists of historical data. Piggy-backing gc.collect() on such daily chores has the nice side-effect of running it once per day for 'free'.
- Consider manually running garbage collection either before or after timing-critical sections of code to prevent garbage collection from disturbing the timing. As example, an irrigation application might sit idle for 10 minutes, then evaluate the status of all field devices and make adjustments. Since delays during system adjustment might affect field device battery life, it makes sense to manually run garbage collection as the gateway is entering the idle period AFTER the adjustment process - or run it every sixth or tenth idle period. This insures that garbage collection won't be triggered automatically during the next timing-sensitive period.