V2EX = way to explore
V2EX 是一个关于分享和探索的地方
现在注册
已注册用户请  登录
推荐学习书目
Learn Python the Hard Way
Python Sites
PyPI - Python Package Index
http://diveintopython.org/toc/index.html
Pocoo
值得关注的项目
PyPy
Celery
Jinja2
Read the Docs
gevent
pyenv
virtualenv
Stackless Python
Beautiful Soup
结巴中文分词
Green Unicorn
Sentry
Shovel
Pyflakes
pytest
Python 编程
pep8 Checker
Styles
PEP 8
Google Python Style Guide
Code Style from The Hitchhiker's Guide
jamiesun
V2EX  ›  Python

dispy,asyncoro实现的分布式并行计算框架

  •  
  •   jamiesun ·
    jamiesun · 2012-09-03 15:13:48 +08:00 · 3543 次点击
    这是一个创建于 4246 天前的主题,其中的信息可能已经有所发展或是发生改变。
    dispy:asyncoro实现的分布式并行计算框架。一个对asyncoro很有说明性的案例。

    框架也是非常精简,只有4个组件

    * dispy.py (client) provides two ways of creating "clusters": JobCluster when only one instance of dispy may run and SharedJobCluster when multiple instances may run (in separate processes). If JobCluster is used, the scheduler contained within dispy.py will distribute jobs on the server nodes; if SharedJobCluster is used, a separate scheduler (dispyscheduler) must be running.

    * dispynode.py executes jobs on behalf of dispy. dispynode must be running on each of the (server) nodes that form the cluster.

    * dispyscheduler.py is needed only when SharedJobCluster is used; this provides a scheduler that can be shared by multiple dispy users.

    * dispynetrelay.py is needed when nodes are located across different networks; this relays information about nodes on a network to the scheduler. If all the nodes are on same network, there is no need for dispynetrelay - the scheduler and nodes automatically discover each other.

    一般情况下,使用dispy和dispynode就已经足够解决问题了

    1. 服务端:

    dispynode是服务端组件,它不需要写代码,只是使用参数运行为一个守护进程就OK了,比如:

    dispynode.py -c 2 -i 192.168.0.10 -p 51348 -s secret

    这个实例会使用2个cpu核心,绑定192.168.0.10:51348地址提供服务,secret是消息加密的共享密钥,更多参数参见 [http://dispy.sourceforge.net/dispynode.html](http://dispy.sourceforge.net/dispynode.html)

    2. 客户端:

    简单例子:

    #!/usr/bin/env python
    def compute(n):
    import time, socket
    time.sleep(n)
    host = socket.gethostname()
    return (host, n)

    if __name__ == '__main__':
    import dispy, random
    cluster = dispy.JobCluster(compute,nodes=['192.168.0.10', '192.168.3.11'])
    jobs = []
    for n in range(20):
    job = cluster.submit(random.randint(5,20))
    job.id = n
    jobs.append(job)
    # cluster.wait()
    for job in jobs:
    host, n = job()
    print '%s executed job %s at %s with %s' % (host, job.id, job.start_time, n)
    # other fields of 'job' that may be useful:
    # print job.stdout, job.stderr, job.exception, job.ip_addr, job.start_time, job.end_time
    cluster.stats()


    JobCluster也可以使用回调来处理结果:

    dispy.JobCluster(compute,nodes=['192.168.0.10', '192.168.3.11'],callback=callback)

    除了python函数,也可以是调用服务端的程序,比如:

    cluster = dispy.JobCluster('/some/program', nodes=['192.168.0.10'])

    也可以不写任何代码,而作为一个命令工具来使用:

    dispy.py -f /some/file1 -f file2 -a "arg11 arg12" -a "arg21 arg22" -a "arg3" /some/program

    详细文档参见 [http://dispy.sourceforge.net/index.html](http://dispy.sourceforge.net/index.html)

    如果嫌celery稍重的话,可以试试[dispy](http://dispy.sourceforge.net/index.html)
    目前尚无回复
    关于   ·   帮助文档   ·   博客   ·   API   ·   FAQ   ·   我们的愿景   ·   实用小工具   ·   2950 人在线   最高记录 6543   ·     Select Language
    创意工作者们的社区
    World is powered by solitude
    VERSION: 3.9.8.5 · 25ms · UTC 11:14 · PVG 19:14 · LAX 04:14 · JFK 07:14
    Developed with CodeLauncher
    ♥ Do have faith in what you're doing.