标签归档:php多线程编程

PHP多线程编程 之 Pool详解

Pool 对象是多个 Worker 对象的容器,同时也是它们的控制器。线程池是对 Worker 功能的高层抽象,包括按照 pthreads 需要的方式来管理应用的功能。

从pthreads包的examples的stub.php文件截取:

/**
 * pthreads extension stub file for code completion purposes
 */

...........

class Pool
{
    /**
     * The maximum number of Worker threads allowed in this Pool
     *
     * @var integer
     */
    protected $size;

    /**
     * The name of the Worker class for this Pool
     *
     * @var string
     */
    protected $class;

    /**
     * The array of Worker threads for this Pool
     *
     * @var array|Worker[]
     */
    protected $workers;

    /**
     * The array of Stackables submitted to this Pool for execution
     *
     * @var array|Threaded[]
     */
    protected $work;

    /**
     * The constructor arguments to be passed by this Pool to new Workers upon construction
     *
     * @var array
     */
    protected $ctor;

    /**
     * The numeric identifier for the last Worker used by this Pool
     *
     * @var integer
     */
    protected $last;

    /**
     * Construct a new Pool of Workers
     *
     * @param integer $size The maximum number of Workers this Pool can create
     * @param string $class The class for new Workers
     * @param array $ctor An array of arguments to be passed to new Workers
     *
     * @link http://www.php.net/manual/en/pool.__construct.php
     */
    public function __construct($size, $class, array $ctor = array()) {}

    /**
     * Shuts down all Workers, and collect all Stackables, finally destroys the Pool
     *
     * @link http://www.php.net/manual/en/pool.__destruct.php
     */
    public function __destruct() {}

    /**
     * Collect references to completed tasks
     *
     * Allows the Pool to collect references determined to be garbage by the given collector
     *
     * @param callable $collector
     *
     * @link http://www.php.net/manual/en/pool.collect.php
     */
    public function collect(callable $collector) {}

    /**
     * Resize the Pool
     *
     * @param integer $size The maximum number of Workers this Pool can create
     *
     * @link http://www.php.net/manual/en/pool.resize.php
     */
    public function resize($size) {}

    /**
     * Shutdown all Workers in this Pool
     *
     * @link http://www.php.net/manual/en/pool.shutdown.php
     */
    public function shutdown() {}

    /**
     * Submit the task to the next Worker in the Pool
     *
     * @param Threaded $task The task for execution
     *
     * @return int the identifier of the Worker executing the object
     */
    public function submit(Threaded $task) {}

    /**
     * Submit the task to the specific Worker in the Pool
     *
     * @param int $worker The worker for execution
     * @param Threaded $task The task for execution
     *
     * @return int the identifier of the Worker that accepted the object
     */
    public function submitTo($worker, Threaded $task) {}
}

这个文件是用来辅助IDE工具完成自动补齐功能的。里面的注释比文档现象的多。

$size		Pool大小,就是有多少个Worker
$class		Worker的类名
$workers	Workers数组,用来存放$size个大小的Worker
$work		Work数组,每个单元是Stackable类型,每提交一次数组也跟着增长,因为它是对$work的引用
$ctor		数组,是Worker构造函数的参数
$last		最后一次调用的Worker索引,索引是$workers数组的下标

接下来看看构造函数:

public function __construct($size, $class, array $ctor = array()) {}

$size就是Pool的大小,对应Pool类的$size,$class就是Worker的类名,对应Pool类的$class,后面的$ctor是一个数组,是用来传递给Worker构造函数的,对应Pool类的$ctor。当生成Pool对象时,其内部会构建$size个$class类型的Worker对象,这种Worker对象生成时,构造函数参数对应$ctor数组。

来自examples的Pooling.php例子:

<?php
class WebWorker extends Worker {

	public function __construct(SafeLog $logger) {
		$this->logger = $logger;
	}
	
	protected $logger;	
}

class WebWork extends Stackable {
	
	public function isComplete() {
		return $this->complete;
	}
	
	public function run() {
		$this->worker
			->logger
			->log("%s executing in Thread #%lu",
				  __CLASS__, $this->worker->getThreadId());
		$this->complete = true;
	}
	
	protected $complete;
}

class SafeLog extends Stackable {
	
	protected function log($message, $args = []) {
		$args = func_get_args();	
		
		if (($message = array_shift($args))) {
			echo vsprintf(
				"{$message}\n", $args);
		}
	}
}


$pool = new Pool(8, \WebWorker::class, [new SafeLog()]);

$pool->submit($w=new WebWork());
$pool->submit(new WebWork());
$pool->submit(new WebWork());
$pool->submit(new WebWork());
$pool->submit(new WebWork());
$pool->submit(new WebWork());
$pool->submit(new WebWork());
$pool->submit(new WebWork());
$pool->submit(new WebWork());
$pool->submit(new WebWork());
$pool->submit(new WebWork());
$pool->submit(new WebWork());
$pool->submit(new WebWork());
$pool->submit(new WebWork());
$pool->shutdown();

$pool->collect(function($work){
	return $work->isComplete();
});

var_dump($pool);

运行输出:

/usr/local/php-5.5.15/bin/php Pooling.php 
WebWork executing in Thread #140585499506432
WebWork executing in Thread #140585419667200
WebWork executing in Thread #140585409177344
WebWork executing in Thread #140585398687488
WebWork executing in Thread #140585388197632
WebWork executing in Thread #140585377707776
WebWork executing in Thread #140585366894336
WebWork executing in Thread #140585499506432
WebWork executing in Thread #140585388197632
WebWork executing in Thread #140585377707776
WebWork executing in Thread #140585409177344
WebWork executing in Thread #140585017014016
WebWork executing in Thread #140585398687488
WebWork executing in Thread #140585419667200
object(Pool)#1 (6) {
  ["size":protected]=>
  int(8)
  ["class":protected]=>
  string(9) "WebWorker"
  ["workers":protected]=>
  array(0) {
  }
  ["work":protected]=>
  array(0) {
  }
  ["ctor":protected]=>
  array(1) {
    [0]=>
    object(SafeLog)#2 (0) {
    }
  }
  ["last":protected]=>
  int(6)
}

由于是最后把Pool对象dump出来的,所以Worker都已经销毁了,看到workers是array(0)。注意last是6,Pool是8,一共submit了14个Work,所以最后执行第14个Work的是下标为6的Worker(14%8=6)。

原创文章,转载务必保留出处。
永久链接:http://blog.ifeeline.com/1136.html

PHP多线程编程 之 重用上下文(官方文档)

来自http://pthreads.org/tutorials/recycle.html,简单翻译一下。

Recycling Contexts 重用上下文

In line with our mantra, always consider employing the Worker and Stackable model of execution.

The Worker model allows a context (Thread) to be reused, so if many tasks need to be executed asynchronously to the process (or creating Thread), but those tasks themselves are suitable to be executed synchronously then using a single context rather than many is a huge win.
Worker模型运行一个上下文(Thread)重用,因此,如果很多任务需要同步执行(或创建Thread),但是那些任务本身适合同步地执行,那么使用单个上下文比使用多个更好。
Worker and Stackable Brief
The patterns used to create Workers and Stackable are much the same as those used to create Threads.
用来创建Workers和Stackable的模式和那些用来创建Threads的模式非常类似。

class WebWorker extends Worker {
    public function run(){}
}
 
class WebRequest extends Stackable {
    public $url;
    public $data;
     
    public function __construct($url) {
        $this->url = $url;
    }
     
    public function run() {
        $response = file_get_contents($this->url);
         
        if ($response) {
            /* process response */
             
            $this->data = array($response);
        }
    }
}
 
$worker = new WebWorker();
$worker->start();
 
$work = array();
 
/* create some random work */
foreach (range(0, 10) as $index) {
    /* retain reference to the work */
    $work[$index] = new WebRequest(
        "http://pthreads.org/?query={$index}");
         
    /* stack the work for execution */
    $worker->stack($work[$index]);
}
 
/* shutting down waits for the execution of 
    anything previously stacked, then joins the Worker */
$worker->shutdown();
 
foreach ($work as $task) {
    var_dump ($task->data);
}

In the example above, many requests are processed by the same context. The aim should be to always use Workers and Stackables wherever possible, it is much more efficient to create one context (Worker), rather than many contexts (Threads).
在以上的例子中,很多请求在同一个上下文中被处理。目标是尽可能使用Workers和Stackables,只创建一个上下文(Worker)比使用多上下文(Threads)更加高效。

Asynchronous Logging 异步日志
A good example of using Workers in the real world might be an implementation of asynchronous logging.
在真实世界里一个好的使用Workers的例子是异步日志的实现。
In the real world, an asynchronous logger would be much more complex than the following example, for the purposes of a tutorial we will just write logs to standard output.
在真实的世界中,一个异步日志比下面的例子更加复杂,出于演示的目的,我们将仅仅写日志到标志的输出。

class LogEntry extends Stackable {
    public $message;
    public $args;
      
    public function __construct($message, $args = null) {
        $this->message = $message;
        $this->args = $args;
    }
      
    public function run() {
        /* for simplicity */
        vprintf(
            "{$this->message}\n", $this->args);
    }
}
  
class Logger extends Worker {
    static $instance = null;
    static $work = [];
 
    public function __construct() {
        $this->start();
    }
      
    public static function log($message, $args = null) {
        if (self::$instance == null) {
            self::$instance = new Logger();
        }
 
        $args = func_get_args();
        if ($args) {
            $wid = count(self::$work);
            self::$work[$wid]=new LogEntry(
                array_shift($args),
                $args
            );
            self::$instance->stack(self::$work[$wid]);
        }
    }
      
    public function run() {}
}
Logger::log("Hello %s", "World");
Logger::log("Bye :)");

In the example above, it is the Worker that actually performs the writing of logs – however that is performed.

注:这个例子中,Worker没有退出,一旦调用log()方法就实例化一个LogEntry然后压栈,LogEntry中的run()方法就在另一个线程中完成日志操作。

永久链接:http://blog.ifeeline.com/1125.html

PHP多线程编程 之 简介(官方文档)

文档来自https://gist.github.com/krakjoe/6437782,对于PHP的多线程编程,这个文档介绍了一些内容值得读一读,我试着翻译成中文,应该比软件翻译会好一点吧….

A Brief Introduction to Multi-Threading in PHP 简短介绍PHP中的多线程
Foreword 前言
Execution 执行
Sharing 共享
Synchronization 同步
Pitfalls 陷阱
WTF ?? ?

Preface 前言

If you are a PHP programmer who spends a lot of time at the console, or someone who is interested in high performance modern programming of PHP, this document is for you.
如果你是一个在控制台上花费大量时间的PHP程序员,或者你对PHP中的高性能现代编程感兴趣,那么这个文档就是为你准备的。
The intention here is to provide information concise and short enough that you (and the community at large) remember it; in the hope that one day all of this will be common knowledge among PHP programmers.
这里的意图是提供简洁和简短的能让你(和整个社会)记得住的信息;希望有一天,这里的一切将会是PHP程序员常识。
By the end of the document, you should have a clear understanding of how, and why, pthreads exists, and executes.
读完整个文档,你应该有对how, and why, pthreads exists, and executes有一个清晰的认识。
If you have any comments, suggestions or insults please forward them to krakjoe@php.net

Insults will be ignored.

Foreword 前言

Since PHP4, May 22nd 2000, PHP has been equipped to execute isolated instances of the interpreter in multiple threads within a single process without any context interfering with another. We call this TSRM, it is a rarely studied omnipresent part of PHP that nobody really talks about.
从2000年5月22日发布的PHP4开始,PHP已经具备在没有任何其它上下文干扰的单进程的多线程中执行独立的解释器实例。我们称之为TSRM,it is a rarely studied omnipresent part of PHP,没有人真正谈论。
If you have ever used XAMPP or PHP on Windows, it’s likely that you used a threaded PHP without even knowing it.
如果你曾经在Windows上使用过XAMPP或PHP,你可能使用了支持线程的PHP而你甚至还不知道。
TSRM has the ability to create isolated instances of the interpreter, which is how pthreads executes userland threads in PHP. The instances of the interpreter are as isolated as they are when executing any threaded build of PHP, the Apache2 Worker MPM PHP5 Module for example. The job of pthreads is to facilitate communication and synchronization between the otherwise isolated contexts.
TSRM有创建独立解释器实例的能力,它是pthreads为何能在PHP中执行用户态线程的原因。The instances of the interpreter are as isolated as they are when executing any threaded build of PHP,比如Apache2的Worker MPM PHP5模块。pthreads的工作是方便和其它独立的上下文之间通信和同步。
Exactly how TSRM works is beyond the scope of this document, and would only confuse the reader (and subject), suffice to say that PHP has been able to work in a multi-threaded environment for more than a decade. The implementation is stable; there is however one well known, but completely misunderstood pitfall, which I shall explain the facts of, and clarify: PHP is a wrapper around third parties, every part of PHP is implemented like this, if a third party does not implement their library in a re-entrant (thread safe) way then the PHP wrapper for that library will fail and or cause unexpected behaviour during execution. A well known example of such a library is locale. It should be clear that this is beyond the control of PHP or pthreads. Such libraries are well known (documented) and or obvious, the vast majority of extensions will have no problem executing in a pthreads application.
TSRM如何工作已经超出了本文档的范围,并且只会混淆读者(和主题),我只想说,PHP已经能够在多线程环境中工作超过十年。它的实现是稳定的;不过还是有一个众所周知的,但完全被误解了的陷阱,对于这个误解我会解释事实并阐明:PHP是在第三方之上的包装器,PHP的每一个部分都是类似实现的,如果第三方没有以可重入的(线程安全)的方式实现它们的库,那么PHP针对那些库的包装器将失败或在执行过程中会导致意外行为。一个众所周知的这样的一个库是locale。应该明确的是这已经超出了PHP或pthreads的控制。这些库是众所周知的(记录),并或不明显,绝大多数的扩展在pthread应用中能正确地执行。
Threading in user land was never a concern for the PHP team, and it remains as such today. You should understand that in the world where PHP does its business, there’s already a defined method of scaling – add hardware. Over the many years PHP has existed, hardware has got cheaper and cheaper and so this became less and less of a concern for the PHP team. While it was getting cheaper, it also got much more powerful; today, our mobile phones and tablets have dual and quad core architectures and plenty of RAM to go with it, our desktops and servers commonly have 8 or 16 cores, 16 and 32 gigabytes of RAM, though we may not always be able to have two within budget and having two desktops is rarely useful for most of us.
PHP开发小组从不关心用户空间中的线程,到今天仍然是这样。你应该明白,在PHP的世界里执行其业务,已经有一个定义缩放的方法 – 添加硬件。PHP已经存在多年,硬件越来越便宜,所以PHP团队对用户空间线程关注越来越少。虽然越来越便宜,但是却更加强大,我们的手机和平板电脑有双核和四核架构和大量的RAM,我们的台式机和服务器通常有8个或16个内核,16和32G的RAM…
In addition to the concerns of the PHP team, there are concerns of the programmer: PHP was written for the non-programmer, it is many hobbyists native tongue. The reason PHP is so easily adopted is because it is an easy language to learn and write. Multi-threaded programming is not easy for most, even with the most coherent and reliable API, there are different things to think about, and many misconceptions. The PHP group do not wish for user land multi-threading to be a core feature, it has never been given serious attention – and rightly so. PHP should not be complex, for everyone.
除了PHP开发小组所关注的问题,也有程序员关注:PHP是针对非程序员编写的,它是许多业余者的native tongue。PHP那么容易adopted的原因是因为它是一种容易学和编写的语言。即使是最连贯和可靠的API,大多数的多线程编程是不容易的,有不同的事情要考虑,以及许多误解。 PHP开发组不希望用户态多线程成为一个核心功能,它从来没有得到重视 – 这是正确的。PHP对每个人不应该是复杂的。
All things considered, there are still benefits to be had from allowing PHP to utilize its production ready and tested features to allow a means of making the most out of what we have, when adding more isn’t always an option, and for a lot of tasks is never really needed if you can take advantage of all you have.

A note about nothing, or more precisely, sharing nothing: The architecture of PHP is referred to as Shared Nothing, this simply means that whenever PHP services a request, via any SAPI, its environment, in the sense of the data structures PHP requires to operate, are isolated from one another. On the surface, pthreads would appear to violate this standard and break the architecture that keeps PHP executing. Relax, this is not so. In fact, another job of pthreads (that is never evident to the programmer) is to maintain that architecture; it does this utilizing copy-on-read and copy-on-write semantics and carefully programmed mutex manipulation. The upshot of this is, any time a user does anything, in the sense of reading or writing to an object, or executing its methods, it is safe to assume that the operation was safe and there is no need for further action like the explicit use of mutex by the programmer.
PHP的架构被称为无共享(Shared Nothing),这意味着任何时候PHP为一个请求服务,通过任意的SAPI,它的环境,在PHP要求去操作的数据结构场景上跟另一个彼此独立的。从表面上看,pthread似乎违背这个标准,并打破了保持PHP的执行架构。放轻松,事实并非如此。事实上,pthread中的另一工作(这个对程序员透明)是维护那个架构;它利用copy-on-read(副本上读) 和copy-on-write(副本上写) semantics和精心编写的互斥操作做到的(指维护架构)。这样做的结果是,用户在任何时间做任何事情,在读取或写入到一个对象,或执行它的方法的场景,它安全的去假设该操作是安全的(意思应该是可以认为它是安全,或假设它就是安全的),没有必要采取进一步行动比如被程序员明确使用互斥锁。
Terms in the foreword that are new to the reader should now be researched, as they may appear throughout this document

Execution 执行

Threading is about dividing your instructions into units of execution, and distributing those units among your processors &| cores in such a way as to maximize the throughput of your application.
多线程是指你的指令分割成多个执行单元,并在处理器或核心之间分发这些执行单元,以这样一种方式最大限度地提高应用程序的吞吐量。

This should always be done using as few threads as possible.
应该坚持使用尽可能少的线程来完成任务。
pthreads exposes two models of execution. The Thread model and the Worker model, they expose much of the same functionality to the programmer, and are internally very similar, with one key difference: what they consider to be the unit of execution.
pthreads提供两种执行模型。Thread模型和Worker模型,它们向程序员提供了很多相同的功能,并且在内部是非常类似的,但有一个关键区别:它们认为执行单元是什么(就是执行单元不一样?)
A Thread is representative of both an interpreter context and a unit of execution (that’s its ::run method).
一个Thread是解释器和执行单元(就是它的run方法)两者的代表。
A Worker is representative of an interpreter context; its ::run method is used to configure that context. The unit of execution for this model is the Stackables, more precisely Stackable::run.
一个Worker是解释器上下文的代表;它的run方法用来配置它的上下文。这个模型的执行单元是Stackables,更精确的说是指Stackable::run(Worker把压入它栈中的对象的run方法置入一个独立线程中执行,实际是重用Worker::run方法的线程参考:http://blog.ifeeline.com/1115.html)。
When the programmer calls Thread::start, a new thread is created, a PHP interpreter context is initialized and then (safely) manipulated to mirror the context that made the call to ::start. Execution continues concurrently in both contexts at this point. Execution in the Thread is passed to the ::run method of the Thread. At the end of the ::run method the context for the Thread is destroyed.
当程序员调用Thread::start,一个新的线程就被创建,一个PHP解释器上下文被初始化然后(安全地)操作由调用start产生的上下文的映射关系。在这个点上,同时有两个上下文继续执行。在Thread中的执行被传递到Thread的run方法。在run方法的最后,Thread的线程上下文被销毁。(我对这段的理解是调用start方法后一个解释器实例被初始化,它用来解释执行run方法中的代码)
When the programmer calls Worker::start, a new thread is created, a PHP interpreter context is again initialized in the same way as a normal Thread, when execution in the Worker leaves Worker::run, the Worker begins to pop Stackables from the stack and execute them in the order they were stacked. If there are no items on the stack the Worker will wait for some to appear. The Worker will continue to do this until Worker::shutdown is called. If Worker::shutdown is called while items remain on the stack they will be executed first and the context that called Worker::shutdown will block until shutdown can occur.
当程序员调用Worker::start,一个新的线程被创建,一个PHP解释器上下文和作为一个正常的Thread一样的方式再次被初始化,当在Worker中的执行完Worker::run后,Worker开始从它的堆栈中弹出Stackables并按照被压栈的顺序执行它们。如果堆栈是空的,Worker将等待。Worker将继续这样操作直到Worker::shutdown被调用。如果Worker::shutdown被调用了但是堆栈还不为空,它们将首先被执行并且调用Worker::shutdown的上下文将被堵塞直到shutdown可以occur。(这段说明了Worker的原理,理解它很重要)
Great care should be taken to avoid wasting contexts unnecessarily, starting a Thread or Worker is not free. Where you can, use the Worker model, this almost eliminates the tendency to be wasteful while multi-threading. Almost, but not completely …
应该十分注意以避免浪费不必要的上下文,开启一个Thread或Worker不是毫无代价的(言外之意就是开销大)。当在多线程编程时,那些能使用Worker模型的地方,几乎消除了浪费的趋势。几乎是如此,但不完全是…
There is a tendency to be wasteful; it’s a common misunderstanding to think that threading anything can make it faster, it cannot. More threads does not always equate to more throughput, in the same way as more water does not always equate to wetter.
有一种倾向是浪费;有一种普遍的误解是认为多线程做任何事将更加快速,实际不是。多线程不是一直等同于更高的吞吐量,同样,更多的水不一直等同于更加湿润。
Thinking outside the box is a prerequisite of a good multi-threaded programmer; common sense should dictate that more water does mean wetter, but if you consider the central point of the bottom of the bowl: Once it is wet, it does not matter how much water you place on top, it cannot get wetter …
框外思考是一个好的多线程编程程序员的先决条件;常识决定了更多的水不意味着更潮湿,但是如果你考虑到碗底部的中心:一旦它是湿的,不管有在它上面有多少水,它都不能更湿润…(这个鬼佬这个例子??咳…无非想说明线程开得多,资源就耗费的多,系统性能反而下降,效率可能更低,进而推导多线程不一直代表更快)
Too much water, or threads, and you will drown.
太多的水或线程,你会被淹死。
The author of pthreads will not take responsibility for drowning programmers, or their code.
pthreads的作者不会对溺水的程序员或它们的代码负责任。

Sharing 共享

Threading would be rather useless if threads could not manipulate a common set of data, which appears to be a problem in a shared nothing architecture. I don’t see shared nothing as a hindrance, I see it as a rather big helpful push in the right direction.
如果线程不能处理一组通用的数据集,那么线程是相当无用的,在无共享的架构中它视乎是一个问题。我没有看到无共享是一个障碍,我看到的是在推向正确的方向上作用巨大。
One of the normal problems for a programmer writing multi-threaded code is the safety and synchronization of data, it is normally very very easy to corrupt an array if 10 threads manipulate it at once.
对一个编写多线程代码的程序员的一个很正常的问题是数据的安全和同步,如果10个线程同时操作一个数组,通常非常容易受到干扰。
Shared Nothing solves this problem; if no two contexts ever manipulate the same data then they cannot corrupt each others stack, the architecture is maintained along with its stability.
无共享解决了这个问题;如果没有两个线程永久操作相同的数据,那么它们彼此的堆栈不会受到干扰,该架构维护了它的稳定性。
Objects descending from pthreads utilize a thread safe member storage table that works slightly differently to any other objects. When you write a member to such an object, the table is locked, the data is copied, and then stored in the table and the lock is released. When a subsequent read of that member occurs, the table is locked, the data is copied for return and the lock is released. This means that no two contexts ever manipulate the same physical data – Share Nothing.
来自pthreads的对象使用一个和其它任何对象工作略有不同的线程安全的成员存储表。当你写一个成员到这样的对象,该表被上锁,数据被拷贝,然后在表中存储并释放锁。当后续对那个成员的读操作发生时,表被上锁,数据被拷贝返回然后释放锁。这意味着不会有两个上下文操作相同的物理数据–无共享。
Some data does not lend itself to being easily copied, PHP has a solution to this in the form of the serialization API. Serialization is utilized on arrays, and objects not descended from pthreads. Objects descended from pthreads are never serialized, as such you should always use pthreads objects as containers for data you intend to manipulate in multiple contexts.
有些数据本身不适合被轻易复制,PHP针对它有一个以序列化API形式的解决方案。序列化被用在数据和不是来自pthreads的对象。来自pthreads的对象永远不会序列化,因此你应该总是使用pthreads对象作为你打算在多上下文中操作数据的容器(数据在多线中共享的办法,用来自pthreads的对象作为容器,因为来自pthreads的对象在被操作时时线程安全的,上面那段已经解释了)。
All objects descending from pthreads can be manipulated, by any context with a reference, as arrays and objects, they also include methods for manipulating members in a thread safe manner. There shouldn’t be a kind of data set you cannot implement with what is exposed by pthreads, and basic sets (arrays) are built in.
所有来自pthreads的对象都可以被任何的带有该对象引用的上下文操作,作为数组和对象,它们也包括可以以线程安全的方式操作成员的方法(就是来自pthreads的对象的方法可以线程安全的操作它的属性)。There shouldn’t be a kind of data set you cannot implement with what is exposed by pthreads, and basic sets (arrays) are built in.(不应该有一种你不能实现的数据集由pthreads来提供,基础的集合(数组)是内建的???)

This is all done in such a way that minimizes memory usage while still maintaining architecture and safety. It may seem wasteful, but it’s a small price to pay, that diminishes with the price of memory.
这是最大限度减少内存使用量这样的方式同时仍保持结构和安全全部完成。看起来很浪费,但是随着内存价格的减低,这只是一个很小的代价。

Synchronization 同步

Sharing isn’t enough, the last piece of the puzzle is synchronization. This is going to be a topic completely alien to a lot of programmers.
共享还是不够的,让人困惑的最后一块是同步。这对很多程序员将是一个完全陌生的话题。
While your are executing, and sharing, you must also be able to control when to share, and when to execute; it is no good trying to manipulate data that does not exist !!
当你在执行和共享时,你也必须能够控制什么时候共享,什么时候执行;试图去操作不存在的数据是不好的。
Synchronization can be used to put a thread into a receptive, but sleepy state, known as waiting, and can be used to awaken such a thread, known as notifying.
同步可以用来把线程放入到可以接收的,但是处于睡眠的状态,称为等待,并且可以用来唤醒这样的线程,被称为通知。
Synchronizing with a unit of execution is easy, but does come with a danger of misuse, which I hope to give a brief, simple explanation of that will stick in your mind and help you to avoid misuse.
与执行单元同步是很容易的,但是也伴随着误操作的危险,我希望能给出一个能让你记住的简短简单的说明,并帮助您避免误操作。
Make this your mantra: Only ever wait FOR something

$this->synchronized(function(){
    $this->wait();
});

The above code looks simple enough, but what or who is it waiting for, and what happens if whatever they are waiting for has already sent notification … waiting forever is the price for not paying attention to your own mantra.
以上代码看起来足够简单,但是在等待什么或谁,并且如果不管它们正在等待的已经发送了通知将发生什么…
The syntax of synchronization may look a bit strange, here’s an explanation that gives you a good reason to keep typing all that stuff: when you call ::synchronized a mutex (lock) is acquired, when you call ::wait, that mutex is atomically locked and unlocked to allow other contexts to acquire it while the waiting context blocks on a condition waiting for notification.
同步的语法看起来有一点怪,这里有一个解释将给你一个保持所有东西的好原因:当你调用synchronized时一个互斥(锁)被获取,当你调用wait,那个互斥自动地上锁和解锁以允许另外上下文获取它while the waiting context blocks on a condition waiting for notification. (当等待上下文因条件堵塞等待通知时??)

Waiting for something looks like this:等待某些东西看起来像这样:


$this->synchronized(function(){
    if (!$this->data) {
        $this->wait();
    }
});
/* I can manipulate $this->data and know it exists */

While notification looks like this:通知则看起来像这样:

$that->synchronized(function($that){
    $that->data = “some”;
    $that->notify();
}, $that);

In the notification example, you ensure that the context that is waiting is not left hanging around forever because if you have acquired the synchronization lock and the object is not waiting then it need not wait (by the time it can acquire the synchronization lock the data is already set). A call to notify will ensure if you managed to acquire the synchronization lock because it was atomically released by the waiting thread, the waiting thread is awoken and will continue executing.
在通知的例子中,你要确保上下文正在等待因为如果你已经要求同步锁并且对象不是在等待中的那么线程也不需要等待(这个时候线程要求同步锁的数据已经设置)。一个notify的调用将确保如果你是否成功获得同步锁,因为它是被等待线程自动释放的,等待线程被唤醒并继续执行。
This kind of explicit synchronization can make for powerful programming, study it well.

Pitfalls 陷阱

The garbage collection built into PHP was never prepared for this kind of prolonged execution, if pthreads followed the PHP way and edited reference counts of objects when we accepted them (as an argument to a method, or as the data for a member property), then memory usage soars, it becomes difficult to retain control of your own code.
PHP内置的垃圾回收机制从来没有为这种长时执行做准备,当我们接收它们(作为方法的参数,或者作为成员属性的数据)时如果pthreads按照PHP的方法并且编辑对象引用计数,那么随着内存使用增加,获取对自己代码的控制就变得很困难。
So we do not do the done thing; in a pthreads application, you are responsible for the objects you create, you are also responsible for retaining a reference to objects that are going to be executed, or accessed from other executing contexts, until that execution or access has taken place.
所以我们不这样做;在pthreads应用程序中,你为你创建的对象负责,你也要为正要执行的获取了引用的对象负责,或从其他执行上下文存取,直到执行或访问已经发生。(这个概念很重要,你要把你创建的东西自己管理起来)
This circumvents the problem of out of control memory usage, but it creates another problem; dreaded segfaults.
这就避免了失控的内存使用的问题,但它创造了另一个问题;可怕的段错误。
Segmentation faults occur when you instruct a processor to address memory that it cannot access, they result in abortion of execution. The prime suspect when you encounter segmentation faults during development is objects being referenced that were already destroyed in the context that originally created the object.
当你让一个处理器去寻址不能访问的内存时段错误就发生,导致的结果是终止执行。当你在开发过程中遇到的段错误的主要可能是在创建对象的原始上下文中正被引用(其它线程中引用)的对象已经被销毁。
Avoiding these segmentation faults sounds much more complex than it in reality is, this can be illustrated best with a (bad) example(用例子来解释):

class W extends Worker {
    public function run(){}
}
class S extends Stackable {
    public function run(){}
}
/* 1 */
$w = new W();
/* 2 */
$j = array(
    new S(), new S(), new S()
);
/* 3 */
foreach ($j as $job)
    $w->stack($job);
/* 4 */
$j = array();
$w->start();
$w->shutdown();

The above example will always segfault; steps 1-3 are perfectly normal, but before the Worker is started the stacked objects are deleted, resulting in a segfault when the Worker is allowed to start. Your code will not always look so explicit, but if you can see a route where this could conceivably happen, then program a different way.
以上例子将一直得到段错误;1-3步骤是正常的,但是在Worker被开始之前,堆栈对象就被删除了,当Worker被运行开始时将导致一个段错误(体会:在当前线程一定要确保其它线程返回,就是要join它,如果使用Worker,最有务必调用shutdown,否则可能会遇到段错误)。
Other symptoms of this kind of programming error are the fatal error

Call to a member function member() on a non-object in /my/code.php

and the notice

Trying to get property of non-object in /my/code.php

If you experience these errors, carefully look over your code and make sure everything you have passed to any other context exists all the time it is being referenced or executed in any other context.

This is probably the hardest part of creating applications with pthreads, but it doesn’t take a lot to avoid; plan with care, and program with even more care.

WTF ??

I hear the criticism that I have taken something simple, that’s PHP, and made it more complex by exposing this kind of functionality. I hear you; I would argue that I have taken something complex, and made it relatively simple.

Something being complex, or difficult, is no kind of justification for avoiding it. The complexity of anything should decrease as your knowledge increases, if it does not, then you are not taking in the right kind of information. This is the nature of learning.

To the idea that I haven’t made anything simple; oh rly? If the task is simple: get two things done at once, the implementation is simple. The fact that you are even considering complex ideas is the thing you should be paying attention to!!

To the rest of the nay-sayers: Progress is made by pushing forwards, when we all push at once, we make more progress !!

Even if you hate the idea, I hope I’ve said enough to convince you to give it a try before you form a long lasting opinion that will affect your decisions in the future, what is the worst that can happen !?

永久链接:http://blog.ifeeline.com/1118.html

PHP多线程编程 – 实例之Fetch

实例来自PHP的PECL扩展包pthreads-2.0.7中的examples。

<?php
class TestObject {
	public $val;
}

class Fetching extends Thread {
	public function run(){
		echo "Begin Fetching run method: ".Thread::getCurrentThreadId()."\n";
		/*
		* of course ...
		*/
		$this->sym = 10245;
		$this->arr = array(
			"1", "2", "3"
		);
		
		/*
		* objects do work, no preparation needed ...
		* read/write objects isn't finalized ..
		* so do the dance to make it work ...
		*/
		$obj = new TestObject();
		$obj->val = "testval";
		$this->obj = $obj;
		
		/*
		* will always work
		*/
		$this->objs = serialize($this->obj);
		
		/*
		* nooooooo
		*/
		$this->res = fopen("php://stdout", "w");
		
		/*
		* tell the waiting process we have created symbols and fetch will succeed
		*/

		$this->synchronized(function(){
			echo "Begin ".Thread::getCurrentThreadId()." notify.\n"; 
		    	$this->notify();
			echo "End ".Thread::getCurrentThreadId()." notify.\n";
		});
		
		/* wait for the process to be finished with the stream */
		$this->synchronized(function(){
		    echo "Begin ".Thread::getCurrentThreadId()." wait.\n";
			$this->wait();
			echo "End ".Thread::getCurrentThreadId()." wait.\n";
		});
		echo "End Thread run method: ".Thread::getCurrentThreadId()."\n";
	}
}

$thread = new Fetching();

$thread->start();

$thread->synchronized(function($me){
	echo "Begin ".Thread::getCurrentThreadId()." wait.\n";
    $me->wait();
	echo "End ".Thread::getCurrentThreadId()." wait.\n";
}, $thread);
/*
* we just got notified that there are symbols waiting
*/
foreach(array("sym", "arr", "obj", "objs", "res") as $symbol){
	printf("\$thread->%s: ", $symbol);	
	$fetched = $thread->$symbol;
	if ($fetched) {
		switch($symbol){
			/*
			* manual unserialize
			*/
			case "objs":
				var_dump(unserialize($fetched));
			break;
			
			default: var_dump($fetched);
		}
	}
	printf("\n");
}

/* notify the thread so it can destroy resource */
$thread->synchronized(function($me){
    $me->notify();
}, $thread);

为了能观察到输出,我添加了一些输出语句。以下是输出:

/usr/local/php-5.5.15/bin/php Fetch.php
Begin Fetching run method: 140107248678656
Begin 	140107248678656 notify.
End  	140107248678656 notify.
Begin 140107248678656 wait.
Begin 140107443247040 wait.

一旦线程对象的start()方法执行,那么它的run()方法就会马上运行,这里可以看到run()方法所在的线程ID是140107248678656,它一直运行到它的最后遇到wait()时才被堵塞,这个过程中,线程ID保持不变,这个说明这段代码是在同一个线程中。

紧接着运行如下这段代码:

$thread->synchronized(function($me){
	echo "Begin ".Thread::getCurrentThreadId()." wait.\n";
    $me->wait();
	echo "End ".Thread::getCurrentThreadId()." wait.\n";
}, $thread);

这段代码让当前线程堵塞(Begin 140107443247040 wait.)。同时这段代码和全局文件处于同一个线程中,所以它不会继续执行以下代码,这个时候实际两个线程都堵塞了,所以它会一直堵塞下去。

另外一种情况是不会出现堵塞的情况,以上这段代码如果在run()方法的notify执行之前被执行了,那么主线程就可以被唤醒:

/usr/local/php-5.5.15/bin/php Fetch.php
Begin Fetching run method: 139715053770496
Begin 139715248338880 wait.
Begin 139715053770496 notify.
End 139715053770496 notify.
End 139715248338880 wait.
Begin 139715053770496 wait.
$thread->sym: int(10245)

$thread->arr: array(3) {
  [0]=>
  string(1) "1"
  [1]=>
  string(1) "2"
  [2]=>
  string(1) "3"
}

$thread->obj: object(TestObject)#2 (1) {
  ["val"]=>
  string(7) "testval"
}

$thread->objs: object(TestObject)#2 (1) {
  ["val"]=>
  string(7) "testval"
}

$thread->res: resource(4) of type (stream)

End 139715053770496 wait.
End Fetching run method: 139715053770496

同一个线程中的代码,如果被堵塞,那么之后的代码将不会被执行,直到它被重新唤醒。线程对象的run()方法在一个独立的线程空间中执行,全局代码也处于一个独立的线程空间中。

永久链接: http://blog.ifeeline.com/1111.html