月度归档:2014年09月

PHP框架Yaf 之 运行流程

官方给的流程图:
yaf_sequence

Yaf是一个以PHP扩展方式添加的框架,它的内部运行过程无法像一般PHP框架一样可以打开源代码查看跟踪看个明白。但是它跟一般的的MVC框架的运行流程是大同小异的。特别是如果你熟悉Zend Framework框架的流程,你要熟悉Yaf的框架,基本上学习成本可以说是零。

Yaf借用了很多Zend Framework 1.x的概念。基本就是它的C语言实现版,不过它比Zend Framework的MVC实现简化了一些。下面总结一下运行流程。

首先从入口程序开始:

<?php
define('APPLICATION_PATH', dirname(__FILE__));

$application = new Yaf_Application( APPLICATION_PATH . "/conf/application.ini");
print_r($application);

$application->bootstrap();
$application->run();

Yaf_Application对象生成后,输出这个对象看看有啥东西:

Yaf_Application Object
(
    [config:protected] => Yaf_Config_Ini Object
        (
            [_config:protected] => ...
            [_readonly:protected] => 1
        )

    [dispatcher:protected] => Yaf_Dispatcher Object
        (
            [_router:protected] => Yaf_Router Object
                (
                    [_routes:protected] => Array
                        (
                            [_default] => Yaf_Route_Static Object
                                (
                                )

                        )

                    [_current:protected] => 
                )

            [_view:protected] => 
            [_request:protected] => Yaf_Request_Http Object

            [_plugins:protected] => Array
                (
                )

            [_auto_render:protected] => 1
            [_return_response:protected] => 
            [_instantly_flush:protected] => 
            [_default_module:protected] => Index
            [_default_controller:protected] => Index
            [_default_action:protected] => index
        )

    [_modules:protected] => Array
        (
            [0] => Index
        )

    [_running:protected] => 
    [_environ:protected] => product
    [_err_no:protected] => 0
    [_err_msg:protected] => 
)

看到Yaf_Application对象生成后,内部就生成了Yaf_Config_Ini 和 Yaf_Dispatcher,还有_modules(定义了哪些模块),_environ(决定使用哪个配置节),Yaf_Config_Ini对象保存了配置文件的信息,Yaf_Dispatcher叫分发器,玛尼这个就是一般意义上的前段控制器吧,它内部就丰富一些,有路由器,Yaf_Request_Http对象,插件数组,默认的模块默认的控制器和方法(Action),看起来,实例化的东西不算多吧。

第二部就是允许Yaf_Application对象的bootstrap()方法,这个实际就是Bootstrap类(可通过配置修改),这个类实际没有任何内容,不过在它里面可以定义_init开头的方法,这些方法, 都接受一个参数:Yaf_Dispatcher $dispatcher,它会被自动按照顺序执行:

class Bootstrap extends Yaf_Bootstrap_Abstract{
    	public function _initConfig() {
		//把配置保存起来
		$arrConfig = Yaf_Application::app()->getConfig();
		Yaf_Registry::set('config', $arrConfig);
	}
	public function _initPlugin(Yaf_Dispatcher $dispatcher) {
		//注册一个插件
		$objSamplePlugin = new SamplePlugin();
		$dispatcher->registerPlugin($objSamplePlugin);
	}
}

先看看这里自定义的_initConfig方法,把配置对象取出来,然后写入到Yaf_Registry中,这样在任何代码中都可以通过Yaf_Registry::get(‘config’)获得这个对象的引用,当要全局共享数据时,这个方法就非常有用。

接下来看看_initPlugin方法,它里面要注册插件,所以携带了Yaf_Dispatcher对象引用。这个引用跟前面输出的那个分发器是同一个对象(全局只有一个分发器或叫前端控制器),插件注册后就会写入到分发器的_plugins数组中。

一般在这个Bootstrap中对资源进行初始化。

这个bootstrap()完成之后,就开始run()方法。这个方法中首先要触发的routerStartup事件,所有绑定到这个事件的方法这个时候首先被执行。然后启动路由器匹配路由,把结果更新到Request对象(找出模块 控制器 和 方法),然后触发routerShutdown事件。我们可以在这个事件中绑定一个方法输出分发器对象看看:

//在注册了的插件中添加如下代码
Yaf_Dispatcher Object
(
    [_router:protected] => Yaf_Router Object
        (
            [_routes:protected] => Array
                (
                    [_default] => Yaf_Route_Static Object
                        ()
                )

            [_current:protected] => _default
        )

    [_view:protected] => 
    [_request:protected] => Yaf_Request_Http Object
        (
            [module] => Index
            [controller] => Vfeelit
            [action] => zend
            [method] => GET
            [params:protected] => Array
                (
                )

            [language:protected] => 
            [_exception:protected] => 
            [_base_uri:protected] => /sample
            [uri:protected] => /sample/vfeelit/zend
            [dispatched:protected] => 
            [routed:protected] => 1
        )

    [_plugins:protected] => Array
        (
            [0] => SamplePlugin Object
                (
                )

        )

    [_auto_render:protected] => 1
    [_return_response:protected] => 
    [_instantly_flush:protected] => 
    [_default_module:protected] => Index
    [_default_controller:protected] => Index
    [_default_action:protected] => index
)

注意看Yaf_Request_Http对象,它的module、controler、action都确定了(路由匹配的结果),它的routed对应的值是1,表示已经路由完成,但是dispatched还没有值,就是还没有进行分发。

接下来触发dispatchLoopStart时间,进入分发循环,在具体分发前,先触发preDispatch,然后是开始分发(具体逻辑不知,一般是路由器匹配路由后,由路由器分发),这里发生的逻辑应该是把控制器的类文件装载进来,开始控制器的初始化(自动调用init方法)然后执行对应的Action,注意,如果调用了Forward将会设置isDispatched为FALSE(Forward的调用会导致request对象发送变化),但是当前的Action不会停止,直到返回后判断isDispatched为FALSE后再次进入循环进行新的分发过程。Action调用完毕后,判断是否是autoRender,如果是就初始化Render(就是视图),这个时候$dispatch对象中的_view才被填充。返回后开始触发postDispatch。

接下来判断IsDispatched(请求对象中记录)来是否进行是否再次分发过程。如果被标记为已分发,接下来就是触发dispatchLoopShutdown。

然后判断是否是autoResponse,如果是就把响应返回给客户端。

Yaf MVC整体流程大概如此,总体上,路由器与路由匹配上,跟一般MVC实现不太一样,另外就是视图渲染上,没有涉及到布局的内容(可以自己实现),在Model上,没有任何体现。所以Yaf是轻型MVC。如果要追求速度,它应该是简单的,这样才能被可靠依赖。

把这个流程图记住,开发就会得心应手。

原创文章,转载务必保留出处。
永久链接:http://blog.ifeeline.com/1197.html

PHP框架Yaf 之 引入和使用Zend Framework组件

Yaf中没有提供所谓的ORM,实际上Yaf本身保持自己是一个高性能的MVC框架足够了,因为其它的东西完全可以重用别的类库,比如使用Zend Framework的组件,由于Zend Framework的每个组件是高度自治的以致于它可以独立使用,你或者觉得Zend Framework的MVC实现太笨重,没有问题的,你可以用Yaf来代替它或者自己来实现一个MVC,举例:Magento这个开源程序数据库层就基于Zend Framework的数据库封装,它重用了Zend Framework的组件,但是MVC是自己实现的。

以下是在Yaf中使用Zend Framework组件(以数据库组件为例)。
首先,我这里使用的是Zend Framework 1.x的组件,虽然Zend Framework 2.x已经发现很久了,由于对2.x没什么研究,所以…

Yaf中类的加载使用Yaf Autoloader来负责类的加载,文档原话:Yaf在自启动的时候, 会通过SPL注册一个自己的Autoloader, 出于性能的考虑, 对于框架相关的MVC类, Yaf Autoloader只以目录映射的方式尝试一次。在Yaf中当遇到一个类需要加载时,它扫描的目录是php.ini中设置的ap.library目录,这个目录存放全局类,没有找到就到配置文件的ap.library目录中寻找,这个目录中的类一般可以认为是所谓的本地类(本项目专用),为了跳过首先去加载全局类,可以调用Yaf_Loader的registerLocalNamespace方法, 来申明那些类前缀是本地类。明白这个就好了,我这里直接把Zend类库放入到项目的library中。

我这里打算使用Zend Framework的Zend_Db组件,在Bootstrap类中放入方法:

        public function _initDatabaseAdapter(){
                $dbConfig = Yaf_Application::app()->getConfig()->get("db");
                $dbAdapter = $dbConfig->get("adapter");
                $dbParams = $dbConfig->get("params")->toArray();

                $db = Zend_Db::factory($dbAdapter, $dbParams);
                Zend_Db_Table::setDefaultAdapter($db);

                Yaf_Registry::set('db', $db);
        }

获取数据库配置信息,然后调用Zend_Db::factory方法,把返回的适配器用Yaf_Registry来保存。

数据库配置信息如下(config/application.ini中):

db.adapter = 'pdo_mysql'
db.params.host = '127.0.0.1'
db.params.username = 'root'
db.params.password = 'root'
db.params.dbname = 'ai_manage'
db.params.charset = 'utf8'

然后开始运行,报告如下错误:

Warning: require_once(Zend/Db/Adapter/Pdo/Abstract.php): failed to open stream: No such file or directory in /usr/local/httpd-2.2.27/htdocs/vfeelit/application/library/Zend/Db/Adapter/Pdo/Mysql.php on line 27

查看Zend/Db/Adapter/Pdo/Mysql.php的27行:

require_once 'Zend/Db/Adapter/Pdo/Abstract.php';

原来Zend_Db_Adapter_Pdo_Mysql继承自Zend_Db_Adapter_Pdo_Mysql_Abstract,而这个类是首先要加载进来的。玛尼的,自动装载函数无法装载Zend_Db_Adapter_Pdo_Mysql_Abstract还是怎么的,非要用那么粗暴的方法?我既然能装载Zend_Db_Adapter_Pdo_Mysql,自然也能装载Zend_Db_Adapter_Pdo_Mysql_Abstract,有点不理解。不过Zend Framework 1.x中大量出现这个粗暴装载的方法,好吧,require_once自然是搜索include_path,当前include_path自然没有包含Zend类库,那么修改一下include_path好了:

//在Bootstrap中添加如下函数
        public function _initIncludePath(){
                $config = Yaf_Application::app()->getConfig();
                set_include_path(implode(PATH_SEPARATOR, array(
                      realpath($config->get("application")->directory .'/library'),
                      get_include_path(),
                )));
	}

再次运行,就没有报错了。这说明成功使用了Zend_Db组件。

下面从数据库中读一段数据出来:

//控制器方法
        public function zendAction(){
                $db = Yaf_Registry::get('db');
                $result = $db->query("select id,order_id from ai_order where id > 6000 order by rand() limit 10");
		foreach($result->fetchAll() as $row){
			echo "ID {$row['id']} -- {$row['order_id']}\n";
		}
		return false;
	}

代码执行成功。

以上代码直接编写SQL语句是不是有点过时的感觉?没有问题,换一种搞法,Zend_Db_Table实现了所谓ORM,这里来个简单的例子:

//从ai_order中获取数据
//先建立一个Model
AiOrder.php 
<?php
class AiOrderModel extends Zend_Db_Table_Abstract{
	protected $_name = "ai_order";
}

//然后在控制器中
$aiOrder = new AiOrderModel();
// SELECT * FROM ai_order WHERE id = "1"
$row = $aiOrder->fetchRow("id = 1");
$row->name = "Vfeelit";

$row->save();

以上把id=1的记录映射到$row中,然后修改字段,调用save()方法数据就更新到数据表了。这玛尼的就是最简单的ORM了。

ORM在遇到复杂的情况时就会比较复杂繁琐,ORM的理念就是让你忘记数据库,所有一切都是对象。实际情况是小项目直接写SQL好了。大一点的项目需要多人开发时最好采用一套ORM,有了它可以统一开发模式,不至于人人都写SQL代码。就我个人来说,不怎么喜欢ORM。

原创文章,转载务必保留出处。
永久链接:http://blog.ifeeline.com/1194.html

PHP框架Yaf 之 安装

Yaf是一个C语言编写的PHP框架.

Yaf的优点:
1 用C语言开发的PHP框架, 相比原生的PHP, 几乎不会带来额外的性能开销.
2 所有的框架类, 不需要编译, 在PHP启动的时候加载, 并常驻内存.
3 更短的内存周转周期, 提高内存利用率, 降低内存占用率.
4 灵巧的自动加载. 支持全局和局部两种加载规则, 方便类库共享.
5 高性能的视图引擎.
6 高度灵活可扩展的框架, 支持自定义视图引擎, 支持插件, 支持自定义路由等等.
7 内建多种路由, 可以兼容目前常见的各种路由协议.
8 强大而又高度灵活的配置文件支持. 并支持缓存配置文件, 避免复杂的配置结构带来的性能损失.
9 在框架本身,对危险的操作习惯做了禁止.
10 更快的执行速度, 更少的内存占用.

以上介绍来自http://www.laruence.com/manual/的介绍。实际上,快是其主要优点。Yaf和Zend Framework 1.x很类似(Yaf有着和Zend Framework相似的API, 相似的理念),如果有使用过Zend Framework 1.x经验,那么入门Yaf非常容易。实际上,Yaf仅仅是一个MVC框架,要跟Zend Framework相比的话,也只能跟Zend Framework MVC部分对比,由于我有使用过Zend Framework的经历,平心而论,Zend Framework的MVC实现尽管非常优雅,但是它的性能实在有点低效,所以做开发时我尽量不使用Zend Framework的MVC。Zend Framework除了实现MVC,还提供了很多组件(类库),比如数据库的抽象层,数据缓存抽象API等非常有价值。这些完全可以作为Yaf的补充,Yaf专注于一个快速轻巧的MVC实现是非常值得肯定的。

Yaf由于是PHP的扩展实现,自然也是有缺点的,比如使用门槛高,虚拟主机基本不会让你去编译扩展;你不知道MVC的内部运行过程(除非你有深厚C功底去阅读它的源代码);一旦有Bug,你需要被动等待修复…

以下是我机器上的安装Yaf的过程:

[root@vfeelit ~]# wget http://pecl.php.net/get/yaf-2.2.9.tgz
[root@vfeelit ~]# tar zxvf yaf-2.2.9.tgz
[root@vfeelit ~]# cd yaf-2.2.9
[root@vfeelit yaf-2.2.9]# /usr/local/php-5.5.15/bin/phpize
Configuring for:
PHP Api Version:         20121113
Zend Module Api No:      20121212
Zend Extension Api No:   220121212
[root@vfeelit yaf-2.2.9]# ./configure --with-php-config=/usr/local/php-5.5.15/bin/php-config
#...
[root@vfeelit yaf-2.2.9]# make 
#...
[root@vfeelit yaf-2.2.9]# make install

#编译的扩展在
/usr/local/php-5.5.15/lib/php/extensions/no-debug-zts-20121212/yaf.so
#修改php.ini让PHP装载yaf.so
vi /usr/local/php-5.5.15/etc/php.ini
#添加
extension = "yaf.so"
#查看yaf.so是否已经可以被加载
/usr/local/php-5.5.15/bin/php -m | grep yaf
yaf
#重启httpd(PHP作为httpd的模块启动,需要重启httpd重新读取php.ini文件)

安装完毕后需要一个Demo,可以到https://github.com/laruence/php-yaf下载Yaf工具包,进入到tools/cg,然后运行:

[root@vfeelit cg]# chmod 755 yaf_cg 
[root@vfeelit cg]# ./yaf_cg vfeelit
DONE
[root@vfeelit cg]# ls
output  README.md  templates  yaf_cg
[root@vfeelit cg]# ls output
vfeelit

你指定的应用vfeelit目录结构已经生成,它在output目录下,实际上你也可以指定保存的目录(./yaf_cg vfeelit 路径)。

注:这里的yaf_cg实际是一个PHP脚本,但是为何能直接就运行了?难道不需要PHP解释器?打开这个脚本查看到:

#!/usr/bin/env php

这个指令的意思就是使用当前的PHP解释这个脚本。

我们把vfeelit拷贝到网站跟目录下:

[root@vfeelit output]# cp -a vfeelit/ /usr/local/httpd-2.2.27/htdocs/

在vfeelit目录中有.htaccess文件,它的作用是把所有请求都重新定位到index.php这个入口文件上,所以在httpd的配置中htdocs目录一定要配置AllowOverride指令为All,否则.htaccess不起作用。

看看目录结构:

[root@vfeelit vfeelit]# tree
.
├── application
│   ├── Bootstrap.php
│   ├── controllers
│   │   ├── Error.php
│   │   └── Index.php
│   ├── library
│   │   └── readme.txt
│   ├── models
│   │   └── Sample.php
│   ├── plugins
│   │   └── Sample.php
│   └── views
│       ├── error
│       │   └── error.phtml
│       └── index
│           └── index.phtml
├── conf
│   └── application.ini
├── index.php
└── readme.txt

9 directories, 11 files

看这个目录结构,是不是跟Zend Framework 1.x的组织结构很像?

再然后就是在浏览器中访问:

curl http://xxx/vfeelit/index/
Hello World! I am Stranger

curl http://xxx/vfeelit/index/index/index/name/vfeelit
Hello World! I am vfeelit

原创文章,转载务必保留出处。
永久链接:http://blog.ifeeline.com/1192.html

PHP多线程编程应用之多线程采集

由于要采集一些1688.com上面的产品,并且按照格式保存数据,下载所有图片,之前写过采集的程序,但是是单线程采集,产品一多,下载图片将耗费非常多的时间(不能充分利用带宽,只能一个个来,A下载完成后才能下载B),并且通过nginx或apache发起请求还常常遇到超时问题,所以这次打算使用命令行下的PHP多线程编程来干这个事,以下是程序段,备忘:

<?php
ini_set("display_errors","0");
ini_set("max_execution_time","7200");
ini_set("memory_limit","1024M");

require_once dirname(__FILE__) . '/phpQuery/phpQuery.php';
require_once dirname(__FILE__) . '/phpexcel/Classes/PHPExcel.php';

//Tool
class CrawTool{
	public static function getDataUseCurl($url='', $data='', $post=false){
		if('' == $url){ return false; }

        $ch = curl_init();
        curl_setopt($ch, CURLOPT_HEADER, '');
       	curl_setopt($ch, CURLOPT_URL, trim($url));
        curl_setopt($ch, CURLOPT_HEADER, false);
        curl_setopt($ch, CURLOPT_TIMEOUT, 30);
        curl_setopt($ch, CURLOPT_CONNECTTIMEOUT, 10);
        if($post === false){
			curl_setopt($ch, CURLOPT_POST, false);
      	}else{
          	curl_setopt($ch, CURLOPT_POST, true);
          	if('' != $data){
           		curl_setopt($ch, CURLOPT_POSTFIELDS,$data);
          	}
     	}
      	if ((int)preg_match('/^HTTPS/i', $url) > 0) {
        	curl_setopt($ch, CURLOPT_SSL_VERIFYPEER, false);
	        curl_setopt($ch, CURLOPT_SSL_VERIFYHOST, false);
        }

        curl_setopt($ch,CURLOPT_RETURNTRANSFER,true);
        curl_setopt($ch,CURLOPT_FOLLOWLOCATION,true);
        curl_setopt($ch,CURLOPT_MAXREDIRS,10);

        curl_setopt($ch, CURLOPT_USERAGENT, "Mozilla/5.0 (Windows NT 6.1; WOW64; rv:21.0) Gecko/20100101 Firefox/21.0");
        $result = curl_exec($ch);
        $errn = curl_errno($ch);
        curl_close($ch);

        if((int)$errn == 0){
                return $result;
        }

        return false;
	}
}


//Get List
class CrawListWorker extends Worker{
	public function __construct(CrawList $cl){
		$this->setCrawList($cl);
	}
	public function setCrawList(CrawList $cl){
		$this->crawList = $cl;
	}
	public function run(){}
}
class CrawList extends Stackable{
	public function run(){}
}
class CrawListWork extends Stackable{
	public function __construct($url){
		$this->setUrl($url);
	}
	public function setUrl($url){
		$this->url = trim($url);
	}
	public function run(){
		$url_return = CrawTool::getDataUseCurl($this->url);
		if($url_return !== false){         
         	preg_match_all("/.*<div\s+class=\"image\">\s*<a\s+href=\"(.*)\"\s+title.*/",iconv("gbk","utf-8",$url_return),$ls);
			$links = $ls[1];
			foreach($links as $l){
            	preg_match('/http:\/\/.*offer\/([0-9]{1,16})\.html/',$l,$pid);
				$id = $pid[1];
               	$arr = array("id"=>$id,"url"=>$l);
               	$this->worker->crawList[$id] = $arr;
         	}
		}
	}
}

//Get Product Detail
class CrawItemWorker extends Worker{
	public function __construct(CrawItems $cis){
		$this->setCrawItems($cis);
	}
	public function setCrawItems(CrawItems $cis){
		$this->crawItems = $cis;
	}
	public function run(){}
}
class CrawItems extends Stackable{
	public function run(){}
}
class CrawItemWork extends Stackable{
	public $url = '';
	public function __construct($url){
		$this->setUrl(trim($url));
	}
	public function setUrl($url){
		$this->url = $url;
	}
	public function run(){
		preg_match('/http:\/\/.*offer\/([0-9]{1,16})\.html/',$this->url,$pid);
        $id = $pid[1];

		$url_return = CrawTool::getDataUseCurl($this->url);
		if($url_return === false){ return false; }
		$rd = iconv("gbk","utf-8",$url_return);
		
		$d = phpQuery::newDocument($rd);
       	// 描述数据连接
       	$data_tfs_url   	= pq("#desc-lazyload-container")->attr("data-tfs-url");
		$jiage				= pq("td.price-title")->next()->find("span.value")->text();
		$mingchen	 		= pq("#mod-detail-hd h1")->text();               	
		$zhongliang        	= pq('td.de-feature:contains("重量")')->next()->text();
		$huohao 			= pq('td.de-feature:contains("货号")')->next()->text();
		$xiaoshouxuliehao 	= pq('td.de-feature:contains("销售序列号")')->next()->text();
		$caizhi				= pq('td.de-feature:contains("材质")')->next()->text();
		$fengge				= pq('td.de-feature:contains("风格")')->next()->text();
		$pinpai				= pq('td.de-feature:contains("品牌")')->next()->text();
		$yangshi			= pq('td.de-feature:contains("样式")')->next()->text();
		$zaoxing			= pq('td.de-feature:contains("造型")')->next()->text();
		$chuligongyi		= pq('td.de-feature:contains("处理工艺")')->next()->text();

		preg_match_all("/\"original\"\:\"(http\:\/\/[^<>]+\.jpg)\"/",$rd,$main_imgs);
		if(isset($main_imgs[1])){
			$tupian["main"] = $main_imgs[1];
		}else{
			$tupian["main"] = array();
		}
		
		preg_match_all("/<span\s+class=\"image\"\s+title=\"([^<>]+)\"\s+data-imgs=/",$rd,$colors);
		if(isset($colors[1])){
			$yanse = $colors[1];
		}else{
			$yanse = array();
		}		
		
		
		$ss = CrawTool::getDataUseCurl($data_tfs_url);

       	preg_match("/.*\'(.*)\'.*/", $ss,$r);
		$thtmll = iconv("gbk","utf-8",$r[1]);
       	$doc = phpQuery::newDocument($thtmll);
		//$imgs = pq("p img");
	   	$des = pq('table:contains("重量")');  // 定位表格

		$dzhongliang 		= $des->find("td:contains('重量')")->next()->text();
      	$dchicun 			= $des->find("td:contains('尺寸')")->next()->text();
       	$ddiandu 			= $des->find("td:contains('电镀')")->next()->text();
       	$dcailiao 			= $des->find("td:contains('材料')")->next()->text();
       	$dbaozhuang 		= $des->find("td:contains('包装')")->next()->text();
		
		preg_match_all("/<img[^<]+src=\"([^<\"]+)\"[^<]+\/>/",$thtmll,$imgs);
		$imgs = array_unique($imgs[1]);
		foreach($imgs as $iurl){
			$tupian["des"][] = preg_replace("/\?.*/",'',$iurl);
		}

      	if(trim($dzhongliang) != ''){ 
			$zhongliang = $dzhongliang; 
		}
		
		$data = array(
			"id"=>$id,
			"url"=>$this->url,
			"jiage"=>$jiage,
			"yanse"=>$yanse,
			"mingchen"=>$mingchen,
			"zhongliang"=>$zhongliang,
			"huohao"=>$huohao,
			"xiaoshouxuliehao"=>$xiaoshouxuliehao,
			"caizhi"=>$caizhi,
			"fengge"=>$fengge,
			"pinpai"=>$pinpai,
			"yangshi"=>$yangshi,
			"zaoxing"=>$zaoxing,
			"chuligongyi"=>$chuligongyi,
			"dchicun"=>$dchicun,
			"ddiandu"=>$ddiandu,
			"dcailiao"=>$dcailiao,
			"dbaozhuang"=>$dbaozhuang,
			"tupian"=>$tupian
		);		

		$this->worker->crawItems[$id]=$data;
	}
}

//Download Images
class CrawImageWorker extends Worker{
	public function run(){}
}

class CrawImageWork extends Stackable{
	public function __contruct($from, $to){
		$this->setFrom($from);
		$this->setTo($to);
	}
	public function setFrom($f){
		$this->from = $f;
	}
	public function setTo($t){
		$this->to = $t;
	}
	public function grabImage($url, $filename=""){
    	if($url == ""){return FALSE;}
      	$extt = strrchr($url, ".");
       	$ext = strtolower($extt);
       	if($ext != ".gif" && $ext != ".jpg" && $ext != ".png" && $ext != ".bmp"){
			echo $url." not support.";return FALSE;
   		}

		if($filename == ""){
      		$filename = time()."$extt";
      	}

		ob_start();
  		readfile($url);
       	$img = ob_get_contents();
      	ob_end_clean();
       	$size = strlen($img);
       	$fp2 = fopen($filename , "a");
     	fwrite($fp2, $img);
     	fclose($fp2);

		return $filename;
	}
	public function run(){
		if(!file_exists($this->to)){
			$this->grabImage($this->from, $this->to);
		}	
	}
}

///////////////////////////////////////////////////////////////
$start = microtime(true);

$url = $argv[1];
preg_match('/http:\/\/(.*)\.1688\.com.*offerlist_(.*)\.htm$/',$url,$pmt);
$shop = $pmt[1];                        //店铺名称
$offlist = "offerlist_".$pmt[2];        //目录编号

// 目录链接
$shop_all_url = "http://".$shop.".1688.com/page/".$offlist.".htm";
$base = dirname(__FILE__);
$new_base = $base. "/".$shop."_".$offlist;

echo "\n\n## Start to Get '".$shop_all_url."'\n";

if((int)$argc < 2){
	echo "## \$argv[1] miss. exit...\n";
	exit;
}

if(!is_dir($new_base)){
 	mkdir($new_base);
}

$html = iconv("gbk","utf-8",file_get_contents($shop_all_url."?pageNum=1"));
phpQuery::newDocument($html);

// 页数
$page_total = (int)pq("li > em.page-count")->text();
echo "## Total ".$page_total." pages.\n";

echo "## Start to get List...\n";
$craw_list = new CrawList();
$pool = new Pool(5,"CrawListWorker",array($craw_list));

for($i=1;$i<=$page_total;$i++){
	$pool->submit(new CrawListWork($shop_all_url."?pageNum=".$i));
}
$pool->shutdown();
unset($pool);

echo "## Start to get Items...\n";
$craw_items = new CrawItems();
$pool_items = new Pool(10,"CrawItemWorker",array($craw_items));
$tt = 0;
foreach($craw_list as $i=>$clt){
	//if($tt>5) break;
	$pool_items->submit(new CrawItemWork($clt['url']));
	$tt++;
}
$pool_items->shutdown();
unset($pool_items);

$excude = array();
echo "## Start to get Images...\n";
$imgs_pool = new Pool(10,"CrawImageWorker",array());

$jj=1;
foreach($craw_items as $idx=>$data){
	//if($jj>3){ break; }
	$iidx = $data["id"];
	$imgss = $data["tupian"];
	$to = $new_base."/".$iidx;
	@mkdir($to,0777,true);
	$i=0;
	foreach($data["tupian"]["main"] as $igs){
		if(in_array($igs,$excude)){ continue; }
		$c = new CrawImageWork($igs,$to."/main_".$i.".jpg");
		$c->setFrom($igs);
		$c->setTo($to."/main_".$i.".jpg");
	
		$imgs_pool->submit($c);
		unset($c);
		$i++;
	}
	$i=0;
	foreach($data["tupian"]["des"] as $igs){
		if(in_array($igs,$excude)){ continue; }
		$c = new CrawImageWork($igs,$to."/des_".$i.".jpg");
		$c->setFrom($igs);
		$c->setTo($to."/des_".$i.".jpg");
	
		$imgs_pool->submit($c);
		unset($c);
		$i++;
	}

	$jj++;
}
$imgs_pool->shutdown();
unset($imgs_pool);

// General PHPExcel
$objPHPExcel = new PHPExcel();
$objPHPExcel->getProperties()->setCreator("vfeelit@qq.com")->setLastModifiedBy("vfeelit@qq.com");

$objPHPExcel->setActiveSheetIndex(0);
$objActSheet = $objPHPExcel->getActiveSheet();
$objActSheet->setTitle('采集表');

//涉及到的列
$colmns = array('A','B','C','D','E','F','G','H','I','J','K','L','M','N','O','P','Q','R','S','T');
//列宽
foreach($colmns as $cls){
	$objActSheet->getColumnDimension($cls)->setWidth(20);
}
$objActSheet->getColumnDimension('B')->setWidth(60);
$objActSheet->getColumnDimension('T')->setWidth(120);

$objActSheet->setCellValue('A1','编号');
$objActSheet->setCellValue('B1','名称');
$objActSheet->setCellValue('C1','图片');
$objActSheet->setCellValue('D1','颜色');
$objActSheet->setCellValue('E1','价格');
$objActSheet->setCellValue('F1','重量');
$objActSheet->setCellValue('G1','货号');
$objActSheet->setCellValue('H1','销售序列号');
$objActSheet->setCellValue('I1','材质');
$objActSheet->setCellValue('J1','风格');
$objActSheet->setCellValue('K1','品牌');
$objActSheet->setCellValue('L1','样式');
$objActSheet->setCellValue('M1','造型');
$objActSheet->setCellValue('N1','处理工艺');
$objActSheet->setCellValue('O1','尺寸');
$objActSheet->setCellValue('P1','电镀');
$objActSheet->setCellValue('Q1','材料');
$objActSheet->setCellValue('R1','包装');
$objActSheet->setCellValue('S1','链接');
$objActSheet->setCellValue('T1','图片');

$jj=2;
foreach($craw_items as $idx=>$data){
	//if($jj>3){ break; }
	$iidx = $data["id"];
	$imgss = $data["tupian"];
	$to = $new_base."/".$iidx;
	@mkdir($to,0777,true);
	$i=0; $the_main_pic=''; $mainimgs='';
	foreach($data["tupian"]["main"] as $igs){
		if(in_array($igs,$excude)){ continue; }
		if($i == 0){
			$the_main_pic = $to."/main_".$i.".jpg";
		}
		$mainimgs .= $igs."\r\n\r\n";
		$i++;
	}
	$i=0; $desimgs='';
	foreach($data["tupian"]["des"] as $igs){
		if(in_array($igs,$excude)){ continue; }
		$desimgs .= $igs."\r\n\r\n";
		$i++;
	}
	//行高	
	$objActSheet->getRowDimension($jj)->setRowHeight(95);

	//单元格对齐
	foreach($colmns as $c){
		$align = $objActSheet->getStyle($c.$jj)->getAlignment();
		$align->setHorizontal(PHPExcel_Style_Alignment::HORIZONTAL_LEFT);
		$align->setVertical(PHPExcel_Style_Alignment::VERTICAL_TOP);
		$align->setWrapText(TRUE);
	}
	
	$objActSheet->getStyle('A'.$jj)->getNumberFormat()->setFormatCode(PHPExcel_Style_NumberFormat::FORMAT_NUMBER);
	$objActSheet->setCellValue('A'.$jj,$data["id"]);

	$objActSheet->setCellValue('B'.$jj,$data["mingchen"]);
	
	//主图
	$fsig = $the_main_pic;
	if(file_exists($fsig)){
		//添加图片
		$objDrawing = new PHPExcel_Worksheet_Drawing();
		$objDrawing->setName('Product Image');					//设置名字
		$objDrawing->setDescription('');						//图片描述
			
		$objDrawing->setPath($fsig);							//图片路径
			
		//$objDrawing->setHeight(120);							//图片宽度 像素
		$objDrawing->setWidth(120);								//图片宽度 像素
		$objDrawing->setCoordinates('C'.$jj); 					//坐标 放入哪个单元格
		//$objDrawing->setOffsetX(10);							//针对单元格 水平偏移量
		//$objDrawing->setOffsetY(10);							//针对单元格 垂直偏移量
		//$objDrawing->setRotation(15);   						//旋转
		$objDrawing->setWorksheet($objActSheet);				//应用到工作表
	}else{
		$objActSheet->setCellValue('C'.$jj,"主图没有下载");	
	}	
	
	$colors = '';
	foreach($data["yanse"] as $color){
		$colors .= $color."\r\n";
	}
	$objActSheet->setCellValue('D'.$jj,$colors);
	
	$objActSheet->setCellValue('E'.$jj,$data["jiage"]);
	$objActSheet->setCellValue('F'.$jj,$data["zhongliang"]);
	$objActSheet->setCellValue('G'.$jj,$data["huohao"]);
	$objActSheet->setCellValue('H'.$jj,$data["xiaoshouxuliehao"]);
	$objActSheet->setCellValue('I'.$jj,$data["caizhi"]);
	$objActSheet->setCellValue('J'.$jj,$data["fengge"]);
	$objActSheet->setCellValue('K'.$jj,$data["pinpai"]);
	$objActSheet->setCellValue('L'.$jj,$data["yangshi"]);
	$objActSheet->setCellValue('M'.$jj,$data["zaoxing"]);
	$objActSheet->setCellValue('N'.$jj,$data["chuligongyi"]);
	$objActSheet->setCellValue('O'.$jj,$data["dchicun"]);
	$objActSheet->setCellValue('P'.$jj,$data["ddiandu"]);
	$objActSheet->setCellValue('Q'.$jj,$data["dcailiao"]);
	$objActSheet->setCellValue('R'.$jj,$data["dbaozhuang"]);
	$objActSheet->setCellValue('S'.$jj,$data["url"]);
	$objActSheet->setCellValue('T'.$jj,$mainimgs."\r\n\r\n".$desimgs);
	
	$jj++;
}
$objWriter = PHPExcel_IOFactory::createWriter($objPHPExcel,'Excel5');
$objWriter->save($base."/".$shop."_".$offlist.".xls");
	
$end = microtime(true);
echo "## Time Use:".($end-$start)."\n\n\n";

采集产品链接时开了5个线程,采集产品信息时开了10个线程,下载图片时开了10个线程。 这里是模拟浏览器的正常访问采集,所以需要设置一下php.ini文件的user_agent值。由于浏览器本身就是一个多线程程序,它可以同时发起多个连接,所以服务器端是绝对要允许同一个客户端开多个链接,所以我们可以模拟浏览器访问,但是线程数一多就可能被识别为非浏览器访问,那可能就会被堵塞。

测试:

/usr/local/php-5.5.15/bin/php craw.php "http://xxx.1688.com/page/offerlist_xxx.htm"


## Start to Get 'http://xxx.1688.com/page/offerlist_xxx.htm'
## Total 2 pages.
## Start to get List...
## Start to get Items...
## Start to get Images...
## Time Use:306.9495668411255


一共采集了40个产品,包括下载所有图片,共用了306秒(5分钟),如果线程数开大点估计时间会更加少(还取决于下载数据,毕竟图片多)。如果是顺序采集(单线程),至少30分钟吧。这个多线程编程的效果非常明显。

原创文章,转载务必保留出处。
永久链接:http://blog.ifeeline.com/1188.html

PHPExcel之生成Excel文档实例

关于PHPExcel,可以参考:
PHPExcel 单元格格式操作对照表
PHPExcel用户文档

require_once dirname(__FILE__) . '/phpexcel/Classes/PHPExcel.php';


// General PHPExcel
$objPHPExcel = new PHPExcel();
$objPHPExcel->getProperties()->setCreator("vfeelit@qq.com")->setLastModifiedBy("vfeelit@qq.com");

$objPHPExcel->setActiveSheetIndex(0);
$objActSheet = $objPHPExcel->getActiveSheet();
$objActSheet->setTitle('采集表');

//涉及到的列
$colmns = array('A','B','C','D','E','F','G','H','I','J','K','L','M','N','O','P','Q','R','S','T');
//列宽
foreach($colmns as $cls){
	$objActSheet->getColumnDimension($cls)->setWidth(20);
}
$objActSheet->getColumnDimension('B')->setWidth(60);
$objActSheet->getColumnDimension('T')->setWidth(120);

$objActSheet->setCellValue('A1','编号');
$objActSheet->setCellValue('B1','名称');
$objActSheet->setCellValue('C1','图片');
$objActSheet->setCellValue('D1','颜色');
$objActSheet->setCellValue('E1','价格');
$objActSheet->setCellValue('F1','重量');
$objActSheet->setCellValue('G1','货号');
$objActSheet->setCellValue('H1','销售序列号');
$objActSheet->setCellValue('I1','材质');
$objActSheet->setCellValue('J1','风格');
$objActSheet->setCellValue('K1','品牌');
$objActSheet->setCellValue('L1','样式');
$objActSheet->setCellValue('M1','造型');
$objActSheet->setCellValue('N1','处理工艺');
$objActSheet->setCellValue('O1','尺寸');
$objActSheet->setCellValue('P1','电镀');
$objActSheet->setCellValue('Q1','材料');
$objActSheet->setCellValue('R1','包装');
$objActSheet->setCellValue('S1','链接');
$objActSheet->setCellValue('T1','图片');

$jj=2;
foreach($craw_items as $idx=>$data){
	//if($jj>3){ break; }
	$iidx = $data["id"];
	$imgss = $data["tupian"];
	$to = $new_base."/".$iidx;
	@mkdir($to,0777,true);
	$i=0; $the_main_pic=''; $mainimgs='';
	foreach($data["tupian"]["main"] as $igs){
		if(in_array($igs,$excude)){ continue; }
		if($i == 0){
			$the_main_pic = $to."/main_".$i.".jpg";
		}
		$mainimgs .= $igs."\r\n\r\n";
		$i++;
	}
	$i=0; $desimgs='';
	foreach($data["tupian"]["des"] as $igs){
		if(in_array($igs,$excude)){ continue; }
		$desimgs .= $igs."\r\n\r\n";
		$i++;
	}
	//行高	
	$objActSheet->getRowDimension($jj)->setRowHeight(95);

	//单元格对齐
	foreach($colmns as $c){
		$align = $objActSheet->getStyle($c.$jj)->getAlignment();
		$align->setHorizontal(PHPExcel_Style_Alignment::HORIZONTAL_LEFT);
		$align->setVertical(PHPExcel_Style_Alignment::VERTICAL_TOP);
		$align->setWrapText(TRUE);
	}
	
	$objActSheet->getStyle('A'.$jj)->getNumberFormat()->setFormatCode(PHPExcel_Style_NumberFormat::FORMAT_NUMBER);
	$objActSheet->setCellValue('A'.$jj,$data["id"]);

	$objActSheet->setCellValue('B'.$jj,$data["mingchen"]);
	
	//主图
	$fsig = $the_main_pic;
	if(file_exists($fsig)){
		//添加图片
		$objDrawing = new PHPExcel_Worksheet_Drawing();
		$objDrawing->setName('Product Image');			//设置名字
		$objDrawing->setDescription('');			//图片描述
			
		$objDrawing->setPath($fsig);				//图片路径
			
		//$objDrawing->setHeight(120);				//图片宽度 像素
		$objDrawing->setWidth(120);				//图片宽度 像素
		$objDrawing->setCoordinates('C'.$jj); 			//坐标 放入哪个单元格
		//$objDrawing->setOffsetX(10);				//针对单元格 水平偏移量
		//$objDrawing->setOffsetY(10);				//针对单元格 垂直偏移量
		//$objDrawing->setRotation(15);   			//旋转
		$objDrawing->setWorksheet($objActSheet);		//应用到工作表
	}else{
		$objActSheet->setCellValue('C'.$jj,"主图没有下载");	
	}	
	
	$colors = '';
	foreach($data["yanse"] as $color){
		$colors .= $color."\r\n";
	}
	$objActSheet->setCellValue('D'.$jj,$colors);
	
	$objActSheet->setCellValue('E'.$jj,$data["jiage"]);
	$objActSheet->setCellValue('F'.$jj,$data["zhongliang"]);
	$objActSheet->setCellValue('G'.$jj,$data["huohao"]);
	$objActSheet->setCellValue('H'.$jj,$data["xiaoshouxuliehao"]);
	$objActSheet->setCellValue('I'.$jj,$data["caizhi"]);
	$objActSheet->setCellValue('J'.$jj,$data["fengge"]);
	$objActSheet->setCellValue('K'.$jj,$data["pinpai"]);
	$objActSheet->setCellValue('L'.$jj,$data["yangshi"]);
	$objActSheet->setCellValue('M'.$jj,$data["zaoxing"]);
	$objActSheet->setCellValue('N'.$jj,$data["chuligongyi"]);
	$objActSheet->setCellValue('O'.$jj,$data["dchicun"]);
	$objActSheet->setCellValue('P'.$jj,$data["ddiandu"]);
	$objActSheet->setCellValue('Q'.$jj,$data["dcailiao"]);
	$objActSheet->setCellValue('R'.$jj,$data["dbaozhuang"]);
	$objActSheet->setCellValue('S'.$jj,$data["url"]);
	$objActSheet->setCellValue('T'.$jj,$mainimgs."\r\n\r\n".$desimgs);
	
	$jj++;
}
$objWriter = PHPExcel_IOFactory::createWriter($objPHPExcel,'Excel5');
$objWriter->save($base."/".$shop."_".$offlist.".xls");

生成一个Excel文档程序框架,在最后如果需要直接下载文档,可以改为:

	header("Content-Type: application/force-download");
	header("Content-Type: application/octet-stream");
	header("Content-Type: application/download");
	header('Content-Disposition:inline;filename="'.$outputFileName.'"');
	header("Content-Transfer-Encoding: binary");
	header("Expires: Mon, 26 Jul 1997 05:00:00 GMT");
	header("Last-Modified: " . gmdate("D, d M Y H:i:s") . " GMT");
	header("Cache-Control: must-revalidate, post-check=0, pre-check=0");
	header("Pragma: no-cache");
	
	$objWriter = PHPExcel_IOFactory::createWriter($objPHPExcel,'Excel5');
	$objWriter->save('php://output');

此文为备忘….

原创文章,转载务必保留出处。
永久链接:http://blog.ifeeline.com/1183.html

PHP多线程编程 之 在线程中共享

PHP多线程编程扩展pthreads提供在多线程中共享的方案是共享一个线程对象。任何外部对这个共享线程对象的初始化都是拷贝。

参考文章:PHP多线程编程 之 基本概念(官方文档)中关于数据储存 和 资源部分:
——————————
Data Storage
As a rule of thumb, any data type that can be serialized can be used as a member of a Threaded object, it can be read and written from any context with a reference to the Threaded Object. Not every type of data is stored serially, basic types are stored in their true form. Complex types, Arrays, and Objects that are not Threaded are stored serially; they can be read and written to the Threaded Object from any context with a reference.
根据经验,任何可以被序列化的数据类型都可以用来作为Threaded对象的成员,它可被任何有引用到Threaded对象的上下文读写(只要有引用就可以读写,注意是任何上下文)。不是所有的数据类都是序列化存储的,基础类型使用它原来的形式存储。那些不是Threaded对象的复杂类型,数组和对象也是序列化存储的;可以在任何线程中把它们读写到Threaded对象中,只要上下文有Threaded对象的引用。
With the exception of Threaded Objects any reference used to set a member of a Threaded Object is separated from the reference in the Threaded Object; the same data can be read directly from the Threaded Object at any time by any context with a reference to the Threaded Object.
在Threaded对象中除了Threaded对象,任何用来设置Threaded对象的成员的引用都是从引用分离的(意思是用来设置Threaded对象属性的引用,不管是值还是什么,除了Threaded对象,都是用它的一份拷贝,理解这个是很重要的,比如你希望传递一个Threaded对象外的引用(非Threaded引用,比如数组)初始化Threaded对象,你务必知道这是个克隆操作)
Resources
The extensions and functionality that define resources in PHP are completely unprepared for this kind of environment; pthreads makes provisions for resources to be shared among contexts, however most types of resource WILL have problems. Most of the time, resources should NOT be shared among contexts, and when they are they should be basic types like streams and sockets.
PHP中扩展或功能定义的资源完全没有为这种类型为环境而准备;对于在上下文之间被共享的资源pthreads做了规定,然而大多数的资源类型将有问题。大多时候,资源不应该在上下文之间被共享,否则它们应该是基本的类型,比如streams和sockets。
Officially, resources remain unsupported.
——————————

资源类型不受支持。比如要在多线程中共享一个数据库链接,这个操作是错误的。一般如果要共享内容,可以构建一个Threaded类型的对象作为容器,然后传递到不同的线程中,不同线程对这个共享对象的操作是线程安全的。不过,如果拷贝了一个资源类型到这个对象中,你会发现,在不同线程中会报告这个资源不可用,所以当出现需要共享资源类型时,多线程操作将无比困难。

<?php
class MyWorker extends Worker{
        public function __construct($arr,$mutex){
                $this->arr = $arr;
		$this->mutex = $mutex;
        }
        public function run(){

        }
}

class MyWork extends Stackable{
        public function __construct($i){
                $this->i = $i;
        }
	public function run(){
		if($this->worker->mutex)
			Mutex::lock($this->worker->mutex);

		$this->worker->arr[] = "MyWork \$i is:".$this->i;
		echo $this->i." ---> ";
		print_r($this->worker->arr);

                if($this->worker->mutex)
                        Mutex::unlock($this->worker->mutex);
		
	}
}

$test_array = array(1,2);
$mutex = Mutex::create();

$pool = new Pool(5,"MyWorker",array($test_array,$mutex));

$test_array[] = 3;

for($i=0;$i<10;$i++){
	$pool->submit(new MyWork($i));
}

$pool->shutdown();

print_r($test_array);

输出:

/usr/local/php-5.5.15/bin/php tarr.php 
0 ---> Array
(
    [0] => 1
    [1] => 2
)
1 ---> Array
(
    [0] => 1
    [1] => 2
)
2 ---> Array
(
    [0] => 1
    [1] => 2
)
3 ---> Array
(
    [0] => 1
    [1] => 2
)
4 ---> Array
(
    [0] => 1
    [1] => 2
)
9 ---> Array
(
    [0] => 1
    [1] => 2
)
8 ---> Array
(
    [0] => 1
    [1] => 2
)
7 ---> Array
(
    [0] => 1
    [1] => 2
)
5 ---> Array
(
    [0] => 1
    [1] => 2
)
6 ---> Array
(
    [0] => 1
    [1] => 2
)
Array
(
    [0] => 1
    [1] => 2
    [2] => 3
)

这里试图拿一个PHP的原生数组对象作为多线程之间共享的内容的容器,很不幸,从结果来看,每个Worker对象直接拷贝了这个数组,并且拷贝的这个数组无法被操作。所以如果希望在多线程之间共享一个PHP原生数组是不被支持的,同样,非pthreads对象和资源也是无法在多线程之间共享。要在多线程之间共享,必须是pthreads对象。

把PHP数组改成一个Stackable对象:

<?php
class MyWorker extends Worker{
        public function __construct($arr,$mutex){
                $this->arr = $arr;
		$this->mutex = $mutex;
        }
        public function run(){

        }
}

class MyWork extends Stackable{
        public function __construct($i){
                $this->i = $i;
        }
	public function run(){
		if($this->worker->mutex)
			Mutex::lock($this->worker->mutex);

		$this->worker->arr[] = "MyWork \$i is:".$this->i;
		echo $this->i." ---> ";
		print_r($this->worker->arr);

                if($this->worker->mutex)
                        Mutex::unlock($this->worker->mutex);
		
	}
}
class MyArray extends Stackable{
	public function run(){}
}

$test_array = new MyArray();
$test_array[] = 1;
$test_array[] = 2;

$mutex = Mutex::create();

$pool = new Pool(5,"MyWorker",array($test_array,$mutex));

$test_array[] = 3;

for($i=0;$i<10;$i++){
	$pool->submit(new MyWork($i));
}

$pool->shutdown();

print_r($test_array);

输出:

/usr/local/php-5.5.15/bin/php tarr.php 
0 ---> MyArray Object
(
    [0] => 1
    [1] => 2
    [2] => 3
    [3] => MyWork $i is:0
)
1 ---> MyArray Object
(
    [0] => 1
    [1] => 2
    [2] => 3
    [3] => MyWork $i is:0
    [4] => MyWork $i is:1
)
2 ---> MyArray Object
(
    [0] => 1
    [1] => 2
    [2] => 3
    [3] => MyWork $i is:0
    [4] => MyWork $i is:1
    [5] => MyWork $i is:2
)
3 ---> MyArray Object
(
    [0] => 1
    [1] => 2
    [2] => 3
    [3] => MyWork $i is:0
    [4] => MyWork $i is:1
    [5] => MyWork $i is:2
    [6] => MyWork $i is:3
)
4 ---> MyArray Object
(
    [0] => 1
    [1] => 2
    [2] => 3
    [3] => MyWork $i is:0
    [4] => MyWork $i is:1
    [5] => MyWork $i is:2
    [6] => MyWork $i is:3
    [7] => MyWork $i is:4
)
6 ---> MyArray Object
(
    [0] => 1
    [1] => 2
    [2] => 3
    [3] => MyWork $i is:0
    [4] => MyWork $i is:1
    [5] => MyWork $i is:2
    [6] => MyWork $i is:3
    [7] => MyWork $i is:4
    [8] => MyWork $i is:6
)
7 ---> MyArray Object
(
    [0] => 1
    [1] => 2
    [2] => 3
    [3] => MyWork $i is:0
    [4] => MyWork $i is:1
    [5] => MyWork $i is:2
    [6] => MyWork $i is:3
    [7] => MyWork $i is:4
    [8] => MyWork $i is:6
    [9] => MyWork $i is:7
)
8 ---> MyArray Object
(
    [0] => 1
    [1] => 2
    [2] => 3
    [3] => MyWork $i is:0
    [4] => MyWork $i is:1
    [5] => MyWork $i is:2
    [6] => MyWork $i is:3
    [7] => MyWork $i is:4
    [8] => MyWork $i is:6
    [9] => MyWork $i is:7
    [10] => MyWork $i is:8
)
5 ---> MyArray Object
(
    [0] => 1
    [1] => 2
    [2] => 3
    [3] => MyWork $i is:0
    [4] => MyWork $i is:1
    [5] => MyWork $i is:2
    [6] => MyWork $i is:3
    [7] => MyWork $i is:4
    [8] => MyWork $i is:6
    [9] => MyWork $i is:7
    [10] => MyWork $i is:8
    [11] => MyWork $i is:5
)
9 ---> MyArray Object
(
    [0] => 1
    [1] => 2
    [2] => 3
    [3] => MyWork $i is:0
    [4] => MyWork $i is:1
    [5] => MyWork $i is:2
    [6] => MyWork $i is:3
    [7] => MyWork $i is:4
    [8] => MyWork $i is:6
    [9] => MyWork $i is:7
    [10] => MyWork $i is:8
    [11] => MyWork $i is:5
    [12] => MyWork $i is:9
)
MyArray Object
(
    [0] => 1
    [1] => 2
    [2] => 3
    [3] => MyWork $i is:0
    [4] => MyWork $i is:1
    [5] => MyWork $i is:2
    [6] => MyWork $i is:3
    [7] => MyWork $i is:4
    [8] => MyWork $i is:6
    [9] => MyWork $i is:7
    [10] => MyWork $i is:8
    [11] => MyWork $i is:5
    [12] => MyWork $i is:9
)

这次不同线程可以操作共享的MyArray对象。这里语句:

$test_array[] = 3;

在生成Pool对象之后执行,从结果可以见,Pool的Worker对象也是在用到时,Pool才实例化之。那么,如果用MyArray作为共享对象,并且在这个对象中有一个成员是PHP的原生数组,那么线程能不能操作这个数组呢?实验证明,也是不可以的。多线程之间只能共享pthreads对象,并且这个对象中的数组、非pthreads对象和资源都是无法被共享的。以下例子试图共享一个数据库链接对象:

<?php
class MyWorker extends Worker{
        public function __construct(MyShare $ms,$mutex){
                $this->share = $ms;
		$this->mutex = $mutex;
        }
        public function run(){}
}

class MyWork extends Stackable{
        public function __construct($i){
                $this->i = $i;
        }
	public function run(){
		if($this->worker->mutex)
			Mutex::lock($this->worker->mutex);

		echo $this->i." ---> ";
		if ($result = mysqli_query($this->worker->share->link,"select id,order_id from ai_order where id > 1000 limit 1")){
			while(($row = mysqli_fetch_assoc($result))) {		
				print_r($row);
			}
		}
                if($this->worker->mutex)
                        Mutex::unlock($this->worker->mutex);
		
	}
}
class MyShare extends Stackable{
	public $link = null;
	public function __construct(){
		$this->link = mysqli_connect("localhost","root","root","ai_manage",3306,"/var/lib/mysql/mysql.sock");
	}
	public function run(){}
}

////////////////////////////////
$link = mysqli_connect("localhost","root","root","ai_manage",3306,"/var/lib/mysql/mysql.sock");
if ($result = mysqli_query($link,"select id,order_id from ai_order where id > 1000 limit 1")){
	while(($row = mysqli_fetch_assoc($result))) {
        	print_r($row);
     	}
}

////////////////////////////////

$myshare = new MyShare();
$mutex = Mutex::create();

$pool = new Pool(5,"MyWorker",array($myshare,$mutex));

for($i=0;$i<2;$i++){
	$pool->submit(new MyWork($i));
}

$pool->shutdown();

输出:

/usr/local/php-5.5.15/bin/php tarr.php 
Array
(
    [id] => 1001
    [order_id] => 60197769529633
)
0 ---> PHP Warning:  mysqli_query(): Couldn't fetch mysqli in /root/tarr.php on line 19

Warning: mysqli_query(): Couldn't fetch mysqli in /root/tarr.php on line 19
1 ---> PHP Warning:  mysqli_query(): Couldn't fetch mysqli in /root/tarr.php on line 19

Warning: mysqli_query(): Couldn't fetch mysqli in /root/tarr.php on line 19

首先证明,mysqli_connect()函数是可以链接上数据库的:

Array
(
    [id] => 1001
    [order_id] => 60197769529633
)

但是在MyShare类的构造函数中的mysqli_connect()没能成功返回资源的引用。所以当Work对象的run方法中调用$this->worker->share->link时就失败了,所以返回了Couldn’t fetch mysql这样的错误。

既然在MyShare内部无法生成和保存数据链接资源类型,那么在外部通过构造函数带入看看是否可以,修改程序如下:

class MyShare extends Stackable{
        public $link = null;
        public function __construct($link){
                $this->link = $link;
        }
        public function run(){}
}

$link = mysqli_connect("localhost","root","root","ai_manage",3306,"/var/lib/mysql/mysql.sock");
if ($result = mysqli_query($link,"select id,order_id from ai_order where id > 1000 limit 1")){
        while(($row = mysqli_fetch_assoc($result))) {
                print_r($row);
        }
}

$myshare = new MyShare($link);

输出了跟上面程序一样的错误。

到这里,已经充分证明,资源类型在多线程之间共享是不可行的,甚至是在pthreads对象中保存资源类型变量都无法完成(因为pthreads对象不知道如何保存它,pthreads对象的数据保存是序列化数据,但是资源类型无法序列化),同样PHP数组和非pthreads对象也不行(拷贝后序列化存储,但是线程无法操作)。如果希望一个对象在多线程之间共享,看来只能继承Stackable类。这个在多线程之间共享的对象成员不能包含PHP原生的数组和非pthareads对象(实际可以包含,但是线程无法操作)。这个pthreads的实现对多线程变量共享进行了诸多限制,实际上线程需要操作数据库是很正常的,但是pthreads无法实现,只能通过一些技巧实现,比如把结果在共享对象中进行标记,在线程退出回到主线程后批量操作数据库。

最后给出多线程编程的一个标准框架:

<?php
class MyWorker extends Worker{
        public function __construct(MyShare $ms,$mutex){
                $this->share = $ms;
                $this->mutex = $mutex;
        }
        public function run(){}
}

class MyWork extends Stackable{
        public function __construct(){
        }
	//如果要操作共享对象,操作共享对象的代码最好进行互斥操作
        public function run(){
                if($this->worker->mutex)
                        Mutex::lock($this->worker->mutex);

		//执行线程代码,MyShare可以当做数组一样来使
		$this->worker->share[] = rand(1,100);

                if($this->worker->mutex)
                        Mutex::unlock($this->worker->mutex);

        }
}
class MyShare extends Stackable{
        public function __construct(){}
        public function run(){}
}

$myshare = new MyShare();
$mutex = Mutex::create();

$pool = new Pool(5,"MyWorker",array($myshare,$mutex));

for($i=0;$i<50;$i++){
        $pool->submit(new MyWork());
}

$pool->shutdown();
Mutex::destroy($mutex);