标签归档:xml

XML转换成PHP数组(或JSON)问题

例子:

<root>
   <books>
      <book>1</book>
      <book>2</book>
   </books>
</root>

//
<root>
   <books>
      <book>1</book>
   </books>
</root>

这个情况,可以转换成如下数组:

[
    “root"=>[
        "books" =>[
              ["name" => "book", "value" => 1],
              ["name" => "book", "value" => 2]
         ]
    ]
]
////
[
    “root"=>[
        "books" => ["name" => "book", "value" => 1]
    ]
]

这个转换的结果肯定不能令人满意。第一个XML books对应一个二维数组,第二个XML books对应了一维数据。换句话说,当要遍历数据时,首先要做一个判断,看它是一维数组还是多维数组。如果books只有一个子元素,那么就换成只有一个子元素的二维数组,看起来是比较方便的。不过,有时候类似books这样的元素,仅仅只能有一个子元素的时候,那么取元素还要往二维数组里面提取出来。换句话说,就仅仅从XML结构来看,无法知道books到底是仅包含一个元素,还是多个元素(当然XML本身是有提供描述支持,这里不讨论这个),那么转换函数看到仅包含一个元素的,就直接对上books,遇到多个子元素的,就对上一个二维数组,这个本身是没有问题的,只是使用起来就太不方便了。如果要解决这个问题,必须知道books到底是多元素还是单元素的,这个就是非常特定的数据结构了,无法通用。相比JSON数据结构,XML这个东西实在太笨重。

对于特定的数据结构,预先知道是对应多元素还是单元素,所以可以特别处理。以下的例子就是PHP eBay SDK中把XML转换成数组方式:

$xml = '<GetCategorySpecificsResponse xmlns="urn:ebay:apis:eBLBaseComponents">'.$cat->asXml().'</GetCategorySpecificsResponse>';
$xmlParser = new \DTS\eBaySDK\Parser\XmlParser('\DTS\eBaySDK\Trading\Types\GetCategorySpecificsResponseType');
$xmlParser->parse($xml)->toArray()

这里转换的XML输出就是规范的输出,因为对于每个元素的类型,DTS\eBaySDK\Trading\Types\GetCategorySpecificsResponseType有约定:

class GetCategorySpecificsResponseType extends \DTS\eBaySDK\Trading\Types\AbstractResponseType
{
    private static $propertyTypes = array(
        'Recommendations' => array(
            'type' => 'DTS\eBaySDK\Trading\Types\RecommendationsType',
            'unbound' => true,
            'attribute' => false,
            'elementName' => 'Recommendations'
        ),
        'TaskReferenceID' => array(
            'type' => 'string',
            'unbound' => false,
            'attribute' => false,
            'elementName' => 'TaskReferenceID'
        ),
        'FileReferenceID' => array(
            'type' => 'string',
            'unbound' => false,
            'attribute' => false,
            'elementName' => 'FileReferenceID'
        )
    );
}

这个操作确实费时费力。需要把已知的数据结构做一遍对应。

以下就是我在下载eBay类目属性时,把这个大文件进行拆分,然后按照类目ID进行存储的具体代码:

        $xmlFile = 'ebay/category-specifics-'.$site.'.xml';
        if(!\Storage::has($xmlFile)) {
            echo "文件:$xmlFile 不存在,请先下载并解压\n";
            return;
        }
        $xml = simplexml_load_file(storage_path('app/'.$xmlFile));
        
        foreach($xml->Recommendations as $cat) {
            $cid = $cat->CategoryID;
            $xml = '<GetCategorySpecificsResponse xmlns="urn:ebay:apis:eBLBaseComponents">'.$cat->asXml().'</GetCategorySpecificsResponse>';
            $xmlParser = new \DTS\eBaySDK\Parser\XmlParser('\DTS\eBaySDK\Trading\Types\GetCategorySpecificsResponseType');
            
            $cacheDir = 'ebay/site/'.$siteName."/specifics";
            $cacheFile = $cacheDir.'/'.$cid.".json";
            @mkdir(storage_path('app/'.$cacheDir), 0777, true);
            
            if(Storage::has($cacheFile)) {
                Storage::delete($cacheFile);
            }
            Storage::put($cacheFile,json_encode($xmlParser->parse($xml)->toArray()));
            unset($cid, $xml, $xmlParser, $cacheDir, $cacheFile);
        }

这样转换之后,只需要取回JSON,如果服务器端,再转换成数组就可以方便使用;对于客户端,非常简单的传递JSON字符串就可以了,根本不需要担心可以包含多元素的子元素,当仅包含一个元素时,没有被正确装换成二维数组的问题,因为这个情况都是二维数组(统一了操作)。

如果直接返回JSON,相对简单很多,当前API的开发,大多使用JSON,这个就是趋势。

XML转换成PHP数组

XML文档或字符串,要转换成PHP的数组,PHP语言本身并没有提供支持。经搜索找到一个第三方函数,特收藏之。

原文链接:http://www.bin-co.com/php/scripts/xml2array/

<?php
/**
 * xml2array() will convert the given XML text to an array in the XML structure.
 * Link: http://www.bin-co.com/php/scripts/xml2array/
 * Arguments : $contents - The XML text
 *                $get_attributes - 1 or 0. If this is 1 the function will get the attributes as well as the tag values - this results in a different array structure in the return value.
 *                $priority - Can be 'tag' or 'attribute'. This will change the way the resulting array sturcture. For 'tag', the tags are given more importance.
 * Return: The parsed XML in an array form. Use print_r() to see the resulting array structure.
 * Examples: $array =  xml2array(file_get_contents('feed.xml'));
 *              $array =  xml2array(file_get_contents('feed.xml', 1, 'attribute'));
 */
function xml2array($contents, $get_attributes=1, $priority = 'tag') {
    if(!$contents) return array();

    if(!function_exists('xml_parser_create')) {
        //print "'xml_parser_create()' function not found!";
        return array();
    }

    //Get the XML parser of PHP - PHP must have this module for the parser to work
    $parser = xml_parser_create('');
    xml_parser_set_option($parser, XML_OPTION_TARGET_ENCODING, "UTF-8"); # http://minutillo.com/steve/weblog/2004/6/17/php-xml-and-character-encodings-a-tale-of-sadness-rage-and-data-loss
    xml_parser_set_option($parser, XML_OPTION_CASE_FOLDING, 0);
    xml_parser_set_option($parser, XML_OPTION_SKIP_WHITE, 1);
    xml_parse_into_struct($parser, trim($contents), $xml_values);
    xml_parser_free($parser);

    if(!$xml_values) return;//Hmm...

    //Initializations
    $xml_array = array();
    $parents = array();
    $opened_tags = array();
    $arr = array();

    $current = &$xml_array; //Refference

    //Go through the tags.
    $repeated_tag_index = array();//Multiple tags with same name will be turned into an array
    foreach($xml_values as $data) {
        unset($attributes,$value);//Remove existing values, or there will be trouble

        //This command will extract these variables into the foreach scope
        // tag(string), type(string), level(int), attributes(array).
        extract($data);//We could use the array by itself, but this cooler.

        $result = array();
        $attributes_data = array();
        
        if(isset($value)) {
            if($priority == 'tag') $result = $value;
            else $result['value'] = $value; //Put the value in a assoc array if we are in the 'Attribute' mode
        }

        //Set the attributes too.
        if(isset($attributes) and $get_attributes) {
            foreach($attributes as $attr => $val) {
                if($priority == 'tag') $attributes_data[$attr] = $val;
                else $result['attr'][$attr] = $val; //Set all the attributes in a array called 'attr'
            }
        }

        //See tag status and do the needed.
        if($type == "open") {//The starting of the tag '<tag>'
            $parent[$level-1] = &$current;
            if(!is_array($current) or (!in_array($tag, array_keys($current)))) { //Insert New tag
                $current[$tag] = $result;
                if($attributes_data) $current[$tag. '_attr'] = $attributes_data;
                $repeated_tag_index[$tag.'_'.$level] = 1;

                $current = &$current[$tag];

            } else { //There was another element with the same tag name

                if(isset($current[$tag][0])) {//If there is a 0th element it is already an array
                    $current[$tag][$repeated_tag_index[$tag.'_'.$level]] = $result;
                    $repeated_tag_index[$tag.'_'.$level]++;
                } else {//This section will make the value an array if multiple tags with the same name appear together
                    $current[$tag] = array($current[$tag],$result);//This will combine the existing item and the new item together to make an array
                    $repeated_tag_index[$tag.'_'.$level] = 2;
                    
                    if(isset($current[$tag.'_attr'])) { //The attribute of the last(0th) tag must be moved as well
                        $current[$tag]['0_attr'] = $current[$tag.'_attr'];
                        unset($current[$tag.'_attr']);
                    }

                }
                $last_item_index = $repeated_tag_index[$tag.'_'.$level]-1;
                $current = &$current[$tag][$last_item_index];
            }

        } elseif($type == "complete") { //Tags that ends in 1 line '<tag />'
            //See if the key is already taken.
            if(!isset($current[$tag])) { //New Key
                $current[$tag] = $result;
                $repeated_tag_index[$tag.'_'.$level] = 1;
                if($priority == 'tag' and $attributes_data) $current[$tag. '_attr'] = $attributes_data;

            } else { //If taken, put all things inside a list(array)
                if(isset($current[$tag][0]) and is_array($current[$tag])) {//If it is already an array...

                    // ...push the new element into that array.
                    $current[$tag][$repeated_tag_index[$tag.'_'.$level]] = $result;
                    
                    if($priority == 'tag' and $get_attributes and $attributes_data) {
                        $current[$tag][$repeated_tag_index[$tag.'_'.$level] . '_attr'] = $attributes_data;
                    }
                    $repeated_tag_index[$tag.'_'.$level]++;

                } else { //If it is not an array...
                    $current[$tag] = array($current[$tag],$result); //...Make it an array using using the existing value and the new value
                    $repeated_tag_index[$tag.'_'.$level] = 1;
                    if($priority == 'tag' and $get_attributes) {
                        if(isset($current[$tag.'_attr'])) { //The attribute of the last(0th) tag must be moved as well
                            
                            $current[$tag]['0_attr'] = $current[$tag.'_attr'];
                            unset($current[$tag.'_attr']);
                        }
                        
                        if($attributes_data) {
                            $current[$tag][$repeated_tag_index[$tag.'_'.$level] . '_attr'] = $attributes_data;
                        }
                    }
                    $repeated_tag_index[$tag.'_'.$level]++; //0 and 1 index is already taken
                }
            }

        } elseif($type == 'close') { //End of tag '</tag>'
            $current = &$parent[$level-1];
        }
    }
    
    return($xml_array);
}  

测试:

$xml = '<?xml version="1.0" encoding="utf-8"?>
<root>
  	<ele id="1">1</ele>
	<ele id="2">2</ele>
	<bok>
		<one ids="111" name="vfeelit">111</one>
		<one ids="222" name="ifeeline">222</one>
		<two>2</two>
	</bok>
</root>';

$arr = xml2array($xml,1,'attribute');
print_r($arr);

输出:

Array
(
    [root] => Array
        (
            [ele] => Array
                (
                    [0] => Array
                        (
                            [value] => 1
                            [attr] => Array
                                (
                                    [id] => 1
                                )

                        )

                    [1] => Array
                        (
                            [value] => 2
                            [attr] => Array
                                (
                                    [id] => 2
                                )

                        )

                )

            [bok] => Array
                (
                    [one] => Array
                        (
                            [0] => Array
                                (
                                    [value] => 111
                                    [attr] => Array
                                        (
                                            [ids] => 111
                                            [name] => vfeelit
                                        )

                                )

                            [1] => Array
                                (
                                    [value] => 222
                                    [attr] => Array
                                        (
                                            [ids] => 222
                                            [name] => ifeeline
                                        )

                                )

                        )

                    [two] => Array
                        (
                            [value] => 2
                        )

                )

        )

)

同名元素会自动变成一个数组,每个元素对应一个关联数组,下标为value对应值,下标attr对应元素的属性。

PHP XML操作 – SimpleXML

The SimpleXML extension provides a very simple and easily usable toolset to convert XML to an object that can be processed with normal property selectors and array iterators.

此扩展需要libxml PHP扩展,这表示需要使用 –enable-libxml,尽管这将隐式完成因为libxml是缺省开启的.

注:不仅是SimpleXML需要libxml PHP扩展,很多其它针对XML的扩展也需要。事实上libxml PHP扩展依赖系统库libxml2 libxml2-devel,所以安装PHP前要先安装这两个库,然后编译PHP时要启用libxml PHP扩展(也可以不用明确使用—enable-libxml,因为它是默认的,如果找不到系统的libxml2库,PHP编译可能出错,这时可以通过–with-libxml-dir指定系统libxml2库安装的目录)

安装
此扩展默认为启用,编译时可通过下列选项禁用: –disable-simplexml
Note: Before PHP 5.1.2, –enable-simplexml is required to enable this extension.

加载XML
1 加载文件
simplexml_load_file()函数将XML文件加载到对象:
object simplexml_load_file(string filename [, string class_name])
如果加载文件时遇到问题,则返回FALSE。如果包含可选的class_name参数,将返回该类的对象。当然,class_name会扩展SimpleXMLElement类。

2 加载字符串
object simplexml_load_string(string data)

3 加载XML DOM文档
simplexml_import_dom()函数将DOM文档的节点转换为SimpleXML的节点。

解析XML

<?xml version='1.0' standalone='yes'?>
<movies>
 <movie>
  <title>PHP: Behind the Parser</title>
  <characters>
   <character>
    <name>Ms. Coder</name>
    <actor>Onlivia Actora</actor>
   </character>
   <character>
    <name>Mr. Coder</name>
    <actor>El Act</actor>
   </character>
  </characters>
  <plot>
   So, this language. It's like, a programming language. Or is it a
   scripting language? All is revealed in this thrilling horror spoof
   of a documentary.
  </plot>
  <great-lines>
   <line>PHP solves all my web problems</line>
  </great-lines>
  <rating type="thumbs">7</rating>
  <rating type="stars">5</rating>
 </movie>
</movies>

// 使用var_dump()输出
object(SimpleXMLElement)#1 (1) {
  ["movie"]=>
  object(SimpleXMLElement)#2 (5) {
    ["title"]=>
    string(22) "PHP: Behind the Parser"
    ["characters"]=>
    object(SimpleXMLElement)#3 (1) {
      ["character"]=>
      array(2) {
        [0]=>
        object(SimpleXMLElement)#5 (2) {
          ["name"]=>
          string(9) "Ms. Coder"
          ["actor"]=>
          string(14) "Onlivia Actora"
        }
        [1]=>
        object(SimpleXMLElement)#6 (2) {
          ["name"]=>
          string(9) "Mr. Coder"
          ["actor"]=>
          string(6) "El Act"
        }
      }
    }
    ["plot"]=>
    string(162) "
   So, this language. It's like, a programming language. Or is it a
   scripting language? All is revealed in this thrilling horror spoof
   of a documentary.
  "
    ["great-lines"]=>
    object(SimpleXMLElement)#4 (1) {
      ["line"]=>
      string(30) "PHP solves all my web problems"
    }
    ["rating"]=>
    array(2) {
      [0]=>
      string(1) "7"
      [1]=>
      string(1) "5"
    }
  }
}

从输出结果可以看到,XML文档被转换成了对象和数组操作,如果子节点相同(多于1个)就转换成数组的操作方式(数组元素内再封装成一个对象,其实可以全部看做是先封装成数组),另外,节点的属性也被转存成数组,这样操作就非常便利。

最常用的操作:
打开XML文档或保存到XML文档

simplexml_load_file()
simplexml_load_string()
simplexm_import_dom()

定位到子节点

// 例子
$xml = simplexml_load_string($xmlstr);
$xml->movice-> title; //定位到title节点,返回它的文本值,与$xml->movice[0] -> title[0]相同
$xml->movice->characters->character[0]; //定位到第一个character
$xml->movice->rating[0][‘type’]; //定位到第一个rating节点的type属性

读取或设置节点值

echo $xml->movice-> title;  //直接获得节点值
$xml->movice-> title =  “Hello SimpleXML”; //直接赋值

读取或设置属性值

echo $xml->movice->rating[0][‘type’];
$xml->movice->rating[0][‘type’]  =  “Set Attribute.”;

添加或删除节点(属性)

// 添加 或 删除 节点(属性)都非常容易
添加:直接造一个,然后赋值
删除:直接unset()

当然,也可以SimpleXMLElement的addChild()或addAttribute(),获取节点或属性,还可以使用children() 和 attributes()。

另外,如果要获取XML的名空间,需要用getNamespaces()方法,要获取节点的名称,需要使用getName()方法,如果要获取有多少个子元素,需要用count()方法。要把当前的XML输出字符串,需要使用asXML()。

这个扩展大部分内容就是如上那些,它使得操作XML非常简单,真正体现了简单。

永久连接: http://blog.ifeeline.com/319.html