The PHP DOM library allows you to manage HTML and XML after you upload your script into a new PHP DOMDocument. Parsing is an important concept referring to changing source code into a more easy-to-read format.
For reading and changing HTML and XML, DOM library is the best option as it is automatically built together with some PHP versions.
Contents
PHP DOMDocument: Main Tips
- The XML data parser called DOM lets you manipulate XML scripts in your PHP code.
- DOM is a tree-based parser (as opposed to event-based XML parsers).
XML Data: Example of Tree-Based Structure
To understand how DOM views XML data, let's analyze the following code example:
<?xml version="1.0" encoding="UTF-8"?>
<from>Me</from>
XML data, as seen by DOM, has a tree-type structure:
- XML document itself is level 1.
- The root element, which is
<from>
, is level 2. - The text element, which is
Me
, is level 3.
DOMDocument Properties
The following properties can help you find out information about your XML document:
Property | Description |
---|---|
actualEncoding | Deprecated. It is a read-only property, representing the encoding of a document. |
config | Deprecated. When DOMDocument::normalizeDocument() is called, this configuration property applies. |
doctype | Represents the Document Type Declaration related to a document. |
documentElement | Provides direct access to child nodes in a document. |
documentURI | Represents the document location. Returns NULL if location is not found. |
encoding | Represents document encoding as indicated by the XML declaration. |
formatOutput | Helps to organize output with necessary spaces and indentation. |
implementation | Represents the DOMImplementation object which manages the document. |
preserveWhiteSpace | Specifies not to remove extra white space. |
recover | Proprietary. Turns on recovery mode. This attribute is not from DOM documentation but from libxml. |
resolveExternals | When set to TRUE, this attribute loads external entities from a doctype declaration. It is convenient for adding characters in XML documents. |
standalone | Deprecated. Indicates whether document is standalone. The same as xmlStandalone. |
strictErrorChecking | Displays DOMException when errors are detected. |
substituteEntities | Proprietary. Indicates whether entities should be substituted. It is not a part of DOM specification and is unique to libxml. |
validateOnParse | Used for loading and validating against DTD. |
version | Deprecated. Represents the XML version. Same as xmlVersion. |
xmlEncoding | Indicates the encoding of the XML document. NULL for when encoding is not found. |
xmlStandalone | Indicates whether the XML document is standalone. FALSE when this information is not found. |
xmlVersion | Indicates the version number of the XML document. If no declaration is found but document supports XML, the version is 1.0. |
Parsing Code Example
In our example below, we will be using a XML document called note.xml:
<?xml version="1.0" encoding="UTF-8"?>
<note>
<to>You</to>
<from>Me</from>
<heading>The Game</heading>
<body>You lost it.</body>
</note>
How to Load and Output XML Data
This is the script you will need to use to initialize XML parser.
<?php
$xml_doc = new DOMDocument();
$xml_doc->load('node.xml');
print $xml_doc->saveXML();
?>
Here are the steps of this code example:
- We create a new XML DOM Document.
- We use the
load
function to load XML data into an object. - We print the information from the XML DOMDocument we created.
Note: by using the function called saveXML(), we put internal XML data into a data string. It will then be ready to be displayed.
We get the following output:
Me You The Game You lost it.
If you try selecting View source in your browser, an HTML text will appear:
<?xml version="1.0" encoding="UTF-8"?>
<note>
<to>You</to>
<from>Me</from>
<heading>The Game</heading>
<body>You lost it.</body>
</note>
Executing Loops in XML Files
To initialize PHP XML parser, access the data and then iterate through, you should apply the foreach loop:
<?php
$xml_doc = new DOMDocument();
$xml_doc->load('note.xml');
$i = $xml_doc->documentElement;
foreach ($i->childNodes AS $item) {
print $item->nodeName . " = " . $item->nodeValue . "<br>";
}
?>
This code example follows the same steps: it creates a new XML DOM Document containing the data from note.xml file. Then, we apply foreach
to print the nodeName
and nodeValue
.
This is the output we get in such a case:
#text =
from = Me
#text =
to = You
#text =
heading = The Game
#text =
body = You lost it
#text =
Note: notice there is some white space left in our example. Keep in mind that XML DOMDocument may take those as elements and cause issues.
Learn to Make PHP Parse HTML
You should learn to create PHP DOMDocuments to modify HTML script. The following code example shows how you can create a PHP DOMDocument, containing HTML:
<?php
$doc = new DOMDocument();
$doc->loadHTML("<html><body>Example<br></body></html>");
echo $doc->saveHTML();
?>
PHP DOMDocument: Summary
- Using DOM library, you can handle XML and HTML documents.
- DOM stands for Document Object Model and belongs to tree-based XML parsers.
- Its functionalities come inbuilt in PHP.