🚨 Time is Running Out: Reserve Your Spot in the Lucky Draw & Claim Rewards! START NOW

Code has been added to clipboard!

Using PHP DOMDocument: Code Examples Explained

Reading time 4 min
Published Aug 8, 2017
Updated Oct 15, 2019

The PHP DOM library allows you to manage HTML and XML after you upload your script into a new PHP DOMDocument. Parsing is an important concept referring to changing source code into a more easy-to-read format.

For reading and changing HTML and XML, DOM library is the best option as it is automatically built together with some PHP versions.

PHP DOMDocument: Main Tips

  • The XML data parser called DOM lets you manipulate XML scripts in your PHP code.
  • DOM is a tree-based parser (as opposed to event-based XML parsers).

XML Data: Example of Tree-Based Structure

To understand how DOM views XML data, let's analyze the following code example:

Example
<?xml version="1.0" encoding="UTF-8"?>
<from>Me</from>

XML data, as seen by DOM, has a tree-type structure:

  • XML document itself is level 1.
  • The root element, which is <from>, is level 2.
  • The text element, which is Me, is level 3.

DOMDocument Properties

The following properties can help you find out information about your XML document:

Property Description
actualEncoding Deprecated. It is a read-only property, representing the encoding of a document.
config Deprecated. When DOMDocument::normalizeDocument() is called, this configuration property applies.
doctype Represents the Document Type Declaration related to a document.
documentElement Provides direct access to child nodes in a document.
documentURI Represents the document location. Returns NULL if location is not found.
encoding Represents document encoding as indicated by the XML declaration.
formatOutput Helps to organize output with necessary spaces and indentation.
implementation Represents the DOMImplementation object which manages the document.
preserveWhiteSpace Specifies not to remove extra white space.
recover Proprietary. Turns on recovery mode. This attribute is not from DOM documentation but from libxml.
resolveExternals When set to TRUE, this attribute loads external entities from a doctype declaration. It is convenient for adding characters in XML documents.
standalone Deprecated. Indicates whether document is standalone. The same as xmlStandalone.
strictErrorChecking Displays DOMException when errors are detected.
substituteEntities Proprietary. Indicates whether entities should be substituted. It is not a part of DOM specification and is unique to libxml.
validateOnParse Used for loading and validating against DTD.
version Deprecated. Represents the XML version. Same as xmlVersion.
xmlEncoding Indicates the encoding of the XML document. NULL for when encoding is not found.
xmlStandalone Indicates whether the XML document is standalone. FALSE when this information is not found.
xmlVersion Indicates the version number of the XML document. If no declaration is found but document supports XML, the version is 1.0.

Parsing Code Example

In our example below, we will be using a XML document called note.xml:

Example
<?xml version="1.0" encoding="UTF-8"?>
<note>
  <to>You</to>
  <from>Me</from>
  <heading>The Game</heading>
  <body>You lost it.</body>
</note>
DataCamp
Pros
  • Easy to use with a learn-by-doing approach
  • Offers quality content
  • Gamified in-browser coding experience
  • The price matches the quality
  • Suitable for learners ranging from beginner to advanced
Main Features
  • Free certificates of completion
  • Focused on data science skills
  • Flexible learning timetable
Udacity
Pros
  • Simplistic design (no unnecessary information)
  • High-quality courses (even the free ones)
  • Variety of features
Main Features
  • Nanodegree programs
  • Suitable for enterprises
  • Paid Certificates of completion
edX
Pros
  • A wide range of learning programs
  • University-level courses
  • Easy to navigate
  • Verified certificates
  • Free learning track available
Main Features
  • University-level courses
  • Suitable for enterprises
  • Verified certificates of completion

How to Load and Output XML Data

This is the script you will need to use to initialize XML parser.

Example
<?php
  $xml_doc = new DOMDocument();
  $xml_doc->load('node.xml');

  print $xml_doc->saveXML();
?>

Here are the steps of this code example:

  1. We create a new XML DOM Document.
  2. We use the load function to load XML data into an object.
  3. We print the information from the XML DOMDocument we created.

Note: by using the function called saveXML(), we put internal XML data into a data string. It will then be ready to be displayed.

We get the following output:

Me You The Game You lost it.

If you try selecting View source in your browser, an HTML text will appear:

Example
<?xml version="1.0" encoding="UTF-8"?>
<note>
  <to>You</to>
  <from>Me</from>
  <heading>The Game</heading>
  <body>You lost it.</body>
</note>

Executing Loops in XML Files

To initialize PHP XML parser, access the data and then iterate through, you should apply the foreach loop:

Example
<?php
  $xml_doc = new DOMDocument();
  $xml_doc->load('note.xml');

  $i = $xml_doc->documentElement;
  foreach ($i->childNodes AS $item) {
    print $item->nodeName . " = " . $item->nodeValue . "<br>";
  }
?>

This code example follows the same steps: it creates a new XML DOM Document containing the data from note.xml file. Then, we apply foreach to print the nodeName and nodeValue.

This is the output we get in such a case:

Example
#text = 
from = Me
#text = 
to = You
#text = 
heading = The Game
#text = 
body = You lost it
#text =

Note: notice there is some white space left in our example. Keep in mind that XML DOMDocument may take those as elements and cause issues.

Learn to Make PHP Parse HTML

You should learn to create PHP DOMDocuments to modify HTML script. The following code example shows how you can create a PHP DOMDocument, containing HTML:

Example
<?php
$doc = new DOMDocument();
$doc->loadHTML("<html><body>Example<br></body></html>");
echo $doc->saveHTML();
?>

PHP DOMDocument: Summary

  • Using DOM library, you can handle XML and HTML documents.
  • DOM stands for Document Object Model and belongs to tree-based XML parsers.
  • Its functionalities come inbuilt in PHP.