GitHub - tristangoossens/php-sitemap-generator: PHP script for generating a XML sitemap for your website

Sitemap generator

Object based PHP script that generates a XML sitemap with the given config options. I made this script because I wanted to automate making a sitemap for google indexing and because there were not a lot of open source sitemap generators out there.

Sitemap format: http://www.sitemaps.org/protocol.html

Features

Feel free to help me implement any of the missing features or add extra features

Generate a sitemap for your website
Multiple options for generating sitemaps
Option to only look through certain filetypes
Load client side Javascript content when crawling
Parse all relative link types (// , # , ?) and more

Installation

Installing this script is simply just downloading both sitemap_config and sitemap_generator and placing them into your project(same directory).

Usage

After installing the script you can use the script by including it into your script

include "/path/to/sitemap-generator.php";

And initializing the class by calling the constructor

// Create an object of the generator class passing the config file
$smg = new SitemapGenerator(include("sitemap-config.php"));
// Run the generator
$smg->GenerateSitemap();

Config

You can alter some of the configs settings by changing the config values.

// Site to crawl and create a sitemap for.
// <Syntax> https://www.your-domain-name.com/ or http://www.your-domain-name.com/
"SITE_URL" => "https://student-laptop.nl/",

// Boolean for crawling external links.
// <Example> *Domain = https://www.student-laptop.nl* , *Link = https://www.google.com* <When false google will not be crawled>
"ALLOW_EXTERNAL_LINKS" => false,

// Boolean for crawling element id links.
// <Example> <a href="#section"></a> will not be crawled when this option is set to false
"ALLOW_ELEMENT_LINKS" => false,

// If set the crawler will only index the anchor tags with the given id.
// If you wish to crawl all links set the value to ""
// <Example> <a id="internal-link" href="/info"></a> When CRAWL_ANCHORS_WITH_ID is set to "internal-link" this link will be crawled
// but <a id="external-link" href="https://www.google.com"></a> will not be crawled.
"CRAWL_ANCHORS_WITH_ID" => "",

// Array with absolute links or keywords for the pages to skip when crawling the given SITE_URL.
// <Example> https://student-laptop.nl/info/laptops or you can just input student-laptop.nl/info/ and it will not crawl anything in that directory
// Try to be as specific as you can so you dont skip 300 pages
"KEYWORDS_TO_SKIP" => array(
    "http://localhost/student-laptop/index", // I already have a href for root ("/") on my page so skip this page
    "/student-laptop/student-laptop.nl/", // Invalid link example
),

// Location + filename where the sitemap will be saved.
"SAVE_LOC" => "sitemap.xml",

// Static priority value for sitemap
"PRIORITY" => 1,

// Static update frequency
"CHANGE_FREQUENCY" => "daily",

// Date changed (today's date)
"LAST_UPDATED" => date('Y-m-d'),

Output

Example output when generating a sitemap using this script

<?xml version="1.0" encoding="UTF-8"?>
<urlset xmlns="http://www.sitemaps.org/schemas/sitemap/0.9">
  <!-- 3 total links-->
  <!-- PHP-sitemap-generator by https://github.com/tristangoossens -->
  <url>
    <loc>https://student-laptop.nl/</loc>
    <lastmod>2021-03-10</lastmod>
    <changefreq>daily</changefreq>
    <priority>1</priority>
  </url>
  <url>
    <loc>https://student-laptop.nl/underConstruction</loc>
    <lastmod>2021-03-10</lastmod>
    <changefreq>daily</changefreq>
    <priority>1</priority>
  </url>
  <url>
    <loc>https://student-laptop.nl/article?article_id=1</loc>
    <lastmod>2021-03-10</lastmod>
    <changefreq>daily</changefreq>
    <priority>1</priority>
  </url>
</urlset>

Name		Name	Last commit message	Last commit date
Latest commit History 3 Commits
README.md		README.md
sitemap-config.php		sitemap-config.php
sitemap-generator.php		sitemap-generator.php
sitemap.xml		sitemap.xml

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Sitemap generator

Features

Installation

Usage

Config

Output

About

Releases

Packages

Contributors 2

Languages

tristangoossens/php-sitemap-generator

Folders and files

Latest commit

History

Repository files navigation

Sitemap generator

Features

Installation

Usage

Config

Output

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Contributors 2

Languages

Packages