large sitemap index woes

Author
Priya Jain Author
|
3 hours ago Asked
|
1 Views
|
0 Replies
0

hey folks,

we've made some progress on the dynamic sitemap generation, but now we're hitting a wall with the sitemap index for really large sites. it's causing some serious headaches.

  • The Core Problem: our main sitemap index file is either taking forever to build/update, or it's not correctly reflecting the sub-sitemaps changes fast enough.
  • Symptoms & Technical Details:
    • Slow generation times, sometimes leading to server timeouts.
    • Google Search Console reports 'couldn't fetch' for the sitemap index occasionally.
    • We're seeing stale lastmod dates for some of the sub-sitemaps in the index, even after they've been updated. its really frustrating.
  • Example (Illustrative):
    <?xml version="1.0" encoding="UTF-8"?>
    <sitemapindex xmlns="http://www.sitemaps.org/schemas/sitemap/0.9">
      <sitemap>
        <loc>https://example.com/sitemap_products_1.xml.gz</loc>
        <lastmod>2023-10-26T10:00:00+00:00</lastmod> <!-- This date is often stale -->
      </sitemap>
      <sitemap>
        <loc>https://example.com/sitemap_products_2.xml.gz</loc>
        <lastmod>2023-10-26T10:00:00+00:00</lastmod>
      </sitemap>
      <!-- ... hundreds more ... -->
    </sitemapindex>
    

    and sometimes, we get a timeout:

    [2023-11-01 04:30:15] cron.ERROR: SitemapIndexGenerationFailed: Process timed out after 300 seconds.
    
  • Our Current Approach: we're using a database query to build the sitemap index, aggregating lastmod from individual sitemap files.
  • Seeking Advice On:
    • Strategies for optimizing sitemap index generation speed for millions of URLs.
    • Best practices for ensuring lastmod dates in the sitemap index are always accurate and timely.
    • Any caching mechanisms or distributed generation techniques for massive sitemap index files.
  • Closing: Really looking for some expert insights here.

0 Answers

No answers yet.

Be the first to provide a helpful answer!

Your Answer

You must Log In to post an answer and earn reputation.