Sitemaps: XML, HTML, etc.

Collapse
This topic is closed.
X
This is a sticky topic.
X
X
 
  • Filter
  • Time
  • Show
Clear All
new posts
  • Crumb da dumdum
    Corporal

    • Aug 2009
    • 12

    Sitemaps: XML, HTML, etc.

    I am currently reading "The secrets to promoting your website online" and it makes sure to make a distinction between the XML and HTML site map. What I understand is that the HTML is what I have typed on the bottom of each page in my website, for example: www.difesacanecorso.com/thestud.html

    Also, is it better to have an HTML site map at the bottom of each page or a separate page of site map where it is listed out in outline form? Should I do both?

    Lastly, what is the other site map? The ones bots use (I guess). How do I make one? I admit I have not finished the e-book as I am making changes as I go along so it is slowing me down. Does the e-book eventually talk about XML site maps and how to build them or there is a helpful video out there I need to read?

    Thanks!
  • Vasili
    Moderator

    • Mar 2006
    • 14683

    #2
    Re: Site maps - HTMLvs XML and what are pipes?

    *
    Every website should have these 'Core' elements of compliance and optimization properly installed : XML Sitemap, Robots.txt, and should consider the optional HTML Sitemap as both a Visitor and SE aide to navigation and relevance.


    An HTML Sitemap (different than a 'Textlink Menu' or 'HTML Navigation') is used primarily to provide Visitors a means of an overview of your website, as if to build a better conveyance of organization or hierarchy of Content as well as method of navigating to your pages that may or may not have page links provided, as in 'hidden pages' (some pages you may deem not important to promote, but those pages are included in your site development as a way to deepen your site relevance and provide greater site valuation for optimization purposes: they are hidden only in the sense that links are shown only in the HTML Sitemap and not in any navigational Menu, but the Content and relevance remains 'archived' on the site and in compliance as long as they are mentioned in the HTML Sitemap -- at least one page link -- the XML Sitemap, and the Robots.txt files, and thus accessible to both Visitors and bots to contribute to overall site relevance metrics and Search Inquiries). HTML Sitemaps are different than one that you create on a page to show 'nested navigation' or the relationships between pages in sets or sub-sets of links to those pages themselves: this is simply a functional visual for Visitors to use in addition to the main menu scheme. HTML Sitemaps can be auto-created as a single compliant file by the same generator as would create an XML Sitemap (which makes it easy for you), and once completed, you simply create a simple text Link to that file (usually in the footer area commonly as simply 'Sitemap'), for both Visitors and bots to access easily.

    The very important XML Sitemap is created specifically for the bots and Search Engines as an aid to help them methodically and organizationally cache and understand your website, as well as provide a framework by which to measure and verify the relevance that your page titles, page content, photo titles, and navigational links should all be mirroring for greater import to the SE's as they evaluate the overall worth of your site in comparison to the millions of others. XML Sitemaps are also auto-generated, and once completed, that file is also linked via a simple text link (also usually in the footer area as simply 'XML' and the link written as http://www.YourSite.com/sitemap.xml) although it is intended for bots and SE use only, and thus should be "Visitor disabled" by placing a transparent Shape over the text link to prevent persons from clicking on it.

    The Robots.txt file is important to any site also, as it helps declare the "Rules" for the bots and SE's to follow as they scour your site. It is the primary effective means to control what the bots see or don't see (and cache) on your site, and the only real manner in which to "emphasize" which pages are most important to be cached and evaluated (read in here "promoted") and to forbid images, pages, or other specific elements to be included in any caching or "seen" by any bots. Any pages, images, or links dis-allowed by the robots.txt file should also be present in your XML Sitemap file, as explained in the second post of the thread mentioned below. The robots.txt file does not need any link on your site whatsoever: once it is uploaded into your Root Directory as a file, it will be one of the first things the SE's and bots look for when they arrive to do their work.


    Remember! Create and install an updated XML Sitemap, Robots.txt, and HTML Sitemap after every update to your website (new pages, changed links, added/deleted titled images, etc.) to accurately reflect the structure of your website -- any deviation from 100% accuracy of relevance will result in penalty by the SE's.


    READ MORE ABOUT THE XML SITEMAP AND THE ROBOTS.txt FILE USE AND CREATION IN THIS THREAD
    Also lesser mentioned in the thread is the HTML Sitemap, all three of which can be auto-generated at >> www.xml-sitemaps.com


    *
    For a deeper understanding, simply use a Google Search to find the many resources that offer various explanations, one of which is sure to present the material in a manner that you are comfortable learning from.
    . VodaWebs....Luxury Group
    * Success Is Potential Realized *

    Comment

    • Crumb da dumdum
      Corporal

      • Aug 2009
      • 12

      #3
      That was super helpful!

      I will go back to all of my pages, remove the footer that I have my 'homemade' site map on and make a separate page that is the HTML site map in outline form as you suggested. I will post a link to that at the bottom of each page. This will certainly make updating a whole lot easier. Now, instead of adding a link to the new page on every page, I will only have to update one page. Regarding the XML sitemap and ROBOTS.txt, I am still clueless. I will read the posts you suggested and see if that helps. Thanks again Vasili!

      I remember reading this before and I am still confused. Other than pages I have make: www.difesacanecorso.com/contact.html or www...com/photogallery.html I have no idea how to choose what files I want to disallow. The article continuously uses images as an example. Why would someone not want images to be searched? Maybe I need "ROBOTS.txt for Dummies"

      Also, someone else asked exactly what I was thinking, I so I make this file in notepad and save it. 1. What do I save it as? 2. Where do I put it? Someone else commented that it should be 'just pop it into your public_html folder' I have no idea what that even means. I think I'll work on my HTML site map for now and give you time to try to make me a little more internet competent.

      Thanks again, Vasili, you are always a saving grace for me.

      3 Posts Merged As One

      Comment

      • Vasili
        Moderator

        • Mar 2006
        • 14683

        #4
        Re: Site maps - HTMLvs XML and what are pipes?

        *
        Be sure you understand the concepts fully as detailed above, and remember to implement the required elements (to disable the XML file link with a transparent Shape over it, etc.) to assure proper function of any added file or Rules convention.

        A.
        Originally posted by Crumb da dumdum
        I remember reading this before and I am still confused. Other than pages I have make: www.difesacanecorso.com/contact.html or www...com/photogallery.html I have no idea how to choose what files I want to disallow. The article continuously uses images as an example. Why would someone not want images to be searched? Maybe I need "ROBOTS.txt for Dummies"
        Whether or not you go to extra lengths to 'protect' your Titled images from being plagairized by using a watermark, label, or other method (most do not even think of implementing any protection for their proprietary images - specially titled or not), Google especially will cache separately any image it has access to and will make them available in their ever-growing library for anyone that does a simple Search to download and use however they want without any permissions from you being required.

        To thwart the uncontrolled re-distribution of images that you consider uniquely yours, the robots.txt file establishes the Rules that all bots and SE's follow when visiting and caching your site resources: creating a "disallow" (or 'do not follow') Rule for a photo gallery page or pages with specific images will prevent those pages from being included in any public cache that is rife for global plagairism.


        B.
        Originally posted by Crumb da dumdum
        Also, someone else asked exactly what I was thinking, I so I make this file in notepad and save it. 1. What do I save it as? 2. Where do I put it? Someone else commented that it should be 'just pop it into your public_html folder' I have no idea what that even means. I think I'll work on my HTML site map for now and give you time to try to make me a little more internet competent. Thanks again, Vasili, you are always a saving grace for me.
        1. Please re-read the suggested Thread carefully .... it explains fully how to establish the logical Rules that will govern the SE's and bots as they visit your site.

        2. You can auto-generate all of the the files at the www.xml-sitemaps.com website, and you may find the additional explanations and Tips informative as you refine your understanding of the fine points of Internet Protocol and Optimization.

        3. The files will automatically be saved in the proper format (sitemap.xml, sitemap.html, robots.txt) as they are created for you, and you will need to download them onto your system prior to uploading them to the Root Directory (public_html/) of your hosting account. There is no need to make any changes to them prior to uploading them to the server, and you must accurately create text links on your page using these same file names (i.e. http://wwwYourDomain.com/sitemap.xml).

        4. Once you have downloaded the properly formatted auto-generated files to your system (it is suggested you create a separate, special folder to keep them in for handy access and to prevent confusion with other similar files for other websites), it is simple to transfer them to your Root Directory directly using the simplified BlueFTP tool included in BlueVoda itself, which makes proper identification of directories and folders much more obvious than if using the cPanel FILE MANAGER Uploader tool.
        From the BlueVoda Toolbar: TOOLS > FTP MANAGER ....
        *Please review this Tutorial to learn how to use this tool effectively.


        * Be sure that any Rule created in the robots.txt is also included in the sitemap.xml file, and vice-versa: they must mirror each other with 100% accuracy to avoid penalty.

        * To manually edit or update the sitemap.xml file, the robots.txt file, or the sitemap.html file you should use the Notebook program included on your PC only, as this program (unlike MS Word or even OpenOffice) does not apply any text formatting that will interfere with the proper code function or file 'readability' by bots or SE's.
        . VodaWebs....Luxury Group
        * Success Is Potential Realized *

        Comment

        • Crumb da dumdum
          Corporal

          • Aug 2009
          • 12

          #5
          Re: Site maps - HTMLvs XML and what are pipes?

          I think I get it a little better. I'm sure it will make more sense as I am doing it. I am making some big changes to my website. When I looked at www.xml-sitemaps.com the 3 line made mention of broken links. Right now, I have a ton since I am dding new pages. I will finish my new design in Blue Voda and then to the xml site map. Ugh, maybe I am ahead of myself here.

          I was making my own HTML sitemap by using the 'text' option in Blue Voda and just outlining my website. Is this bad? Is it better to use www.xml-sitemaps.com to make my sitemap.html?

          Hopefully this thread does not disappear so I can reference it when I have finished making my updates. Thanks again, I'm sure I'll have more questions soon.

          Comment

          • Vasili
            Moderator

            • Mar 2006
            • 14683

            #6
            Re: Sitemaps: HTML, XML, etc ??

            Originally posted by Crumb da dumdum View Post
            I was making my own HTML sitemap by using the 'text' option in Blue Voda and just outlining my website. Is this bad? Is it better to use www.xml-sitemaps.com to make my sitemap.html?
            Using HTML Textlinks in the footer of your pages is the best way to compensate for the lack of hard-coded links the SE's require to demonstrate a relationship between the pages that is missing if a website uses primarily dynamic scripted links that are not 'read' by the SE's at all (Drop-Down Menus, Sliding Menus, roll-over buttons, etc.). HTML Links are basically "Text Links" by nature, and are nothing like the HTML Sitemap file which is truly a file in code format of all the pages, images, shapes, and every other element of a website noted in HTML (which then becomes a usable link to these objects). This format is simply easy for the bots and SE's follow to make sure they note every item and page of a website as they organize their caching, and this file so formatted is definitely not "Visitor-friendly" .... it is simply created in HTML for the various bots that primarily read HTML rather than XML or other coding --- XML is simply "the new code on the block," if that helps to understand why now two [different] sitemap files are required.

            As explained in the first reply above, your inclination to provide an "HTML Sitemap Page" for Visitors is entirely your choice, and is usually seen as an added bonus as Visitors drill-in to the pages and topical Content that interests them most. The mixing up the terminology is common, however, for your understanding of a 'Sitemap' in this way needs to re-focus on the fact you are using Text-based Hyperlinks arranged in a visually organized manner on a page to show the hierarchy of your site's structure, whether or not this "Sitemap" includes otherwise previously un-mentioned pages or resources not present in your Dynamic menu or Primary navigation scheme. HTML Sitemap pages are also especially helpful from a site design standpoint, for example, if there are too many pages of a site to simply list all the Text Links in the footer without becoming an eyesore just to compensate for the Dynamic Menu used.
            *The debate lives on whether the 'Sitemap' page fully compensates for the lack of every page being "hard-linked" on/to the Index page itself via HTML links - in a simple menu , as part of composed Content, or added in the footer when in the presence of Dynamic navigation used primarily (as the Index page is the singlemost important page of any site), but the evidence (i.e. SERP results) from many thousands of sites so far seems to support there is no negative aspect as long as the site has the properly created robots.txt, sitemap.xml, and the sitemap.html files in place to allow bots and SE's to naturally follow as means to compensate for Dynamic unreadability and any "missing" footer textlinks. It is thus obvious a single "referring" HTML Textlink to a 'Sitemap Page' in addition to the required HTML Textlink to the XML Sitemap file is all the hard-coded links needed to fulfill the protocol requirements.

            You can certainly create a Text Link in the footer of each page as "Sitemap" which takes you to a single website Page that has the nested navigation displayed as Text Links in a hierarchy, as this in addition to any Primary or Dynamic menu scheme will allow Visitors to see a complete overview of your site to navigate more intelligently.


            1. Finish the updates and changes to your site completely, including the "Sitemap" page with Text Links in a hierarchy layout.

            2.
            Once all pages have been re-published with the site-wide navigation updates (including the "Sitemap" text link in the footer to your 'Sitemap Page'), return to www.xml-sitemaps.com to generate a sitemap.xml file and a sitemap.html file, and save them to your local system.

            3.
            Create a robots.txt file and carefully edit the Rules if necessary. Once your robots.txt file is accurate, open your sitemap.xml file in Notebook to make sure it reflects each and every specific Rule created in your robots.txt file.

            4.
            Upload all three files using BlueFTP to your Root Directory to complete the task.


            If necessary, review each of the replies in this thread to understand the concepts completely before beginning, and follow the outlined steps precisely for the predictable results that are necessary for the optimal performance of your website.


            THREAD CLOSED
            . VodaWebs....Luxury Group
            * Success Is Potential Realized *

            Comment

            Working...
            X