I have a wordpress site where I want to stop search engines from crawling an entire directory. I know that I can do this in the robots.txt file (in the root of the site) by adding a "Disallow" line for that directory. However...
In the same site I am using the "XML Sitemap" plugin to automatically build and submit a sitemap.xml when any content changes on the site. Unfortunately, there is no way to automatically stop the plugin to from listing pages within the directory that I do not want crawled. Each time I add a new page within that directory I have to manually exclude that page from the sitemap (the plugin allows for this).
My question is what takes precedence...the robots.txt file or the sitemap.xml file? In other words, if a page is listed in the sitemap.xml file will it be crawled by the search engines if its parent directory is disallowed in robots.txt?