I had a page with an obscure url that wasn't linked from any other page and no where on the web. My sitemap file didn't include this page either.
Yet I just saw that Google indexed the page and it shows up in search results.
How can that happen?
The url for this page was completely obscure like:
Google's job is to index everything it can, so they're going to take advantage of everything that isn't illegal or ethically questionable. There are about four ways that I can think of, and I'm sure Google uses them all:
1. Links between pages. The most obvious. Even though you don't think there are any pages that link to yours, it's possible that you haven't accounted for something. I had pages getting unexpectedly indexed, and later discovered that there was a special system page that listed all pages on my site.
2. Pages submitted to them via https://www.google.com/webmasters/tools/submit-url
3. Pages visited through Google tools and utilities, possibly including things like Chrome and the Google Toolbar. (Yes, there's a cost associated with free things. There's no such thing as a free lunch.)
4. Information it can glean from requests made to Google itself (like pulling the HTTP referer out of the request).
If you don't want a page indexed by Google and other search engines, the officially sanctioned way to stop this is using the robots.txt file.