Note: These are rough notes from the W3C Workshop on eGovernment and the Web. It is being held in Washington DC on June 18th-19th.
Google is working to make sure that they’re working with government and evangelizing Google’s FREE services.
Google’s biggest focus isn’t site search. It’s Web search. There was a recent NY Times article on search results quality. There is another side which is crawling the pages to find all that exists.
Some of the biggest barriers is that content is hidden by a search form. There can be a robots.txt which tells the crawler not to crawl.
Bureau of Alcohol, Tobacco, and Firearms prevents all search engine crawling with a robots.txt file. Google’s index doesn’t recognize the acronym ATF.
US Gov is the largest publisher of the world’s data. The data is put into databases for easy access but unless correctly structured a search engine can not crawl it.
This is important because of the value that is places on public sector information. People trust .gov more than .com. It’s unbias and free.
Microsoft, Yahoo, Ask, and Google have come together to agree on a common SiteMap standard. This can make all Web services accessible to search engine crawlers. It is pretty easy to implement.
There are 4 parameters:
- last modification date
- change frequency
PlainLanguage.gov successfully implemented the sitemaps protocol in around 8 hours. Now the site is being crawled and added to search results. When there are changes to the site, the sitemap is updated and uploaded.
There are now partnerships with about four states.
Searching has become the defacto way for people to find public sector information.