Google now does DNS resolving too

Google now does DNS resolving too. You can switch if you’re using OpenDNS.

Introducing Google Public DNS: A new DNS resolver from Google

Today, as part of our efforts to make the web faster, we are announcing Google Public DNS, a new experimental public DNS resolver.

The DNS protocol is an important part of the web’s infrastructure, serving as the Internet’s “phone book”. Every time you visit a website, your computer performs a DNS lookup. Complex pages often require multiple DNS lookups before they complete loading. As a result, the average Internet user performs hundreds of DNS lookups each day, that collectively can slow down his or her browsing experience.

  • Speed: Resolver-side cache misses are one of the primary contributors to sluggish DNS responses. Clever caching techniques can help increase the speed of these responses. Google Public DNS implements prefetching: before the TTL on a record expires, we refresh the record continuously, asychronously and independently of user requests for a large number of popular domains. This allows Google Public DNS to serve many DNS requests in the round trip time it takes a packet to travel to our servers and back.
  • Security: DNS is vulnerable to spoofing attacks that can poison the cache of a nameserver and can route all its users to a malicious website. Until new protocols like DNSSEC get widely adopted, resolvers need to take additional measures to keep their caches secure. Google Public DNS makes it more difficult for attackers to spoof valid responses by randomizing the case of query names and including additional data in its DNS messages.
  • Validity: Google Public DNS complies with the DNS standards and gives the user the exact response his or her computer expects without performing any blocking, filtering, or redirection that may hamper a user’s browsing experience.

(Source: Google)

Just use the following name servers:

[code lang=”bash”]nameserver

Nice numbers. They got those from Level 3.

In 1996, Larry Page made this post

This is posted in 1996:

From: Lawrence Page

I have a web robot which is a Java app. I need to be able to set the User-Agent field in the HTTP header in order to be a good net citizen (so people know who is accessing their server). Anyone have any ideas?

Right now, Java sends a request that includes something like:

User-Agent: Java/1.0beta2

I’d rather not rewrite all the HTTP stuff myself. I tried just searching in the JDK for the Java/1.0beta2 figuring I could just change the string, but I couldn’t find it. Perhaps it is stored as a unicode string?

An easy method of setting the User-Agent field should probably be added to Java, so people can properly identify their programs.

Thanks, Larry Page

A web robot? Guess what is it?

[via Guyro]

Bing is now default search engine on IE6

What a terrible practice. Microsoft appears innocent by claiming it is currently investing a solution.

Bing Is Now Your Default Search Engine On IE6, Whether You Like It Or Not

The Next Web reports that users of Internet Explorer 6 are being forced to use Bing as their default search engine — even if they’ve manually switched their preference to another search provider, like Google. Attempts to switch the browser to something other than Bing result in an error message.

While the vast majority of users affected probably won’t even notice the change, some are beginnig to complain (you can find threads in Google’s forums here and here). Microsoft has confirmed the issue to Search Engine Roundtable, explaining that it is currently investigating a solution. (Source: Techcrunch)

By the way Bing is currently my 2nd top referrer.

List of stop words

Stop words sometimes known as stopwords or Noise Words (in the case of SQL Server), is the name given to words which are filtered out prior to, or after, processing of natural language data (text). Hans Peter Luhn, one of the pioneers in information retrieval, is credited with coining the phrase and using the concept in his design. It is controlled by human input and not automated. This is sometimes seen as a negative approach to the natural articles of speech as mentioned above. (Source: Wikipedia)

Here’s a list of stop words, it’s compiled from Mark Sanderson’s Information Retrieval linguistic utilities stop words list. It has been formatted to a PHP array for easy use:

[code lang=”php”]var $stop_words = array(“a”, “about”, “above”, “across”, “after”, “afterwards”, “again”, “against”, “all”, “almost”, “alone”, “along”, “already”, “also”, “although”, “always”, “am”, “among”, “amongst”, “amoungst”, “amount”, “an”, “and”, “another”, “any”, “anyhow”, “anyone”, “anything”, “anyway”, “anywhere”, “are”, “around”, “as”, “at”, “back”, “be”, “became”, “because”, “become”, “becomes”, “becoming”, “been”, “before”, “beforehand”, “behind”, “being”, “below”, “beside”, “besides”, “between”, “beyond”, “bill”, “both”, “bottom”, “but”, “by”, “call”, “can”, “cannot”, “cant”, “co”, “computer”, “con”, “could”, “couldnt”, “cry”, “de”, “describe”, “detail”, “do”, “done”, “down”, “due”, “during”, “each”, “eg”, “eight”, “either”, “eleven”, “else”, “elsewhere”, “empty”, “enough”, “etc”, “even”, “ever”, “every”, “everyone”, “everything”, “everywhere”, “except”, “few”, “fifteen”, “fify”, “fill”, “find”, “fire”, “first”, “five”, “for”, “former”, “formerly”, “forty”, “found”, “four”, “from”, “front”, “full”, “further”, “get”, “give”, “go”, “had”, “has”, “hasnt”, “have”, “he”, “hence”, “her”, “here”, “hereafter”, “hereby”, “herein”, “hereupon”, “hers”, “herself”, “him”, “himself”, “his”, “how”, “however”, “hundred”, “i”, “ie”, “if”, “in”, “inc”, “indeed”, “interest”, “into”, “is”, “it”, “its”, “itself”, “keep”, “last”, “latter”, “latterly”, “least”, “less”, “ltd”, “made”, “many”, “may”, “me”, “meanwhile”, “might”, “mill”, “mine”, “more”, “moreover”, “most”, “mostly”, “move”, “much”, “must”, “my”, “myself”, “name”, “namely”, “neither”, “never”, “nevertheless”, “next”, “nine”, “no”, “nobody”, “none”, “noone”, “nor”, “not”, “nothing”, “now”, “nowhere”, “of”, “off”, “often”, “on”, “once”, “one”, “only”, “onto”, “or”, “other”, “others”, “otherwise”, “our”, “ours”, “ourselves”, “out”, “over”, “own”, “part”, “per”, “perhaps”, “please”, “put”, “rather”, “re”, “same”, “see”, “seem”, “seemed”, “seeming”, “seems”, “serious”, “several”, “she”, “should”, “show”, “side”, “since”, “sincere”, “six”, “sixty”, “so”, “some”, “somehow”, “someone”, “something”, “sometime”, “sometimes”, “somewhere”, “still”, “such”, “system”, “take”, “ten”, “than”, “that”, “the”, “their”, “them”, “themselves”, “then”, “thence”, “there”, “thereafter”, “thereby”, “therefore”, “therein”, “thereupon”, “these”, “they”, “thick”, “thin”, “third”, “this”, “those”, “though”, “three”, “through”, “throughout”, “thru”, “thus”, “to”, “together”, “too”, “top”, “toward”, “towards”, “twelve”, “twenty”, “two”, “un”, “under”, “until”, “up”, “upon”, “us”, “very”, “via”, “was”, “we”, “well”, “were”, “what”, “whatever”, “when”, “whence”, “whenever”, “where”, “whereafter”, “whereas”, “whereby”, “wherein”, “whereupon”, “wherever”, “whether”, “which”, “while”, “whither”, “who”, “whoever”, “whole”, “whom”, “whose”, “why”, “will”, “with”, “within”, “without”, “would”, “yet”, “you”, “your”, “yours”, “yourself”, “yourselves”);[/code]

And here is a list of Google stop words, I can’t recall where I got this from but there’re numerous sites with such information. Once again formatted in a PHP array which you can quite easily convert to Java array:

[code lang=”php”]var $google_stop_words = array(“I” ,”a” ,”about” ,”an” ,”are” ,”as” ,”at” ,”be” ,”by” ,”com” ,”de” ,”en” ,”for” ,”from” ,”how” ,”in” ,”is” ,”it” ,”la” ,”of” ,”on” ,”or” ,”that” ,”the” ,”this” ,”to” ,”was” ,”what” ,”when” ,”where” ,”who” ,”will” ,”with” ,”und” ,”the” ,”www”);[/code]

This is useful for filtering out common words in an English paragraph that may be deemed insignificant. This is one of the things I used to implement something like a tag discoverer based on word frequency.

I compete with Colbie Caillat in Google for “justrealized”

I was googling my website to around and I just realized I did not come up top. Top is a song from Colbie Caillat called “Realize” in YouTube. No doubt it’s a beautiful song.

Competing with Colbie Caillat in Google

(Competing with Colbie Caillat in Google.)

And here’s the song. I couldn’t watch the one right at the top of the results due to region restriction in YouTube. Oh come on, there is near to no benefit restricting a music video to certain regions only.

Colbie Caillat – Realize – Roxy – Hollywood CA

Colbie Marie Caillat (born May 28, 1985 in Newbury Park, California) is an American pop singer-songwriter and guitarist from Malibu, California. (Okay I took that from Wikipedia)

Gathering iPhone developers at Microsoft

It felt kinda funny so I just had to take a picture. I was at MobFest last week and Microsoft was the venue sponsor.

Why develop for iPhone in Microsoft

When Microsoft got their interns to talk about their projects, a few people stood up and leave. So basically they’re not there for Microsoft at all. It’s kinda sad and funny at the same time. I, too, wasn’t there for Microsoft products. I am working in the building next to them so we just walk over to check MobFest out. After MobFest, Arzhou, Raine and Uzyn had some supper (or dinner). And yesterday, I was at Microsoft Singapore again for RIAction and they had lots of Adobe stuff there. Google and Yahoo! was there too.

I think it is really generous for them to let competitor products step into the company. It requires a certain amount of openness and generosity to allow that to happen. I mean they could have rejected these things and no one would say that they are selfish. After all, it’s not in their commercial interest to host these sort of events. Sure they could sip in a bit of Microsoft talks here and there but they don’t really have to do that too. I don’t think the company deserved to be made a butt of jokes all the time.

Google explains why all sites may harm your computer

This is Google’s response for informing all their users that every site on this world wide web is harmful to their computer. “Very simply, human error,” they confessed. It’s a huge mistake and definitely shaken people’s confidence a little. But by being truthful about the whole incident without using the word “whoops” (like Dreamhost) is good PR still.

“This site may harm your computer” on every search result?!?!

What happened? Very simply, human error. Google flags search results with the message “This site may harm your computer” if the site is known to install malicious software in the background or otherwise surreptitiously. We do this to protect our users against visiting sites that could harm their computers. We maintain a list of such sites through both manual and automated methods. We work with a non-profit called to come up with criteria for maintaining this list, and to provide simple processes for webmasters to remove their site from the list.

We periodically update that list and released one such update to the site this morning. Unfortunately (and here’s the human error), the URL of ‘/’ was mistakenly checked in as a value to the file and ‘/’ expands to all URLs. Fortunately, our on-call site reliability team found the problem quickly and reverted the file. Since we push these updates in a staggered and rolling fashion, the errors began appearing between 6:27 a.m. and 6:40 a.m. and began disappearing between 7:10 and 7:25 a.m., so the duration of the problem for any particular user was approximately 40 minutes. (Source: Google Blog)

I was a little troubled yesterday and had to use Yahoo for searches but I wasn’t too concern. Here’s StopBadware’s side of the story.