Using the Google SOAP Search API - Performing a Google Search
(Page 2 of 4 )
Google searches can be conducted through the doGoogleSearch method, which returns a GoogleSearchResult object. The method, however, takes a number of arguments: a string containing your license key, a string containing the query, an integer representing the first result you wish to retrieve (starting at zero), an integer representing the maximum number of results you wish to retrieve (the maximum is ten), a boolean turning search filtering on or off (this eliminates similar results), a string restricting results to a certain country or topic, a boolean turning SafeSearch on or off, a string restricting results to a certain language, and two more strings to specify encoding—arguments which are now disregarded.
Here, we query Google with “Developer Shed” and store the results in search:
GoogleSearchResultsearch = google.doGoogleSearch("x0x0",
"Developer Shed", 0, 10, true, "", true, "", "", "");
You will, of course, have to substitute the first argument with your own license key.
The above line of code tells the SOAP Search API to return ten results starting from the very first result obtained. Search filtering is turned on, which, as stated earlier, trims down the results by eliminating near-duplicate results. SafeSearch is turned on, and the results are not restricted to any language, topic or country.
We can obtain the results of the query through search, starting with some general information:
Console.WriteLine("Query "{0}" completed in {1}
seconds.", search.searchQuery,
search.searchTime);
Console.WriteLine("Estimated number of results: " +
search.estimatedTotalResultsCount);
Above, we display the query text, the amount of time the query took and the estimated number of results.
We can also access the results themselves (well, ten of them). Here, we iterate through the results and print a summary:
foreach (ResultElement result in search.resultElements)
{
Console.WriteLine("n" + result.title);
Console.WriteLine(result.URL);
Console.WriteLine(result.snippet);
}
As you can see, resultElements contains an array of ResultElement objects. A foreach loop is used to iterate through this array. Unfortunately, a bit of HTML is also included, and we may not always want that. However, regular expressions can fix this problem easily enough:
Regex stripHtml = new Regex("<(.+?)>");
foreach (ResultElement result in search.resultElements)
{
Console.WriteLine("n" + stripHtml.Replace
(result.title, ""));
Console.WriteLine(result.URL);
Console.WriteLine(stripHtml.Replace(result.snippet,
""));
}
Now let's say we need the next five results. To retrieve them, simply increase the starting index by ten and set the maximum number of results to five:
search = google.doGoogleSearch("x0x0", "Developer Shed",
10, 5, true, "", true, "", "", "");
The results can be modified significantly by turning filtering off:
GoogleSearchResultnoFilter = google.doGoogleSearch
("x0x0", "Developer Shed", 0, 10, false, "", true, "", "", "");
This produces a number of similar results. In most cases, this is undesired behavior, since a smaller variety of information is returned in a single query. However, the feature can be turned off, as we did above, in case you find some reason to do so.
Making use of country, topic and language restrictions is easy. For example, the following query returns results restricted to the German language and the German country. The country of a result is based on its top level domain and IP address.
GoogleSearchResult germanSearch = google.doGoogleSearch
("x0x0", "Geschichte", 0, 10, true, "countryDE", true, "lang_de",
"", "");
Google also provides a few topics that searches can be restricted to: American government (“unclesam”), Linux (“linux”), Macintosh (“mac”) and FreeBSD (“bsd”). Here, we restrict results to American government:
GoogleSearchResult govSearch = google.doGoogleSearch
("x0x0", "John Paul Jones", 0, 10, true, "unclesam", true, "",
"", "");
Google provides a full list of country, topic and language restrictions on the SOAP Search API website:
http://www.google.com/apis/reference.html#2_4
Next: Cached Pages >>
More Google Optimization Articles
More By Peyton McCullough