28 August 2008

Module 4

Log Entry of Three Search Results:

URL: http://www.xml.com/pub/a/2007/03/14/a-relational-view-of-the-semantic-web.html

Author: Andrew Newman 14th March 2007

Institution: O’ReillyXML.com

Summary: “A Relational View of the Semantic Web:”

Software which falls into the category of “Web 2” have the ability to search and find information drawn from databases using applications that are designed to search all data, locate and identify combinations in which bits of data appear and interpret the relevance of the data for the user. This new software has a flexibility that does not need to conform to rigid database design features to conduct a search. The outcome is the integration of data in any database being available and accessible to the user. Therefore, creating the most suitable software to operate within the semantic web and produce these results is the goal of numerous software developers.

Search Engine: Yahoo

URL: http://www.sciam.com/article.cfm?id=the-semantic-web

Author: Tim Berners-Lee, James Hendler and Ora Lassila

Institution: Scientific American magazine May, 2001

Summary: “The Semantic Web:”

(An extract from the article)

“The Semantic Web is not a separate Web but an extension of the current one, in which information is given well-defined meaning, better enabling computers and people to work in cooperation. The first steps in weaving the Semantic Web into the structure of the existing Web are already under way. In the near future, these developments will usher in significant new functionality as machines become much better able to process and "understand" the data that they merely display at present.”

Search Engine: Copernic Agent

URL: http://news.cnet.com/8301-13953_3-9891949-80.html

Author: Posted by Dan Farber 11/3/2008

Institution: cnetnews.com

Summary: “Tim Berners-Lee: Google could be superseded by the semantic Web:”

The evolution of the web is moving towards complete integration of data which will be a tool that will infiltrate every part of society. The automatic search and use of information by a new wave of software applications has already started. These applications exceed the current ability of limited search engines like Google and create a connection between all sources of information. The speed with which this new technology is adopted will depend on drivers such as commerce, usability and cooperation between providers to set standards of data recording. All factors need to be developed and reach a stage where they produce a cheap, user friendly and integrated search system.

Module 4

Searching the Web:

Log Entry 1: - Standard Google Search for “Semantic Web”.

(above) :This is the top 7 results of a Google search for “Semantic Web”. This is a Web index search which has picked out matches of the two words either in the title or the content of the web page. These results put the two words in order but as searches progress down the page, the words may be found in different areas of the web page. That is why there are 7,980,000 results in this search.

To make the search more relevant it would be helpful to conduct a further search within the first set of results to focus on a topic such as “applications” or “issues”.


Log Entry 2a: - Modified Yahoo Search, using modified search features: “Semantic Web” in the Title Page only.

(above): results 912,000 sites: This is a search of the same topic using Yahoo and making the search more specific. I have asked the search to look for the words “Semantic Web” in order and only in the title of the page. This should help ensure that the document I am seeking is primarily on the topic of the Semantic Web and not a passing comment in the text. Even with this one modification , the search results have been narrowed to 1/10th of the Google search results. There are three resources appearing in both searches but the modified search introduces an article but the main differences will be further down the search list as results that refer to the Semantic Web in the content only have been excluded.

These results, while still numerous ensure that the initial search for this topic results in sources that are only discussing this topic.

Log Entry 2b: - Deep Web search engine (Complete Planet) “Semantic Web”.

(above): results 1 site: This is a search using CompletePlanet “Deep Web Directory” with the same query. The result is a website for a directory type search engine that uses Semantic Web applications. CompletePlanet conducts a database search and the concept of the Semantic Web is not a reference term to be found in a database.

Log Entry 2c: - Copernic Search Manager using 6 Search Engines “Semantic Web”.


(above): results 22 sites (less 4 sponsored sites): This is Copernic search agent which combines the results of multiple search engines. This search used 6 search engines and the Yahoo engine was modified using the same filters as the example in log entry 2a to give the results of a “Deep Web Search” (i.e. only search for the words Semantic Web, in the titles). Copernic has combined duplicate sites and labeled which engines found them.

There are 3 sites in this search that appear in the broader Google search.


Summary: The results from the Google search are of a general nature but the first 5 displayed give a broad range of topics and reference material to start a research project. There is less information about the sites in the Copernic search manager so it is difficult to determine the relevance of the site from the summary of the source. On this website there is a “relevancy indicator” that is used to order the search results but I would hesitate to rely on this indicator alone and would check the data personally to determine the site’s relevancy to my research.


The Copernic search manager has performed satisfactorily (overlooking the first 4 sponsored sites which may not appear in a paid for version), as it has reduced the duplication of search results. If there was a specific research project or issue to narrow the topic down further and ample opportunity to practice using the application to be satisfied with its operation then this type of search manager should produce a narrow range of good quality results.

The Google and Yahoo results provide more details about the source than the other two search engines. The quality of the results is affected by the topic being researched but in this case the results in Google, Yahoo and Copernic are on a par with each other.