29 August 2008

Module 4 - Annotation

Summary:

The Web has made huge amounts of information available to society and has been a technological leap in communication and data storage, retrieval and display. Even so, storage and retrieval of data has depended on the ability of search software to find data stored in structured formats (schema) that allow it to be recognized as legitimate data and contain a complete set of attributes before it can be retrieved.

Creating suitable applications to perform these “relational” tasks means the development of software features that are not constrained to select data that has been stored using a predetermined schema. This software needs to be “agile” enough to collect data, store and retrieve it in a format that responds to its quality and quantity and can evolve as more data is changed or added.


The relational data base relies on the primary key to link related data from different tables and it will not retrieve and display that data if the primary key is missing or different. The Semantic Web, however, has the potential to retrieve data from unrelated databases which therefore requires a more flexible alternative to the traditional notion of the primary key.


SPARQL is a Web 2 software which is an extension of the Web search software that can operate with the existing Web data storage structure whilst accessing the Semantic Web data. In the example of a database, the Semantic Web has access to information which reaches across the data defined in a table.


The ability of this type of software to form relationships with data outside of one database table with data in another database will create “one huge database” and make the next evolutionary leap in data storage and retrieval.

Preference:

  1. The information describing the article was an introduction to a number of articles on the same topic as the Semantic Web and not directly attributable to the content of the researched article. Therefore, although the summary was relevant to the Semantic Web, the article by Andrew Newman was specific to describing a type of software with the capability of searching and interacting in the Semantic Web environment. There is some technical information which, although very detailed, was not helpful to the conceptual discussion that I was looking for.

The annotation which I have written is a summary of this particular article and provides a description which would give me a more informed and preferable version of the article.

  1. External users would prefer to use the information in the annotation that I have written. It would take longer to read than the “snapshot” on the original source but it would save time to read a description that provided the key elements of the article rather than a general summary of the articles located at this site.

Newman, Andrew, “A Relational View of the Semantic Web”. March 14th, 2007. Retrieved from: http://www.xml.com/pub/a/2007/03/14/a-relational-view-of-the-semantic-web.html

Module 4

Evaluating Web Sites:

Type of Content:

The website I have chosen is a site that contains an article located at O’ReillyXML.com. This is a media website that publishes articles and resource material. The information is more reliable than a site such as a blog but not as proven as a research paper. This site however, reproduces topical articles from many newspapers and magazines.

Search Engine: Yahoo “Semantic Web”

The description of this article, although brief, has more credence because of the website that it appears in i.e. “Semantic Web Resource Center”. This site is a collection of commercial and public resources for research groups and has a catalogue of articles on the Semantic Web.

The headline of this article is: “A Relational View of the Semantic Web”.

The “blurb” reads “Creating a Web of machine readable information, leading the Web to its fullest potential…” But this is an introduction for a list of articles about the Semantic Web and not directly attributable to this article.

URL: http://www.xml.com/pub/a/2007/03/14/a-relational-view-of-the-semantic-web.html

Author: Andrew Newman 14th March 2007

Institution: O’ReillyXML.com

Summary: “A Relational View of the Semantic Web:”

Abstract:

Software which falls into the category of “Web 2” have the ability to search and find information drawn from databases using applications that are designed to search all data, locate and identify combinations in which bits of data appear and interpret the relevance of the data for the user. This new software has a flexibility that does not need to conform to rigid database design features to conduct a search. The outcome is the integration of data in any database being available and accessible to the user. Therefore, creating the most suitable software to operate within the semantic web and produce these results is the goal of numerous software developers.

Relevance to Purpose:

I selected this article for its relevance to the concept of Meta Data being the data required to produce quality search results in research. The concept of the Semantic Web is an evolutionary step to using Meta Data and the title implies that the relational aspect between types of Meta Data is an important factor it the development of the Semantic Web.

Purpose of the Site:

This site would be described as commercial because it invites subscription to its service for a fee and has an abundance of advertising material on it. It sells books on line and reference material can be downloaded for a fee. However, the article is valid and the information can be tracked to a reliable source.


O’RiellyXML.com is a resource and publishing center. To post to the site membership is required but articles are free to read. It would be a useful search tool for future reference.

Author:

The Journalists who wrote this article are recorded in the search results. The facts noted in their article are suitably referenced in the Bibliography for authenticity. The credentials of the author are available on a separate reference page:


Andrew Newman

“Andrew Newman is currently working for the University of Queensland's eResearch centre and part time on his Honours. His has previously worked on Kowari and continues to actively support the RDF API for Java, JRDF. His current interests include SPARQL, defeasible logic, agile databases using RDF, ontology development, and software development methodologies. and as the source is a newspaper, it should be seen as an opinion on a topical issue and not a research document.”

Publisher:

The article is published by “© 2008, O'Reilly Media, Inc.”. The Website produces a directory of articles about the topic but states at the foot of the page that copyright is owned by the individual authors. There is an invitation on the site to submit articles on topics that are new and “it helps if you know that we tend to publish "high end" books rather than books for dummies, and generally don't want yet another book on a topic that's already well covered.”

Content Bias/Balance:

The content of articles is talking about a new area of Web use in the Semantic Web. The article doesn’t lend itself to being bias unless support for the Semantic Web concept is bias. There is a substantial amount of information and speculation that is supported by explanatory data and a bibliography.

Coverage:

Other articles about the topic are published on the same site and are listed in the directory. Also using Google I found 27 related sites:

Currency:

The date of the article was recorded on the search results. The development of this concept will make each article time sensitive because of the changing technology. However, this site seems to be a popular resource for publications and attracts current information on the topic. A search of this site on the Semantic Web produced 9,842 references to similar articles. The range of posting dates appears to be from 2000 to 2008 but they are not listed in date order and required a manual search for the information.

Signs of Recognition:


Links to http://www.xml.com/semweb/ (above) – 6 (3 are blogs, 2 are linked to the publisher’s Webpage and 1 is an educational reference.

Tags for this Site:


Tags for www.xml.com (above) – 13 bookmarked items from this site used by 5,585 people. Is this a significant number of tags for this type of site? A search in del.icio.us for “Semantic Web” resulted in 23,915 bookmarked items by thousands of people. This doesn’t place www.xml.com as a popular website for this information.

Blog reaction to website www.xml.com 1,432 The “authority rating” is respectable for the blogs on a specialized topic. A search for blogs with references to the “semantic web” was 8,281. Not all references from xml.com are about the semantic web so more comparison would have to be made to determine whether bloggers thought this site is popular.


Blog reaction to website www.xml.com 1,432 The “authority rating” is similar for these blogs as the previous but would need some more comparison with other sites to get the relative popularity.

Citations of Articles by other Researchers: This is a search using the title of the article to see whether it has been cited in any research articles. The search found that the article had been cited in one published paper.

Module 4

Boolean searching task:

The biggest number of hits relating to these key words:

Using OR:

Semantic OR Web: would initiate a search for the individual words or both words combined.

Semantic

http://www.google.com.au/search?hl=en&as_qdr=all&q=semantic&btnG=Search&meta

Web

http://www.google.com.au/search?hl=en&as_qdr=all&q=web&btnG=Search&meta

Semantic OR Web

http://www.google.com.au/search?hl=en&as_q=&as_epq=&as_oq=semantic+web&as_eq=&num=10&lr=&as_filetype=&ft=i&as_sitesearch=&as_qdr=all&as_rights=&as_occt=any&cr=&as_nlo=&as_nhi=&safe=images

Search Items

Results

Semantic

23,600,000

Web

4,210,000,000

Semantic OR Web

4,230,000,000

Information most relevant to what I actually wanted to look for:

Using Semantic AND Web AND Concepts: would initiate a search of sites that only contained these words and as I was searching for information on the concept of the Semantic Web, I could exclude reference to “applications”by using NOT and ().

(Semantic AND Web AND Concepts) NOT applications: http://www.google.com.au/search?hl=en&as_q=&as_epq=Semantic+Web+concepts&as_oq=&as_eq=applications&num=10&lr=&as_filetype=&ft=i&as_sitesearch=&as_qdr=all&as_rights=&as_occt=title&cr=&as_nlo=&as_nhi=&safe=images

Using Google, the number of results using a Boolean search with these parameters, came to 49.

Google doesn’t seem to recognize the Boolean Operators and substitutes its own “Implied Boolean Logic” instead with the use of + & - symbols. The following site address provides a list of corresponding Boolean Operators used by Google: http://www.googleguide.com/advanced_operators_reference.html

Information coming only from Universities:

  1. http://www.google.com.au/search?hl=en&as_q=Semantic+Web+concepts+universities&as_epq=&as_oq=&as_eq=&num=10&lr=&as_filetype=&ft=i&as_sitesearch=&as_qdr=all&as_rights=&as_occt=any&cr=&as_nlo=&as_nhi=&safe=images – This search included the AND Universities – Results 849,000. The term “Universities” doesn’t limit the results to University sites.

  1. Specialised search engines or nominated types of sites such as *.edu will focus on data from unique areas: http://www.google.com.au/search?hl=en&as_q=&as_epq=Semantic+Web&as_oq=&as_eq=&num=10&lr=&as_filetype=&ft=i&as_sitesearch=.edu&as_qdr=all&as_rights=&as_occt=title&cr=&as_nlo=&as_nhi=&safe=images This site is powered by Google but only brought up results from educational institutions. Results 15,400

Searches using different search engines would reveal different search results and each search would require several modifications to narrow down the search to get the best quality information.