Oldest known version of this page was edited on 2004-07-27 18:37:32 by DarTar []
Page view:
Relevance
Most web surfing is done with a specific goal in mind: one has a question and is looking for an answer to that question. Of course, the question can be more or less precise: ranging from the traveler searching for the first train living today from London to Cambridge, to the amateur of the art looking for nice paintings. In any cases, the web surfer is looking for relevant web sites: sites that will answer his needs as straightforwardly as possible. In our examples, the first web surfer will be satisfied with a site containing the schedule of train departing from London, but he will be even more satisfied with a page displaying nothing but today’s schedule of trains going from London to Cambridge. Likewise, the amateur of the art may be happy with a site displaying pictures of sculptures and paintings – a museum site for instance, but since he is looking for paintings only, he may be more satisfied with a site displaying a selection of the most beautiful paintings.
Of course, quality also comes into the story: we may prefer a high quality but less relevant site than a low quality but highly relevant site. Our amateur of the art, for instance, may prefer the museum site than the selection of painting site, if the selection is badly done. Relevance and quality have both a role in determining the choices and actions of web-surfers. Unsurprisingly, these two properties do also play an essential in the structuring of the Web. The role and importance of quality is analysed in the thread e-authorities. So let us how the Web is structured according to the principle of relevance. As with quality, two actions come into play: first, webmasters make links from their site to relevant sites only; second, search engines implement algorithms that will display results as relevant as possible for a specific query.
1. Relevance and hyperlinking.
The hyperlinking strategy based on relevance cause the structure of the Web to include niches.
A niche is a set of web pages which include numerous links to each other but few links to other pages.
A site, for instance, constitute a particular niche of web pages since it links mostly to its own pages (at least most site do that). But niches may also be constituted around themes, institutions, cultural groups, etc.
Examples: Pages on nuclear physics; Official pages own by the French government, Hackers’ favorite sites, …
The consequence of the existence of niche, for the surfer, is that once one is within a niche, the probability that one exits the niche is lower than the probability that one will remain within the niche.
As surfer, you may have experienced the feeling that after some times your clicks make you turn into circles: you have already covered all the pages of the niche.
Normally, the surfer is willing to remain into a niche: once he has found the relevant area of the Web for its purpose, he is happy to remain in that area. Thus, niches serve the epistemic goals of web-surfers. Yet, the existence of niches has some drawbacks: a site which is relevant but out of its thematically associated niche is unlikely to be visited. Thus, the surfer may not have access to some relevant documents due to contingent factors structuring of the web.
Research Questions: What is the epistemic value of niches? What are the factors that influence or allow the constitution of a niche? What is the traditional counterpart, if any, of a niche?
2. Relevance and search engines.
The algorithm implemented by search engines for withdrawing relevant documents with regard to a query from the enormous amount of documents, is based on word occurrence. A document will be attributed a high score for relevance if it contains a high rate of the words entered in the search box (key-words) with a form putting these words forward (bold, hyperlink, title, …). A specific algorithm is also implemented for multiple key-words search giving a higher score to a document when the key-words are close one to another in the document.
Thus, while the processes of quality assessments rely on the dynamic structure of the web (see e-authorities), the processes through which relevance is assessed involve only the algorithm of search engines taking as input the content and the forms of documents.
Relevance and quality assessment procedures: Each property is assessed by two distinct algorithms that respectively provide a mark for relevance and a mark for quality. The two marks are then multiplied to give the order in which URL will be displayed in the SERP (search engine result page) answering a specific query.
Research questions: What are the epistemological consequences of this Relevance*Quality assessment procedure of search engine. How does this procedure differs from the traditional assessment procedure of scientific institutions?
Current research project:
The Transdisciplinary Semantics of Key Words
(Christophe Heintz)
I will argue that a major mode of information retrieval of the information age, namely the one provided by search engine, is strongly facilitating interdisciplinary research. If one uses a term for a keyword search, one gets a set of documents that dictate how the term is to be used, and thus, its meaning. Yet, a particularity of keyword search, as opposed to browsing, is that it goes across disciplines. Thus search engines break disciplinary boundaries and favor keyword clustered research.
Search engines provide a new way to extract relevant information through the use of keywords. Whether in libraries or on the Internet, researchers can now retrieve all the information that is clustered around a word or a set of words. What is specifically new with this mode of information retrieval is the fact it is independent of disciplines: the search result will span across disciplines for any document that contain the keywords of the query. This ignorance of disciplines by search engines has often been seen as a failure to provide relevant results. I will point out the positive aspects of this transdisciplinary search and show that the consequent de-contextualisation of keywords provides a rich soil for interdisciplinary scientific research.
First, a scientist documenting himself about a specific topic and using a keyword search will be confronted to the work of other disciplines using the same keywords. Historical and semantic reasons make it highly probable that the latter works are relevant for the scientist’s research, thus prompting motives for interdisciplinary research.
Second, the knowledge of the meaning of keywords is being emancipated from disciplinary boundaries: while traditional search through browsing library shelves lead scientists to be familiar with only disciplinary biased uses of a word, keyword search allows scientists to know how the keyword is used across disciplines. The predicted consequence is an ‘attunement’ of scientists’ understanding of the meanings of keywords and a lowering of disciplinary semantic incommensurability. Terms used as keywords will develop a transdisciplinary meaning.