MS Computer Science Virtual University of Pakistan

CS726 – Information Retrieval Techniques Viva Preparation

CS726 – Information Retrieval Techniques Viva Preparation

Question 01 What is Information Retrieval (IR)?
Answer: IR is the techniques of storing and recovering and often disseminating recorded data especially through the use of a computerized system


Question 02 What is precision Information Retrieval (IR)?
Answer: Precision (P) is the fraction of retrieved documents PRECISION that are relevant. This is the percentage of retrieved documents that are in fact relevant to the query.

Precision = Number of relevant document retrieved / Total number of documents retrieved


Question 03 What is recall Information Retrieval (IR)?
Answer: Recall (R) is the fraction of relevant documents that are retrieved. This is the percentage of documents that are relevant to the query and were in fact retrieved.

Recall = Number of relevant documents retrieved / Total number of relevant documents



Question 04 What are Models used in Information Retrieval (IR)?
Answer: Boolean Model, Vector Model, Probabilistic Model


Question 05 What is Boolean Model used in Information Retrieval (IR)?
Answer: The Boolean retrieval model is a model for information retrieval in which we can pose any query which is in the form of a Boolean expression of terms, that is, in which terms are combined with the operators and, or, and not. The model views each document as just a set of words.


Question 06 :      How we get relevant document in Information Retrieval (IR)?
Answer: By using Precision and Recall method.


Question 07        What are Information Retrieval (IR) ingredients?

•      Documents representation

•      Query formulation

•      Query processing


Question 08           How we measure relevant document in IR?
Answer: The two most frequent and basic measures for information retrieval effectiveness are precision and recall. By using these measures we can measure relevant document.


Question 9    What is TF-IDF?
Answer:  In information retrieval, tfidf, short for term frequency–inverse document frequency, is a numerical statistic that is intended to reflect how important a word is to a document in a collection or corpus. It is often used as a weighting factor in information retrieval, text mining, and user modeling.


Question 10  What is Probabilistic Model?
Answer: The probabilistic model computes the similarity coefficient between queries and documents as the probability that a document will be relevant to a query.


Question 11      What is Ranking Function?
Answer: A function that assigns scores to documents with regard to a given query.


Question 12  What is Query Vector in IR?
Answer:  Presenting queries in the term of vector space is called Query vector. Query vector is typically treated as a document and also tf-idf weighted.


Question 13   What is Similarity Measure?
Answer: A similarity measure is a function that computes the degree of similarity between two vectors.