It is easy for me to access this knowledge pool, I want it to grow so that I can grow along

Last 3 Donors


Support us

Funding progress for 2010:

Supriya Patil

No picture of Supriya Patil available - click to provide one

About the author:
No description available of Supriya Patil...
ADD DESCRIPTION
ADD PUBLICATION
SHARE YOUR RESEARCH

Publications by Supriya Patil (bibliography)

 what's this?

» 2002 «

Edit | Del

Choudhari, Prashant, Davulcu, Hasan, Joglekar, Abhishek, More, Akshay, Mukherjee, Saikat, Patil, Supriya and Ramakrishnan, I. V. (2002): YellowPager: a tool for ontology-based mining of service directories from web sources. In: Proceedings of the 25th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval 2002. p. 458. Available online

The web has established itself as the dominant medium for doing electronic commerce. Realizing that its global reach provides significant market and business opportunities, service providers, both large and small are advertising their services on the web. A number of them operate their own web sites promoting their services at length while others are merely listed in a referral site. Aggregating all of the providers into a queriable service directory makes it easy for customers to locate the one most suited for his/her needs. YellowPager is a tool for creating service directories by mining web sources. Service directories created by YellowPager have several merits compared to those generated by existing practices, which typically require participation by service providers (e.g. Verizon's SuperYellowPages.com). Firstly, the information content will be rich. Secondly since the process is automated and repeatable the content can always be kept current. Finally the same process can be readily adapted to different domains. YellowPager builds service directories by mining the web through a combination of keyword-based search engines, web agents, text classifiers and novel extraction algorithms. The extraction is driven by a services ontology consisting of a taxonomy of service concepts and their associated attributes (such as names and addresses) and type descriptions for the attributes. In addition the ontology also associates an extractor function with each attribute. Applying the function to a web page will identify all the occurrences of the attribute in that page. YellowPager's mining algorithm consists of a training step followed by classification and extraction steps. In the training step a classifier is trained to identify web pages relevant to the service of interest. The classification step proceeds by doing a search for the particular service of interest using a keyword based web search engine and retrieves all the matching web pages. From these pages the relevant ones are identified using the classifier. The final step is extraction of attribute values, associated with the service, from these pages. Each web page is parsed into a DOM tree and the extractor functions are applied. All of the attributes corresponding to a service provider are then correctly aggregated. This can pose difficulties especially in the presence of multiple service providers in a page. Using a novel concept of scoring and conflict resolution to prevent erroneous associations of attributes with service provider entities in the page, the algorithm aggregates all the attribute occurrences correctly. The extractor function may not be complete in the sense that it cannot always identify all the attributes in a page. By exploiting the regularity of the sequence in which attributes occur in referral pages, the mining algorithm automatically learns generalized patterns to locate attributes that the extractor function misses. The distinguishing aspects of YellowPager's extraction algorithm are: (i) it is unsupervised, and (ii) the attribute values in the pages are extracted independent of any page-specific relationships that may exist among the markup tags. YellowPager has been used by a large pet food producer to build a directory of veterinarian service providers in the United States. The resulting database was found to be much larger and richer than that found in Vetquest, Vetworld, and the Super Yellow pages. YellowPager is implemented in JAVA and is interfaced to Rainbow, a library utility in C that is used for classification. The tool will demonstrate the creation of a service directory for any service domain by mining web sources.

Copyrights may apply

ADD PUBLICATION
SHOW THIS LIST ON YOUR HOMEPAGE

What do YOU think?

Give us your opinion! Do you have any comments/additions that you would like other visitors to see?

 
comment You say: Mar 18th, 2010
#1
Be the first to add a thoughtful note to this page ! 

  will be spam-protected
 

 
How many?
=
e.g. "6"
 

Changes to this page (author)

21 Feb 2010: Enabled abstracts to be shown on Supriya Patil's author page.
24 Jun 2007: Author was added to the bibliography

Publication statistics

Publication period:2002-2002
Publication count:1
Number of co-authors:6



Productive colleagues

Supriya Patil's 3 most productive colleagues in number of publications:

I. V. Ramakrishnan:20
Hasan Davulcu:10
Saikat Mukherjee:6


Collaboration count

Number of publications with 3 favourite co-authors:

Saikat Mukherjee:1
I. V. Ramakrishnan:1
Akshay More:1

 

Other options

Learn more about Supriya Patil:
- Google Scholar
- ACM
- CSB

Mar 18

The theory gives the answers, not the theorist.

-- Allen Newell

  • Share this quote on... Bookmark and Share
  • Get more quotes

Eva Hornecker on Tangible Interaction

Eva Hornecker explains the evolving concept of Tangible Interaction.

Read Eva's insightful entry here..

Help us help you!

  • Spread the word: Bookmark and Share
  • Donate
  • Other ways to help
 

Page information

Page maintainer: The Editorial Team
How to cite/reference this page
URL: http://www.interaction-design.org/references/authors/supriya_patil.html