Showing posts with label New search engine for tables is developed. Show all posts
Showing posts with label New search engine for tables is developed. Show all posts

Wednesday, August 8, 2007

New search engine for tables is developed

U.S. computer scientists have created a search engine that can identify and extract tables from PDF documents, as well as index and rank the results.


The search engine -- called TableSeer -- developed by Pennsylvania State University researchers has an innovative ranking algorithm that also can identify tables found in frequently cited documents and weigh that factor as well in the search results, Assistant Professor Prasenjit Mitra said.


Mitra said TableSeer is believed to be the first search engine designed for tables.


Although some software can identify and extract tables from text, existing software cannot search for tables across documents, Mitra said. TableSeer automates that process, capturing data not only within the table, but also in tables' titles and footnotes. In addition, it enables column-name-based searches so a user can search for a particular column in a table.


The development of TableSeer is part of an open-source project funded by the National Science Foundation.


TableSeer can be tested online at http://chemxseer.ist.psu.edu. The source code will be made available near the completion of the project, the researchers said.


Copyright 2007 by United Press International. All Rights Reserved.


 

Google
 
Zeus Internet Marketing Robot
NNXH
Cyber-Robotics - ZEUS INTERNET MARKETING ROBOT