X Searching Engine

Zhishuang Zhang
Faculty Advisor: Val Tannen, Zachary Ives

    XML, which stands for eXtensible Markup Language, has become important as a data storage and interchange format, because of its properties of being able to describe structured data, platform independence, human readable. DTD (Document Type Definition) and XML Schema (XSD) [1] are two ways to specify the structure of an XML document, therefore, search through XSD and DTD is an efficient and effective way to find XML data. This project is aiming to build a search engine for XSD in order to locate XML data. First, XSDs will be parsed and indexed by keywords. Then, XSD search engine will search against the index database. XSD itself has XML format, and it is also structured data format. In addition to search by keywords, search by structure, which is the relations between keywords, will make the search result more accurate and helpful.

    This project does not contain a web crawler. In other words, this project is focusing on how to maintain enough information about the structure of XSD in the index database and search against the database, instead of how to find XSD on the internet by a crawler.