Project Summary: Overview of an Internet Search Engine
McClure has worked with one of the top five Internet Search Portals since its beginnings. The front-end for its site is maintained and developed by an internal development staff. But we have developed the back-end, in order for the site to process all of the requests. The scale of the site’s operation is as follows:
- 800,000 - 1,000,000 unique visitors per day and 30,000,000+ unique visitors per month.
- 50,000,000 hits per day.
- 4,000,000 database lookups per day.
- 2,000,000 database inserts per day.
- 300,000 searches per day.
- 200,000+ click-thrus per day.
What did we do?
We retrieve XML from external XML search result providers, based on the terms that are being searched, and XML newsfeeds from external news sources that place no load on the web server. By way of searches and click-thrus, we trace the domain name that users are coming into the system through. We provide associated searches on search terms, and a function that spell-checks the input search terms. In addition to that, we provide a large number of administrative functions through a web browser and secure section of the site.
About the Application
The Internet Search Portal’s project requirements include:
- Scaling to a large number of users.
- Integrating with external XML search result providers.
- Integrating with external XML newsfeeds, resulting in timely updates that do not place a load on the system.
- Saving search terms for future analysis.
- Saving click-thrus for future analysis.
In addition to these requirements, security has been a top priority for this project. Because there are a large number of viruses, worms, and hackers on the Internet, security is a very important feature of all online systems. Since we have been involved in the security of its system, there have been no security-related problems with this portal. Since 2001, the Internet has seen the Red Code and Nimda, viruses that attack open, default vulnerabilities in the Microsoft IIS system, and this Internet Search Portal’s systems are constantly under these kinds of viral and hacker attacks. But, with our help, none of the attacks have been successful.