How to Search the Web
Understanding what the Internet is and how it works can enhance your online experience.
- The Internet itself is a worldwide network of interconnected computers that allows users to access and transfer information remotely. The information viewed on the Internet is actually not on the Internet at all, but rather on other computers, and viewed via the Internet. A useful analogy would be to think of the Internet as a phone network: Just as a telephone provides people with the ability to contact distant locations and exchange information verbally, the Internet allows users to contact faraway locations and exchange information electronically.
- The physical composition of the Internet is the system of wires, fiber-optic cables, routers and circuits that make this connection possible. Many people view the Internet as an abstract body of information floating around in “cyberspace.” This is not the case at all. The Internet is a worldwide network of computer networks, and the information accessed resides on the connected computers themselves.
- The most popular service accessed through the Internet is the World Wide Web (WWW). The Web and the Internet are often considered to be synonymous but actually represent two different things. Whereas the Internet is the means for accessing information, the Web is composed of the visual display of the information being accessed. Web pages are collections of files and documents stored on computers around the world, formatted in a programming language called HTML (hypertext markup language). This permits users to move between them by clicking on highlighted areas, called hyperlinks, or links for short. The Web is navigated using a technology called hypertext. Hypertext is a name for documents containing embedded pathways that, when clicked, direct users to other documents. These “links” can come in the form of words, phrases, icons or graphics, and create interconnectedness between files and documents, giving character to the image of the World Wide Web as a “web.”
- A Web browser is a computer program that allows users to access the Internet and view information on the Web. They accomplish this by interpreting HTML files, and displaying them as “pages” on a user’s computer. Browsers are designed to facilitate an ease of navigation through the Web’s pages, by taking advantage of its many benefits afforded by hypertext. Popular browsers include Internet Explorer, Firefox, Safari and Netscape. Browsers with the proper “plug-ins” (software upgrades that permit users to open specific file types) allow users to: view documents, watch videos, listen to audio files, chat with other users, play games and watch animations.
Finding information on the Web is like being a police detective: Your information is only as good as your sources. Learn how to evaluate Web site credibility with the links below.
- If you're using information for a lighthearted e-mail, the source isn't that important. If you're conducting research for a professional report, you had better be sure your information is legitimate. Checking sources may sound arduous, but there are a few crucial questions that can help:
Who is the author(s)? What are their credentials? Look at the domain (the last part of the Web address, for example: .com, .org or .edu). This generally tells you what kind of site you're using: .ac and .edu sites are regulated educational sites; .com and .biz sites are for commercial purposes; and .gov sites are U.S. government sites. Other Web address endings can indicate the country of origin of the site. Some domains are sponsored and therefore heavily regulated (.jobs, .museum and .travel are a few examples), while others are not sponsored. Learn more about top-level domains (TLDs).
Who is making the information available? How is the site being funded? Are they trying to sell you something? Does the site appear to have any social or political biases? The “About Us” section of a site is a good place to start but it shouldn’t be the end of your research. One way to look for additional company or author information is to try the name in a search engine. For an author, try searching the name along with key subject words to check for any additional work or credentials.
When was the information first published? Has it been updated recently? Many Web pages indicate when they were created and last revised. Check the bottom of the page for a copyright date or look for a date near the byline of an article. Without a date, the timeliness of the information is difficult to evaluate.
Search engines are online software programs designed to help users locate relevant Web sites, and are some of the most highly trafficked sites out there. Understanding how search engines work can help you get the results you want and sort through the irrelevant, misleading results you’ll undoubtedly encounter.
- When you enter a keyword (that is, a significant term or phrase related to the Web site you hope to find) into a search engine, you're not searching the entirety of the Web. Rather, you're searching the list of Web sites that the search engine has indexed (which can be in the billions). If the search engine hasn't added a Web site to its index, it cannot include it in the search results.
- Search engines sift through text on Web pages using computer programs called spiders. "Spiders" crawl on the "Web," get it?
- Spiders are very fast but they can travel only through the hyperlinks that connect Web sites. If a page isn't linked to any other pages, spiders can't find it. The part of the World Wide Web that is not linked is called the "invisible Web" or the "deep Web." It may contain information highly relevant to your search. To find resources on the invisible Web, see "The Invisible Web" and "Web Directories" sections of this guide.
- Search engines don't know why you want information—they simply find information according to the words you've entered. These results are not recommendations; search engines don't rank their results by the content of each site. They use mathematical equations (or algorithms) to rank them, and the formula may have little to do with a site's legitimacy or value to you.
- Companies have gotten wise to the way that search engines work. This has created an environment where Web pages are created and customized with the goal of appearing near the top of a search engine’s results list regardless of their credibility or usefulness. This practice is called "search engine optimization," and it's one reason that not all of your search results will be relevant or trustworthy.
- The "Help," "About" or "Preferences" sections of a search engine site often have helpful tips for using that particular search engine to your advantage. For example, if you’re looking for a definition, Google tells you to add “define:” to the beginning of your keyword. Thus, a search for “define: search engine” in Google will give you a list of definitions for “search engine” from around the Web. Similar tricks are innumerable, and all search engines have them. Google has a complete list of “search operators.”
- There is more than one kind of search engine. General search engines, also called “horizontal” search engines, search for all types of information. “Vertical” search engines search only within certain topics. “Meta” search engines search other search engines. Using the kind that does exactly what you need can improve your search results.
- If you have trouble finding the information you want, ask yourself: Is my keyword too general? Too specific? Are there useful synonyms? Could related topics be more effective?
- Do you get so many results that you can’t find the sites that answer your question? Here’s how to reduce the number of results:
Use more than one word in your search. For example, type "chicken salad sandwich" instead of just "sandwich."
Try to be more specific in your terms. If you want a panini, type "panini" instead of "sandwich."
Use "and" instead of simply typing two words (for example, "soup AND sandwich"), and your results will include only sites that contain both terms.
Use "not" to exclude certain terms from your results: "sandwich NOT bologna."
- Want more search results? Try using fewer words when you search. Typing "Reuben sandwich" instead of "classic Reuben sandwich" will yield more results. By using the term "or" and trying a few related words at once ("sandwich OR gyro OR panini"), you increase your results exponentially.
Aside from Google, there are hundreds of alternative search engines, each with its own set of merits. Get tips on choosing a search engine and find some worth considering when your go-to engine fails you.
- A majority of search engines have features that allow users to search specifically for images, videos, news, blogs and much more. Links to these categories are generally found above the search bar, and need only to be clicked to activate the specialty search features.
- Each search engine’s index of sites is unique; each has a different formula for spidering through them. This means there can be significant variation in the results that different engines generate for the same search terms. For example, visit the site Zuula. Zuula allows you to search across multiple platforms by putting them all in one location. After entering your search term, you'll be given a typical-looking results page. What makes it unique is that by clicking the tabs listed across the top of the page, you'll be given the results for your search term on each of the search engines listed. Google, Yahoo, Live, Gigablast, Exalead, Alexa, Entireweb, Mahalo and Mojeek are all in one place.
- These sites represent a selection of the best search engines, and are certainly as much as most Web users would need. But as this continually updated list on Wikipedia demonstrates, search engines come in all varieties, and their number is vast.
Many of the Web’s most extensive sites work like libraries. These database sites keep their information tucked away in the stacks, and if you want something, you have to ask for it. Although search engines may visit these libraries, they rarely make it past the lobby, and they refuse to ask the librarian for help. This causes them (and you) to miss out on the massive amount of information stocked in the back rooms. This hidden material is referred to as part of the "deep" or "invisible" Web.
- A brief explanation of the invisible Web:
Information in databases can only be accessed by direct searches (searches from within the site itself), which prevents search engines from finding it.
White pages, electronic books, online journals, image files, newspaper archives, dictionary definitions and patents are examples of the file types found in databases. Frequently updated or changing information, like ticket prices and job listings, are also part of the deep Web.
Although its exact size is unknown, the deep Web is believed to be 400–550 times larger than the surface Web (the area accessible to search engines).
- One trick for finding databases with standard search engines is as simple as adding the term “database” to your search query. Instead of “Buddhism,” try “Buddhism database.” By doing this, you are using the search engine to find a gateway to more information, rather than the information itself.
- Online databases occasionally require users to pay for access to their content. Schools and libraries subscribe to various database services, so consult your librarian for a list of resources that they may make available to you. Otherwise, consider your research goals to determine whether paying is worthwhile.
Web directories are lists of hand-selected sites compiled by Web users and organized into categorical tree structures to help users locate sites with content that's relevant to their research. Web directories are very useful when you want to find sites related to a specific category.
- Web directories are browsable collections of links, assembled by humans and classified by subject.
- Web directories generally fall into two categories: scholarly (assembled, edited and annotated by experts and professionals) or commercial (rely on site traffic and advertising to operate).
- To find subject directories, simply add the term “directory” to your search query. This will lead you to a page of preselected sources on the topic you're searching.
- Each directory has a different focus, and you’ll need one that suits your individual needs. For instance, if you want to find sites on video game cheat codes, a commercial directory like the Open Directory Project or dmoz would be a good start. If you want access to sociology journals, try a scholarly directory such as ipl2.
- Because directories only contain the title, URL and sometimes a brief description of the sites listed and not a site’s full text, your searches within a directory will be most successful if you're more general than specific. Keep this in mind when forming your search terms: It may be best to begin with a broad topic to reduce the chances of eliminating valuable sites. For example, if you're looking for information on Picasso, start by searching for “modern art,” then explore the sites listed.
- Browse a directory only if it lacks a search function. If you’re browsing, you’ll need to guess which subject heading your topic would fall under at each layer. If at any point you follow an incorrect topic, you’ll miss the link that you're looking for.
- To ensure quality, ask yourself some questions: How are the links selected, and by what criteria are they judged? Are the links accompanied by descriptions? Are these descriptions written by directory staff or by Web site creators themselves? For answers to these questions, visit the "Web Site Credibility" section of this guide.
In social bookmarking, a community of users compiles an index by collectively submitting (or bookmarking) their favorite sites, and then tags them (assigns each site a set of keywords) so that they'll turn up when you search. With social bookmarking, you'll learn how popular a Web site is and get only results you know other users find useful or interesting.
- If you're still scratching your head about what social bookmarking is, watch this YouTube video for an explanation.
Most standard sources of information aren't adequate for academic purposes; what you need is the information and in-depth research found only in scholarly sources. Use the sites below to find scholarly sources online that can aid your research or writing.
- Scholarly sources aren't meant to be easy to read or understand. They are often first-hand sources, or come from people and organizations that deal specifically with your topic of interest.
- The Online Education Database's article, "Research Beyond Google: 119 Authoritative, Invisible, and Comprehensive Resources," details Google alternatives ranging from the invisible Web to search engines specializing in things like art, government data and transportation.
Almost all of the information you find on the Internet is copyrighted. All copyright and intellectual property laws also apply to the Internet, so become familiar with terms-of-use policies and how to cite a source.
- Citing is the act of attributing borrowed ideas used in your own work to the authors or locations from which you took them.
- Always assume something is copyrighted and as a general rule, ask permission before using it.
- If you want to find free-use material, read the findingDulcinea Free-Use Media Web Guide.
- Unsure of what Web content is free to use and what you need to cite? See the findingDulcinea Web Guide to Plagiarism Prevention for sites on citation and plagiarism information.