Web scraping  

From The Art and Popular Culture Encyclopedia

Jump to: navigation, search

Related e

Wikipedia
Wiktionary
Shop


Featured:

Web scraping (or Web harvesting, Web data extraction) is a computer technique of extracting information from websites using specially coded software programs. Usually, such software programs simulate human exploration of the Web by either implementing low-level Hypertext Transfer Protocol (HTTP), or embedding certain full-fledged Web browsers, such as the Internet Explorer (IE) and the Mozilla Web browser. Web scraping is closely related to Web indexing, which indexes Web content using a bot and is a universal technique adopted by most search engines. In contrast, Web scraping focuses more on the transformation of unstructured Web content, typically in HTML format, into structured data that can be stored and analyzed in a central local database or spreadsheet. Web scraping is also related to Web automation, which simulates human Web browsing using computer software. Exemplary uses of Web scraping include online price comparison, weather data monitoring, website change detection, Web research, Web content mashup and Web data integration.

See also




Unless indicated otherwise, the text in this article is either based on Wikipedia article "Web scraping" or another language Wikipedia page thereof used under the terms of the GNU Free Documentation License; or on research by Jahsonic and friends. See Art and Popular Culture's copyright notice.

Personal tools