So sánh phương pháp
Xem các phương pháp đã chọn cạnh nhau; những hàng khác biệt được làm nổi bật.
| Thu thập dữ liệu web từ xa× | Thu thập dữ liệu dựa trên API× | |
|---|---|---|
| Lĩnh vực | Phương pháp luận khảo sát | Phương pháp luận khảo sát |
| Họ | Process / pipeline | Process / pipeline |
| Năm ra đời≠ | 2000s–2010s (cloud infrastructure era) | 2000s–2010s (formalized as a research method) |
| Người khởi xướng≠ | Distributed computing and web automation communities | Emerged from computational social science and web 2.0 platform practices |
| Loại≠ | Automated remote data collection technique | Digital data collection technique |
| Công trình gốc≠ | Mitchell, R. (2018). Web Scraping with Python: Collecting More Data from the Modern Web (2nd ed.). O'Reilly Media. ISBN: 978-1491985571 | Salganik, M. J. (2018). Bit by Bit: Social Research in the Digital Age. Princeton University Press. ISBN: 9780691158648 |
| Tên gọi khác | cloud web scraping, server-side scraping, remote automated data extraction, distributed web scraping | API data harvesting, API-driven data collection, programmatic data retrieval, API research data collection |
| Liên quan≠ | 3 | 5 |
| Tóm tắt≠ | Remote web scraping is a data collection approach in which automated scripts or bots harvest publicly accessible web content — text, tables, metadata, or links — running on remote servers or cloud infrastructure rather than on the researcher's local machine. This separation allows continuous, large-scale, or geographically distributed crawling that local setups cannot sustain, making it particularly suited to longitudinal or high-volume data collection tasks. | API-based data collection is a systematic technique in which a researcher sends structured requests to an application programming interface to retrieve data automatically from digital platforms, databases, or services. It is the primary method used in computational social science to gather large-scale social media records, government open data, financial data streams, and scientific repository content in machine-readable formats such as JSON or XML, enabling reproducible and scalable data acquisition that manual collection cannot match. |
| ScholarGateBộ dữ liệu ↗ |
|
|