Scraping Best Practices Investigator - Python


About the Job:

Your key objective will be to advance Scrapinghub’s knowledge of web technologies and web scraping best practices.

This is not a production role. Instead, you’ll be given the time and resources to iteratively, and with scientific rigor, test hypotheses and produce a research-backed knowledge base for other developers at Scrapinghub.

Despite not working on specific customer projects, your work will help fuel growth across all of Scrapinghub’s Data business (Professional Services & Data on Demand). Your measures of success will be your ability to iterate quickly and produce assets that are useful to other Shubbers.

Job Responsibilities:

    • Create and execute well designed experiments (repeatable, multiple treatments, testable variables, controls, replication) to learn more about how to best complete web scraping projects
    • Produce well written, indexed, reports of your findings (similar to publishing to an academic journal, though not nearly as lengthy)
    • Propose new experiments to run
    • Work with the Team Lead to prioritize the backlog of experiments
    • Maintain best practice guides for other Shubbers who will be implementing client solutions based on your findings
    • Propose changes to Scrapinghub’s other products (Crawlera, Scrapy Cloud, etc) or Scrapy itself based on your findings

Job Requirements:

    • Excellent communication in written English.
    • A strong understanding of the Scientific Method and the ability to continuously implement a process that follows it with rigor.
    • Take a logical, measurement-backed approach to prioritizing projects, and enjoy working with others that do the same.
    • Familiarity with techniques and tools for crawling, extracting and processing data, asynchronous communication and distributed systems.
    • A strong knowledge of Python along with a broad general programming background; strong problem solver.
    • Enjoy working across several teams and communicating with your end customer (other Shubbers)