Grab - site scraping framework

Grab could help you to:

  • Extract data from web site
  • Work with web-service API
  • Automate some activty on the web site

Important information:

If you want to use Grab in Windows OS then you should to download our pycurl library compilation, we have fixed the bug in pycurl library which causes some POST requests to fail. Link to download: pycurl-ssl-7.19.0.win32-py2.7.msi

Discussions in python-grab

  • 23 September 07:35: Поясните работу с регулярными выражениями, IndexError: list index out of range
  • 21 September 18:52: Re: Spider. Несколько запросов в одном таске.
  • 20 September 13:16: И опять win32 ImportError: DLL load failed:
  • 18 September 20:44: Парсинг сайта, результат в виде mysql-дампа
  • 13 September 21:24: UnicodeEncodeError: 'ascii' codec can't encode...

Documenation

Docs are here docs.grablib.org. Originally docs were written in Russian. Now I am trying to tranlate documentation into English.

Here is incompleted English docs.

How to help Grab project

  1. Write publication about the Grab in your blog or on some pupular discussion board like reddit or hacker news
  2. Report a bug, describe details
  3. Create new feature and submit pull-request
  4. Order some site-scraping project at DataLab

Development activity


Fork me on GitHub