Skip to content

Commit

Permalink
Update README.md
Browse files Browse the repository at this point in the history
  • Loading branch information
AndrewKorzh authored Aug 8, 2024
1 parent bb60de0 commit e403d71
Showing 1 changed file with 18 additions and 0 deletions.
18 changes: 18 additions & 0 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -23,6 +23,22 @@ DOWNLOADER_MIDDLEWARES = {
'scrapypuppeteer.middleware.PuppeteerServiceDownloaderMiddleware': 1042
}

PUPPETEER_SERVICE_URL = 'http://localhost:3000'

#To run locally (without scrapy-puppeteer-service started), you need to enable the setting:
PUPPETEER_LOCAL = True
```
For local execution it is also necessary to install Chromium for Pyppeteer.

## Configuration

You should have [scrapy-puppeteer-service](https://github.com/ispras/scrapy-puppeteer-service) started.
Then add its URL to `settings.py` and enable puppeteer downloader middleware:
```python
DOWNLOADER_MIDDLEWARES = {
'scrapypuppeteer.middleware.PuppeteerServiceDownloaderMiddleware': 1042
}

PUPPETEER_SERVICE_URL = 'http://localhost:3000'
```

Expand Down Expand Up @@ -50,6 +66,7 @@ There is a parent `PuppeteerResponse` class from which other response classes ar
Here is a list of them all:
- `PuppeteerHtmlResponse` - has `html` and `cookies` properties
- `PuppeteerScreenshotResponse` - has `screenshot` property
- `PuppeteerHarResponse` - has `har` property
- `PuppeteerJsonResponse` - has `data` property and `to_html()` method which tries to transform itself to `PuppeteerHtmlResponse`
- `PuppeteerRecaptchaSolverResponse(PuppeteerJsonResponse, PuppeteerHtmlResponse)` - has `recaptcha_data` property

Expand All @@ -66,6 +83,7 @@ Here is the list of available actions:
- `Click(selector, click_options, wait_options)` - click on element on page
- `Scroll(selector, wait_options)` - scroll page
- `Screenshot(options)` - take screenshot
- `Har()` - to get the HAR file, pass the `har_recording=True` argument to `PuppeteerRequest` at the start of execution.
- `RecaptchaSolver(solve_recaptcha)` - find or solve recaptcha on page
- `CustomJsAction(js_function)` - evaluate JS function on page

Expand Down

0 comments on commit e403d71

Please sign in to comment.