Overview

Service for retrieving HTTP resources asynchronously. Self-hosted within a lovely collection of docker containers.

Short Description

Send a POST request containing url, callback, (optionally) header and (optionally) parameter values. Content for the given url will be retrieved eventually and sent in a POST request to the specified callback url.

+--------------+                                                            +--------------+
|              |                                                            |              |
|              | POST http://localhost:8001/                                |              |
|              | url=http://example.com/                                    |              |
|              | callback=http://callback.example.com/                      |              |
|              | headers={                                                  |              |
|              |     "User-Agent":"Chrome, honest"                          |              |
|              | }                                                          |              |
|              | parameters={                                               |              |
|              |     "cookies": {                                           |              |
|              |     }                                                      |              |
|              | }                                                          |              |
|              |                                                            |              |
|              |                                                            |              |
|              |                                                            | Asynchronous |
| Your         |                                                            | HTTP         |
| application  | +--------------------------------------------------------> | retriever    |
|              |                                                            |              |
|              |                                                            |              |
|              |                                                            |              |
|              |                                                            |              |
|              |                       HTTP 200 OK                          |              |
|              |                       Content-Type: application/json       |              |
|              |                                                            |              |
|              |                       "118e35f631be802c41bec5c9dfb0f415"   |              |
|              | <--------------------------------------------------------+ |              |
+--------------+                                                            +--------------+
+

… some time passes …

+-------------+                                                             +--------------+
|             |                     POST http://callback.example.com/       |              |
|             |                     {                                       |              |
|             |                       "request_id": "118e35f631be802c41b…", |              |
|             |                       "status": "success",                  |              |
|             |                       "headers": {                          |              |
|             |                         "content-type": "text/html;"        |              |
| Your        |                       },                                    |              |
| callback    |                       "content": "PGRvY3R5cGUgaHRtbD4="     | Asynchronous |
| handler     |                     }                                       | HTTP         |
|             |                                                             | Retriever    |
|             | <---------------------------------------------------------+ |              |
+-------------+                                                             +--------------+

Why?

Pretty much every modern programming ecosystem provides a means for making HTTP requests and handling the resulting responses. You already get synchronous HTTP out the box, possibly asynchronous HTTP as well. Using whatever HTTP functionality your programming ecosystem provides is fine most of the time.

Want to retrieve the content of arbitrary urls often? No, you probably don’t. But if you do, you periodically run into failure cases.

We don’t like failure cases. Temporary service unavailability, intermittent internal server errors, unpredictable rate limiting responses.

To reliably retrieve an arbitrary HTTP resource, you need to able to retry after a given period for those odd cases where a request failed right now but which could (maybe would) succeed a little later. You introduce state (remembering what to retrieve) and you need something to handle doing so at the right time (some form of delayable background job processing).

You could re-write the means for doing so for every application you create that needs to retrieve resources over HTTP. Or you could not. Up to you really.

Production Readiness

Not production ready