• Stars
    star
    82
  • Rank 396,546 (Top 8 %)
  • Language
    PHP
  • License
    MIT License
  • Created over 3 years ago
  • Updated 9 months ago

Reviews

There are no reviews yet. Be the first to send feedback to the community and the maintainers!

Repository Details

πŸ“œ Framework-agnostic package to load items from any paginated JSON API into a Laravel lazy collection via async HTTP requests.

🐼 Lazy JSON Pages

Author PHP Version Laravel Version Octane Compatibility Build Status Coverage Status Quality Score Latest Version Software License PSR-7 PSR-12 Total Downloads

Framework agnostic package using asynchronous HTTP requests and generators to load paginated items of JSON APIs into Laravel lazy collections.

Need to load heavy JSON with no pagination? Consider using Lazy JSON instead.

Install

In a Laravel application, all you need to do is requiring the package:

composer require cerbero/lazy-json-pages

Otherwise, you also need to register the lazy collection macro:

use Cerbero\LazyJsonPages\Macro;
use Illuminate\Support\LazyCollection;

LazyCollection::macro('fromJsonPages', new Macro());

Usage

Loading paginated items of JSON APIs into a lazy collection is possible by calling the collection itself or the included helper:

$items = LazyCollection::fromJsonPages($source, $path, $config);

$items = lazyJsonPages($source, $path, $config);

The source which paginated items are fetched from can be either a PSR-7 request or a Laravel HTTP client response:

// the Guzzle request is just an example, any PSR-7 request can be used as well
$source = new GuzzleHttp\Psr7\Request('GET', 'https://paginated-json-api.test');

// Lazy JSON Pages integrates well with Laravel and supports its HTTP client responses
$source = Http::get('https://paginated-json-api.test');

Lazy JSON Pages only changes the page query parameter when fetching pages. This means that if the first request was authenticated (e.g. via bearer token), the requests to fetch the other pages will be authenticated as well.

The second argument, $path, is the key within JSON APIs holding the paginated items. The path supports dot-notation so if the key is nested, we can define its nesting levels with dots. For example, given the following JSON:

{
    "data": {
        "results": [
            {
                "id": 1
            },
            {
                "id": 2
            }
        ]
    }
}

the path to the paginated items would be data.results. All nested JSON keys can be defined with dot-notation, including the keys to set in the configuration.

APIs are all different so Lazy JSON Pages allows us to define tailored configurations for each of them. The configuration can be set with the following variants:

// assume that the integer indicates the number of pages
// to be used when the number is known (e.g. via previous HTTP request)
lazyJsonPages($source, $path, 10);

// assume that the string indicates the JSON key holding the number of pages
lazyJsonPages($source, $path, 'total_pages');

// set the config with an associative array
// both snake_case and camelCase keys are allowed
lazyJsonPages($source, $path, [
    'items' => 'total_items',
    'per_page' => 50,
]);

// set the config through its fluent methods
use Cerbero\LazyJsonPages\Config;

lazyJsonPages($source, $path, function (Config $config) {
    $config->items('total_items')->perPage(50);
});

The configuration depends on the type of pagination. Various paginations are supported, including length-aware and cursor paginations.

Length-aware paginations

The term "length-aware" indicates all paginations that show at least one of the following numbers:

  • the total number of pages
  • the total number of items
  • the number of the last page

Lazy JSON Pages only needs one of these numbers to work properly. When setting the number of items, we can also define the number of items shown per page (if we know it) to save some more memory. The following are all valid configurations:

// configure the total number of pages:
$config = 10;
$config = 'total_pages';
$config = ['pages' => 'total_pages'];
$config->pages('total_pages');

// configure the total number of items:
$config = ['items' => 500];
$config = ['items' => 'total_items'];
$config = ['items' => 'total_items', 'per_page' => 50];
$config->items('total_items');
$config->items('total_items')->perPage(50);

// configure the number of the last page:
$config = ['last_page' => 10];
$config = ['last_page' => 'last_page_key'];
$config = ['last_page' => 'https://paginated-json-api.test?page=10'];
$config->lastPage(10);
$config->lastPage('last_page_key');
$config->lastPage('https://paginated-json-api.test?page=10');

Depending on the APIs, the last page may be indicated as a number or as a URL, Lazy JSON Pages supports both.

By default this package assumes that the name of the page query parameter is page and that the first page is 1. If that is not the case, we can update the defaults by adding this configuration:

$config->pageName('page_number')->firstPage(0);
// or
$config = [
    'page_name' => 'page_number',
    'first_page' => 0,
];

When dealing with a lot of data, it's a good idea to fetch only 1 item (or a few if 1 is not allowed) on the first page to count the total number of pages/items without wasting memory and then fetch all the calculated pages with many more items.

We can do that with the "per page" setting by passing:

  • the new number of items to show per page
  • the query parameter holding the number of items per page
$source = new Request('GET', 'https://paginated-json-api.test?page_size=1');

$items = lazyJsonPages($source, $path, function (Config $config) {
    $config->pages('total_pages')->perPage(500, 'page_size');
});

Some APIs do not allow to request only 1 item per page, in these cases we can specify the number of items present on the first page as third argument:

$source = new Request('GET', 'https://paginated-json-api.test?page_size=5');

$items = lazyJsonPages($source, $path, function (Config $config) {
    $config->pages('total_pages')->perPage(500, 'page_size', 5);
});

As always, we can either set the configuration through the Config object or with an associative array:

$config = [
    'pages' => 'total_pages',
    'per_page' => [500, 'page_size', 5],
];

From now on we will just use the object-oriented version for brevity. Also note that the "per page" strategy can be used with any of the configurations seen so far:

$config->pages('total_pages')->perPage(500, 'page_size');
// or
$config->items('total_items')->perPage(500, 'page_size');
// or
$config->lastPage('last_page_key')->perPage(500, 'page_size');

Cursor and next-page paginations

Some APIs show only the number or cursor of the next page in all pages. We can tackle this kind of pagination by indicating the JSON key holding the next page:

$config->nextPage('next_page_key');

The JSON key may hold a number, a cursor or a URL, Lazy JSON Pages supports all of them.

Fine-tuning the pages fetching process

Lazy JSON Pages provides a number of settings to adjust the way HTTP requests are sent to fetch pages. For example pages can be requested in chunks, so that only a few streams are kept in memory at once:

$config->chunk(3);

The configuration above fetches 3 pages concurrently, loads the paginated items into a lazy collection and proceeds with the next 3 pages. Chunking benefits memory usage at the expense of speed, no chunking is set by default but it is recommended when dealing with a lot of data.

To minimize the memory usage Lazy JSON Pages can fetch pages synchronously, i.e. one by one, beware that this is also the slowest solution:

$config->sync();

We can also set how many HTTP requests we want to send concurrently. By default 10 pages are fetched asynchronously:

$config->concurrency(25);

Every HTTP request has a timeout of 5 seconds by default, but some APIs may be slow to respond. In this case we may need to set a higher timeout:

$config->timeout(15);

When a request fails, it has up to 3 attempts to succeed. This number can of course be adjusted as needed:

$config->attempts(5);

The backoff strategy allows us to wait some time before sending other requests when one page fails to be loaded. The package provides an exponential backoff by default, when a request fails it gets retried after 0, 1, 4, 9 seconds and so on. This strategy can also be overridden:

$config->backoff(function (int $attempt) {
    return $attempt ** 2 * 100;
});

The above backoff strategy will wait for 100, 400, 900 milliseconds and so on.

Putting all together, this is one of the possible configurations:

$source = new Request('GET', 'https://paginated-json-api.test?page_size=1');

$items = lazyJsonPages($source, 'data.results', function (Config $config) {
    $config
        ->pages('total_pages')
        ->perPage(500, 'page_size')
        ->chunk(3)
        ->timeout(15)
        ->attempts(5)
        ->backoff(fn (int $attempt) => $attempt ** 2 * 100);
});

$items
    ->filter(fn (array $item) => $this->isValid($item))
    ->map(fn (array $item) => $this->transform($item))
    ->each(fn (array $item) => $this->save($item));

Handling errors

As seen above, we can mitigate potentially faulty HTTP requests with backoffs, timeouts and retries. When we reach the maximum number of attempts and a request keeps failing, an OutOfAttemptsException is thrown.

When caught, this exception provides information about what went wrong, including the actual exception that was thrown, the pages that failed to be fetched and the paginated items that were loaded before the failure happened:

use Cerbero\LazyJsonPages\Exceptions\OutOfAttemptsException;

try {
    $items = lazyJsonPages($source, $path, $config);
} catch (OutOfAttemptsException $e) {
    // the actual exception that was thrown
    $e->original;
    // the pages that failed to be fetched
    $e->failedPages;
    // a LazyCollection with items loaded before the error
    $e->items;
}

Change log

Please see CHANGELOG for more information on what has changed recently.

Testing

composer test

Contributing

Please see CONTRIBUTING and CODE_OF_CONDUCT for details.

Security

If you discover any security related issues, please email [email protected] instead of using the issue tracker.

Credits

License

The MIT License (MIT). Please see License File for more information.

More Repositories

1

json-parser

🧩 Zero-dependencies lazy parser to read JSON of any dimension and from any source in a memory-efficient way.
PHP
676
star
2

lazy-json

🐼 Framework-agnostic package to load JSON of any dimension and from any source into Laravel lazy collections recursively.
PHP
235
star
3

enum

🎲 Zero-dependencies PHP library to supercharge enum functionalities.
PHP
188
star
4

command-validator

Validate Laravel console commands input.
PHP
160
star
5

laravel-enum

Discontinued. Enum generator for Laravel.
PHP
136
star
6

eloquent-inspector

πŸ•΅οΈ Inspect Laravel Eloquent models to collect properties, relationships and more.
PHP
115
star
7

Workflow

Laravel 5 package to create extendible and maintainable apps by harnessing the power of pipelines.
PHP
109
star
8

query-filters

Laravel package to filter Eloquent model records based on query parameters. Fully inspired by the Laracasts episode https://laracasts.com/series/eloquent-techniques/episodes/4
PHP
84
star
9

notifiable-exception

Laravel package to send notifications when some exceptions are thrown.
PHP
76
star
10

exception-handler

Extend the Laravel exception handler to let service providers determine how custom exceptions should be handled.
PHP
60
star
11

laravel-dto

Data Transfer Object (DTO) for Laravel
PHP
55
star
12

Auth

Laravel authentication module.
PHP
49
star
13

octane-testbench

β›½ Set of utilities to test Laravel applications powered by Octane.
PHP
42
star
14

sql-dumper

Laravel package to dump SQL queries, related EXPLAIN and location in code in different formats.
PHP
23
star
15

pest-plugin-laravel-octane

β›½ Pest plugin to test Laravel applications powered by Octane.
PHP
21
star
16

json-objects

Extract objects from large JSON files, endpoints or streams while saving memory.
PHP
20
star
17

dto

Data Transfer Object (DTO).
PHP
16
star
18

Transformer

Framework agnostic package to transform objects and arrays by manipulating, casting and mapping their properties.
PHP
14
star
19

Sublime-Text-PHP-and-Laravel-Snippets

Sublime Text snippets to ease development with PHP and Laravel.
13
star
20

console-tasker

🦾 Laravel package to create lean, powerful, idempotent and beautiful Artisan commands.
PHP
10
star
21

workflow-demo

Demo for Workflow repository
CSS
9
star
22

Date

Framework agnostic and easy to use tool to work with dates.
PHP
6
star
23

start

Mac service written in Automator to run several softwares and commands by pressing a hot key.
AppleScript
2
star
24

fluent-api

Framework agnostic starting point to perform fluent calls to any API.
PHP
1
star
25

Affiliate

PHP
1
star