When making any kind of GET request to the eagle.io HTTP API, you expect to get some results back. These results could be in the form of timeseries data, nodes, or some kind of metadata such as events. But if there are too many results, the API call can fail. So how do you ensure you can reliably get the results you need, even if you don't know ahead of time exactly how many results there will be?
While the documentation recommends limiting both the size and rate of requests, this article will focus on limiting the size. Specifically, how to split up requests for very large amounts of data, so that the data is returned via a number of smaller and predictably sized responses rather than one unpredictably large response.
The following technique only applies to node and event data. Limiting timeseries data requires a slightly different approach which will be discussed in a future article. All the examples to follow will focus on limiting results from the
GET /api/v1/events API endpoint, but the same technique can be applied to both the
GET /api/v1/nodes API endpoint and the
GET /api/v1/nodes/:id/events API endpoint.
GET /api/v1/events API endpoint will, as the name suggests, retrieve a list of all events available to the authenticated API key. We can (and definitely should) provide a time range when making a request to this endpoint, so lets try to get all events from the month of July. As usual when prototyping or testing any API endpoints, we will use the excellent Postman tool as our API client:
Disaster! Our request has timed out after 50 seconds, presumably because there are too many events this month so the response would have been too large.
Lets try the same request, but this time with a limit of 1000 events. This is done by adding
&limit=1000 to the end of the URL:
That worked a lot better; we got 1000 events in about 2 seconds. But to get the next batch of events, we need to add a skip option, which will skip over the 1000 records we already have. Now we have
&limit=1000&skip=1000 at the end of the URL:
Note that this second batch of events starts with an event that occurred about 2 days into the month. This means we would need to make approximately 15 API calls in order to retrieve the full range of events for the month, assuming the density of events remains reasonably constant. All subsequent API calls would be identical except for the skip value, which will increase by 1000 each time, e.g.:
Assuming these API calls would typically be made within a script, probably as some kind of loop that is adding 1000 to the skip value each time, how do we know when to stop? Since
endTime=2021-08-01T00:00:00Z has been specified, we will eventually have retrieved all the events for July. The clue that indicates all the events have been retrieved is when the number of events for a specific API call is less than the limit, i.e. less than 1000. But our script would probably want to keep going until an empty set of events is returned, just to be sure.
The big advantage of this method is that we can be guaranteed a consistent number of entries in each response, even when we are not sure of the density of entries in the database. One downside is that we don't know in advance how many API calls will need to be made to retrieve the full set of entries. And if we keep making API calls in a fast loop, there is a possibility of hitting rate limits that govern how many requests may be made in one period of time. Therefore, a careful programmer will design their loop to be respectful of rate limits, which will be the subject of a further article.