cat /dev/brain

Some Better Practices For Using Requests in API Clients

As someone who has written a few API client libraries and works on Requests a bit, I realize I have some overlapping knowledge that I need to share. There are some patterns I've found from building my own API client libraries and helping developers of other client libraries. I hesitate to call these "Best Practices" because I'm sure there are better ways of providing this functionality based on how you're designing your library, but the underlying idea of allowing users more choice and freedom should hopefully shine through regardless.

Before we get to that, though, let's start with some basics.

Requests Connection Pooling - Cruise Control for Better Performance

One of the things that Requests does for its users is provide connection pooling. It only does this when the user uses a Session object though. So, all API client libraries really need to use Session objects under the hood and share a Session object as much as possible. Take, for example, github3.py. Users create a GitHub object and can retrieve other objects with that - Users, Repositories, Issues, Pull Requests, etc. Any object that was created by that same GitHub instance (or an instance of an object it created) will share the same Session and thus the same connection pool. That means they all share the same credentials and the same configuration.

Now, if someone creates a different GitHub object, that will receive a new Session (unless the user passes in a specific one that it wants to share with a different instance) so that the Sessions aren't really shared between everything, just objects created from the same ancestor (for lack of a better term).

Requests Adapters - Fine Tuning Your HTTP Experience

In the past, I have written about retries in which we create a new HTTPAdapter to configure how Requests performs retries in a granular way. Adapters, however, are fantastic tools that allow users to configure much more than just retries. For example, if you want to change the connection pool limit you can simply specify:

import requests

my_session = requests.Session()
my_adapter = requests.adapters.HTTPAdapter(pool_maxsize=20)
my_session.mount('https://', my_adapter)

In fact, there are several options that most people do not usually need but which users can specify:

  • max_retries - This can be an integer or a Retry object as documented in my other blog post
  • pool_connections - The number of connection pools to create
  • pool_maxsize - The maximum number of connections to keep in a connection pool
  • pool_block - Whether or not the connection pool should block waiting for a fresh connection. This defaults to False which means that we will always create a new connection rather than wait for a previous request to finish and return the connection to the pool.

By overriding some of these defaults, you can make your HTTP experience with Requests custom.

Pulling It All Together

As an API client developer, you may want to set up a minimum number of retries that should happen so your users aren't retrying requests manually. In that case, you might do something like this:

import attr
import requests

@attr.s
class APIClient(object):
    username = attr.ib('username')
    password = attr.ib('password')
    url = attr.ib('url')

    def __attrs_post_init__(self):
        self.session = requests.Session()
        self.session.mount(
            'https://',
            requests.adapters.HTTPAdapter(max_retries=5)
        )
        # Alternatively, you may just want to scope it to the API and do
        # self.session.mount(
        #     self.url,
        #     requests.adapters.HTTPAdapter(max_retries=5)
        # )

But perhaps a user wants to specify a Session subclass that they've made because it sends metrics to statsd for them. In that case, you might then expand your API to look like:

import attr
import requests

@attr.s
class APIClient(object):
    username = attr.ib('username')
    password = attr.ib('password')
    url = attr.ib('url')
    session = attr.ib('session', default=attr.Factory(requests.Session))

    def __attrs_post_init__(self):
        self.session.mount(
            'https://',
            requests.adapters.HTTPAdapter(max_retries=5)
        )
        # Alternatively, you may just want to scope it to the API and do
        # self.session.mount(
        #     self.url,
        #     requests.adapters.HTTPAdapter(max_retries=5)
        # )

This, however, will always override whatever adapters the user may have specified. At this point, you really don't want your APIClient class to receive the adapter class that the user specified, so instead, you should offer a function that returns the specific keyword arguments you want to pass to an adapter:

import attr
import requests

def http_adapter_kwargs():
    return {
        'max_retries': 5,
    }

def _make_new_session():
    session = requests.Session()
    session.mount(
        'https://',
        requests.adapters.HTTPAdapter(**http_adapter_kwargs)
    )
    return session

@attr.s
class APIClient(object):
    username = attr.ib('username')
    password = attr.ib('password')
    url = attr.ib('url')
    session = attr.ib('session', default=attr.Factory(_make_new_session))

Users would then do something like this:

import apiclient

session = MySession()
session.mount('https://', MyAdapter(**apiclient.http_adapter_kwargs()))
client = apiclient.APIClient(
    'username',
    'secrete-passw0rd',
    url='...',
    session=session,
)

This then allows users who know what they need to configure to do so and for the library to provide defaults that it deems necessary when the user doesn't specify things. It also gives the user choice as to whether or not to accept the library's defaults. Perhaps the user found that Requests was making too many retry attempts so they decided to lower it or turn it off entirely because failing fast is more important to them than eventually succeeding. This provides users with a very valuable client library that becomes very easy and convenient to use, especially as their needs evolve and become more complex.

Side Note

If instead, you only want to configure an adapter for your API service (as opposed to all https:// URLs your session might handle), then your code sample might look more like this:

import attr
import requests

def http_adapter_kwargs():
    return {
        'max_retries': 5,
    }

def _make_new_session(url):
    session = requests.Session()
    session.mount(
        url,
        requests.adapters.HTTPAdapter(**http_adapter_kwargs)
    )
    return session

@attr.s
class APIClient(object):
    username = attr.ib('username')
    password = attr.ib('password')
    url = attr.ib('url')
    session = attr.ib('session', default=None)

    def __attrs_post_init__(self):
        if self.session is None:
            self.session = _make_new_session(self.url)