The ApiClient¶
The ApiClient communicates with the Coscine REST API and enables the user to make requests to and get responses from Coscine.
Cache¶
To speed things up it makes use of a request cache that stores
responses from Coscine for re-use. As a consequence not every
function provided by the Coscine Python SDK that makes a request
to Coscine actually sends a request. Some may only send a request
once and later on revert to the cached response.
Other users in Coscine may make changes in the meantime, thereby effectively
invalidating cached data. To make sure that this does not lead to
inconsistencies, only constant data is cacheable and even that kind of data
has a limited lifetime in the cache. Once that lifetime ends, the request
is automatically refreshed on the next access.
Since constant data in Coscine rarely changes, the cache may be saved
to file and re-used in later sessions. This can be disabled by the user.
Initialization and configuration 🎢¶
All we need to get started after we have installed python and the coscine package is these few lines of code:
import coscine
token = "Our API Token"
client = coscine.ApiClient(token)
Parallelization 🚀¶
While it does not come with builtin support for parallel requests,
the Coscine Python SDK can easily be used with concurrency in mind
using external measures such as a thread pool. The following snippet
uses the ThreadPoolExecutor type provided by the standard library module
concurrent
. In that snippet the executor
manages 8 threads to which
it delegates function calls. We no longer have to wait until one function
returnes but instead can have multiple functions run at the same time -
in this case 8, since there are 8 threads. By increasing the number of
threads we can increase the number of operations that we are able to
do concurrently. However, computing resources are generally scarce and
we also should not send too many requests at once to Coscine in order
to not trigger the rate limiter and get a temporary timeout. A reasonable
measure for the amount of threads could be a low multiple of the amount
of processor cores that your current machine has to offer or even the
exact amount of cores. Anything in the order of 2 to 16 threads will
suffice.
from concurrent.futures import Future, ThreadPoolExecutor, wait
emails = ["user@example.org", "user2@example.org", ...]
with ThreadPoolExecutor(8) as executor: # 8 threads
futures = []
for email in emails:
future = executor.submit(project.invite, email)
futures.append(future)
# The tasks are not running in several threads. We are still in the main
# thread at this point and can either process something else in the
# meantime or wait for the threads with our function to finish.
# Once they are finished we can process the results.
wait(futures)
# All futures arrived at this point since we were waiting for them
for future in futures:
function_return_value = future.result()
print(function_return_value)
If you need the results returned by these functions you need to wait for their futures. Futures are not returned immediately since the function is running in another thread. Therefore the main thread does not get the result of the function in time and proceeds executing whatever comes next. The result therefore arrives in the future. When we are done sending stuff to Coscine and want to process the results, we can wait for the remaining futures to appear and then process them.
For more information on launching parallel tasks with python refer to the official documentation.
Caching¶
The Client uses caching internally. Sometimes you may run into cache consistency issues.
To clear the cache use:
ApiClient.session.cache.clear()
Pagination¶
Coscine deals out large amounts of data in a paginated way. The Coscine Python
SDK uses the maximum page size of 50 entries per page. The amount of pages
that can be fetched by the SDK is soft-limited. Upon initialization of the
ApiClient instance, the max_pages
argument can be set to an appropriate
amount. By default it is set to 65535
and may either be restricted
or extended. The default values allows one to fetch 65535*50=3276750
items.
The limit can be adjusted at runtime by accessing the max_pages
attribute
of an ApiClient object.