coscine package#

Submodules#

coscine.client module#

Provides the Coscine ApiClient.

class coscine.client.ApiClient(token: str, language: str = 'en', base_url: str = 'https://coscine.rwth-aachen.de', enable_caching: bool = True, verbose: bool = True, verify_certificate: bool = True, timeout: float = 60.0, retries: int = 3, use_native: bool = True)[source]#

Bases: object

The ApiClient communicates with Coscine by sending requests to and receiving response data from Coscine.

Parameters:
  • token (str) – To be able to use the Coscine REST API one has to supply their Coscine API token. Every Coscine user can create their own set of API tokens for free in their Coscine user profile. For security reasons Coscine API Tokens are only valid for a certain amount of time after which they are deactivated.

  • language (str) – Just like in the web interface of Coscine one can select a language preset for the Coscine API. This will localize all multi-language vocabularies and application profiles to the selected language. The language can later be switched on the fly.

  • base_url (str) – Coscine is Open Source software and hosted on various domains. Via the base_url setting, the API user can specify which instance they would like to connect to. By default this is set to the Coscine instance of RWTH Aachen.

  • verify_certificate (bool) – Whether to verify the SSL server certificate of the Coscine server. By default this is enabled as it provides some form of protection against spoofing, but on test instances or “fresh” installs certificates are often not used. To be able to use the Coscine Python SDK with such instances, the verify setting should be turned off.

  • verbose (bool) – By disabling the verbose setting one can stop the Python SDK from printing to the command line interface (stdout). The stderr file handle is unaffected by this. This setting is particulary helpful in case you wish to disable the banner on initialization.

  • enable_caching (bool) – Enabling caching allows the Python SDK to store some of the responses it gets from Coscine in a RequestCache. This cache is always active at runtime but may be saved and loaded to a file by enabling the caching setting. Entries in the cachefiles are valid for a certain amount of time until they are refreshed. With caching enabled the Python SDK is much faster.

  • timeout (float (seconds)) – The timeout threshold for Coscine to respond to a request. If Coscine does not answer within the specified amount of seconds, an exception is raised. Note that setting a timeout is very important since otherwise your application may hang indefinitely if it never gets a response.

  • use_native – If enabled, up- and downloads are performed via the native providers. In the case of an s3 resource that equates to using boto3 behind the scenes. If set to False, the route via Coscine is taken, which is usually about 30% slower, less stable and has size and bandwidth limitations.

BANNER: str = "                      _              \n                     (_)             \n    ___ ___  ___  ___ _ _ __   ___   \n   / __/ _ \\/ __|/ __| | '_ \\ / _ \\  \n  | (_| (_) \\__ \\ (__| | | | |  __/  \n   \\___\\___/|___/\\___|_|_| |_|\\___|  \n ___________________________________ \n  Coscine Python SDK 0.10.4   \n  https://coscine.de/                \n"#
api_tokens() list[ApiToken][source]#

Retrieves the list of Coscine API tokens that have been created by the owner of the same API token that was used to initialize the Coscine Python SDK ApiClient.

application_profile(profile_uri: str) ApplicationProfile[source]#

Retrieves a specific application profile via its uri.

Parameters:

profile_uri – The uri of the application profile, e.g. https://purl.org/coscine/ap/base/ The trailing slash is important!

application_profiles() list[ApplicationProfileInfo][source]#

Retrieves the list of all application profiles that are currently available in Coscine.

base_url: str#
create_project(name: str, display_name: str, description: str, start_date: date, end_date: date, principal_investigators: str, disciplines: list[Discipline], organizations: list[Organization], visibility: Visibility, keywords: list[str] | None = None, grant_id: str = '') Project[source]#

Creates a new Coscine project.

Parameters:
  • name – The project’s name.

  • display_name – The project’s display name (how it appears in the web interface).

  • description – The project description.

  • start_date – Date when the project starts.

  • end_date – Date when the project ends.

  • principal_investigators – The project PIs.

  • disciplines – List of associated scientific disciplines.

  • organizations – List of organizations partaking in the project.

  • visibility – Project metadata visibility (relevant for search).

  • keywords – List of project keywords (relevant for search).

  • grant_id – The projects grant ID.

delete(*args, **kwargs) ApiResponse[source]#

Sends a DELETE request to the Coscine REST API.

Parameters:

parameters. (Refer to ApiClient.request() for a list of) –

Raises:

Refer to ApiClient.request() for a list of exceptions.

discipline(name: str) Discipline[source]#

Returns the discipline that matches the name. Valid names would be for example: * “Jurisprudence 113” * “Materials Science 406” * “Medicine 205” * “Computer Science 409” * …

disciplines() list[Discipline][source]#

Retrieves the list of scientific disciplines available in Coscine.

enable_caching(enable: bool) None[source]#

Enables or disables request caching on demand.

get(*args, **kwargs) ApiResponse[source]#

Sends a GET request to the Coscine REST API. For a list of exceptions that are raised by this method or a detailed list of parameters, refer to ApiClient.request().

Parameters:

parameters. (Refer to ApiClient.request() for a list of) –

Raises:

Refer to ApiClient.request() for a list of exceptions.

Returns:

The “data”: { … } section of the JSON-response.

Return type:

dict

static handle_request_exception(exception: Exception) None[source]#
Raises:
property language: str#

The language setting of the ApiClient. This may be set to “en” for english or “de” for german. By default it is set to english but it can be changed on the fly even after the ApiClient has been instantiated.

languages() list[Language][source]#

Retrieves all languages available in Coscine.

latest_version() str[source]#

Retrieves the version string of the latest version of this package hosted on PyPi. Useful for checking whether the currently used version is outdated and if an update should be performed.

Examples

>>> if client.version != client.latest_version():
>>>     print("Module outdated.")
>>>     print("Run 'py -m pip install --upgrade coscine'.")
license(name: str) License[source]#

Returns the license that matches the name.

licenses() list[License][source]#

Retrieves a list of all licenses available in Coscine.

maintenances() list[MaintenanceNotice][source]#

Retrieves the list of current active maintenance notices for Coscine.

options(*args, **kwargs) ApiResponse[source]#

Sends an OPTIONS request to the Coscine REST API.

Parameters:

parameters. (Refer to ApiClient.request() for a list of) –

Raises:

Refer to ApiClient.request() for a list of exceptions.

organization(ror_uri: str) Organization[source]#

Looks up an organization based on its ror uri.

post(*args, **kwargs) ApiResponse[source]#

Sends a POST request to the Coscine REST API.

Parameters:

parameters. (Refer to ApiClient.request() for a list of) –

Raises:

Refer to ApiClient.request() for a list of exceptions.

project(key: str, attribute: property = <property object>, toplevel: bool = True) Project[source]#

Returns a single Coscine Project via one of its properties.

Parameters:
  • key – The value of the property to filter by.

  • property – The property/attribute of the project to filter by.

  • toplevel – If set to True, only toplevel projects are searched. Set it to False to include all (sub-)projects in the search.

Raises:
projects(toplevel: bool = True) list[Project][source]#

Retrieves a list of all Coscine projects that the creator of the Coscine API token is currently a member of.

Parameters:

toplevel – If set to True, only toplevel projects are retrieved. Set it to False to include all (sub-)projects in the results.

put(*args, **kwargs) ApiResponse[source]#

Sends a PUT request to the Coscine REST API.

Parameters:

parameters. (Refer to ApiClient.request() for a list of) –

Raises:

Refer to ApiClient.request() for a list of exceptions.

request(method: str, *args, stream: bool = False, **kwargs) ApiResponse[source]#

Sends a request to the Coscine REST API. This method is used internally. As a user of the ApiClient you should use the methods ApiClient.get(), ApiClient.post(), ApiClient.put(), ApiClient.delete(), ApiClient.options() instead of directly calling ApiClient.request().

Parameters:
  • method – The HTTP method to use for the request: GET, PUT, POST, DELETE, OPTIONS, etc.

  • *args – Any number of arguments to forward to the requests.Request()

  • stream – If set to true, the response will be streamed. This means that the response will be split up and arrive in multiple chunks. When attempting to download files, the stream parameter must be set to True. Otherwise it should be left at False.

  • *kwargs – Any number of keyword arguments to forward to requests.Request()

Raises:

See coscine.ApiClient.handle_request_exception

resource_type(name: str) ResourceType[source]#

Returns the ResourceType that matches the name. Here name refers to the resource specificType, e.g. “rdsrwth” instead of the general type “rds”. Mapping between specificType -> generalType: * “rdss3rwth” -> “rdss3” * “linked” -> “linked” * “rdss3wormrwth” -> “rdss3worm” * “rdsrwth” -> “rds” * “rdstudo” -> “rds” * “gitlab” -> “gitlab” * “rdss3nrw” -> “rdss3” * “rdss3ude” -> “rdss3” * “rdsude” -> “rds” * “rdss3tudo” -> “rdss3” * “rdsnrw” -> “rds”

resource_types() list[ResourceType][source]#

Retrieves a list of all resource types available in Coscine.

role(name: str) ProjectRole[source]#

Returns the role that matches the name.

roles() list[ProjectRole][source]#

Returns all roles that are available in Coscine, e.g. “Member”, “Owner”, …

search(query: str, category: str | None = None) list[SearchResult][source]#

Sends a search request to Coscine and returns the results.

Parameters:
  • query – The search query

  • category – The search can optionally be restricted to one of these categories: “metadata”, “project” or “resource”

self() User[source]#

Returns the owner of the Coscine API token that was used to initialize the ApiClient.

send_request(request: Request, stream: bool = False) ApiResponse[source]#

Sends a requests request to Coscine.

Parameters:
  • request – The request that previously has been created with requests.Request().

  • stream – If set to True, the data transfer will be streamed, i.e. performed in chunks.

session: CachedSession#
timeout: float#
uri(*args) str[source]#

Constructs a URI for requests to the Coscine REST API. This method creates URLs relative to the ApiClient.base_url and escapes URL arguments for compliance with the HTTP.

Parameters:

*args – Any number of arguments that should be included in the URI. The arguments do not have to be of type string, but should be str() serializable.

Examples

>>> ApiClient.uri("application-profiles", "profiles", profile_uri)
users(query: str) list[User][source]#

Searches for users.

validate_pid(pid: str) bool[source]#

Checks the given PID for validity.

verbose: bool#
verify: bool#
property version: str#

“1.0.0”

Type:

Coscine Python SDK version string. For example

visibilities() list[Visibility][source]#

Retrieves the list of visibility options available in Coscine.

visibility(name: str) Visibility[source]#

Returns the visibility that matches the name. Valid names are: * “Project Members” * “Public”

vocabulary(class_uri: str) Vocabulary[source]#

Retrieves the vocabulary for the class.

Parameters:

class_uri – The instance class uri, e.g. http://purl.org/dc/dcmitype

class coscine.client.ApiResponse(client: ApiClient, request: Request, response: Response)[source]#

Bases: object

Models the response data object sent by the Coscine REST API upon a successful request.

property current_page: int#

The page number of the current page.

property data: dict#

The response data as a dict if it arrived in JSON format.

property has_next: bool#

Evaluates to True if there are more pages available.

property has_previous: bool#

Evaluates to True if there is at least one preceding page available.

property is_paginated: bool#

Evaluates to True if the response is paginated, i.e. divided onto multiple pages.

property json: dict#

The full response data as a dict if it arrived in JSON format. Includes response metadata.

property page_size: int#

The page size of the current response data.

pages() Iterator[ApiResponse][source]#

Returns all pages of the response. This may result in additional requests.

response: Response#
property status_code: int#

The status code of the response as set by Coscine.

property total_data_count: int#

The total amount of data items available.

property total_pages: int#

The total number of pages for the specified PageSize. The PageSize is by default set to the maximum of 50.

property trace_id: str#

The Trace ID for Coscine internal error handling.

coscine.common module#

Provides common classes shared among multiple modules.

class coscine.common.AcademicTitle(data: dict)[source]#

Bases: object

Models the Academic Titles available in Coscine.

property id: str#

Unique and constant Coscine internal identifier for the respective Academic Title.

property name: str#

The name of the Academic Title, e.g. “Prof.” or “Dr.”

class coscine.common.ApiToken(data: dict)[source]#

Bases: object

This class models the Coscine API token.

property created: date#

Timestamp of when the API token was created.

property expired: bool#

Evaluates to True if the API token is expired.

property expires: date#

Timestamp of when the API token will expire.

property id: str#

Unique Coscine-internal identifier for the API token.

property name: str#

The name assigned to the API token by the creator upon creation.

property owner: str#

Unique Coscine-internal user id of the owner of the API token.

class coscine.common.Discipline(data: dict)[source]#

Bases: object

Models the disciplines available in Coscine.

property id: str#

The Coscine-internal unique identifier for the discipline.

property name: str#

The human-readable name of the discipline.

serialize() dict[source]#

Returns the machine-readable representation of the discipline data instance.

property uri: str#

The uri of the discipline.

class coscine.common.Language(data: dict)[source]#

Bases: object

Models the languages available in Coscine.

property abbreviation: str#

The abbreviated name of the language option.

property id: str#

Unique and constant Coscine internal identifier for the respective language option.

property name: str#

The full name of the language option.

class coscine.common.License(data: dict)[source]#

Bases: object

Models the licenses available in Coscine.

property id: str#

The Coscine-internal unique identifier for the license.

property name: str#

The human-readable name of the license.

serialize() dict[source]#

Returns the machine-readable representation of the license data instance.

class coscine.common.MaintenanceNotice(data: dict)[source]#

Bases: object

Models maintenance notices set in Coscine.

property body: str#

The body or description of the notice.

property ends_date: date#

Date when the maintenance ends.

The URL link to the detailed maintenance notice.

property starts_date: date#

Date when the maintenance goes active.

property title: str#

The title or name of the maintenance notice.

property type: str#

The type of maintenance.

class coscine.common.Organization(data: dict)[source]#

Bases: object

Models organization information for organizations in Coscine.

property email: str#

Contact email address of the organization.

property name: str#

The full name of the organization.

serialize() dict[source]#

Returns the machine-readable representation of the organization data instance.

property uri: str#

The organization’s ror uri.

class coscine.common.SearchResult(data: dict)[source]#

Bases: object

This class models the search results returned by Coscine upon a search request.

property source: str#

The source text that matches in some way or another the search query.

property type: str#

The search category the result falls into (e.g. project).

property uri: str#

Link to the result (i.e. a project or resource or file).

class coscine.common.User(data: dict)[source]#

Bases: object

This class provides an interface around userdata in Coscine.

property display_name: str#

The full name of a Coscine user as displayed in the Coscine web interface.

property email: str | list[str]#

The email address or list of email addresses of a user. In case the user has not associated an email address with their account ‘None’ is returned.

property first_name: str#

The first name of a Coscine user.

property id: str#

The unique Coscine-internal user id for a user.

property last_name: str#

The family name of a Coscine user.

property title: AcademicTitle | None#

The academic title of a user. In case the user has not set a title in their user profile ‘None’ is returned.

class coscine.common.Visibility(data: dict)[source]#

Bases: object

Models the visibility settings available in Coscine.

property id: str#

Coscine-internal identifier for the visibility setting.

property name: str#

Human-readable name of the visibility setting.

serialize() dict[source]#

Returns the machine-readable representation of the visibility data instance.

coscine.exceptions module#

The Coscine Python SDK ships with its own set of exceptions. All exceptions raised by the Coscine Python SDK are derived from a common base exception class called “CoscineException”.

exception coscine.exceptions.AuthenticationError[source]#

Bases: CoscineException

Failed to authenticate with the API token supplied by the user.

exception coscine.exceptions.CoscineException[source]#

Bases: Exception

Coscine Python SDK base exception. Inherited by all other Coscine Python SDK exceptions.

exception coscine.exceptions.NotFoundError[source]#

Bases: CoscineException

The droids you were looking for have not been found. Move along!

exception coscine.exceptions.RequestRejected[source]#

Bases: CoscineException

The request has reached the Coscine servers but has been rejected for whatever reason there may be. This exception is most likely thrown in case of ill-formatted requests.

exception coscine.exceptions.TooManyResults[source]#

Bases: CoscineException

Two or more instances match the property provided by the user but the Coscine Python SDK expected just a single instance to match.

coscine.metadata module#

Provides functions and classes around the handling of metadata.

class coscine.metadata.ApplicationProfile(client: ApiClient, data: dict)[source]#

Bases: ApplicationProfileInfo

An application profile defines how metadata can be specified.

Parameters:
  • client (ApiClient) – A Coscine Python SDK ApiClient for access to settings and requests.

  • data – ApplicationProfileInfo data as received by Coscine.

client: ApiClient#
property definition: str#

The actual application profile in text/turtle format.

fields() list[FormField][source]#

Returns the list of metadata fields with their properties as specified in the application profile.

graph: rdflib.Graph#
lock = <unlocked _thread.lock object>#
query(query: str, **kwargs) list[source]#

Performs a SPARQL query on the application profile and returns the results as a list of rows, with each row containing as many columns as selected in the SPARQL query.

Warning

Note that rdflib SPARQL queries are NOT thread-safe! Under the hood pyparsing is invoked, which leads to a lot of trouble if used in a multithreaded context. To avoid any problems the Coscine Python SDK employs a lock on this function - only one thread can use it at any given time. TODO: Open pull request at rdflib and make rdflib itself thread-safe.

Parameters:
  • query – A SPARQL query string.

  • **kwargs – Any number of keyword arguments to pass onto rdflib.query()

property target_class: str#

Returns the target class of the application profile. If not target class is present, the application profile URI is used as a fallback.

class coscine.metadata.ApplicationProfileInfo(data: dict)[source]#

Bases: object

Many different application profiles are available in Coscine. To be able to get information on a specific application profile or all application profiles, the ApplicationProfileInfo datatype is provided.

property description: str#

A description of the application profile.

property name: str#

The human-readable name of the application profile.

property uri: str#

The uri of the application profile.

class coscine.metadata.FileMetadata(data: dict)[source]#

Bases: object

The existing metadata to a file as returned by the Coscine API. This metadata is by default in machine-readable format and not human-readable.

property created: datetime#

The timestamp when the metadata was assigned.

property definition: str#

The actual metadata in rdf turtle format.

fixed_graph(resource: Resource) rdflib.Graph[source]#

Patches the file metadata knowledge graph to include the file path as its root subject.

graph() Graph[source]#

The metadata parsed as rdflib graph.

property is_latest: bool#

Returns True if the current metadata is the newest metadata for the file.

items() list[dict[str, str]][source]#

Returns the list of metadata values in the format: >>> [{ >>> “path”: “…”, >>> “value”: “…”, >>> “datatype”: “…” >>> }]

property path: str#

Path/Identifier of the metadata field.

property type: str#

Datatype of the value as a string.

property version: str#

Current metadata version string. The version is a Unix timestamp.

property versions: list[str]#

List of all metadata version strings. Versions are unix timestamps.

class coscine.metadata.FormField(client: ApiClient, data: dict)[source]#

Bases: object

A FormField represents a MetadataField that has been specified in an application profile. The FormField has numerous properties which restrict the range of values that can be assigned to a metadata field. It is thus very important for the validation of metadata and ensures the consistency of metadata.

append(value: bool | date | datetime | Decimal | int | float | str | time | timedelta, serialized: bool = False) None[source]#

If the field accepts a list of values, one can use the append method to add another value to the end of that list.

property class_uri: str#

In case the field is controlled by a vocabulary, the class_uri specifies the link to the instances of the vocabulary. These can then be fetched via ApiClient.instances(class_uri)

clear() None[source]#

Clears all values of all metadata fields.

client: ApiClient#
property datatype: type#

Restricts the datatype of values that can be assigned to the field.

deserialize(value: str) bool | date | datetime | Decimal | int | float | str | time | timedelta[source]#

Unmarshals the value and returns the pythonic representation.

property has_selection: bool#

Evaluates to True if the field values are controlled by a predefined selection of values.

property has_vocabulary: bool#

Evaluates to True if the field values are controlled by a vocabulary.

property identifiers: list[Literal] | list[URIRef]#

The list of values as rdflib identifiers.

property invisible: bool#

FormFields can be set to invisible in the Coscine resource metadata default value settings. Inivisble FormFields are not displayed.

property is_controlled: bool#

Evaluates to True if the field is either controlled by a vocabulary or a selection.

property is_required: bool#

Evaluates to True if the field must be assigned a value before it can be sent to Coscine alongside the other metadata.

property language: str#

The language setting of the field. This influences the field name and the values of fields controlled by a vocabulary or selection.

property literals: list[Literal]#

The field as rdflib.Literal ready for use with rdflib. The literal has the appropriate datatype as specified in the SHACL application profile. This should be used as Coscine is very strict with its verification: There is apparently a difference between xsd:int and xsd:integer, I kid you not!

property max_count: int#

The maximum amount of values that can be given to the field.

property max_length: int#

Specifies the maximum permissible length of the value. For values of type string this would equal the maximum string length.

property min_count: int#

The minimum count of values that the field must receive. If the count is greater than 0, the field is a required one, as it will always need a value.

property min_length: int#

Specifies the minumum required length of the value. For values of type string this would equal the minimum string length.

property name: str#

The human-readable name of the field, as displayed in the Coscine web interface.

property node: str#

The node property of the metadata field, if present.

property order: int#

The order of appearance of the field. The metadata fields are often displayed in a list in some sort of user interface. This property simply states at which position the field should appear.

property path: str#

The path of the FormField, acting as a unique identifier.

property selection: list[str]#

Some fields have a predefined selection of values that the user can choose from. In that case other values are not permitted.

property serial: list[str]#

Serializes the metadata value to Coscine format. That means that for vocabulary controlled fields, the human-readable value is translated to the machine-readable unique identifier. This property can also be set with the metadata value received by the Coscine API, which is already in machine-readable format and will be translated to human-readable internally.

serialize() list[str][source]#

Serializes the form field values into machine readable format.

validate(value: bool | date | datetime | Decimal | int | float | str | time | timedelta) None[source]#

Validates whether the value matches the specification of the FormField. Does not return anything but instead raises all sorts of exceptions.

property values: list[bool | date | datetime | Decimal | int | float | str | time | timedelta]#

This is the value of the metadata field in human-readable form. For the machine-readable form that is sent to Coscine use the property FormValue.serial! Setting a value can only be done by using the appropriate datatype. If the FormField.max_count is greater than 1, you may assign a list of values to the field.

property vocabulary: Vocabulary#

In the case that the field has a value for the class_uri property, it is controlled by a vocabulary.

property xsd_type: str | None#

datatype. For example: http://www.w3.org/2001/XMLSchema#int

Type:

The string representation of the xsd

class coscine.metadata.Instance(data: dict)[source]#

Bases: object

A (vocabulary) instance is an entry inside of a vocabulary. It maps from a human-readable name to a unique uniform resource identifier.

property graph_uri: str#

The uniform resource identifier of the graph. If entered in a web browser, it should yield the definition of the graph.

Example: >>> http://www.dfg.de/dfg_profil/gremien/fachkollegien/faecher/

property instance_uri: str#

The uniform resource identifier of the instance. If entered in a web browser, it should yield the definition of the instance.

Example: >>> http://www.dfg.de/dfg_profil/gremien/fachkollegien/liste/ >>> index.jsp?id=112#112-03 -> The item with id 112 within the graph

property name: str#

The display name of the instance.

property subclass_of: str#

Identifies the subclass of the instance.

Example: >>> http://www.dfg.de/dfg_profil/gremien/fachkollegien/liste/ >>> index.jsp?id=112 -> subclass of the item

property type_uri: str#

Identifies the type of instance.

Example: >>> http://www.dfg.de/dfg_profil/gremien/fachkollegien/liste/ >>> index.jsp?id=112#112-03 -> Commonly the same as instance_uri

class coscine.metadata.MetadataForm(resource: Resource, fixed_values: bool = True)[source]#

Bases: object

The metadata form makes the meatadata fields that have been defined in an application profile accessible to users.

Parameters:
  • resource (Resource) – Coscine resource instance

  • fixed_values – If set to true, the fixed values set in the resource are applied when creating the application profile. If set to false, they are ignored and an empty metadata form is returned.

clear() None[source]#

Clears all values.

defaults() None[source]#

Parses the fixed and default value settings of a resource. This also includes visibility settings for metadata fields.

field(key: str) FormField[source]#

Looks up a metadata field via its name.

fields() list[FormField][source]#

The list of metadata fields that can be filled in as defined in the application profile.

graph() Graph[source]#

Returns the metadata as a knowledge graph.

items() list[tuple[str, list[bool | date | datetime | Decimal | int | float | str | time | timedelta]]][source]#

Returns key, value pairs for all metadata fields

keys() list[str][source]#

Returns the list of names of all metadata fields.

parse(data: FileMetadata) None[source]#

Parses existing metadata that was received from Coscine.

path(path: str) FormField[source]#

Looks up a metadata field via its path.

resource: Resource#
serialize(path: str) dict[source]#

Prepares and validates metadata for sending to Coscine. Requires the file path of the file in Coscine as an argument.

Parameters:

path – The path in Coscine to the FileObject that you would like to attach metadata to.

test() None[source]#

Auto-fills the MetadataForm with a set of predefined values. Every field is filled in.

validate() bool[source]#

Validates the metadata against the resource application profile SHACL.

values() list[list[bool | date | datetime | Decimal | int | float | str | time | timedelta]][source]#

Returns the list of values of all metadata fields.

class coscine.metadata.Vocabulary(data: list[Instance])[source]#

Bases: object

The Vocabulary contains all instances of a class and provides an interface to easily check whether a term is contained in the set of instances and to query the respective instance.

graph() Graph[source]#

Returns the vocabulary as an rdflib knowledge graph.

keys() list[str][source]#

Returns the list of keys that are contained inside of the vocabulary. This equals the set of names of the class instances.

resolve(value: str) bool | date | datetime | Decimal | int | float | str | time | timedelta[source]#

This method takes a value and return its corresponding key. It can be considered the reverse of Vocabulary[key] -> value, namely Vocabulary[value] -> key but that cannot be expressed in Python, hence this method.

coscine.metadata.xsd_to_python(xmltype: str) type[source]#

Converts an XMLSchema XSD datatype string to a native Python datatype class instance.

coscine.project module#

class coscine.project.Project(client: ApiClient, data: dict)[source]#

Bases: object

Projects in Coscine contains resources.

add_member(user: User, role: ProjectRole) None[source]#

Adds the project member of another project to the current project. The owner of the Coscine API token must be a member of the other project.

client: ApiClient#
create_resource(name: str, display_name: str, description: str, license: License, visibility: Visibility, disciplines: list[Discipline], resource_type: ResourceType, quota: int, application_profile: ApplicationProfileInfo, usage_rights: str = '', keywords: list[str] | None = None) Resource[source]#

Creates a new Coscine resource within the project.

Parameters:
  • name – The full name of the resource.

  • display_name – The shortened display name of the resource.

  • description – The description of the resource.

  • license – License for the resource contents.

  • visibility – Resource metadata visibility (relevant for search).

  • disciplines – Associated/Involved scientific disciplines.

  • resource_type – The Cosciner resource type.

  • quota – Resource quota in GB (irrelevant for linked data resources).

  • application_profile – The metadata application profile for the resource.

  • notes – Data usage notes

  • keywords – Keywords (relevant for search).

property created: date#

Timestamp of when the project was created. If 1998-01-01 is returned, then the created() value is erroneous or missing.

property creator: str#

Project creator user ID.

delete() None[source]#

Deletes the project on the Coscine servers. Be careful when using this function in your code, as users should be prevented from accidentially triggering it! Best to prompt the user before calling this function on whether they really wish to delete their project.

property description: str#

The project description.

property disciplines: list[Discipline]#

Scientific disciplines the project is involved with.

property display_name: str#

The shortened project name as displayed in the Coscine web interface.

download(path: str = './') None[source]#

Downloads the project to the local directory path.

property end_date: date#

End of project lifecycle timestamp.

property grant_id: str#

Project grant id.

property id: str#

Unique Coscine-internal project identifier.

invitations() list[ProjectInvitation][source]#

Returns the list of all outstanding project invitations.

invite(email: str, role: ProjectRole) None[source]#

Invites an external user via their email address to the Coscine project.

property keywords: list[str]#

Project keywords for better discoverability.

match(attribute: property, key: str) bool[source]#

Attempts to match the project via the given property and property value. Filterable properties: * Project.id * Project.pid * Project.name * Project.display_name * Project.url

Returns:

  • True – If its a match ♥

  • False – Otherwise :(

members() list[ProjectMember][source]#

Returns the list of all members of the current project.

property name: str#

The full project name as set in the project settings.

property organizations: list[Organization]#

Organizations participating in the project.

property pid: str#

Project Persistent Identifier.

property principal_investigators: str#

The project investigators.

quotas() list[ProjectQuota][source]#

Returns the project storage quotas.

remove_member(member: ProjectMember) None[source]#

Removes the member from the project. Does not invalidate the member object in Python - it is up to the API user to not use that variable again.

resource(key: str, attribute: property = <property object>) Resource[source]#

Returns a single resource via one of its properties. The key can be specified to match any of the ResourceProperty items.

resources() list[Resource][source]#

Retrieves a list of all resources of the project.

serialize() dict[source]#

Marshals the project metadata into machine-readable format.

property slug: str#

Project slug - usually a combination out of original project name and some arbitrary Coscine-internal data appended to it.

property start_date: date#

Start of project lifecycle timestamp.

update() None[source]#

Updates a Coscine project’s settings. To update certain properties just access the properties of the coscine.Project class directly and call Project.update() when done.

property url: str#

Project URL - makes the project accessible in the web browser.

property visibility: Visibility#

Project visibility setting.

class coscine.project.ProjectInvitation(data: dict)[source]#

Bases: object

Models external user invitations via email in Coscine.

property email: str#

The email address of the invited user.

property expires: date#

Timestamp of when the invitation expires.

property id: str#

Unique Coscine-internal identifier for the invitation.

property issuer: User#

The user in Coscine who sent the invitation.

property project_id: str#

Project ID of the project the invitation applies to.

property role: ProjectRole#

Role assigned to the invited user.

class coscine.project.ProjectMember(project: Project, data: dict)[source]#

Bases: object

This class models the members of a Coscine project.

property id: str#

Unique Coscine-internal project member identifier.

project: Project#
property role: ProjectRole#

The role of the member within the project.

property user: User#

The user in Coscine that represents the project member.

class coscine.project.ProjectQuota(data: dict)[source]#

Bases: object

Projects have a set of storage space quotas. This class models the quota data returned by Coscine.

property allocated: int#

The allocated storage space in bytes.

property maximum: int#

The maximum available storage space in bytes.

property project_id: str#

The ID of the associated project.

property resource_quotas: list[ResourceQuota]#

The list of used resource quotas for the project.

property resource_type: ResourceType#

The associated resource type.

property total_reserved: int#

The total reserved storage space in bytes.

property total_used: int#

The total used storage space in bytes.

class coscine.project.ProjectRole(data: dict)[source]#

Bases: object

Models roles that can be assumed by project members within a Coscine project.

property description: str#

Description for the role.

property id: str#

Unique and constant Coscine-internal identifier of the role.

property name: str#

Name of the role.

coscine.resource module#

Provides an interface around resources in Coscine.

class coscine.resource.FileObject(resource: Resource, data: dict, metadata: FileMetadata | None = None)[source]#

Bases: object

Models files or file-like objects in Coscine resources.

assign_metadata(metadata: list[FileMetadata]) None[source]#

Assigns locally available metadata to the file object. This is mostly used internally and can be ignored by most users.

property client: ApiClient#

The Coscine ApiClient associated with the resource instance.

property created: date#

Timestamp of when the file has been uploaded.

delete() None[source]#

Deletes the FileObject remote on the Coscine server.

property directory: str#

The directory the file object is located in, if it is in a folder.

download(path: str = './', recursive: bool = False) None[source]#

Downloads the file to the computer. If path ends in a filename, the whole path is used. Otherwise the filename of the file is appended to the given path. If recursive is True, the full path of the file is used and appended to the path. But all folders on that path must have already been created then.

property download_expires: datetime#

The timestamp when the FileObject.download_url will expire.

property download_url: str#

The download URL for the file.

property extension: str#

The file type extension.

property filetype: str#

The file’s filetype.

Examples

>>> ".png"
>>> ".txt"
>>> ""
property is_folder: bool#

Evaluates to True when the FileObject represents a folder and not an actual file.

metadata(refresh: bool = True) FileMetadata | None[source]#

Returns the metadata of the file. This might use a cached version of file metadata or make a request, if no cached version is available.

metadata_form(refresh: bool = True) MetadataForm[source]#

Returns the metadata of the file or an empty metadata form if no metadata has been attached to the file.

property modified: date#

Timestamp of when the file was recently modified.

property name: str#

The filename of the file. Includes the file type extension.

Examples

>>> foo.txt
>>> bar.png
property path: str#

The path to the file. Usually equivalent to the filename, except when the file is a directory or contained within a directory. Which in the case of S3 resource may occur regularly.

Examples

>>> chest_xray.png
>>> pneumonia/lung_ap.png
>>> pneumonia/
resource: Resource#
property size: int#

The size of the file contents in bytes.

stream(fp: BinaryIO) None[source]#

Streams file contents.

stream_blob(fp: BinaryIO) None[source]#

Streams file contents from the Coscine Blob API.

stream_s3(handle: BinaryIO) None[source]#

Works only on rdss3 resources and should not be called on other resource types! Bypasses Coscine and uploads directly to the underlying s3 storage.

property type: str#

The type of the file object in the file tree.

Examples

>>> Leaf
>>> Tree
update(handle: BinaryIO | bytes | str, progress: Callable[[int], None] | None = None) None[source]#

Uploads a file-like object to a resource in Coscine.

Parameters:
  • handle (BinaryIO | bytes | str) – A binary file handle that supports reading or bytes or str.

  • progress (Callable "def function(int)") – Optional callback function that gets occasionally called during the upload with the progress in bytes.

update_metadata(metadata: MetadataForm | dict) None[source]#

Updates the metadata for a file object or creates new metadata if there has not been metadata assigned yet.

class coscine.resource.Resource(project: Project, data: dict)[source]#

Bases: object

Models a Coscine Resource object.

property access_url: str#

Resource Access URL via PID

property application_profile: ApplicationProfile#

The application profile of the resource.

property archived: bool#

Evaluates to True when the resource is set to archived.

client: ApiClient#
property created: date#

Timestamp of when the resource was created.

property creator: str#

The Coscine user id of the resource creator.

delete() None[source]#

Deletes the Coscine resource and along with it all files and metadata contained within it on the Coscine servers. Special care should be taken when using that method in code as to not accidentially trigger a delete on a whole resource. Therefore this method is best combined with additional input from the user e.g. by prompting them with the message “Do you really want to delete the resource? (Y/N)”.

property description: str#

The resource description.

property disciplines: list[Discipline]#

The scientific disciplines set for the resource.

property display_name: str#

Shortened resource name as displayed in the Coscine web interface.

download(path: str = './') None[source]#

Downloads the resource to the local directory given by path.

file(path: str) FileObject[source]#

Returns a single file of the resource via its unique path.

file_index() list[dict][source]#

Returns a file index with the following data: {

file-path: {

filename: “foo”, filesize: 0, download: http://example.org/foo, expires: datetime

}, …

} This index can easily be serialized to JSON format and made available publicly. Coscine currently prohibits external users from downloading files in a resource. To be able to publicly make data available one can instead publish this file index, which needs to be updated in regular intervals to ensure that the download urls do not expire. By hosting the json-serialized representation of this index on a free to use platform such as GitHub, one can access it via browser or via software and thus use Coscine as a storage provider in software and data publications regardless of whether people have access to Coscine.

files(path: str = '', recursive: bool = False, with_metadata: bool = False) list[FileObject][source]#

Retrieves the list of files that are contained in the resource. Via an additional single API call the metadata for those files can be fetched and made available in the returned files.

Parameters:
  • path – You can limit the set of returned files to a path. The path may be the path to a single file in which case a list containing that single file will be returned. Or it may point to a “directory” in which case all the files contained in that “directory” are returned.

  • recursive – S3 resources may have folders inside them. Set the recursive parameter to True to also fetch all files contained in these folders.

  • with_metadata – If set to True the set of files are returned alongside with their metadata. This internally requires another API request which is considerably slower (1 to 2 seconds). However if you plan on manipulating each files metadata this is the way to go. Otherwise you would have to make an API call to fetch the metadata for each file which in case of large resources will prove to be very painful… :)

property fixed_values: dict#

The resources default metadata values.

graph() Graph[source]#

Returns a knowledge graph with the full set of file object metadata.

property id: str#

Unique Coscine-internal resource identifier.

property keywords: list[str]#

List of keywords for better discoverability.

property license: License | None#

The license used for the resource data.

match(attribute: property, key: str) bool[source]#

Attempts to match the resource via the given property and property value. Filterable properties: * Resource.id * Resource.pid * Resource.name * Resource.display_name * Resource.url

Returns:

  • True – If its a match ♥

  • False – Otherwise :(

metadata(path: str = '') list[FileMetadata][source]#

Returns the full set of metadata for each file in the resource.

metadata_form(fixed_values: bool = True) MetadataForm[source]#

Returns the resource metadata form.

Parameters:

fixed_values – If set to true, the fixed values set in the resource are applied when creating the application profile. If set to false, they are ignored and an empty metadata form is returned.

mkdir(path: str, metadata: MetadataForm | None = None) None[source]#

Creates a folder inside of a resource. Should work for all resource types.

property name: str#

Full resource name as displayed in the resource settings.

property pid: str#

The persistent identifier assigned to the resource.

post_metadata(metadata: dict) None[source]#

Creates metadata for a file object for the first time. There shall be no metadata assigned to the file already - in that case use put_metadata()!

project: Project#
put_metadata(metadata: dict) None[source]#

Updates existing metadata of a file object. If the file object does not yet have metadata, use post_metadata()!

query(sparql: str) list[FileObject][source]#

Runs a SPARQL query on the underlying resource knowledge graph and returns the file objects whose metadata matches the query. IMPORTANT: The query must (!) include ?path as a variable/column. Otherwise it will get rejected and a ValueError is raised.

Examples

>>> resource.query("SELECT ?path ?p ?o { ?path ?p ?o . }")
>>> project = client.project("Solaris")
>>> resource = project.resource("Chest X-Ray CNN")
>>> files = resource.query(
>>>     "SELECT ?path WHERE { "
>>>     "    ?path dcterms:creator ?creator . "
>>>     "     FILTER(?creator != 'Dr. Akula') "
>>>     "}"
>>> )
>>> for file in files:
>>>     print(file.path)
property quota: ResourceQuota#

The resources storage quota.

serialize() dict[source]#

Serializes Coscine Resource metadata into machine-readable representation.

property type: ResourceType#

The resource’s resource type.

update() None[source]#

Change the values locally via setter properties

update_metadata(metadata: dict) None[source]#

Updates metadata of a file. In case no metadata has yet been assigned to the file, it will create new metadata. This method basically incorporates both post_metadata() and put_metadata() into one, choosing the appropriate method when applicable. This comes at the cost of possibly sending two requests, where one would have sufficed.

upload(path: str, handle: BinaryIO | bytes | str, metadata: MetadataForm | dict | None = None, progress: Callable[[int], None] | None = None) None[source]#

Uploads a file-like object to a resource in Coscine.

Parameters:
  • path – The path the file shall assume inside of the Coscine resource. Not the path on your local harddrive! The terms path, key and filename can be used interchangeably.

  • handle – A binary file handle that supports reading or a set of bytes or a string that can be utf-8 encoded.

  • metadata – Metadata for the file that matches the resource application profile.

  • progress – Optional callback function that gets occasionally called during the upload with the progress in bytes.

property url: str#

Project URL - makes the resource accessible in the web browser.

property usage_rights: str#

The usage rights specified for the data inside the resource.

property visibility: Visibility#

The Coscine visibility setting for the resource.

class coscine.resource.ResourceQuota(data: dict)[source]#

Bases: object

Models the Coscine resource quota data.

property reserved: int#

The reserved quota for the resource.

property resource_id: str#

The associated Coscine resource id.

serialize() dict[source]#
property used: int#

The used quota in bytes.

property used_percentage: float#

The ratio of used up quota in relation to the available quota.

class coscine.resource.ResourceType(data: dict)[source]#

Bases: object

Models the resource types available in Coscine.

property active: str#

Whether the resource type is enabled on the Coscine instance.

property general_type: str#

General resource type, e.g. rdss3

property id: str#

Coscine-internal resource type identifier.

property options: ResourceTypeOptions#

The resource’s resource type specific options.

serialize() dict[str, dict][source]#

Serializes to resourceTypeOptions {}, not type.

property specific_type: str#

Specific resource type, e.g. rdss3rwth

class coscine.resource.ResourceTypeOptions(data: dict | None = None)[source]#

Bases: object

Options and settings regarding the resource type. Mainly provides an interface to resource type specific attributes such as S3 access credentials for resources of type rds-s3.

property access_key_read: str#

The S3 access key for reading.

property access_key_write: str#

The S3 access key for writing.

property bucket_name: str#

The S3 bucket name.

property endpoint: str#

The S3 endpoint.

property secret_key_read: str#

The S3 secret key for reading.

property secret_key_write: str#

The S3 secret key for writing.

property size: int#

The size setting of the resource type in GibiByte.

coscine.resource.progress_callback(progress_bar: tqdm, bytes_read: int, fn: Callable[[int], None] | None = None) None[source]#

Updates the progress bar and calls a callback if one has been specified.

Module contents#

The Coscine Python SDK provides a high-level interface to the Coscine REST API.