Designing your own manager

If you'd like to extend the functionality of the package, please feel free to make a pull request on the project's github.

To extend the functionality by supporting another storage medium, you can inherit from the Manager abstract base class and implement the abstract methods it declares. You can then incorporate the manager by exposing your new Manager via the python entry point system.

Important

stow uses the entry point stow_managers to find managers

Add your managers to this entry point to integrate seamlessly with the stow stateless interface and connect utilities.

Base classes

Managers should be implemented as either a LocalManager or RemoteManager

from stow.manager import LocalManager, RemoteManager

The main functions on Manager use a method localise to get an absolute path to artefacts with which they want to interact. This method is responsible for ensuring the artefacts availability for the other methods and it is the key difference between the LocalManager and RemoteManager.

A LocalManager can access their artefacts directly and a RemoteManager must retrieve their artefacts before they can work with them.

Each Manager implements a localise function for these situations respectively. The RemoteManager object's localise function is a lot more involved to avoid pulling and pushing information anymore more than it needs to.

localise makes use of your abstract methods defined below to uphold the interface of Manager and does not need to be re-implemented.

Note

You may inherit from the Manager base class directly if you wish but you will have to implement the localise method in addition to the other abstract methods. I'd only suggest doing this if you have very special behaviour you want to express.

If you do find yourself in this situation, please consider adding this special behaviour as it's own abstract base class back to the original project to help others.

Abstract methods

abstract method

_abspath(managerPath)

Return the absolute path on the backend provider from the standardised manager path.

Parameters
  • managerPath (str) The manager relative path which is to be converted to an absolute path
Returns (str)

The manager absolute path

Examples

For the filesystem, this will be the full absolute path to the object. For s3 this is the key of the object.

>>> stow.connect(manager='FS', path='/home/ubuntu')._abspath('/hello/there')
'/home/ubuntu/hello/there'
>>> stow.connect(manager='s3', bucket='bucket-example')._abspath('/hello/there')
'hello/there'
def _abspath(self, managerPath: str) -> str:
    path = self.join(self._path, managerPath, joinAbsolutes=True)

    if os.name == 'nt':
        path = path.replace('/', '\\')

    return path
abstract method

_identifyPath(managerPath)

For the path given, create an Artefact for the object at the location on the manager but do not add it into the manager. If no object exists - return None

Parameters
  • abspath The path for artefact on disk
Returns (typing.Union[Artefact, None])

The artefact object that represents the item on disk or None if nothing exists

def _identifyPath(self, managerPath: str):

    abspath = self._abspath(managerPath)

    if os.path.exists(abspath):

        stats = os.stat(abspath)

        # Created time
        createdTime = datetime.datetime.utcfromtimestamp(stats.st_ctime)
        createdTime = pytz.UTC.localize(createdTime)

        # Modified time
        modifiedTime = datetime.datetime.utcfromtimestamp(stats.st_mtime)
        modifiedTime = pytz.UTC.localize(modifiedTime)

        # Access time
        accessedTime = datetime.datetime.utcfromtimestamp(stats.st_atime)
        accessedTime = pytz.UTC.localize(accessedTime)

        if os.path.isfile(abspath):
            return File(
                self,
                managerPath,
                stats.st_size,
                modifiedTime,
                createdTime,
                accessedTime,
            )

        elif os.path.isdir(abspath):
            return Directory(
                self,
                managerPath,
                createdTime=createdTime,
                modifiedTime=modifiedTime,
                accessedTime=accessedTime,
            )

    return None
abstract method

_get(source, destination)

Fetch the artefact and downloads its data to the local destination path provided

The existence of the file to collect has already been checked so this function can be written to assume its existence

Parameters
  • source (Artefact) The source object and context that is to be downloaded
  • destination (str) The local path to where the source is to be written
 def _get(self, source: Artefact, destination: str):

    # Convert source path
    sourceAbspath = self._abspath(source.path)

    # Identify download method
    method = shutil.copytree if os.path.isdir(sourceAbspath) else shutil.copy

    # Download
    method(sourceAbspath, destination)
abstract method

_getBytes(source)

Fetch the file artefact contents directly. This is to avoid having to write the contents of files to discs for some of the other operations.

The existence of the file to collect has already been checked so this function can be written to assume its existence

Parameters
  • source (Artefact) The source object and context that is to be downloaded
Returns (bytes)

The bytes content of the disk

   def _getBytes(self, source: Artefact) -> bytes:
        with open(self._abspath(source.path), "rb") as handle:
            return handle.read()
abstract method

_put(source, destination)

Put the local filesystem object onto the underlying manager implementation using the absolute paths given.

To avoid user error - an artefact cannot be placed onto a Directory unless an overwrite toggle has been passed which is False by default. This should protect them from accidentally deleting a directory.

In the event that they want to do so - the deletion of the directory will be handled before operating this function. Therefore there is no need to check/protect against it. (famous last words)

Parameters
  • source (str) A local absolute path to an artefact (File or Directory)
  • destination (str) A manager abspath path for the artefact
def _put(self, source: str, destination: str):
    # Convert destination path
    destinationAbspath = self._abspath(destination)

    # Ensure the destination
    os.makedirs(os.path.dirname(destinationAbspath), exist_ok=True)

    # Select the put method
    method = shutil.copytree if os.path.isdir(source) else shutil.copy

    # Perform the putting
    method(source, destinationAbspath)
abstract method

_putBytes(fileBytes, destination)

Put the bytes of a file object onto the underlying manager implementation using the absolute path given.

This function allows processes to avoid writing files to disc for speedier transfers.

If it's not possible to transmit bytes - I'd recommend writing the bytes to a tempfile and then operating the put method.

Parameters
  • fileBytes (bytes) files bytes
  • destinationAbsPath (str) Remote absolute path
def _putBytes(self, fileBytes: bytes, destination: str):

    # Convert destination path
    destinationAbspath = self._abspath(destination)

    # Makesure the destination exists
    os.makedirs(os.path.dirname(destinationAbspath), exist_ok=True)

    # Write the byte file
    with open(destinationAbspath, "wb") as handle:
        handle.write(fileBytes)
abstract method

_cp(source, destination)

Method for copying an artefact local to the manager to another location on the manager. Implementation would avoid having to download data from a manager to re-upload that data.

If there isn't a method of duplicating the data on the manager, you can call self._put(self._abspath(source.path), destination)

Which will mean the behaviour defaults to the put action.

Parameters
  • source (Artefact) the manager local source artefact
  • destination (str) a manager abspath path for destination
def _cp(self, source: Artefact, destination: str):
    self._put(self._abspath(source.path), destination)
abstract method

_mv(source, destination)

Method for moving an artefact local to the manager to another location on the manager. Implementation would avoid having to download data from a manager to re-upload that data.

If there isn't a method of duplicating the data on the manager, you can call self._put(self._abspath(source.path), destination) self._rm(self._abspath(source.path))

Which will mean the behaviour defaults to the put action and then a delete of the original file. Achieving the same goal.

Parameters
  • source (Artefact) the manager local source file
  • destination (str) a manager abspath path for destination
def _mv(self, source: Artefact, destination: str):

    # Convert the source and destination
    source, destination = self._abspath(source.path), self._abspath(destination)

    # Ensure the destination location
    os.makedirs(os.path.dirname(destination), exist_ok=True)

    # Move the source artefact
    os.rename(source, destination)
abstract method

_ls(directory)

List all artefacts that are present at the directory objects location and add them into the manager.

Parameters
  • managerPath the manager path to the directory whose content is to be indexed
def _ls(self, directory: str):

    # Get a path to the folder
    abspath = self._abspath(directory)

    # Iterate over the folder and identify every object - add the created
    for art in os.listdir(abspath):
        self._addArtefact(
            self._identifyPath(
                self.join(directory, art, separator='/')
            )
        )
abstract method

_rm(artefact)

Delete the underlying artefact data on the manager.

To avoid possible user error in deleting directories, the user must have already indicated that they want to delete everything

Parameters
  • artefact (Artefact) The artefact on the manager to be deleted
def _rm(self, artefact: Artefact):

    # Convert the artefact
    artefact = self._abspath(artefact.path)

    # Select method for deleting
    method = shutil.rmtree if os.path.isdir(artefact) else os.remove

    # Remove the artefact
    method(artefact)
abstract classmethod

_signatureFromURL(url)

Create the signature that can be passed to the init of the manager to create a new instance using the information passed via the url ParseResult object that will have been created via the stateless interface

Parameters
  • url (ParseResult) The result of passing the stateless path through urllib.parse.urlparse
Returns (Manager)

A manager of this type loaded with information from the url Relpath: The manager relative path for the artefact that may have been referenced

Raises
  • Error Errors due to missing information and so on
def _signatureFromURL(cls, url: urllib.parse.ParseResult):
    return {"path": "/"}, os.path.abspath(os.path.expanduser(url.path))
abstract method

toConfig()

Generate a config which can be unpacked into the connect interface to initialise this manager. To be used to seralise and de-seralise a manager object.

NOTE Defaulted values or environment variables are not guaranteed to be saved

Returns (dict)

A dictionary of the kwargs of the init of the manager

def toConfig(self):
    return {'manager': 'FS', 'path': self._path}

Special cases

Depending on the storage medium, it may be more efficient to load (read the metadata of) multiple artefacts simultaneously. s3 for example, returns the metadata for all files at a level when asked. It would be more efficient to instantiate all of these objects at this point rather than singling out any single object.

This can be achieved by overloading the _loadArtefact method on the Manager, which is the method used internally to create/ensure an artefact object.

def _loadArtefact(self, managerPath: str) -> Artefact:

    if managerPath in self._paths:
        # Artefact was previously loaded and can be returned normally
        return super()._loadArtefact(managerPath)

    try:
        # Ensure the owning directory and fetch the directory object
        directory = self._ensureDirectory(self.dirname(managerPath))

    except (exceptions.ArtefactNotFound, exceptions.ArtefactTypeError) as e:
        raise exceptions.ArtefactNotFound("Cannot locate artefact {}".format(managerPath)) from e

    # Add all artefacts of the directory into the manager at this level
    self._ls(directory.path)
    directory._collected = True

    # Return the now instantiated artefact
    if managerPath in self._paths:
        return self._paths[managerPath]

    else:
        raise exceptions.ArtefactNotFound("Cannot locate artefact {}".format(managerPath))