GitLocker: The Coding Marketplace

Description:

cabinets 0.7.0

Cabinets

cabinets is a Python library that provides a consistent interface for file operations
across multiple storage platforms. File extensions are dynamically detected to allow
automatic serialization and deserialization of Python objects.
cabinets supports a variety of protocols and file
format parsers natively, and new protocols or parsers can be
easily registered.
Table of contents

Sample Usage

Read a file
Write a file
List files in a directory
Reading and Writing with Other Protocols

Built-in Protocols and Parsers

Protocols
Parsers

Protocol Configuration
Custom Protocols and Parsers

Adding Cabinets
Adding Parsers
Loading Plugins

Contributing

Sample Usage
Read a file
Set up a test file in your local filesystem:
import json

obj = {'test': 1}

with open('data.json', 'w') as fh:
json.dump(obj, fh)

Read back and parse the file using cabinets:
import cabinets

new_obj = cabinets.read('test.json')

That's it! The file is loaded and parsed in just one line.
Write a file
cabinets also supports creating files. We can rewrite the first example using
only cabinets.
import cabinets

obj = {'test': 1}
cabinets.create('test.json', obj)

new_obj = cabinets.read('test.json')

assert new_obj == obj

List files in a directory
In some situations, you may need to know what files are in a directory before doing
any operations. cabinets also provides an list function to assist with this.
import cabinets

obj = {'test': 1}
cabinets.create('example/test.json', obj)
cabinets.create('example/test2.yaml', obj)
cabinets.create('example/subdir/test3.txt', "test")

assert cabinets.list('example/') == ['test.json', 'test2.yaml']
assert cabinets.list('example/subdir/') == ['test3.txt']

Important: For simplicity, cabinets restricts the output of list to only file types.
Subdirectories are excluded, and must be queried separately.
Future versions may include a flag in list for returning subdirectories as well.

Reading and Writing with Other Protocols
Using cabinets allows you to interact with multiple file storage protocols depending
on the URI you specify. In the previous examples, we used
read() and write() to operate within our local file system; that's
because cabinets assumes we're using the file:// protocol by default. Luckily,
accessing other storage systems is just as easy!
For example, operating on a file on AWS S3 is done exactly the same way:
import cabinets

# Read JSON file from your filesystem
local_obj = cabinets.read('file://test.json')

# Write that object to a file in AWS S3
cabinets.create('s3://test.json', local_obj)

# Read back the same file from AWS S3
remote_obj = cabinets.read('s3://test.json')

assert local_obj == remote_obj

The above example will read a file from the local filesystem and create a new file
containing the same data, at the same path in S3.
By prefixing the path with {protocol}:// we specify how and where cabinets should
look for a file. Using file:// (default if none specified) tells cabinets to use *
path* on the local filesystem. Using s3:// on the other hand instructs cabinets to
perform operations against that path in AWS S3.

NOTE: The S3Cabinet may require initial configuration for the s3 protocol to
function properly. See Protocol Configuration for details.

See all the natively supported protocols below.
Built-in Protocols and Parsers
Protocols

Local File System (file://)
S3 (s3://)

Parsers

YAML (.yml, .yaml)
JSON (.json)
Python Pickle (.pickle)
CSV (beta) (.csv)
TXT (.txt)

import cabinets

# .foo file in local filesystem
local_foo_data = cabinets.read('file://test.foo')

# .foo file in S3
s3_foo_data = cabinets.read('s3://test.foo')

Protocol Configuration
Some storage platform protocols may require additional configuration parameters to be
set before they can be used. Each Cabinet subclass can expose
a set_configuration(**config) class method to take care of any required initial setup.
from cabinets.cabinet.s3_cabinet import S3Cabinet

# set the AWS S3 region to us-west-2 and specify an access key
S3Cabinet.set_configuration(region_name='us-west-2', aws_access_key_id=...)

# use specific Cabinet to avoid protocol prefix
S3Cabinet.read('bucket-in-us-west-2/test.json')
# or use generic Cabinet with protocol prefix
import cabinets

cabinets.read('s3://bucket-us-west-2/test.json')

See the documentation of specific Cabinet classes for what configuration parameters
are available.
Additionally, there is a top-level set_configuration() function so that importing
specific Cabinet subclasses is not required. Simply pass the desired protocol as the
first argument.
import cabinets

# *OPTIONAL*: set the AWS S3 region to us-west-2 and specify an access key
cabinets.set_configuration('s3', region_name='us-west-2', aws_access_key_id=...)

# use generic Cabinet with protocol prefix
cabinets.read('s3://bucket-us-west-2/test.json')

Custom Protocols and Parsers
cabinets is designed to allow complete extensibility in adding new protocols and
parsers. Just because your desired storage platform or file format is not listed above,
doesn't mean you can't use it with cabinets!
Adding Cabinets
New protocol connections can be added by subclassing abstract base class Cabinet, and
registering the class to one or more protocol identifiers:
from cabinets import Cabinet, register_protocols

@register_protocols('foo')
class FooCabinet(Cabinet):

@classmethod
def set_configuration(cls, **kwargs):
# Set up any necessary configuration parameters for "foo" protocol
...

@classmethod
def read_content(cls, path: str) -> bytes:
# Custom logic for reading bytes from a path using "foo" protocol
...

@classmethod
def create_content(cls, path: str, content: bytes):
# Custom logic for writing bytes to a path using "foo" protocol
...

@classmethod
def delete_content(cls, path):
# Custom logic for deleting the object at a path using "foo" protocol
...

Here we define a FooCabinet, and register it to the protocol identifier foo. Once
this class is loaded, any cabinets function calls using the foo:// prefix will be
processed with this class. This means if we called:
import cabinets
from ... import FooCabinet # ensure FooCabinet is loaded

cabinets.read('foo://example.json')

The first call that occurs will be FooCabinet.read_content('foo.json), and that result
is then parsed by the JSONParser before being returned.

NOTE: In order for the protocols to be registered, the class definition must be
run at least once. Make sure the modules where your custom Cabinet classes are defined
are imported somewhere before they are used, OR use the built in Plugin system.

Adding Parsers
cabinets also supports custom extension parsing in the exact same way:
from cabinets.parser import Parser, register_extensions

@register_extensions('bar')
class BarParser(Parser):
@classmethod
def load_content(cls, content: bytes):
# Parse bytes from "bar" file format into a Python object
...

@classmethod
def dump_content(cls, data: Any):
# Dump a Python object into bytes in the "bar" file format
...

Now if we redo our above example using the .bar extension:
from ... import FooCabinet, BarParser # ensure FooCabinet and BarParser are loaded

cabinets.read('foo://example.bar')

This statement is roughly equivalent to:
BarParser.load_content(FooCabinet.read_content('foo.bar'))

and should return a Python object from your Foo cabinet, using your Bar parser!
Loading Plugins
As mentioned in the example above, your custom Cabinet and Parser classes must be
executed in order to be added to the internal cache cabinets uses for protocol and
extension lookup. If your custom classes are imported before any cabinets functions
are use then, this won't be an issue. However, in many use cases there is no reason to
import those classes aside from usage with cabinets functions. Instead of requiring
each class to be imported manually at the start of your program,
cabinets can search a specified path for new Cabinet and Parser classes, and load
them automatically.
Specifying the PLUGIN_PATH environment variable will cause cabinets to search for
subdirectories called cabinet and parser in that path. Modules residing within those
directories will be searched for Cabinet and Parser subclasses respectively.
└─ PLUGIN_PATH
|
└───cabinet
│ │ foo_cabinet.py
└───parser
│ │ bar_parser.py
│ │ baz_parser.py

If the above FooCabinet and BarParser classes are placed in foo_cabinet.py
and bar_parser.py, they will be loaded and registered to their specified cache without
needing to be referenced anywhere else in the program.
Contributing
This package is open source (see LICENSE), so please feel free
to contribute
by submitting a pull request, creating an issue, or contacting the authors directly.
Authors and Contributors

Lucas Lofaro (Co-Author): lucasmlofaro@gmail.com
Sam Hollenbach (
Co-Author): samhollenbach@gmail.com