Last updated:
0 purchases
bdrcutil 1.0.5
BDRC-UTIL
Overview
Development
Deployment
Installation
Debian requirements
MacOS requirements
Contents
Publicly available scripts
locators
migrate works
log_dip
User Guides
log_dip
Synopsis
Argument structure
Argument hints
Deep Archive and Inversion
Inversion
Sync and Deep Archive
API
TODO: Document API
bdrc-util Changelog
BDRC-UTIL
Overview
BDRC UTIL is a python package containing modules for use by the Buddhist
Digital Resource Center. It is offered to the public under the MIT
License. This document describes its
contents and features.
Although publicly available, BDRC does not support this project for use
by others. We will not respond to questions about its features and
functionality.
Development
archive-ops uses python packages from archive-ops/venv
Deployment
# be in project main dir
python -m setup bdist_wheel
# test
twine upload --verbose -r testpypi dist/bdrc_util-x.MM.mm-py3-none-any.whl
# prod
twine upload --verbose dist/bdrc_util-x.MM.mm-py3-none-any.whl
Installation
pyPI.org bdrc-util
Debian requirements
You need this (and its dependencies) for the pip component mysqlclient
to install sudo apt install default-libmysqlclient-dev
MacOS requirements
You need this (and its dependencies) for the pip component mysqlclient
to install brew install mysql
Contents
Publicly available scripts
As defined in setup.py
locators
Maps a work and a destination parent to a specific directory using
various BDRC mapping schemes
migrate works
Scripts to migrate and log works into BDRC’s 2021 Archival strategy
log_dip
Log creation and distribution of Distribution Information Packages
(DIPs). DIP is an OAIS term to describe a unit of publication.
User Guides
log_dip
The command log_dip is intended for use by BDRC staff to instrument
their publication activities. log_dip takes arguments from the shell
and transfers them into a database table.
Synopsis
log_dip --help
usage: log_dip | -d DBAppSection:DbAppFile log_dip [OPTIONS] [dip_source_path] [dip_dest_path]
Logs a number of different publication strategies
positional arguments:
source_path Source path (optional) - string
dest_path Destination path (optional) - string
options:
-h, --help show this help message and exit
-d DRSDBCONFIG, --drsDbConfig DRSDBCONFIG
specify section:configFileName
-l {info,warning,error,debug,critical}, --log-level {info,warning,error,debug,critical}
choice values are from python logging module
-a ACTIVITY_TYPE, --activity_type ACTIVITY_TYPE
Activity type
-w WORK_NAME, --work_name WORK_NAME
work being distributed
-i DIP_ID, --dip_id DIP_ID
ID to update
-r ACTIVITY_RETURN_CODE, --activity_return_code ACTIVITY_RETURN_CODE
Integer result of operation.
-b BEGIN_TIME, --begin_time BEGIN_TIME
time of beginning - ')yyyy-mm-dd hh:mm:ss bash format date +'%Y-%m-%d
%R:%S'
-e END_TIME, --end_time END_TIME
time of end.Default is invocation time. yyyy-mm-dd hh:mm:ss bash format
date + '%Y-%m-%d %R:%S'
-c COMMENT, --comment COMMENT
Any text up to 4GB in length
-s DIP_SOURCE_PATH, --dip_source_path DIP_SOURCE_PATH
Source path (optional) - string
-t DIP_DEST_PATH, --dip_dest_path DIP_DEST_PATH
Destination path (optional) - string
-L, --resolve-sym-links
True to resolve file paths, false to accept input as is
-n INVENTORY, --inventory INVENTORY
path to inventory (only used for ARCHIVE)
Argument structure
log_dip creates a database record that captures the beginning or end of
a DIP event.
All its operations return an opaque identifier which can reference the
record. In bash, this would be invoked as
You reference the record later by one of two methods:
passing in the id from the initial (or subsequent calls):
dip_id=$(dip_log --drsDbConfig sec:some.config --begin_time "2021-05-11 01:23:45" --activity_type DRS --work_name W12345)
dip_log -d sec:some.config --activity_return_code 42 --end_time "2021-05-11 12:34:56" --dip_id $dip_id
using the work Id, Activity type and begin time:
dip_log -d sec:some.config -b "2021-05-11 01:23:45" -a DRS -w W12345
dip_log -d sec:some.config -b "2021-05-11 01:23:45" -a DRS -w W12345 -r 42 -e "2021-05-11 12:34:56"
Both of the above examples perform the same function:
log the start of a DRS job for work W12345 at “2021-05-11 01:23:45”
log the end_time of the job at “2021-05-11 12:34:56” , with a return
code of 42
Argument hints
to give an end time, you must give all the job id information, either
in the id, or with the (work_name, begin_time, activity_id) tuple
You can add as much information as you want in one call. If you’ve
captured the begin time, you can create a call which logs them all at
the same time (this is not the best practice, because it eliminates
the system’s ability to check for in-progress jobs). This is
perfectly legal:
dip_log -d sec:some.config -b "2021-05-11 01:23:45" -a DRS -w W12345 -r 42 -e "2021-05-11 12:34:56 -c "Hi Mom, Im re-writing history"
Begin and end dates are fussy: in shell, the format for generating
the date dip_log requires is: date +%Y-%m-%d %R:%S (for Mac with
GNU core, GNU Linuxes)
you can update some DIP log properties:
comments
end time
operation return code
Obviously, since these are the tuple which identifies the
transaction, you cannot modify:
work name
begin time
activity type
dip_external_id (this is a read only argument supplied by the
caller of log_dip)
In this example, the comments field is updated.
dip_log_id=$( dip_log -d sec:some.config -b "2021-05-11 01:23:45" -a DRS -w W12345 -r 42 -e "2021-05-11 12:34:56 -c "Experienced some discomfort")
dip_log -d sec:some.config -i $dip_log_id -c "But it passed.")
Any property not given in the command line is preserved. (The example
above preserves the begin and end times of the DIP transaction.)
the comment field is a free-form text field of up to 4GB in length.
You can store XML or JSON data in it for later use. (such as any
error messages or summary information about the process or the
objects being processed). Update: the deep-archive utility
reads the comment field for coded data.
Deep Archive and Inversion
Inversion
In version 1.0.2 of bdrc-util the deep-archive utility was
created, to send to Glacier Deep Archive separate image groups. This
allowed large works to be sent as separate smaller segments. (It also
allowed other material that was not categorized by image group to be
sent to Glacier.) The process packages all the media types (sources,
archive, images) for an image group into one bagged zip file.
Sync and Deep Archive
archive-ops-1087 - sync by image
group
specifies enhancements to the sync process to sync fragments of
image groups. README.md documents these requirements
and provides examples.
API
A simple API, inspired by openpecha.buda.api is provided as a
central library for commonly used utilities, including Legacy Hack Image
Group Translation
TODO: Document API
To use in your code, pip install bdrc-util>=0.9.44
bdrc-util Changelog
version
commit
Comments
1.0.5
(many)
Integration fixes
1.0.4
(many)
Support volume-manifest-builder by image group
1.0.3
1dfef221
Silence deep archive empty file error
1.0.2
(many)
Invert works for deep archive
1.0.1
ccd9865
dip_log passes db config to ORM
0.9.48
9573f3c
optional symlink resolution
0.9.47
192c43f4
Add s3pathlib to install requirements
0.9.46
e14b3a6
decomission web in favor of api
0.9.45
89724ee
Raise pageSize for Get volumes
0.9.44
TBD
Move Resolvers to api
0.9.43
013242a
cacheing to reduce load on server
0.9.42
146bc43a
support buda-dld
0.9.41
0d01394
print, dont return from disk_ig_from_buda
0.9.40
Rename get_image_groups
0.9.39
Added measure archive fixity
Shorten log file name
0.9.38
Added RST documentation to setup.
Added minimum requirement for bdrc-db-lib
0.9.34
Use external address for resolver
0.9.32
be754999
Create entry points for image group renaming
0.9.31
192eea17
(not released) single entry point for image group renames
0.9.30
83c5062a
Add Work calculation size to script
For personal and professional use. You cannot resell or redistribute these repositories in their original state.
There are no reviews.