thegmu_nextcloud_tools package

git_sync_data module

git_sync_data.py

Class designed specific for TheGMUNextCloudGitSync.

  1. Tracks the files and directories to sync.

  2. File and directory methods such as md5sum calculations.

class thegmu_nextcloud_tools.git_sync_data.GitSyncData[source]

Bases: object

GitSyncData: files, directories and methods to sync git and nextcloud data.

static get_current_seconds()[source]

The epoch seconds returned as integer value.

get_directories(root_dir, include_dirs=None, exclude_dirs=None, empty_only=False)[source]

os.walk with filter to only return directories.

Parameters

empty_only – filter that specifies to only return empty directories. The empty_only use case represents removal of directories in NextCloud not found in Git.

get_directory_files(root_dir, include_dirs=None, exclude_dirs=None)[source]

Use os.walk to recursively get all directory files using include/exclude lists

static get_files_exclude_regex(exclude_dirs)[source]

regex: compile list of exclude dirs to one expression to match.

static get_files_include_regex(root_dir, include_dirs)[source]

regex: compile list of include dirs to one expression to match.

static get_md5sum(md5sum_file)[source]

Python md5sum checksum for file similarity comparison.

Keep In Mind

GIT repositories are not designed for large binary files. Audio, video, and images are generally not suitable for GIT. Therefore files are slurped into main memory.

Parameters

md5sum_file – any file that fits in memory.

get_sync_elapsed_seconds()[source]

Elapsed seconds since setting self.sync_data[‘start_seconds’]

init_sync_data()[source]

init_sync_data when processing data only.

sync_data is a dictionary holding file and directory listings.

set_file_md5sum(dir_class, root_dir)[source]

set_file_md5sum: set the md5sum for all files in root_dir.

Only call this after all the files for the class of directory have been set, either all the nextcloud or git files.

Files for both classes are filtered based upon the include and exclude lists of each class.

Parameters
  • dir_class – One of (nextcloud, sync).

  • root_dir – os.path.join(root_dir, file path) == absolute path.

set_files_nextcloud(nextcloud_dir, include_dirs, exclude_dirs)[source]

Call get_directory_files() on the configured NextCloud dir.

set_files_sync(sync_dir, include_dirs, exclude_dirs)[source]

Call get_directory_files() on the configured git sync dir.

set_files_to_copy(nextcloud_dir, git_dir)[source]

Determine files in GIT that have been added or changed relative to nextcloud.

#: If the file in git has changed (md5sum) then replace it. #: The Tranfer directory files were already excluded. #: Set a reason for any file processed as NEW or DIFF. #: TODO: put the reasons in the config file.

Parameters
  • nextcloud_dir – root nextcloud directory.

  • git_dir – root git sync directory.

set_files_to_delete()[source]

Determine files in nextcloud that have been addded relative to GIT and delete them.

#: If the file is in NextCloud but not git, delete it. #: Set a reason for any file processed as CLOUD UPLOAD. #: TODO: put the reasons in the config file.

set_start_seconds()[source]

Set Linux epoch start seconds as integer value.

class thegmu_nextcloud_tools.git_sync_data.GitSyncDataException[source]

Bases: object

GitSyncData: exceptions specific to the GitSyncData module.

script module

script.py

Command line script utilities.

Most notably runcmd() that runs Linux bash command strings and outputs to console as needed.

exception thegmu_nextcloud_tools.script.ScriptException[source]

Bases: Exception

Exception specific to this file, notably the ‘initlock’ function.

thegmu_nextcloud_tools.script.begin()[source]

Print timestamp() with BEGIN.

thegmu_nextcloud_tools.script.end()[source]

Print timestamp() with END.

thegmu_nextcloud_tools.script.env_string_replace(env_string, env_vars=None, empty_sub=False)[source]

Replace all enviornment variables in a string with the environment variable value.

  1. If “env_vars” list is given then ignore os.environ and use this list.

  2. ${HOME}: all substitutions variables require curly braces as in ${HOME}.

  3. If “env_vars” dict is given the both key and values are taken from the dictionary.

  4. If an environment variable does not exist then no substitution is made.

  5. If empty_sub is True then if an environment variable does not exist it will be replaced with an empty string.

Parameters
  • env_string – the string for substituion with env variables.

  • env_vars – substitution for os.envioron list of environment variables.

  • empty_sub – If True then variables not found are removed, otherwise the they are left as in the string.

thegmu_nextcloud_tools.script.fatal(msg, exit_ok=True)[source]

Prints timestamp() with ‘FATAL’ to stderr prepended to a msg. Exit with -1 unless exit_ok is False.

Parameters

exit_ok – If True then call sys.exit(-1) after printing message.

thegmu_nextcloud_tools.script.get_hostname(host_name_only=False)[source]

Retrun socket.gethostbyaddr() string of the FQDN, or fully qualified domain name.

Parameters

host_name_only

short name only Example:

localhost.localdomain -> localhost

thegmu_nextcloud_tools.script.get_log_frame()[source]

stack trace log frame of current function.

thegmu_nextcloud_tools.script.getnow(now_type=None, target_datetime=None)[source]

getnow() get a log file timestamp.

Parameters
  • now_type – format option: #: None: 20190401:103226.82 #: ‘script’: 2019-04-01-10:32:26 #: ‘script_date’: 2019-04-01 #: ‘SQL’: 2019-04-01-10:32:26

  • target_datetime – datetime.datetime() object. When None then defaults to datetime.now().

thegmu_nextcloud_tools.script.initlock(lockpath)[source]

Use a lock file to ensure only a single instance of a script is running at any one time.

  1. The process id, PID, is the first and only line of the lock file.

  2. Subsequent calls to initlock() check if the process corresponding to the PID is running and if not then acquires the lock, else the program exits.

Note

initlock fails if Linux account permission is denied by file permissions.

Parameters

lockpath

The file name to hold the PID, for example:

/tmp/myscript.sh.lock

thegmu_nextcloud_tools.script.msg_error_code(msg, error_code)[source]

Create a new msg with ‘ERRORCODE error_code:’ for keyword log parsing.

Parameters
  • msg – one line log message.

  • error_code – any string but typically digits only.

thegmu_nextcloud_tools.script.print_dashes(msg, for_return=False)[source]

Call print_header() with char ‘-‘

thegmu_nextcloud_tools.script.print_hashes(msg, for_return=False)[source]

Call print_header() with char ‘#’

thegmu_nextcloud_tools.script.print_header(char, msg, for_return=False, width=80)[source]

Print 3 lines per message using character line separators of the specified char, for example:

++++++++++++++++++++++++++++++++++++++++
print_header passing char as '+'.
++++++++++++++++++++++++++++++++++++++++
Parameters
  • char – The character to repeat.

  • msg – A one line log message.

  • for_return – If True return as string.

  • width – How many char to repeat, default is 80.

thegmu_nextcloud_tools.script.print_sql_comment(msg, for_return=False)[source]

Prefix ‘–’ to every line passed in msg.

Parameters
  • msg – A multi-line log message.

  • for_return – If True then return the string.

thegmu_nextcloud_tools.script.runcmd(cmd, console=False, exception_continue=False, encoding=None, return_code=False)[source]

Characterized subprocess.check_call/check_output design patterns.

Note

stderr is always redirected to stdout.

Parameters
  • cmd – A Linux command.

  • console – If True then cmd is printed first than output is sent to stdout.

  • exception_continue – If True exceptions become warnings and execution continues.

  • encoding – binary bytes are returned unless encoding is passed as ‘utf-8’, ‘ascii’ or other encoding.

  • return_code – Only return the integer return code value when console is True or an exception occurs.

thegmu_nextcloud_tools.script.success(msg='success', eol=True, success_name=None)[source]

Print function name with ‘success’ or an optional msg to stdout.

Parameters
  • msg – optional log message.

  • eol – if True then append os.linesep.

  • success_name – Substitute the function name with this string.

thegmu_nextcloud_tools.script.timestamp(msg='TIMESTAMP', for_return=False)[source]

print a log message using a prefix of timestamp, file name, function name and line number prefix.

Example:

[2019-04-01-11:02:07] case.py.run.605 % hello
Parameters
  • msg – A one line log message.

  • for_return – If True return the msg.

thegmu_nextcloud_tools.script.warn(msg, eol=True, warn_prefix=None)[source]

prints timestamp(‘WARNING) to stderr along with the msg. Example: [2019-04-01-11:07:25] system_test.py.test00_python_stuff.50 % WARNING hello

Parameters
  • msg – A one line log message.

  • eol – If True append os.linesep to the msg.

  • warn_prefix – Replace timestamp(‘WARNING’) with this string.

thegmu_davfs module

thegmu_davfs.py

The TheGMUNextCloudGitSync davfs object.

  1. Mount as root only for security purposes. Directory is not accessible by anyone other than root.

  2. Accomodate quirks with davfs.

  3. Primary usage is to sync files between a remote server and local server. Therefore the APIs copy and delete using root directories being sync’ed.

Note

Don’t use Linux rsync with davfs for gigabyte or larger copies. This object times out after 30 seconds due to vagaries of DavFS and nextCloud processing speeds.

class thegmu_nextcloud_tools.thegmu_davfs.TheGMUDavFS(url, mount_dir, thegmu_log=None, verbose=False)[source]

Bases: object

DavFS mount/umount and copy.

MOUNT_CMD = 'mount -t davfs %s %s < /dev/null'
SUDO = True
UMOUNT_CMD = 'umount %s'
davfs_copy(dav_root, copy_dir, copy_path)[source]

Copy file ‘copy_dir/copy_path’ from local disk to a mounted DavFS directory. The file in DavFS will have the same relative path as copy_path. We rely on Linux commands because we haven’t tested the reliablity of the Pyhthon libraries and this needs to be done by root.

davfs_delete(dav_root, delete_path)[source]

Delete a file ‘dav_root/delete_path’ from a mounted DavFS directory. We rely onLinux commands because we haven’t tested the reliablity of the Pyhthon libraries and this needs to be done by root.

davfs_mount()[source]

root mount a davfs mount point.

Note

Credentials need to be set already in /etc/davfs/secrets.

davfs_unmount(fail_okay=False, console=True)[source]

umount a davfs mount point.

delete_files(dav_root, files_to_delete)[source]

Delete files in ‘files_to_delete’, paths must be relative to dav_root.

make_sudo(cmd)[source]

Append sudo to a string.

up_check_or_exit()[source]

Ensure the nextCloud DavFS service is available. The program exits if the server is not up.

thegmu_log module

thegmu_log.py

Standard Python Logging extension.

  1. Interleave dependent package messages using a unique three letter acronym name.

  2. Configure logging to taste.

  3. Default configuration resolves all method names to four characters. this prevents wavy indent where ‘warning’ being 6 characters and ‘debug’ is five and ‘info’ is 4.

  4. If you prefer the standard Python logging names then just create a context as such and pass that in.

  5. GLOBAL class variables are used because the logging module is global state.

class thegmu_nextcloud_tools.thegmu_log.TheGMULog(tla=None, context=None)[source]

Bases: object

TheGMULog builds on Python logging which is a quasi singleton design pattern with context switching. YAML configuration is provided.

CLITLA = 'cli'
CURRENT_CONTEXT = {}
DEFAULT_CONTEXT_YAML = "\ncontext: null\ndefault_context: null\ndefault_level: PROGRESS\ndefault_level_environment_variable: GMUPYLOGLEVEL\nmessage_format: '[%(asctime)s] %(name)s.%(levelname)s.%(message)s'\ndate_format: '%m/%d/%Y %H:%M:%S'\nlevel:\n DEBUG:\n method: debu\n order: 10\n TEMP:\n method: temp\n order: 15\n TEST:\n method: test\n order: 20\n PROGRESS:\n method: prog\n order: 30\n WARNING:\n method: warn\n order: 40\n CRITICAL:\n method: crit\n order: 50\nlevel_name_level: null\nlevel_name: null\nlog_formatter: null\nmethod_name: null\nmethod_name_level_name: null\ntla: cli\ntimestamp: null\n "
LOG_FRAME_DEPTH = 2
STDOUT_STREAMHANDLER = <StreamHandler <stdout> (NOTSET)>
TLA_CONTEXT = {}
context_check(context)[source]

Validate the state of the context passed in.

context_switch(context)[source]

Set up the logging.

Parameters

context – is a Python dictionary initially copied from DEFAULT_CONTEXT_YAML.

static get_default_context_copy()[source]

copy.deepcopy the DEFAULT_CONTEXT_YAML.

get_level_name()[source]

Return the current level name as string, i.e “DEBUG”.

get_level_name_for_method_name(method_name)[source]

Map a method name to its level string.

Parameters

method_name – the name to map. It is an exception to pass an unregistered method_name.

get_level_name_method_name(level_name)[source]

Map level name string to a class method.

Parameters

level_name – The level name to map. It is an exception to pass an unconfigured level.

classmethod get_log_frame()[source]

Stack trace log frame of current function.

is_current_context(context)[source]

Timestamp is checked in case the context for the tla has been updated.

log_by_level_name(msg)[source]

logging level output using a string instead of constant.

Parameters

msg – Typically a one line log message.

classmethod remove_all_loggers()[source]

Update the global state in the Python package ‘logging’

remove_existing_logger_handlers()[source]

Sets up the logging Stream handler for the current TLA

set_current_context(current_context)[source]

Update the class global state with the passed on state.

Parameters

current_context – The new context information to copy verbatim.

set_level_from_string(level_name)[source]

Python logging uses integers, set the integer value mapped to this string.

Parameters

level_name – A previously configured level_name mapped to a level integer. It is an exception to pass an unregistered level_name.

exception thegmu_nextcloud_tools.thegmu_log.TheGMULogException[source]

Bases: Exception

TheGMULog class exception.

thegmu_nextcloud_tools.thegmu_nextcloud_git_sync module

thegmu_nextcloud_git_sync.py ~~~~~~~~~~~~~~~~~~~~~~~~~~~-

Mirror a git repository as a system for record to dedicated Next Cloud account.

  1. Files are virtually read only in that any Next Cloud uploads are reverted back to the git versions.

  2. Transfer directory will accept uploads. It is expected that Transfer files will be eventually added to the git repository. The Transfer directory is designed with a daily or weekly removal of all uploads.

class thegmu_nextcloud_tools.thegmu_nextcloud_git_sync.TheGMUNextCloudGitSync[source]

Bases: object

Command line script for syncing a Git repo with Next Cloud.

TEMPLATE_YAML = 'data/thegmu_nextcloud_git_sync.template.yaml'
TLA = 'ngs'
USAGE = '\n thegmu_nextcloud_git_sync [-t,--today_report, -y,--yesterday_report, -v,--verbose] thegmu_nextcloud_git_sync.yaml\n\n Synchronize files between a git repository and an NextCloud installation.\n The git repository and NextCloud directories are specified in a YAML configuration file.\n\n'
copy_git_files(local_test_dir=None)[source]

Copy git files to the NextCloud mounted DavFS directory.

create_template_file(file_name_only=False)[source]

Copy data/thegmu_nextcloud_git_sync.template.yaml to the current directory.

Parameters

file_name_only – If True return the file name only and do not create the file.

delete_nextcloud_empty_dirs(dav_root, include_dirs=None, exclude_dirs=None)[source]

delete any empty Next Cloud directories created. Returns the number directories deleted.

Parameters
  • include_dirs – filter based on a list of dirs to include.

  • exclude_dirs – filter based on a list of dirs to exclude.

delete_nextcloud_files(dav_root)[source]

Delete any new files from Next Cloud uploads.

get_davfs_mount_dir()[source]

get_davfs_mount_dir: configuration determined davfs mount directory. Test uses a different directory than production.

get_mail_report(day=None, previous_day=False, config_file=None)[source]

This will create a text buffer listing all the errors and processed files for a day. Specify previous_day as True to run after midnight for the previous day. 1. Report all errors first. 2. Report processed files of specified day only. 3. 20190308 is date format.

get_ubuntu_version()[source]

Ubuntu 20 work around for umount seg fault requires version check.

git_pull_nextcloud()[source]

use sudo -c and execute the command as the designated user.

init(argv)[source]

init_args, init_davfs, init_cfg, init_sync

init_args(argv=None)[source]

argparse.ArgumentParser

Parameters

argv – If None then sys.argv is used. Passing this parameter is for testing purposes.

init_cfg(program_config_file=None)[source]

init_cfg: load the YAML configuration file or die trying.

init_davfs()[source]

Init DavFS with command line args.

init_sync()[source]

init_sync_data and output files.

run(argv=None)[source]

run: sync Next Cloud with Git step by step.

run_step_copy_files()[source]

Step 7: Copy files with a 30 second time check.

run_step_delete()[source]

Step 6: Delete files and dirs in NextCloud without any time limit.

run_step_git_pull()[source]

Step 1. GIT pull source directory.

run_step_init(argv)[source]

Start a new synchronization run.

run_step_set_copy_files()[source]

Step 5. Set the git files to copy either because they are new or the md5sums are different.

run_step_set_delete_files()[source]

Step 4. Set the NextCloud files for delete that are not found in git.

run_step_set_git_files()[source]

Step 2. Set sync files using white list of top level directories and black list of exclusion directories like “.git”. Then set the md5sums.

run_step_set_nextcloud_files()[source]

Step 3. Set existing NextCloud files to all the files except those excluded.

run_step_validate_files()[source]

Step 8: Validate, need to flush the DavFS cache first.

run_step_write_files()[source]

Step 9: write output files.

validate_nextcloud_files()[source]
  1. Time out files.

  2. md5sum erorrs.

  3. missing files (delete worked but copy failed).

write_errors()[source]

Write errors out in sorted timestamp order.

write_processed_files()[source]

Write files processed out in sorted timestamp order.

class thegmu_nextcloud_tools.thegmu_nextcloud_git_sync.TheGMUNextCloudGitSyncException[source]

Bases: object

Exceptions specific to this module.

thegmu_nextcloud_tools.thegmu_nextcloud_git_sync.main(argv)[source]

main: system test method with new object, use run() on same object.