ymp package¶
-
ymp.
get_config
()[source]¶ Access the current YMP configuration object.
This object might change once during normal execution: it is deleted before passing control to Snakemake. During unit test execution the object is deleted between all tests.
- Return type
-
ymp.
print_rule
= 0¶ Set to 1 to show the YMP expansion process as it is applied to the next Snakemake rule definition.
>>> ymp.print_rule = 1 >>> rule broken: >>> ...
>>> ymp make broken -vvv
-
ymp.
snakemake_versions
= ['5.20.1']¶ List of versions this version of YMP has been verified to work with
Subpackages¶
- ymp.cli package
- ymp.stage package
- Submodules
- ymp.stage.base module
- ymp.stage.expander module
- ymp.stage.groupby module
- ymp.stage.pipeline module
- ymp.stage.project module
PandasProjectData
PandasTableBuilder
Project
KEY_BCCOL
KEY_DATA
KEY_IDCOL
KEY_READCOLS
RE_FILE
RE_REMOTE
RE_SRR
choose_fq_columns
choose_id_column
data
encode_barcode_path
fq_names
fwd_fq_names
fwd_pe_fq_names
get_fq_names
get_ids
idcol
iter_samples
minimize_variables
outputs
pe_fq_names
project_name
raw_reads_source_path
rev_pe_fq_names
runs
se_fq_names
source_cfg
source_path
unsplit_path
variables
SQLiteProjectData
- ymp.stage.reference module
- ymp.stage.stack module
- ymp.stage.stage module
Submodules¶
ymp.blast module¶
Parsers for blast output formats 6 (CSV) and 7 (CSV with comments between queries).
-
class
ymp.blast.
BlastParser
[source]¶ Bases:
object
Base class for BLAST parsers
-
FIELD_MAP
= {'% identity': 'pident', 'alignment length': 'length', 'bit score': 'bitscore', 'evalue': 'evalue', 'gap opens': 'gapopen', 'mismatches': 'mismatch', 'q. end': 'qend', 'q. start': 'qstart', 'query acc.': 'qacc', 'query frame': 'qframe', 'query length': 'qlen', 's. end': 'send', 's. start': 'sstart', 'sbjct frame': 'sframe', 'score': 'score', 'subject acc.': 'sacc', 'subject strand': 'sstrand', 'subject tax ids': 'staxids', 'subject title': 'stitle'}¶
-
FIELD_TYPE
= {'bitscore': <class 'float'>, 'evalue': <class 'float'>, 'gapopen': <class 'int'>, 'length': <class 'int'>, 'mismatch': <class 'int'>, 'pident': <class 'float'>, 'qend': <class 'int'>, 'qframe': <class 'int'>, 'qlen': <class 'int'>, 'qstart': <class 'int'>, 'score': <class 'float'>, 'send': <class 'int'>, 'sframe': <class 'int'>, 'sstart': <class 'int'>, 'staxids': <function BlastParser.tupleofint>, 'stitle': <class 'str'>}¶
-
-
class
ymp.blast.
Fmt6Parser
(fileobj)[source]¶ Bases:
ymp.blast.BlastParser
Parser for BLAST format 6 (CSV)
-
Hit
¶ alias of
BlastHit
-
field_types
= [None, None, <class 'float'>, <class 'int'>, <class 'int'>, <class 'int'>, <class 'int'>, <class 'int'>, <class 'int'>, <class 'int'>, <class 'float'>, <class 'float'>]¶
-
fields
= ['qseqid', 'sseqid', 'pident', 'length', 'mismatch', 'gapopen', 'qstart', 'qend', 'sstart', 'send', 'evalue', 'bitscore']¶ Default field types
-
-
class
ymp.blast.
Fmt7Parser
(fileobj)[source]¶ Bases:
ymp.blast.BlastParser
Parses BLAST results in format ‘7’ (CSV with comments)
-
DATABASE
= '# Database: '¶
-
FIELDS
= '# Fields: '¶
-
HITSFOUND
= ' hits found'¶
-
QUERY
= '# Query: '¶
-
-
ymp.blast.
reader
(fileobj, t=7)[source]¶ Creates a reader for files in BLAST format
>>> with open(blast_file) as infile: >>> reader = blast.reader(infile) >>> for hit in reader: >>> print(hit)
- Parameters
fileobj – iterable yielding lines in blast format
t (
int
) – number of blast format type
- Return type
ymp.blast2gff module¶
ymp.cluster module¶
Module handling talking to cluster management systems
>>> python -m ymp.cluster slurm status <jobid>
-
class
ymp.cluster.
Lsf
[source]¶ Bases:
ymp.cluster.ClusterMS
Talking to LSF
-
states
= {'DONE': 'success', 'EXIT': 'failed', 'PEND': 'running', 'POST_DONE': 'success', 'POST_ERR': 'failed', 'PSUSP': 'running', 'RUN': 'running', 'SSUSP': 'running', 'UNKWN': 'running', 'USUSP': 'running', 'WAIT': 'running'}¶
-
-
class
ymp.cluster.
Slurm
[source]¶ Bases:
ymp.cluster.ClusterMS
Talking to Slurm
-
states
= {'BOOT_FAIL': 'failed', 'CANCELLED': 'failed', 'COMPLETED': 'success', 'COMPLETING': 'running', 'CONFIGURING': 'running', 'DEADLINE': 'failed', 'FAILED': 'failed', 'NODE_FAIL': 'failed', 'PENDING': 'running', 'PREEMPTED': 'failed', 'RESIZING': 'running', 'REVOKED': 'running', 'RUNNING': 'running', 'SPECIAL_EXIT': 'running', 'SUSPENDED': 'running', 'TIMEOUT': 'failed'}¶
-
static
status
(jobid)[source]¶ Print status of job @param jobid to stdout (as needed by snakemake)
Anectotal benchmarking shows 200ms per invocation, half used by Python startup and half by calling sacct. Using
scontrol show job
instead ofsacct -pbs
is faster by 80ms, but finished jobs are purged after unknown time window.
-
ymp.common module¶
Collection of shared utility classes and methods
-
class
ymp.common.
AttrDict
[source]¶ Bases:
dict
AttrDict adds accessing stored keys as attributes to dict
-
class
ymp.common.
CacheDict
(cache, name, *args, loadfunc=None, itemloadfunc=None, itemdata=None, **kwargs)[source]¶ Bases:
ymp.common.AttrDict
-
class
ymp.common.
MkdirDict
[source]¶ Bases:
ymp.common.AttrDict
Creates directories as they are requested
ymp.config module¶
-
class
ymp.config.
ConfigExpander
(config_mgr)[source]¶ Bases:
ymp.snakemake.ColonExpander
-
class
Formatter
(expander)[source]¶ Bases:
ymp.snakemake.FormatExpander.Formatter
,ymp.string.PartialFormatter
-
class
-
class
ymp.config.
ConfigMgr
(root, conffiles)[source]¶ Bases:
object
Manages workflow configuration
This is a singleton object of which only one instance should be around at a given time. It is available in the rules files as
icfg
and viaymp.get_config()
elsewhere.ConfigMgr loads and maintains the workflow configuration as given in the
ymp.yml
files located in the workflow root directory, the user config folder (~/.ymp
) and the installationetc
folder.-
CONF_DEFAULT_FNAME
= '/home/docs/checkouts/readthedocs.org/user_builds/ymp/checkouts/stable/src/ymp/etc/defaults.yml'¶
-
CONF_FNAME
= 'ymp.yml'¶
-
CONF_USER_FNAME
= '/home/docs/.ymp/ymp.yml'¶
-
KEY_PIPELINES
= 'pipelines'¶
-
KEY_PROJECTS
= 'projects'¶
-
KEY_REFERENCES
= 'references'¶
-
RULE_MAIN_FNAME
= '/home/docs/checkouts/readthedocs.org/user_builds/ymp/checkouts/stable/src/ymp/rules/Snakefile'¶
-
property
absdir
¶ Dictionary of absolute paths of named YMP directories
-
property
cluster
¶ The YMP cluster configuration.
-
property
conda
¶
-
property
dir
¶ Dictionary of relative paths of named YMP directories
The directory paths are relative to the YMP root workdir.
-
property
ensuredir
¶ Dictionary of absolute paths of named YMP directories
Directories will be created on the fly as they are requested.
-
classmethod
find_config
()[source]¶ Locates ymp config files and ymp root
The root ymp work dir is determined as the first (parent) directory containing a file named
ConfigMgr.CONF_FNAME
(defaultymp.yml
).The stack of config files comprises 1. the default config
ConfigMgr.CONF_DEFAULT_FNAME
(etc/defaults.yml
in the ymp package directory), 2. the user configConfigMgr.CONF_USER_FNAME
(~/.ymp/ymp.yml
) and 3. theyml.yml
in the ymp root.- Returns
Root working directory conffiles: list of active configuration files
- Return type
root
-
property
limits
¶ The YMP limits configuration.
-
mem
(base='0', per_thread=None, unit='m')[source]¶ Clamp memory to configuration limits
- Params:
base: base memory requested per_thread: additional mem required per allocated thread unit: output unit (b, k, m, g, t)
-
property
pairnames
¶
-
property
pipeline
¶ Configure pipelines
-
property
platform
¶ Name of current platform (macos or linux)
-
property
ref
¶ Configure references
-
property
shell
¶ The shell used by YMP
Change by adding e.g.
shell: /path/to/shell
toymp.yml
.
-
property
snakefiles
¶ Snakefiles used under this config in parsing order
-
-
class
ymp.config.
OverrideExpander
(cfgmgr)[source]¶ Bases:
ymp.snakemake.BaseExpander
Apply rule attribute overrides from ymp.yml config
Example
Set the
wordsize
parameter in thebmtagger_bitmask
rule to 12:overrides: rules: bmtagger_bitmask: params: wordsize: 12
-
expand
(rule, ruleinfo, **kwargs)[source]¶ Expands RuleInfo object and children recursively.
Will call :meth:format (via :meth:format_annotated) on
str
items encountered in the tree and wrap encountered functions to be called once the wildcards object is available.Set
ymp.print_rule = 1
before arule:
statement in snakefiles to enable debug logging of recursion.- Parameters
rule – The :class:snakemake.rules.Rule object to be populated with the data from the RuleInfo object passed from item
item – The item to be expanded. Initially a :class:snakemake.workflow.RuleInfo object into which is recursively decendet. May ultimately be
None
,str
,function
,int
,float
,dict
,list
ortuple
.expand_args – Parameters passed on late expansion (when the
dag
tries to instantiate therule
into ajob
.rec – Recursion level
-
ymp.download module¶
-
class
ymp.download.
FileDownloader
(block_size=4096, timeout=300, parallel=4, loglevel=30, alturls=None, retry=3)[source]¶ Bases:
object
Manages download of a set of URLs
Downloads happen concurrently using asyncronous network IO.
- Parameters
block_size (
int
) – Byte size of chunks to downloadtimeout (
int
) – Aiohttp cumulative timeoutparallel (
int
) – Number of files to download in parallelloglevel (
int
) – Log level for messages send to logging (Errors are send with loglevel+10)alturls – List of regexps modifying URLs
retry (
int
) – Number of times to retry download
-
error
(msg, *args, **kwargs)[source]¶ Send error to logger
Message is sent with a log level 10 higher than the default for this object.
- Return type
None
-
log
(msg, *args, modlvl=0, **kwargs)[source]¶ Send message to logger
Honors loglevel set for the FileDownloader object.
ymp.env module¶
This module manages the conda environments.
-
class
ymp.env.
CondaPathExpander
(config, *args, **kwargs)[source]¶ Bases:
ymp.snakemake.BaseExpander
Applies search path for conda environment specifications
File names supplied via
rule: conda: "some.yml"
are replaced with absolute paths if they are found in any searched directory. Eachsearch_paths
entry is appended to the directory containing the top level Snakefile and the directory checked for the filename. Thereafter, the stack of including Snakefiles is traversed backwards. If no file is found, the original name is returned.
-
class
ymp.env.
Env
(env_file=None, dag=None, singularity_img=None, container_img=None, cleanup=None, name=None, packages=None, base='none', channels=None, rule=None)[source]¶ Bases:
ymp.snakemake.WorkflowObject
,snakemake.deployment.conda.Env
Represents YMP conda environment
Snakemake expects the conda environments in a per-workflow directory configured by
conda_prefix
. YMP sets this value by default to~/.ymp/conda
, which has a greater chance of being on the same file system as the conda cache, allowing for hard linking of environment files.Within the folder
conda_prefix
, each environment is created in a folder named by the hash of the environment definition file’s contents and theconda_prefix
path. This class inherits fromsnakemake.deployment.conda.Env
to ensure that the hash we use is identical to the one Snakemake will use during workflow execution.The class provides additional features for updating environments, creating environments dynamically and executing commands within those environments.
Note
This is not called from within the execution. Snakemake instanciates its own Env object purely based on the filename.
Creates an inline defined conda environment
- Parameters
name (
Optional
[str
]) – Name of conda environment (and basename of file)packages (
Union
[list
,str
,None
]) – package(s) to be installed into environment. Version constraints can be specified in each package string separated from the package name by whitespace. E.g."blast =2.6*"
channels (
Union
[list
,str
,None
]) – channel(s) to be selected for the environmentbase (
str
) – Select a set of default channels and packages to be added to the newly created environment. Sets are defined in conda.defaults inyml.yml
-
create
(dryrun=False, force=False)[source]¶ Ensure the conda environment has been created
Inherits from snakemake.conda.Env.create
- Behavior of super class
The environment is installed in a folder in
conda_prefix
named according to a hash of theenvironment.yaml
defining the environment and the value ofconda-prefix
(Env.hash
). The latter is included as installed environments cannot be moved.If this folder (
Env.path
) exists, nothing is done.If a folder named according to the hash of just the contents of
environment.yaml
exists, the environment is created by unpacking the tar balls in that folder.
- Handling pre-computed environment specs
In addition to freezing environments by maintaining a copy of the package binaries, we allow maintaining a copy of the package binary URLs, from which the archive folder is populated on demand.
If a file
{Env.name}.txt
exists inconda.spec
FIXME
-
property
installed
¶
ymp.exceptions module¶
Exceptions raised by YMP
-
exception
ymp.exceptions.
YmpConfigError
(obj, msg, key=None, exc=None)[source]¶ Bases:
ymp.exceptions.YmpNoStackException
Indicates an error in the ymp.yml config files
-
exception
ymp.exceptions.
YmpNoStackException
(message)[source]¶ Bases:
ymp.exceptions.YmpException
,click.exceptions.ClickException
Exception that does not lead to stack trace on CLI
Inheriting from ClickException makes
click
print only theself.msg
value of the exception, rather than allowing Python to print a full stack trace.This is useful for exceptions indicating usage or configuration errors. We use this, instead of
click.UsageError
and friends so that the exceptions can be caught and handled explicitly where needed.Note that click will call the
show
method on this object to print the exception. The default implementation from click will just prefix themsg
withError:
.- FIXME: This does not work if the exception is raised from within
the snakemake workflow as snakemake.snakemake catches and reformats exceptions.
-
exception
ymp.exceptions.
YmpRuleError
(obj, msg)[source]¶ Bases:
ymp.exceptions.YmpNoStackException
Indicates an error in the rules files
This could e.g. be a Stage or Environment defined twice.
- Parameters
-
exception
ymp.exceptions.
YmpStageError
(msg)[source]¶ Bases:
ymp.exceptions.YmpNoStackException
Indicates an error in the requested stage stack
-
exception
ymp.exceptions.
YmpSystemError
(message)[source]¶ Bases:
ymp.exceptions.YmpNoStackException
Indicates problem running YMP with available system software
-
exception
ymp.exceptions.
YmpWorkflowError
(message)[source]¶ Bases:
ymp.exceptions.YmpNoStackException
Indicates an error during workflow execution
E.g. failures to expand dynamic variables
ymp.gff module¶
Implements simple reader and writer for GFF (general feature format) files.
Unfinished
only supports one version, GFF 3.2.3.
no escaping
-
class
ymp.gff.
Attributes
(ID, Name, Alias, Parent, Target, Gap, Derives_From, Note, Dbxref, Ontology_term, Is_circular)¶ Bases:
tuple
Create new instance of Attributes(ID, Name, Alias, Parent, Target, Gap, Derives_From, Note, Dbxref, Ontology_term, Is_circular)
-
property
Alias
¶ Alias for field number 2
-
property
Dbxref
¶ Alias for field number 8
-
property
Derives_From
¶ Alias for field number 6
-
property
Gap
¶ Alias for field number 5
-
property
ID
¶ Alias for field number 0
-
property
Is_circular
¶ Alias for field number 10
-
property
Name
¶ Alias for field number 1
-
property
Note
¶ Alias for field number 7
-
property
Ontology_term
¶ Alias for field number 9
-
property
Parent
¶ Alias for field number 3
-
property
Target
¶ Alias for field number 4
-
property
-
class
ymp.gff.
Feature
(seqid, source, type, start, end, score, strand, phase, attributes)¶ Bases:
tuple
Create new instance of Feature(seqid, source, type, start, end, score, strand, phase, attributes)
-
property
attributes
¶ Alias for field number 8
-
property
end
¶ Alias for field number 4
-
property
phase
¶ Alias for field number 7
-
property
score
¶ Alias for field number 5
-
property
seqid
¶ Alias for field number 0
-
property
source
¶ Alias for field number 1
-
property
start
¶ Alias for field number 3
-
property
strand
¶ Alias for field number 6
-
property
type
¶ Alias for field number 2
-
property
ymp.helpers module¶
This module contains helper functions.
Not all of these are currently in use
-
class
ymp.helpers.
OrderedDictMaker
[source]¶ Bases:
object
odict creates OrderedDict objects in a dict-literal like syntax
>>> my_ordered_dict = odict[ >>> 'key': 'value' >>> ]
Implementation: odict uses the python slice syntax which is similar to dict literals. The [] operator is implemented by overriding __getitem__. Slices passed to the operator as
object[start1:stop1:step1, start2:...]
, are passed to the implementation as a list of objects with start, stop and step members. odict simply creates an OrderedDictionary by iterating over that list.
ymp.nuc2aa module¶
ymp.snakemake module¶
Extends Snakemake Features
-
class
ymp.snakemake.
BaseExpander
[source]¶ Bases:
object
Base class for Snakemake expansion modules.
Subclasses should override the :meth:expand method if they need to work on the entire RuleInfo object or the :meth:format and :meth:expands_field methods if they intend to modify specific fields.
-
expand
(rule, item, expand_args=None, rec=- 1, cb=False)[source]¶ Expands RuleInfo object and children recursively.
Will call :meth:format (via :meth:format_annotated) on
str
items encountered in the tree and wrap encountered functions to be called once the wildcards object is available.Set
ymp.print_rule = 1
before arule:
statement in snakefiles to enable debug logging of recursion.- Parameters
rule – The :class:snakemake.rules.Rule object to be populated with the data from the RuleInfo object passed from item
item – The item to be expanded. Initially a :class:snakemake.workflow.RuleInfo object into which is recursively decendet. May ultimately be
None
,str
,function
,int
,float
,dict
,list
ortuple
.expand_args – Parameters passed on late expansion (when the
dag
tries to instantiate therule
into ajob
.rec – Recursion level
-
expands_field
(field)[source]¶ Checks if this expander should expand a Rule field type
- Parameters
field – the field to check
- Returns
True if field should be expanded.
-
-
exception
ymp.snakemake.
CircularReferenceException
(deps, rule)[source]¶ Bases:
ymp.exceptions.YmpRuleError
Exception raised if parameters in rule contain a circular reference
-
class
ymp.snakemake.
ColonExpander
[source]¶ Bases:
ymp.snakemake.FormatExpander
Expander using
{:xyz:}
formatted variables.-
regex
= re.compile('\n \\{:\n (?=(\n \\s*\n (?P<name>(?:.(?!\\s*\\:\\}))*.)\n \\s*\n ))\\1\n :\\}\n ', re.VERBOSE)¶
-
spec
= '{{:{}:}}'¶
-
-
class
ymp.snakemake.
DefaultExpander
(**kwargs)[source]¶ Bases:
ymp.snakemake.InheritanceExpander
Adds default values to rules
The implementation simply makes all rules inherit from a defaults rule.
Creates DefaultExpander
Each parameter passed is considered a RuleInfo default value. Where applicable, Snakemake’s argtuples
([],{})
must be passed.
-
class
ymp.snakemake.
ExpandableWorkflow
(*args, **kwargs)[source]¶ Bases:
snakemake.workflow.Workflow
Adds hook for additional rule expansion methods to Snakemake
Constructor for ExpandableWorkflow overlay attributes
This may be called on an already initialized Workflow object.
-
classmethod
activate
()[source]¶ Installs the ExpandableWorkflow
Replaces the Workflow object in the snakemake.workflow module with an instance of this class and initializes default expanders (the snakemake syntax).
-
add_rule
(name=None, lineno=None, snakefile=None, checkpoint=False)[source]¶ Add a rule.
- Parameters
name – name of the rule
lineno – line number within the snakefile where the rule was defined
snakefile – name of file in which rule was defined
-
get_rule
(name=None)[source]¶ Get rule by name. If name is none, the last created rule is returned.
- Parameters
name – the name of the rule
-
global_workflow
= <ymp.snakemake.ExpandableWorkflow object>¶
-
classmethod
load_workflow
(snakefile='/home/docs/checkouts/readthedocs.org/user_builds/ymp/checkouts/stable/src/ymp/rules/Snakefile')[source]¶
-
classmethod
-
class
ymp.snakemake.
FormatExpander
[source]¶ Bases:
ymp.snakemake.BaseExpander
Expander using a custom formatter object.
-
class
Formatter
(expander)[source]¶ Bases:
ymp.string.ProductFormatter
-
regex
= re.compile('\n \\{\n (?=(\n (?P<name>[^{}]+)\n ))\\1\n \\}\n ', re.VERBOSE)¶
-
spec
= '{{{}}}'¶
-
class
-
exception
ymp.snakemake.
InheritanceException
(msg, rule, parent, include=None, lineno=None, snakefile=None)[source]¶ Bases:
snakemake.exceptions.RuleException
Exception raised for errors during rule inheritance
Creates a new instance of RuleException.
Arguments message – the exception message include – iterable of other exceptions to be included lineno – the line the exception originates snakefile – the file the exception originates
-
class
ymp.snakemake.
InheritanceExpander
[source]¶ Bases:
ymp.snakemake.BaseExpander
Adds class-like inheritance to Snakemake rules
To avoid redundancy between closely related rules, e.g. rules for single ended and paired end data, YMP allows Snakemake rules to inherit from another rule.
Example
Derived rules are always created with an implicit
ruleorder
statement, making Snakemake prefer the parent rule if either parent or child rule could be used to generate the requested output file(s).Derived rules initially contain the same attributes as the parent rule. Each attribute assigned to the child rule overrides the matching attribute in the parent. Where attributes may contain named and unnamed values, specifying a named value overrides only the value of that name while specifying an unnamed value overrides all unnamed values in the parent attribute.
-
KEYWORD
= 'ymp: extends'¶ Comment keyword enabling inheritance
-
expand
(rule, ruleinfo)[source]¶ Expands RuleInfo object and children recursively.
Will call :meth:format (via :meth:format_annotated) on
str
items encountered in the tree and wrap encountered functions to be called once the wildcards object is available.Set
ymp.print_rule = 1
before arule:
statement in snakefiles to enable debug logging of recursion.- Parameters
rule – The :class:snakemake.rules.Rule object to be populated with the data from the RuleInfo object passed from item
item – The item to be expanded. Initially a :class:snakemake.workflow.RuleInfo object into which is recursively decendet. May ultimately be
None
,str
,function
,int
,float
,dict
,list
ortuple
.expand_args – Parameters passed on late expansion (when the
dag
tries to instantiate therule
into ajob
.rec – Recursion level
-
-
class
ymp.snakemake.
NamedList
(fromtuple=None, **kwargs)[source]¶ Bases:
snakemake.io.Namedlist
Extended version of Snakemake’s
Namedlist
Fixes array assignment operator: Writing a field via
[]
operator updates the value accessed via.
operator.Adds
fromtuple
to constructor: Builds from Snakemake’s typial(args, kwargs)
tuples as present in ruleinfo structures.Adds
update_tuple
method: Updates values in(args,kwargs)
tuples as present inruleinfo
structures.
Create the object.
Arguments toclone – another Namedlist that shall be cloned fromdict – a dict that shall be converted to a
Namedlist (keys become names)
-
class
ymp.snakemake.
RecursiveExpander
[source]¶ Bases:
ymp.snakemake.BaseExpander
Recursively expands
{xyz}
wildcards in Snakemake rules.-
expands_field
(field)[source]¶ Returns true for all fields but
shell:
,message:
andwildcard_constraints
.We don’t want to mess with the regular expressions in the fields in
wildcard_constraints:
, and there is little use in expandingmessage:
orshell:
as these already have all wildcards applied just before job execution (byformat_wildcards()
).
-
-
class
ymp.snakemake.
SnakemakeExpander
[source]¶ Bases:
ymp.snakemake.BaseExpander
Expand wildcards in strings returned from functions.
Snakemake does not do this by default, leaving wildcard expansion to the functions provided themselves. Since we never want
{input}
to be in a string returned as a file, we expand those always.
-
class
ymp.snakemake.
WorkflowObject
(*args, **kwargs)[source]¶ Bases:
object
Base for extension classes defined from snakefiles
This currently encompasses
ymp.env.Env
andymp.stage.Stage
.This mixin sets the properties
filename
andlineno
according to the definition source in the rules file. It also maintains a registry within the Snakemake workflow object and provides an accessor method to this registry.-
property
defined_in
¶
-
property
-
ymp.snakemake.
print_ruleinfo
(rule, ruleinfo, func=<bound method Logger.debug of <Logger ymp.snakemake (WARNING)>>)[source]¶ Logs contents of Rule and RuleInfo objects.
-
ymp.snakemake.
ruleinfo_fields
= {'benchmark': {'apply_wildcards': True, 'format': 'string'}, 'conda_env': {'apply_wildcards': True, 'format': 'string'}, 'container_img': {'format': 'string'}, 'docstring': {'format': 'string'}, 'func': {'format': 'callable'}, 'input': {'apply_wildcards': True, 'format': 'argstuple', 'funcparams': ('wildcards',)}, 'log': {'apply_wildcards': True, 'format': 'argstuple'}, 'message': {'format': 'string', 'format_wildcards': True}, 'norun': {'format': 'bool'}, 'output': {'apply_wildcards': True, 'format': 'argstuple'}, 'params': {'apply_wildcards': True, 'format': 'argstuple', 'funcparams': ('wildcards', 'input', 'resources', 'output', 'threads')}, 'priority': {'format': 'numeric'}, 'resources': {'format': 'argstuple', 'funcparams': ('wildcards', 'input', 'attempt', 'threads')}, 'script': {'format': 'string'}, 'shadow_depth': {'format': 'string_or_true'}, 'shellcmd': {'format': 'string', 'format_wildcards': True}, 'threads': {'format': 'int', 'funcparams': ('wildcards', 'input', 'attempt', 'threads')}, 'version': {'format': 'object'}, 'wildcard_constraints': {'format': 'argstuple'}, 'wrapper': {'format': 'string'}}¶ describes attributes of
snakemake.workflow.RuleInfo
ymp.snakemakelexer module¶
ymp.snakemakelexer¶
-
class
ymp.snakemakelexer.
SnakemakeLexer
(*args, **kwds)[source]¶ Bases:
pygments.lexers.python.PythonLexer
-
name
= 'Snakemake'¶
-
tokens
= {'globalkeyword': [(<pygments.lexer.words object>, Token.Keyword)], 'root': [('(rule|checkpoint)((?:\\s|\\\\\\s)+)', <function bygroups.<locals>.callback>, 'rulename'), 'rulekeyword', 'globalkeyword', ('\\n', Token.Text), ('^(\\s*)([rRuUbB]{,2})("""(?:.|\\n)*?""")', <function bygroups.<locals>.callback>), ("^(\\s*)([rRuUbB]{,2})('''(?:.|\\n)*?''')", <function bygroups.<locals>.callback>), ('\\A#!.+$', Token.Comment.Hashbang), ('#.*$', Token.Comment.Single), ('\\\\\\n', Token.Text), ('\\\\', Token.Text), 'keywords', ('(def)((?:\\s|\\\\\\s)+)', <function bygroups.<locals>.callback>, 'funcname'), ('(class)((?:\\s|\\\\\\s)+)', <function bygroups.<locals>.callback>, 'classname'), ('(from)((?:\\s|\\\\\\s)+)', <function bygroups.<locals>.callback>, 'fromimport'), ('(import)((?:\\s|\\\\\\s)+)', <function bygroups.<locals>.callback>, 'import'), 'expr'], 'rulekeyword': [(<pygments.lexer.words object>, Token.Keyword)], 'rulename': [('[a-zA-Z_]\\w*', Token.Name.Class, '#pop')]}¶
-
ymp.sphinxext module¶
This module contains a Sphinx extension for documenting YMP stages and Snakemake rules.
The SnakemakeDomain
(name sm) provides the following directives:
-
.. sm:rule::
name
¶ Describes a
Snakemake rule
Both directives accept an optional source
parameter. If given, a
link to the source code of the stage or rule definition will be added.
The format of the string passed is filename:line
. Referenced
Snakefiles will be highlighted with pygments and added to the
documentation when building HTML.
The extension also provides an autodoc-like directive:
-
.. autosnake::
filename
¶ Generates documentation from Snakefile
filename
.
-
class
ymp.sphinxext.
AutoSnakefileDirective
(name, arguments, options, content, lineno, content_offset, block_text, state, state_machine)[source]¶ Bases:
docutils.parsers.rst.Directive
Implements RSt directive
.. autosnake:: filename
The directive extracts docstrings from rules in snakefile and auto-generates documentation.
-
ymp.sphinxext.
BASEPATH
= '/home/docs/checkouts/readthedocs.org/user_builds/ymp/checkouts/stable/src'¶ Path in which YMP package is located
- Type
-
class
ymp.sphinxext.
DomainTocTreeCollector
[source]¶ Bases:
sphinx.environment.collectors.EnvironmentCollector
Add Sphinx Domain entries to the TOC
-
clear_doc
(app, env, docname)[source]¶ Clear data from environment
If we have cached data in environment for document
docname
, we should clear it here.- Return type
None
-
merge_other
(app, env, docnames, other)[source]¶ Merge with results from parallel processes
Called if Sphinx is processing documents in parallel. We should merge this from
other
intoenv
for alldocnames
.- Return type
None
-
process_doc
(app, doctree)[source]¶ Process
doctree
This is called by
read-doctree
, so after the doctree has been loaded. The signal is processed in registered first order, so we are called after built-in extensions, such as thesphinx.environment.collectors.toctree
extension building the TOC.- Return type
None
-
select_doc_nodes
(doctree)[source]¶ Select the nodes for which entries in the TOC are desired
This is a separate method so that it might be overriden by subclasses wanting to add other types of nodes to the TOC.
- Return type
List
[Node
]
-
-
class
ymp.sphinxext.
SnakemakeDomain
(env)[source]¶ Bases:
sphinx.domains.Domain
Snakemake language domain
-
data_version
= 0¶
-
directives
= {'rule': <class 'ymp.sphinxext.SnakemakeRule'>, 'stage': <class 'ymp.sphinxext.YmpStage'>}¶
-
get_objects
()[source]¶ Return an iterable of “object descriptions”.
Object descriptions are tuples with six items:
name
Fully qualified name.
dispname
Name to display when searching/linking.
type
Object type, a key in
self.object_types
.docname
The document where it is to be found.
anchor
The anchor name for the object.
priority
How “important” the object is (determines placement in search results). One of:
1
Default priority (placed before full-text matches).
0
Object is important (placed before default-priority objects).
2
Object is unimportant (placed after full-text matches).
-1
Object should not show up in search at all.
-
initial_data
= {'objects': {}}¶
-
label
= 'Snakemake'¶
-
name
= 'sm'¶
-
object_types
= {'rule': <sphinx.domains.ObjType object>, 'stage': <sphinx.domains.ObjType object>}¶
-
resolve_xref
(env, fromdocname, builder, typ, target, node, contnode)[source]¶ Resolve the pending_xref node with the given typ and target.
This method should return a new node, to replace the xref node, containing the contnode which is the markup content of the cross-reference.
If no resolution can be found, None can be returned; the xref node will then given to the :event:`missing-reference` event, and if that yields no resolution, replaced by contnode.
The method can also raise
sphinx.environment.NoUri
to suppress the :event:`missing-reference` event being emitted.
-
roles
= {'rule': <sphinx.roles.XRefRole object>, 'stage': <sphinx.roles.XRefRole object>}¶
-
-
class
ymp.sphinxext.
SnakemakeRule
(name, arguments, options, content, lineno, content_offset, block_text, state, state_machine)[source]¶ Bases:
ymp.sphinxext.YmpObjectDescription
Directive
sm:rule::
describing a Snakemake rule-
typename
= 'rule'¶
-
-
class
ymp.sphinxext.
YmpObjectDescription
(name, arguments, options, content, lineno, content_offset, block_text, state, state_machine)[source]¶ Bases:
sphinx.directives.ObjectDescription
Base class for RSt directives in SnakemakeDomain
Since this inherhits from Sphinx’ ObjectDescription, content generated by the directive will always be inside an addnodes.desc.
- Parameters
source – Specify source position as
file:line
to create link
-
add_target_and_index
(name, sig, signode)[source]¶ Add cross-reference IDs and entries to
self.indexnode
- Return type
None
-
handle_signature
(sig, signode)[source]¶ Parse rule signature sig into RST nodes and append them to signode.
The retun value identifies the object and is passed to
add_target_and_index()
unchanged
-
option_spec
= {'source': <function unchanged>}¶
-
typename
= '[object name]'¶
-
class
ymp.sphinxext.
YmpStage
(name, arguments, options, content, lineno, content_offset, block_text, state, state_machine)[source]¶ Bases:
ymp.sphinxext.YmpObjectDescription
Directive
sm:stage::
describing an YMP stage-
typename
= 'stage'¶
-
ymp.string module¶
-
exception
ymp.string.
FormattingError
(message, fieldname)[source]¶ Bases:
AttributeError
-
class
ymp.string.
GetNameFormatter
[source]¶ Bases:
string.Formatter
-
class
ymp.string.
OverrideJoinFormatter
[source]¶ Bases:
string.Formatter
Formatter with overridable join method
The default formatter joins all arguments with
"".join(args)
. This class overrides_vformat()
with identical code, changing only that line to one that can be overridden by a derived class.
-
class
ymp.string.
PartialFormatter
[source]¶ Bases:
string.Formatter
Formats what it can and leaves the remainder untouched
-
class
ymp.string.
ProductFormatter
[source]¶ Bases:
ymp.string.OverrideJoinFormatter
String Formatter that creates a list of strings each expanded using one point in the cartesian product of all replacement values.
If none of the arguments evaluate to lists, the result is a string, otherwise it is a list.
>>> ProductFormatter().format("{A} and {B}", A=[1,2], B=[3,4]) "1 and 3" "1 and 4" "2 and 3" "2 and 4"
-
class
ymp.string.
RegexFormatter
(regex)[source]¶ Bases:
string.Formatter
String Formatter accepting a regular expression defining the format of the expanded tags.
ymp.util module¶
-
ymp.util.
R
(code='', **kwargs)[source]¶ Execute R code
This function executes the R code given as a string. Additional arguments are injected into the R environment. The value of the last R statement is returned.
The function requires rpy2 to be installed.
- Parameters
- Yields
value of last R statement
>>> R("1*1", input=input)
-
ymp.util.
file_not_empty
(fn)[source]¶ Checks is a file is not empty, accounting for gz mininum size 20
-
ymp.util.
filter_out_empty
(*args)[source]¶ Removes empty sets of files from input file lists.
Takes a variable number of file lists of equal length and removes indices where any of the files is empty. Strings are converted to lists of length 1.
Returns a generator tuple.
Example: r1, r2 = filter_out_empty(input.r1, input.r2)
ymp.yaml module¶
-
class
ymp.yaml.
AttrItemAccessMixin
[source]¶ Bases:
object
Mixin class mapping dot to bracket access
Added to classes implementing __getitem__, __setitem__ and __delitem__, this mixin will allow acessing items using dot notation. I.e. “object.xyz” is translated to “object[xyz]”.
-
exception
ymp.yaml.
LayeredConfAccessError
[source]¶ Bases:
ymp.yaml.LayeredConfError
,KeyError
,IndexError
Can’t access
-
class
ymp.yaml.
LayeredConfProxy
(maps, parent=None, key=None)[source]¶ Bases:
ymp.yaml.MultiMapProxy
Layered configuration
-
exception
ymp.yaml.
LayeredConfWriteError
[source]¶ Bases:
ymp.yaml.LayeredConfError
Can’t write
-
class
ymp.yaml.
MultiMapProxy
(maps, parent=None, key=None)[source]¶ Bases:
collections.abc.Mapping
,ymp.yaml.MultiProxy
,ymp.yaml.AttrItemAccessMixin
Mapping Proxy for layered containers
-
class
ymp.yaml.
MultiMapProxyItemsView
(mapping)[source]¶ Bases:
ymp.yaml.MultiMapProxyMappingView
,collections.abc.ItemsView
ItemsView for MultiMapProxy
-
class
ymp.yaml.
MultiMapProxyKeysView
(mapping)[source]¶ Bases:
ymp.yaml.MultiMapProxyMappingView
,collections.abc.KeysView
KeysView for MultiMapProxy
-
class
ymp.yaml.
MultiMapProxyMappingView
(mapping)[source]¶ Bases:
collections.abc.MappingView
MappingView for MultiMapProxy
-
class
ymp.yaml.
MultiMapProxyValuesView
(mapping)[source]¶ Bases:
ymp.yaml.MultiMapProxyMappingView
,collections.abc.ValuesView
ValuesView for MultiMapProxy
-
class
ymp.yaml.
MultiProxy
(maps, parent=None, key=None)[source]¶ Bases:
object
Base class for layered container structure
-
class
ymp.yaml.
MultiSeqProxy
(maps, parent=None, key=None)[source]¶ Bases:
collections.abc.Sequence
,ymp.yaml.MultiProxy
,ymp.yaml.AttrItemAccessMixin
Sequence Proxy for layered containers
-
ymp.yaml.
load
(files)[source]¶ Load configuration files
Creates a
LayeredConfProxy
configuration object from a set of YAML files.