Flow

Elements:

Cache(filename[, method, protocol]) Cache flow passing through.
DropContext(*args) Sequence, which transform (data, context) flow so that only data remains in the inner sequence.
End Stop sequence here.
Print([before, sep, end, transform]) Print values passing through.

Functions:

get_context(value) Get context from a possible (data, context) pair.
get_data(value) Get data from value (a possible (data, context) pair).
get_data_context(value) Get (data, context) from value (a possible (data, context) pair).
seq_map(seq, container[, one_result]) Map Lena Sequence seq to the container.

Group plots:

GroupBy(group_by) Group data.
GroupPlots(group_by, select[, transform, …]) Group several plots.
GroupScale(scale_to[, allow_zero_scale, …]) Scale a group of data.
Selector(selector) Determine whether an item should be selected.

Iterators:

Chain(*iterables) Chain generators.
CountFrom([start, step]) Generate numbers from start to infinity, with step between values.
ISlice(*args) Slice iterable from start to stop with step.

Split into bins:

SplitIntoBins(seq, arg_func, edges[, transform]) Split analysis into bins.

Elements

Elements form Lena sequences. This group contains miscellaneous elements, which didn’t fit other categories.

class Cache(filename, method='cPickle', protocol=2)[исходный код]

Cache flow passing through.

On the first run, dump all flow to file (and yield the flow unaltered). On subsequent runs, load all flow from that file in the original order.

Example:

s = Source(
         ReadFiles(),
         ReadEvents(),
         MakeHistograms(),
         Cache("histograms.pkl"),
         MakeStats(),
         Cache("stats.pkl"),
      )

If stats.pkl exists, Cache will read data flow from that file and no other processing will be done. If the stats.pkl cache doesn’t exist, but the cache for histograms exist, it will be used and no previous processing (from ReadFiles to MakeHistograms) will occur. If both caches are not filled yet, processing will run as usually.

Only pickleable objects can be cached (otherwise a pickle.PickleError is raised).

Предупреждение

The pickle module is not secure against erroneous or maliciously constructed data. Never unpickle data from an untrusted source.

filename is the name of file where to store the cache. You can give it .pkl extension.

method can be pickle or cPickle (faster pickle). For Python3 they are same.

protocol is pickle protocol. Version 2 is the highest supported by Python 2. Version 0 is «human-readable» (as noted in the documentation). 3 is recommended if compatibility between Python 3 versions is needed. 4 was added in Python 3.4. It adds support for very large objects, pickling more kinds of objects, and some data format optimizations.

static alter_sequence(seq)[исходный код]

If the Sequence seq contains a Cache, which has an up-to-date cache, a Source is built based on the flattened seq and returned. Otherwise the seq is returned unchanged.

cache_exists()[исходный код]

Return True if file with cache exists and is readable.

drop_cache()[исходный код]

Remove file with cache if that exists, pass otherwise.

If cache exists and is readable, but could not be deleted, LenaEnvironmentError is raised.

run(flow)[исходный код]

Load cache or fill it.

If we can read filename, load flow from there. Otherwise use the incoming flow and fill the cache. All loaded or passing items are yielded.

class DropContext(*args)[исходный код]

Sequence, which transform (data, context) flow so that only data remains in the inner sequence. Context is restored outside DropContext.

DropContext works for most simple cases as a Sequence, but may not work in more advanced circumstances. For example, since DropContext is not transparent, Split can’t judge whether it has a FillCompute element inside, and this may lead to errors in the analysis. It is recommended to provide context when possible.

*args will form a Sequence.

run(flow)[исходный код]

Run the sequence without context, and generate output flow restoring the context before DropContext.

If the sequence adds a context, the returned context is updated with that.

class End[исходный код]

Stop sequence here.

run(flow)[исходный код]

Exhaust all preceding flow and stop iteration (yield nothing to the following flow).

class Print(before='', sep='', end='n', transform=None)[исходный код]

Print values passing through.

before is a string appended before the first element in the item (which may be a container).

sep separates elements, end is appended after the last element.

transform is a function which transforms passing items (for example, it can select its specific fields).

Functions

Functions to deal with data and context, and seq_map().

A value is considered a (data, context) pair, if it is a tuple of length 2, and the second element is a dictionary or its subclass.

get_context(value)[исходный код]

Get context from a possible (data, context) pair.

If context is not found, return an empty dictionary.

get_data(value)[исходный код]

Get data from value (a possible (data, context) pair).

If context is not found, return value.

get_data_context(value)[исходный код]

Get (data, context) from value (a possible (data, context) pair).

If context is not found, (value, {}) is returned.

Since get_data() and get_context() both check whether context is present, this function may be slightly more efficient and compact than the other two.

seq_map(seq, container, one_result=True)[исходный код]

Map Lena Sequence seq to the container.

For each value from the container, calculate seq.run([value]). This can be a list or a single value. If one_result is True, the result must be a single value. In this case, if results contain less than or more than one element, LenaValueError is raised.

The list of results (lists or single values) is returned. The results are in the same order as read from the container.

Group plots

Group several plots into one.

Since data can be produced in different places, several classes are needed to support this. First, the plots of interest must be selected (for example, one-dimensional histograms). This is done by Selector. Selected plots must be grouped. For example, we may want to plot data x versus Monte-Carlo x, but not data x vs data y. Data is grouped by GroupBy. To preserve the group, we can’t yield it to the following elements, but have to transform the plots inside GroupPlots. We can also scale (normalize) all plots to one using GroupScale.

class GroupBy(group_by)[исходный код]

Group data.

Data is added during update(). Groups are available as groups attribute.

Groups is a mapping of keys (return values of group_by) and lists of items with the same key.

Combine data with same attributes.

group_by is a function, which returns distinct hashable results for items from different groups.

It can be a dot-separated string, which corresponds to context. Otherwise, LenaTypeError is raised.

clear()[исходный код]

Remove all groups.

update(val)[исходный код]

Find a group for val and add it there.

A group key is calculated by group_by. If no such key exists, a new group is created.

class GroupPlots(group_by, select, transform=(), scale_to=None, yield_selected=False)[исходный код]

Group several plots.

Plots to be grouped are chosen by select, which acts as a boolean function. If select is not a Selector, it is converted to that class. See Selector for more options.

Plots are grouped by group_by, which returns different keys for different groups. If it is not an instance of GroupBy, it is converted to that class. See GroupBy for more options.

scale_to is a number or a string. A number means the scale, to which plots must be normalized. A string is a name of the plot to which other plots must be normalized. If scale_to is not an instance of GroupScale, it is converted to that class. If a plot could not be rescaled, LenaValueError is raised. For more options, use GroupScale.

transform is a sequence, which processes individual plots before yielding. For example, transform=(HistToCSV(), writer). transform is called after scale_to.

yield_selected defines whether selected items should be yielded during run like other items. Use it if you want to have both single and combined plots. By default, selected plots are not yielded.

run(flow)[исходный код]

Run the flow and yield final groups.

Each item of the flow is checked with the selector. If it is selected, it is added to groups. Otherwised it is yielded.

After the flow is finished, groups are yielded. Groups are lists of items, which have same keys from group_by. Each group’s context (including empty) is inserted into a list in context.group. The resulting context is updated with the intersection of groups“ contexts. For uniformity, if yield_selected is True, single values are also updated: data is put into a list of one element, and context is updated with group key. Its value is copy (not deep copy) of context’s values, so future updates to subdictionaries which existed during this run will be effective in context.group.

If scale_to was set, plots are normalized to the given value or plot. If that plot was not selected (is missing in the captured group) or its norm could not be calculated, LenaValueError is raised.

class GroupScale(scale_to, allow_zero_scale=False, allow_unknown_scale=False)[исходный код]

Scale a group of data.

scale_to defines the method of scaling. If a number is given, group items are scaled to that. Otherwise it is converted to a Selector, which must return a unique item from the group. Group items will be scaled to the scale of that item.

By default, attempts to rescale a structure with unknown or zero scale raise an error. If allow_zero_scale and allow_unknown_scale are set to True, the corresponding errors are ignored and the structure remains unscaled.

scale(group)[исходный код]

Scale group and return a rescaled group as a list.

The group can contain (structure, context) pairs. The original group is unchanged as long as structures“ scale method returns a new structure (default for Lena histograms and graphs).

If any item could not be rescaled and options were not set to ignore that, LenaValueError is raised.

class Selector(selector)[исходный код]

Determine whether an item should be selected.

Generally, selected means the result is convertible to True, but other values can be used as well.

The usage of selector depends on its type.

If selector is a class, __call__() checks that data part of the value is subclassed from that.

A callable is used as is.

A string means that value’s context must conform to that (as in lena.context.check_context_str()).

selector can be a container. In this case its items are converted to selectors. If selector is a list, the result is or applied to results of each item. If it is a tuple, boolean and is applied to the results.

If incorrect arguments are provided, LenaTypeError is raised.

__call__(value)[исходный код]

Check whether value is selected.

If an exception occurs, the result is False. It is safe to use non-existing attributes, etc.

Iterators

Adapters to iterators from itertools.

class Chain(*iterables)[исходный код]

Chain generators.

Chain can be used as a Source to generate data.

Example:

>>> c = lena.flow.Chain([1, 2, 3], ['a', 'b'])
>>> list(c())
[1, 2, 3, 'a', 'b']

iterables will be chained during __call__(), that is after the first one is exhausted, the second is called, etc.

__call__()[исходный код]

Generate values from chained iterables.

class CountFrom(start=0, step=1)[исходный код]

Generate numbers from start to infinity, with step between values.

Similar to itertools.count().

__call__()[исходный код]

Yield values from start to infinity with step.

class ISlice(*args)[исходный код]

Slice iterable from start to stop with step.

Initialization:

ISlice (stop)

ISlice (start, stop [, step])

Similar to itertools.islice() or range().

fill_into(element, value)[исходный код]

Fill element with value.

Element must have a fill(value) method.

run(flow)[исходный код]

Yield values from start to stop with step.

Split into bins

Split analysis on groups set by bins.

class ReduceBinContent(select, transform, drop_bins_context=True)[исходный код]

Transform bin content of histograms.

This class is used when histogram bins contain complex structures. For example, in order to plot a histogram with a 3-dimensional vector in each bin, we shall create 3 histograms corresponding to vector’s components.

Select determines which types should be transformed. The types must be given in a list (not a tuple) or as a general Selector. Example: select=[lena.math.vector3, list].

transform is a Sequence or element applied to bin contents. If transform is not a Sequence or an element with run method, it is converted to a Sequence. Example: transform=Split([X(), Y(), Z()]) (provided that you have X, Y, Z variables).

ReduceBinContent creates histograms, which may be plotted, that is bins contain only data without context. By default, context of all bins except one is not used. If drop_bins_context is False, a histogram of bin context is added to context.

In case of wrong arguments, LenaTypeError is raised.

run(flow)[исходный код]

Transform histograms from flow.

Not selected values pass unchanged.

Context is updated with variable, histogram and bin_content. variable» and *histogram copy context from split_into_bins (if present there). bin_content includes context for example bin in «example_bin» and (optionally) for all bins in «all_bins».

class SplitIntoBins(seq, arg_func, edges, transform=None)[исходный код]

Split analysis into bins.

seq is a FillComputeSeq sequence, which corresponds to the analysis being compared for different bins. It can be a tuple containing a FillCompute element. Deep copy of seq will be used to produce each bin’s content.

arg_func is a function which takes data and returns argument value used to compute the bin index. A Variable must be provided. Example of a two-dimensional function: arg_func = lena.variables.Variable("xy", lambda event: (event.x, event.y)).

edges is a sequence of arrays containing monotonically increasing bin edges along each dimension. Example: edges = lena.math.mesh((0, 1), 10).

transform is a Sequence, which is applied to results. The final histogram may contain vectors, histograms and any other data the analysis produced. To be able to plot them, transform can extract vector components or do other work to simplify structures. By default, transform is TransformBins. Pass an empty tuple to disable it.

Attributes: bins, edges.

If edges are not increasing, LenaValueError is raised. In case of other argument initialization problems, LenaTypeError is raised.

compute()[исходный код]

Yield a (Histogram, context) for compute() for each bin.

Histogram is created from edges and bins taken from compute() for bins. Context is preserved in histogram bins.

SplitIntoBins context is added to context.split_into_bins as histogram (corresponding to edges) and variable (corresponding to arg_func) subcontexts.

In Python 3 the minimum number of compute() among all bins is used. In Python 2, if some bin is exhausted before the others, its content will be filled with None.

fill(val)[исходный код]

Fill the cell corresponding to arg_func(val) with val.

Values outside of edges range are ignored.

class TransformBins(create_edges_str=None)[исходный код]

Transform bins into a flattened sequence.

create_edges_str is a callable, which creates a string from bin’s edges and coordinate names and adds that to context. It is passed parameters (edges, var_context), where var_context is Variable context containing variable names (it can be a single Variable or Combine).

By default, it is cell_to_string().

If create_edges_str is not callable, LenaTypeError is raised.

cell_to_string(cell_edges, var_context=None, coord_names=None, coord_fmt='{}_lte_{}_lt_{}', coord_join='_', reverse=False)[исходный код]

Transform cell edges into a string.

cell_edges is a tuple of pairs (lower bound, upper bound) for each coordinate.

coord_names is a list of coordinates names.

coord_fmt is a string, which defines how to format individual coordinates.

coord_join is a string, which joins coordinate pairs.

If reverse is True, coordinates are joined in reverse order.

get_example_bin(struct)[исходный код]

Return bin with zero index on each axis of the histogram bins.

For example, if the histogram is two-dimensional, return hist[0][0].

struct can be a Histogram or an array of bins.