Output

Output:

MakeFilename([filename, dirname, fileext, ...])

Make file name, file extension and directory name.

PDFToPNG([format, overwrite, verbose, ...])

Convert PDF to image format (by default PNG).

iterable_to_table(iterable[, format_, ...])

Create a table from an iterable.

ToCSV([separator, header, duplicate_last_bin])

Convert data to CSV text.

Write(output_directory[, output_filename, ...])

Write text data to filesystem.

Writer(*args, **kwargs)

Не рекомендуется, начиная с версии 0.4.

LaTeX utilities:

LaTeXToPDF([overwrite, verbose, create_command])

Run pdflatex binary for LaTeX files.

RenderLaTeX([select_template, template_dir, ...])

Create LaTeX from templates and data.

Output

class MakeFilename(filename=None, dirname=None, fileext=None, prefix=None, suffix=None, overwrite=False)[исходный код]

Make file name, file extension and directory name.

filename is a string, which will be used as a file name without extension (but it can contain a relative path). The string can contain formatting arguments enclosed in double braces. These arguments will be filled from context during __call__(). Example:

MakeFilename(«{{variable.type}}/{{variable.name}}»)

dirname and fileext set directory name and file extension. They are treated similarly to filename in most aspects.

It is possible to «postpone» file name creation, but to provide a part of a future file name through prefix or suffix. They will be appended to file name during its creation. Existing file names are not affected. It is not allowed to use prefix or suffix if filename argument is given.

For example, if one creates logarithmic plots, but complete file names will be made later, one may use MakeFilename(suffix=»_log»).

All these arguments must be strings, otherwise LenaTypeError is raised. They may all contain formatting arguments.

By default, values with context.output already containing filename, dirname or fileext are not updated (pass unaltered). This can be changed using a keyword argument overwrite. For more options, use lena.context.UpdateContext.

At least one argument must be present, or LenaTypeError will be raised.

__call__(value)[исходный код]

Add output keys to the value’s context.

Formatting context is retrieved from static context and from the context part of the value. The run-time context has higher precedence.

filename, dirname, fileext, if initialized, set respectively context.output.{filename,dirname,fileext} (if they didn’t exist).

If this elements sets file name and if context contains output.prefix or output.suffix, they are prepended to or appended after the file name. After that they are removed from context.output.

If this element adds a prefix or a suffix and they exist in the context, then prefix is prepended before the existing prefix, and suffix is appended after the existing suffix, unless overwrite is set to True: in that case they are overwritten. prefix and suffix always update their existing keys in the context if they could be formatted (which is different for attributes like filename).

If current context can’t be formatted (doesn’t contain all necessary keys for the format string), a key is not updated.

class PDFToPNG(format='png', overwrite=False, verbose=True, timeoutsec=60)[исходный код]

Convert PDF to image format (by default PNG).

Set output format (by default png).

If the resulting file already exists and the pdf is unchanged (which is checked through context.output.changed), conversion is not repeated. To convert all pdfs to images, set overwrite to True (by default it is False).

To disable printing messages during run(), set verbose to False.

timeoutsec is time (in seconds) for subprocess timeout (used only in Python 3). If the timeout expires, the child process will be killed and waited for. The TimeoutExpired exception will be re-raised after the child process has terminated.

This element uses pdftoppm binary internally. pdftoppm can use other output formats, for example jpeg or tiff. See pdftoppm manual for more details.

run(flow)[исходный код]

Convert PDF files to format.

PDF files are recognized via context.output.filetype. Their paths are assumed to be the data part of the value.

Data yielded is the resulting file name. Context is updated with output.filetype set to format.

Other values are passed unchanged.

iterable_to_table(iterable, format_=None, header='', header_fields=(), row_start='', row_end='', row_separator=',', footer='')[исходный код]

Create a table from an iterable.

The resulting table is yielded line by line. If the header or footer is empty, it is not yielded.

format_ controls the output of individual cells in a row. By default, it uses standard Python representation. For finer control, one should provide a sequence of formatting options for each column. For floating values it is recommended to output only a finite appropriate number of digits, because this would allow the output to be immutable between calls despite technical reasons. Default formatting allows an arbitrary number of columns in each cell. For tables to be well-formed, substitute missing values in the iterable for some placeholder like "", None, etc.

Each row is prepended with row_start and appended with row_end. If it consists of several columns, they are joined by row_separator. Separators between rows can be added while iterating the result.

This function can be used to convert structures to different formats: csv, html, xml, etc.

Examples:

>>> angles = [(3.1415*i/4, 180*i/4) for i in range(1, 5)]
>>> format_ = ("{:.2f}", "{:.0f}")
>>> header_fields = ("rad", "deg")
>>>
>>> csv_rows = iterable_to_table(
...    angles, format_=format_,
...    header="{},{}", header_fields=header_fields,
...    row_separator=",",
... )
>>> print("\n".join(csv_rows))
rad,deg
0.79,45
1.57,90
2.36,135
3.14,180
>>>
>>> html_rows = iterable_to_table(
...    angles, format_=format_,
...    header="<table>\n" + " "*4 + "<tr><td>{}</td><td>{}</td></tr>",
...    header_fields=header_fields,
...    row_start=" "*4 + "<tr><td>", row_end="</td></tr>",
...    row_separator="</td><td>",
...    footer="</table>"
... )
>>> print("\n".join(html_rows))
<table>
    <tr><td>rad</td><td>deg</td></tr>
    <tr><td>0.79</td><td>45</td></tr>
    <tr><td>1.57</td><td>90</td></tr>
    <tr><td>2.36</td><td>135</td></tr>
    <tr><td>3.14</td><td>180</td></tr>
</table>
>>>

For more complex formatting use templates (see RenderLaTeX).

Добавлено в версии 0.5.

class ToCSV(separator=',', header=None, duplicate_last_bin=True)[исходный код]

Convert data to CSV text.

Can be converted:
  • histogram (implemented only for 1- and 2-dimensional histograms).

  • any iterable object (including graph).

separator delimits values in the output text. The result is yielded as one string starting from header.

If duplicate_last_bin is True, then for histograms contents of the last bin will be written in the end twice. This may be useful for graphical representation: if last bin is from 9 to 10, then the plot may end on 9, while this parameter allows to write bin content at 10, creating the last horizontal step.

run(flow)[исходный код]

Convert values from flow to CSV text.

context.output is updated with {«filetype»: «csv»}. If a data structure has a method _update_context(context), it also updates the current context during the transform. All not converted data is yielded unchanged. If output.duplicate_last_bin is present in context, it takes precedence over this element’s value. To force the common behaviour, one can manually update context before this element.

If context.output.to_csv is False, the value is skipped.

Data is yielded as a whole CSV block. To generate CSV line by line, use hist1d_to_csv(), hist2d_to_csv() or iterable_to_table().

hist1d_to_csv(hist, header=None, separator=',', duplicate_last_bin=True)[исходный код]

Yield CSV-formatted strings for a one-dimensional histogram.

hist2d_to_csv(hist, header=None, separator=',', duplicate_last_bin=True)[исходный код]

Yield CSV-formatted strings for a two-dimensional histogram.

class Write(output_directory, output_filename='output', verbose=True, existing_unchanged=False, overwrite=False)[исходный код]

Write text data to filesystem.

output_directory is the base output directory. It can be further appended by the incoming data. Non-existing directories are created.

output_filename is the name for unnamed data. Use it to write only one file.

If no arguments are given, the default is to write to «output.txt» in the current directory (rewritten for every new value) (unless different extensions are provided through the context). It is recommended to create filename explicitly using MakeFilename. The default writer’s output file is useful in case of errors, when explicit file name didn’t work.

verbose sets whether additional information should be printed on the screen. verbose set to False disables runtime messages.

existing_unchanged and overwrite are used during run() to change the handling of existing files. These options are mutually exclusive: their simultaneous use raises LenaValueError.

run(flow)[исходный код]

Only strings (and unicode in Python 2) and objects with a method write are written. Method write must accept a string with output file path as an argument. If context[«output»][«write»] is set to False, a value will not be written. Not written values pass unchanged.

Full name of the file to be written (filepath) has the form self.output_directory/dirname/filename.fileext, where dirname, filename and file extension fileext are searched in context[«output»]. If filename is missing, Write’s default filename is used. If fileext is missing, then filetype is used; if it is also absent, the default file extension is «txt». It is usually enough to provide fileext.

If the resulting file exists and its content is the same as the incoming data, file is not overwritten (unless it was produced with an object’s method write, which doesn’t allow to learn whether the file has changed). If existing_unchanged is True, existing file contents are not checked (they are assumed to be not changed). If overwrite is True, file contents are not checked, and all data is assumed to be changed. If a file was written, then output.changed is set to True, otherwise, if it was not set before, it is set to False. If in that case output.changed existed, it retains its previous value.

Example: suppose you have a sequence (Histogram, ToCSV, Write, RenderLaTeX, Write, LaTeXToPDF). If both histogram representation and LaTeX template exist and are unchanged, the second Write signals context.output.changed=False, and LaTeXToPDF doesn’t regenerate the plot. If LaTeX template was unchanged, but the previous context from the first Write signals context.output.changed=True, then in the second Write template is not rewritten, but context.output.changed remains True. On the second run, even if we check file contents, the program will run faster for unchanged files even for Write, because read speed is typically higher than write speed.

File name with full path is yielded as data. context.output is updated with fileext and filename (in case they were not present), and filepath, where filename is its base part (without output directory and extension) and filepath is the complete path. If data is equal to context.output.filepath, this means that the file was already written by another Write, and the value is skipped (yielded unchanged).

If context.output.filename is present but empty, LenaRuntimeError is raised.

class Writer(*args, **kwargs)[исходный код]

Не рекомендуется, начиная с версии 0.4: use Write.

LaTeX

class LaTeXToPDF(overwrite=False, verbose=1, create_command=None)[исходный код]

Run pdflatex binary for LaTeX files.

It runs in parallel (separate process is spawned for each job) and non-interactively.

overwrite sets whether existing unchanged pdfs shall be overwritten during run().

verbose = 0 allows no output messages. 1 prints pdflatex command and output in case of errors. More than 1 prints all pdflatex output.

If you need to run pdflatex (or other executable) with different parameters, provide its command.

create_command is a function which accepts texfile_name, outfilename, output_directory, context (in this order) and returns a list made of the command and its arguments.

Default command is:
[«pdflatex», «-halt-on-error», «-interaction», «errorstopmode»,

«-output-directory», output_directory, texfile_name]

run(flow)[исходный код]

Convert all incoming LaTeX files to pdf.

A value from flow corresponds to a TeX file if its context.output.filetype is «tex». Other values pass unchanged.

If the resulting pdf file exists and context.output.changed is set to False, pdf rendering is not run. If context.output.changed is not set, then modification times for .tex and .pdf files are compared: if the template .tex is newer, it is reprocessed. Set the initialization argument overwrite to True to always recreate pdfs. All non-existent files are always created.

class RenderLaTeX(select_template='', template_dir='.', select_data=None, environment=None, from_data=False, verbose=0)[исходный код]

Create LaTeX from templates and data.

select_template is a string or a callable. If a string, it is the name of the template to be used (unless context.output.template overwrites that). If select_template is a callable, it must accept a value from the flow and return template name. If select_template is an empty string (default) and no template could be found in the context, LenaRuntimeError is raised.

template_dir is the path to the directory with templates (used by jinja2.FileSystemLoader). By default, it is the current directory.

select_data is a callable to choose data to be rendered. It should accept a value from flow and return boolean. By default CSV files are selected (see run()).

environment allows user-defined initialisation of jinja Environment. One can use that to add custom filters, tests, global functions, etc. In that case one must set template_dir for that environment manually. Example user initialisation:

import jinja2
from lena.output import RenderLaTeX, jinja_syntax_latex

# import user settings, filters and globals


def render_latex():
    """Construct RenderLaTeX to be used in analysis sequences."""
    loader = jinja2.FileSystemLoader(TEMPLATE_PATH)
    environment = jinja2.Environment(
        loader=loader,
        **jinja_syntax_latex
    )
    environment.filters.update(FILTERS)
    environment.globals.update(GLOBALS)
    return RenderLaTeX(
        select_template=select_template,
        environment=environment
    )

Usually template context is stored in the context part of values. Sometimes, however, the data part contains the needed information (for example, during creation of tables). Set from_data to True to render the data part.

verbose controls the verbosity of output. If it is 1, selected values are printed during run(). If it is 2 or higher, not selected values are printed as well.

run(flow)[исходный код]

Render values from flow to LaTeX.

If no custom select_data was initialized, values with context.output.filetype equal to «csv» are selected by default.

Rendered LaTeX text is yielded as the data part of the tuple (use Write to write that to the filesystem). context.output.filetype updates to «tex».

Not selected values pass unchanged.