Output

Output:

MakeFilename([filename, dirname, fileext, …]) Make file name, file extension and directory name.
PDFToPNG([format, overwrite, verbose, …]) Convert PDF to image format (by default PNG).
ToCSV([separator, header, duplicate_last_bin]) Convert data to CSV text.
Writer([output_directory, output_filename, …]) Write text data to filesystem.

LaTeX utilities:

LaTeXToPDF([overwrite, verbose, create_command]) Run pdflatex binary for LaTeX files.
RenderLaTeX([select_template, …]) Create LaTeX from templates and data.

Output

class MakeFilename(filename=None, dirname=None, fileext=None, overwrite=False)[source]

Make file name, file extension and directory name.

A single argument can be a string, which will be used as a file name without extension (but it can contain a relative path). The string can contain arguments enclosed in double braces. These arguments will be filled from context during __call__(). Example:

MakeFilename(“{{variable.type}}/{{variable.name}}”)

By default, values with context.output already containing filename are not updated (returned unchanged). This can be changed using a keyword argument overwrite. If context doesn’t contain all necessary keys for formatting, it will not be updated. For more options, use lena.context.UpdateContext.

Other allowed keywords are filename, dirname, fileext. Their value must be a string, otherwise LenaTypeError is raised. At least one of the must be present, or LenaTypeError will be raised. If a simple check finds unbalanced or single braces instead of double, LenaValueError is raised.

__call__(value)[source]

Add output keys to the value’s context.

filename, dirname, fileext, if initialized, set respectively context.output.{filename,dirname,fileext}. Only those values are transformed that have no corresponding keys (filename, fileext or dirname) in context.output and for which the current context can be formatted (contains all necessary keys for any of the format strings).

class PDFToPNG(format='png', overwrite=False, verbose=True, timeoutsec=60)[source]

Convert PDF to image format (by default PNG).

Set output format (by default png).

If the resulting file already exists and the pdf is unchanged (which is checked through context.output.changed), conversion is not repeated. To convert all pdfs to images, set overwrite to True (by default it is False).

To disable printing messages during run(), set verbose to False.

timeoutsec is time (in seconds) for subprocess timeout (used only in Python 3). If the timeout expires, the child process will be killed and waited for. The TimeoutExpired exception will be re-raised after the child process has terminated.

This element uses pdftoppm binary internally. pdftoppm can use other output formats, for example jpeg or tiff. See pdftoppm manual for more details.

run(flow)[source]

Convert PDF files to format.

PDF files are recognized via context.output.filetype. Their paths are assumed to be the data part of the value.

Data yielded is the resulting file name. Context is updated with output.filetype set to format.

Other values are passed unchanged.

class ToCSV(separator=', ', header=None, duplicate_last_bin=True)[source]

Convert data to CSV text.

These objects are converted:
  • Histogram (implemented only for 1- and 2-dimensional histograms).
  • any object (including Graph) with to_csv method.

separator delimits values in the output text,

header is a string which becomes the first line of the output,

If duplicate_last_bin is True, contents of the last bin will be written in the end twice. This may be useful for graphical representation: if last bin is from 9 to 10, then the plot may end on 9, while this parameter allows to write bin content at 10, creating the last horizontal step.

run(flow)[source]

Convert values from flow to CSV text.

Context.output is updated with {“filetype”: “csv”}. All not converted data is yielded unchanged.

If data has to_csv method, it must accept keyword arguments separator and header and return text.

If context.output.to_csv is False, the value is skipped.

Data is yielded as a whole CSV block. To generate CSV line by line, use hist1d_to_csv() and hist2d_to_csv().

hist1d_to_csv(hist, header=None, separator=', ', duplicate_last_bin=True)[source]

Yield CSV-formatted strings for a one-dimensional histogram.

hist2d_to_csv(hist, header=None, separator=', ', duplicate_last_bin=True)[source]

Yield CSV-formatted strings for a two-dimensional histogram.

class Writer(output_directory='', output_filename='output', verbose=True, existing_unchanged=False, overwrite=False)[source]

Write text data to filesystem.

output_directory is the base output directory. It can be further appended by the incoming data. Non-existing directories are created.

output_filename is the name for unnamed data. Use it to write only one file.

If no arguments are given, the default is to write to “output.txt” in the current directory (rewritten for every new value) (unless different extensions are provided through the context). It is recommended to create filename explicitly using MakeFilename. The default writer’s output file can be useful in case of errors, when explicit file name didn’t work.

verbose regulates whether additional information should be printed on the screen. verbose set to False disables runtime messages.

existing_unchanged and overwrite are used during run() to change the handling of existing files. They are mutually exclusive: if one tries to use them simultaneously, LenaValueError is raised.

run(flow)[source]

Only strings (and unicode in Python 2) are written. To be written, data must have “output” dictionary in context and context[“output”][“writer”] not set to False. Other values pass unchanged.

Full name of the file to be written (filepath) has the form self.output_directory/dirname/filename.fileext, where dirname, filename and file extension fileext are searched in context[“output”]. If filename is missing, Writer’s default filename is used. If fileext is missing, then filetype is used; if it is also absent, the default file extension is “txt”. It is usually enough to provide fileext.

If the resulting file exists and its content is the same as the incoming data, file is not overwritten. If existing_unchanged is True, existing file contents are not checked (they are assumed to be not changed). If overwrite is True, file contents are not checked, and all data is assumed to be changed. If a file was overwritten, output.changed is set to True, otherwise if it was not set before, it is set to False. If in that case output.changed existed, it retains its previous value.

Example: suppose you have a sequence (Histogram, ToCSV, Writer, RenderLaTeX, Writer, LaTeXToPDF). If both histogram representation and LaTeX template exist and are unchanged, the second Writer signals context.output.changed=False, and LaTeXToPDF doesn’t regenerate the plot. If LaTeX template was unchanged, but the previous context from the first Writer signals context.output.changed=True, then in the second Writer template is not rewritten, but context.output.changed remains True. On the second run, even if we check file contents, the program will run faster for unchanged files even for Writer, because read speed is typically higher than write speed.

File name with full path is yielded as data. Context.output is updated with fileext and filename (in case they were not present), and filepath, where filename is its base part (without output directory and extension) and filepath is the complete path.

If context.output.filename is present but empty, LenaRuntimeError is raised.

LaTeX

class LaTeXToPDF(overwrite=False, verbose=1, create_command=None)[source]

Run pdflatex binary for LaTeX files.

It runs in parallel (separate process is spawned for each job) and non-interactively.

overwrite sets whether existing unchanged pdfs shall be overwritten during run().

verbose = 0 allows no output messages. 1 prints pdflatex error messages. More than 1 prints pdflatex stdout.

If you need to run pdflatex (or other executable) with different parameters, provide its command.

create_command is a function which accepts texfile_name, outfilename, output_directory, context (in this order) and returns a list made of the command and its arguments.

Default command is:
[“pdflatex”, “-halt-on-error”, “-interaction”, “batchmode”,
“-output-directory”, output_directory, texfile_name]
run(flow)[source]

Convert all incoming LaTeX files to pdf.

A value from flow corresponds to a TeX file if its context.output.filetype is “tex”. Other values pass unchanged.

If the resulting pdf file exists and context.output.changed is not set to True, pdf rendering is not run. Set overwrite to True to always recreate pdfs.

class RenderLaTeX(select_template='', template_path='.', select_data=None)[source]

Create LaTeX from templates and data.

select_template is a string or a callable. If a string, it is the name of the template to be used (unless context.output.template overwrites that). If select_template is a callable, it must accept a value from the flow and return template name. If select_template is an empty string (default) and no template could be found in the context, LenaRuntimeError is raised.

template_path is the path for templates (used in jinja2.FileSystemLoader). By default, it is the current directory.

select_data is a callable to choose data to be rendered. It should accept a value from flow and return boolean. If it is not provided, by default CSV files are selected.

run(flow)[source]

Render values from flow to LaTeX.

If no select_data was initialized, values with context.output.filetype equal to “csv” are selected by default.

Rendered LaTeX text is yielded in the data part of the tuple (no write to filesystem occurs). context.output.filetype updates to “tex”.

Not selected values pass unchanged.