cmt.base_tasks.plotting

Plotting tasks.

Class BasePlotTask

class cmt.base_tasks.plotting.BasePlotTask(*args, **kwargs)[source]

Bases: ConfigTaskWithCategory

Task that wraps parameters used in all plotting tasks. Can’t be run.

Parameters:
  • feature_names (csv string) – names of features to plot. Uses all features when empty.

  • feature_tags (csv string) – list of tags for filtering features selected via feature names.

  • skip_feature_names (csv string) – names or name pattern of features to skip

  • skip_feature_tags (csv string) – list of tags of features to skip

  • apply_weights (bool) – whether to apply weights and scaling to all histograms.

  • n_bins (int) – Custom number of bins for plotting, defaults to the value configured by the feature when empty.

  • systematics (csv list) – NOT YET IMPLEMENTED. List of custom systematics to be considered.

  • store_systematics (bool) – whether to store systematic templates inside the output root files.

  • shape_region (str from choice list) – shape region used for QCD computation.

  • remove_horns (bool) – NOT YET IMPLEMENTED. Whether to remove the eta horns present in 2017

  • optimization_method (str) – Optimization method to be used. Only bayesian_blocks available.

Class PrePlot

class cmt.base_tasks.plotting.PrePlot(*args, **kwargs)[source]

Bases: DatasetTaskWithCategory, BasePlotTask, LocalWorkflow, HTCondorWorkflow, SGEWorkflow, RDFModuleTask

Performs the filling of histograms for all features considered. If systematics are considered, it also produces the same histograms after applying those.

Parameters:
  • skip_processing (bool) – whether to skip the preprocessing and categorization steps.

  • skip_merging (bool) – whether to skip the MergeCategorization task.

  • preplot_modules_file (str) – filename inside cmt/config/ or ../config/ (w/o extension) with the RDF modules to run.

get_syst_list()[source]

Returns a list of systematic names that affect present or past selections, so dedicated input ntuples are needed as requirements

create_branch_map()[source]
Returns:

number of files after merging (usually 1) unless skip_processing == True

Return type:

int

workflow_requires()[source]
requires()[source]

Each branch requires one input file for the central value and two per systematic considered

output()[source]
Returns:

One file per input file with all histograms to be plotted for each feature

Return type:

.root

get_weight(category, syst_name, syst_direction, **kwargs)[source]

Obtains the product of all weights depending on the category/channel applied. Returns “1” if it’s a data sample or the apply_weights parameter is set to False.

Returns:

Product of all weights to be applied

Return type:

str

run()[source]

Creates one RDataFrame per input file, runs the desired RDFModules and produces a set of plots per each feature, one for the nominal value and others (if available) for all systematics.

Class FeaturePlot

class cmt.base_tasks.plotting.FeaturePlot(*args, **kwargs)[source]

Bases: BasePlotTask, DatasetWrapperTask

Performs the actual histogram plotting: loads the histograms obtained in the PrePlot tasks, rescales them if needed and plots and saves them.

Example command:

law run FeaturePlot --version test --category-name etau --config-name ul_2018 --process-group-name etau --feature-names Htt_svfit_mass,lep1_pt,bjet1_pt,lep1_eta,bjet1_eta --workers 20 --PrePlot-workflow local --stack --hide-data False --do-qcd --region-name etau_os_iso--dataset-names tt_dl,tt_sl,dy_high,wjets,data_etau_a,data_etau_b,data_etau_c,data_etau_d --MergeCategorizationStats-version test_old

Parameters:
  • stack (bool) – whether to show all backgrounds stacked (True) or normalized to 1 (False)

  • do_qcd (bool) – whether to estimate QCD using the ABCD method

  • qcd_wp (str from choice list) – working point to use for QCD estimation

  • qcd_signal_region_wp (str) – region to use as signal region for QCD estimation

  • shape_region (str from choice list) – region to use as shape region for QCD estimation

  • qcd_sym_shape (bool) – whether to symmetrise the shape coming from both possible shape regions

  • qcd_category_name (str) – category name used for the same sign regions in QCD estimation

  • hide_data (bool) – whether to show (False) or hide (True) the data histograms

  • normalize_signals (bool) – whether to normalize signals to the total background yield (True) or not (False)

  • avoid_normalization (bool) – whether to avoid normalizing by cross section and initial number of events

  • blinded (bool) – whether to blind data in specified regions. The blinding ranges are specified using the blinded_range parameter in the Feature definition. This parameter can include a list ([initial, final]) or a list of lists ([[init_1, fin_1], [init_2, fin_2], …])

  • save_png (bool) – whether to save plots in png

  • save_pdf (bool) – whether to save plots in pdf

  • save_root (bool) – whether to write plots in a root file

  • save_yields (bool) – whether to save histogram yields in a json file

  • process_group_name (str) – name of the process grouping name

  • bins_in_x_axis (bool) – (NOT YET IMPLEMENTED) whether to plot histograms with the bin numbers in the x axis instead of the actual values

  • plot_systematics (bool) – (NOT YET FULLY IMPLEMENTED) whether to plot histograms with their uncertainties

  • fixed_colors (bool) – whether to plot histograms with their defined colors (False) or fixed colors (True) starting from ROOT color 2.

  • log_y (bool) – whether to set y axis to log scale

  • include_fit (str) – YAML file inside config folder (w/o extension) including input parameters for the fit

  • propagate_syst_qcd (bool) – whether to propagate systematics to qcd background

requires()[source]

All requirements needed:

  • Histograms coming from the PrePlot task.

  • Number of total events coming from the MergeCategorizationStats task (to normalize MC histograms).

  • If estimating QCD, FeaturePlot for the three additional QCD regions needed.

get_output_postfix(key='pdf')[source]
Returns:

string to be included in the output filenames

Return type:

str

output()[source]

Output files to be filled: pdf, png, root or json

complete()[source]

Task is completed when all output are present

setup_signal_hist(hist, color)[source]

Method to apply signal format to an histogram

setup_background_hist(hist, color)[source]

Method to apply background format to an histogram

setup_data_hist(hist, color)[source]

Method to apply data format to an histogram

plot(feature, ifeat=0)[source]

Performs the actual plotting.

run()[source]

Splits processes into data, signal and background. Creates histograms from each process loading them from the input files. Scales the histograms and applies the correct format to them.