cmt.base_tasks.preprocessing
Preprocessing tasks.
Class PreCounter
- class cmt.base_tasks.preprocessing.PreCounter(*args, **kwargs)[source]
Bases:
DatasetTask,LocalWorkflow,HTCondorWorkflow,SGEWorkflow,SplittedTask,RDFModuleTaskPerforms a counting of the events with and without applying the necessary weights. Weights are read from the config file. In case they have to be computed, RDF modules can be run.
Example command:
law run PreCounter --version test --config-name base_config --dataset-name ggf_sm --workflow htcondor --weights-file weight_file- Parameters:
Class PreCounterWrapper
- class cmt.base_tasks.preprocessing.PreCounterWrapper(*args, **kwargs)[source]
Bases:
DatasetSystWrapperTaskWrapper task to run the PreCounter task over several datasets in parallel.
Example command:
law run PreCounterWrapper --version test --config-name base_config --dataset-names tt_dl,tt_sl --PreCounter-weights-file weight_file --workers 2
Class PreprocessRDF
- class cmt.base_tasks.preprocessing.PreprocessRDF(*args, **kwargs)[source]
Bases:
PreCounter,DatasetTaskWithCategoryPerforms the preprocessing step applying a preselection + running RDF modules
See requirements in
PreCounter.Example command:
law run PreprocessRDF --version test --category-name base_selection --config-name base_config --dataset-name ggf_sm --workflow htcondor --modules-file modulesrdf --workers 10 --max-runtime 12h- Parameters:
- weights_file = None
Class PreprocessRDFWrapper
- class cmt.base_tasks.preprocessing.PreprocessRDFWrapper(*args, **kwargs)[source]
Bases:
DatasetCategorySystWrapperTaskWrapper task to run the PreprocessRDF task over several datasets in parallel.
Example command:
law run PreprocessRDFWrapper --version test --category-name base_selection --config-name ul_2018 --dataset-names tt_dl,tt_sl --PreprocessRDF-workflow htcondor --PreprocessRDF-max-runtime 48h --PreprocessRDF-modules-file modulesrdf --workers 10
Class Categorization
- class cmt.base_tasks.preprocessing.Categorization(*args, **kwargs)[source]
Bases:
PreprocessRDFPerforms the categorization step running RDF modules and applying a post-selection
Example command:
law run Categorization --version test --category-name etau --config-name base_config --dataset-name tt_dl --workflow local --base-category-name base_selection --workers 10 --feature-modules-file features- Parameters:
- region_name = None
Class CategorizationWrapper
- class cmt.base_tasks.preprocessing.CategorizationWrapper(*args, **kwargs)[source]
Bases:
DatasetCategorySystWrapperTaskWrapper task to run the Categorization task over several datasets in parallel.
Example command:
law run CategorizationWrapper --version test --category-names etau --config-name base_config --dataset-names tt_dl,tt_sl --Categorization-workflow htcondor --workers 20 --Categorization-base-category-name base_selection
Class MergeCategorization
- class cmt.base_tasks.preprocessing.MergeCategorization(*args, **kwargs)[source]
Bases:
DatasetTaskWithCategory,ForestMergeMerges the output from the Categorization or PreprocessRDF tasks in order to reduce the parallelization entering the plotting tasks. By default it merges into one output file, although a bigger number can be set with the merging parameter inside the dataset definition.
In simulated samples,
haddis used to perform the merging. In data samples, to avoid skipping events due to different branches between them,haddnano.py(safer but slower) is used instead. In any case, the use of one method or the other can be forced by specifying the parameters--force-haddand--force-haddnanorespectively.Example command:
law run MergeCategorization --version test --category-name etau --config-name base_config --dataset-name tt_sl --workflow local --workers 4- Parameters:
from_preprocess (bool) – whether it merges the output from the PreprocessRDF task (True) or Categorization (False, default)
force_hadd (bool) – whether to force
haddas tool to do the merging.force_haddnano (bool) – whether to force
haddnano.pyas tool to do the merging.systematic (str) – systematic to use for categorization.
systematic_direction (str) – systematic direction to use for categorization.
- region_name = None
Class MergeCategorizationWrapper
- class cmt.base_tasks.preprocessing.MergeCategorizationWrapper(*args, **kwargs)[source]
Bases:
DatasetCategorySystWrapperTaskWrapper task to run the MergeCategorizationWrapper task over several datasets in parallel.
Example command:
law run MergeCategorizationWrapper --version test --category-names etau --config-name base_config --dataset-names tt_dl,tt_sl --workers 10
Class MergeCategorizationStats
- class cmt.base_tasks.preprocessing.MergeCategorizationStats(*args, **kwargs)[source]
Bases:
DatasetTask,ForestMergeMerges the output from the PreCounter task in order to reduce the parallelization entering the plotting tasks.
- Parameters:
Example command:
law run MergeCategorizationStats --version test --config-name base_config --dataset-name dy_high --workers 10
Class MergeCategorizationStatsWrapper
- class cmt.base_tasks.preprocessing.MergeCategorizationStatsWrapper(*args, **kwargs)[source]
Bases:
DatasetSystWrapperTaskWrapper task to run the MergeCategorizationStatsWrapper task over several datasets in parallel.
Example command:
law run MergeCategorizationStatsWrapper --version test --config-name base_config --dataset-names tt_dl,tt_sl --workers 10
Class EventCounterDAS
- class cmt.base_tasks.preprocessing.EventCounterDAS(*args, **kwargs)[source]
Bases:
DatasetTaskPerforms a counting of the events with and without applying the necessary weights. Weights are read from the config file. In case they have to be computed, RDF modules can be run.
Example command:
law run EventCounterDAS --version test --config-name base_config --dataset-name ggf_sm- Parameters:
use_secondary_dataset (bool) – whether to use the dataset included in the secondary_dataset parameter from the dataset instead of the actual dataset
Class EventCounterDASWrapper
- class cmt.base_tasks.preprocessing.EventCounterDASWrapper(*args, **kwargs)[source]
Bases:
DatasetSuperWrapperTaskWrapper task to run the EventCounterDAS task over several datasets in parallel.
Example command:
law run EventCounterDASWrapper --version test --config-name base_config --dataset-names tt_dl,tt_sl --workers 2