Classifier¶
Folder Structure¶
classifier/
: main package for classifier- Machine Learning
ml/
: high-level ML workflows and utilitiesnn/
: neural network models
- Task System
task/
: task protocols and command-line interfaceconfig/
: task configurationstest/
: task configurations for testing
- Monitor System
monitor/
: monitor core and components
- Others
data/
: model data archivesalgorithm/
: algorithms implemented withtorch.Tensor
compatibility/
: 4b analysis related modulesroot/
:ROOT
I/O utilitiesdf/
:pd.DataFrame
utilitiesprocess/
: multiprocessing utilitiespatch/
: unreleased critical bug fixes
- Machine Learning
pyml.py
: run the classifier jobs, can be used as an executable.
Getting Started¶
Setup Environment¶
Note
You are assumed to be in the coffea4bees/python/
directory to run the following commands.
Use Container (Recommended)¶
Warning
You may need to change the apptainer cache and temp directory before pulling any image, especially when the home directory has a limited quota. The directories are controlled by the following environment variables:
export APPTAINER_CACHEDIR=
export APPTAINER_TMPDIR=
The docker image is available as:
docker://chuyuanliu/heptools:ml
/cvmfs/unpacked.cern.ch/registry.hub.docker.com/chuyuanliu/heptools:ml
(only when CVMFS is available)
The image is built from the following configurations:
base.Dockerfile
: base imageml.Dockerfile
: ml image derived from base imagebase.yml
: used bybase.Dockerfile
base-linux.yml
: used bybase.Dockerfile
ml.yml
: used byml.Dockerfile
Run the following command to start an interactive shell:
apptainer exec \
-B .:/srv \
--nv \
--pwd /srv \
docker://chuyuanliu/heptools:ml \
bash --init-file /entrypoint.sh
where:
-B .:/srv
mount the current directory to/srv
--nv
enable GPU--pwd /srv
equivalent tocd /srv
when starting the containerbash --init-file /entrypoint.sh
(important) start a bash shell and run the initialization script.
Use Conda¶
The conda environment can be created from the base.yml
, base-linux.yml
and ml.yml
files listed above.
classifier/env.yml
is deprecated and not actively maintained.
rogue01/rogue02
specific¶
-
change the cache and temp directory for apptainer:
mkdir -p /mnt/scratch/${USER}/.apptainer
-
add the following to
~/.bashrc
export APPTAINER_TMPDIR=/mnt/scratch/${USER}/.apptainer/ export APPTAINER_CACHEDIR=/mnt/scratch/${USER}/.apptainer/
Command-line Interface¶
See the Task System for details.
Setup Auto-completion¶
To register the auto-completion for the current shell session, run the following command:
source classifier/install.sh
To unregister the auto-completion, run:
source classifier/uninstall.sh
The auto-completion will be triggered when the command starts with ./pyml.py
and the <tab>
key is pressed. It will dynamically search for available tasks in the classifier/config
directory and hint for the task name or the arguments.
Help¶
Use the following command to print help for all tasks:
./pyml.py help --all
Training and Evaluation¶
See the HCR Training for a complete example to train and evaluate a HCR model for SvB and FvT.
Monitor¶
A monitor is provided to collect logs, progresses, resource metrics and other information from worker processes/nodes. See the Monitor System for details.
Histogram¶
The histogramming is handled by dask
processors for better performance and compatibility. See the Histogram for details.