monitor – Monitor Training of Neural Networks¶
-
class
neuralnet_pytorch.monitor.Monitor(model_name=None, root=None, current_folder=None, print_freq=100, num_iters=None, prefix='run', use_visdom=False, use_tensorboard=False, send_slack=False, with_git=False, **kwargs)[source]¶ Collects statistics and displays the results using various backends. The collected stats are stored in ‘<root>/<model_name>/<prefix><#id>’ where #id is automatically assigned each time a new run starts.
Examples
The following snippet shows how to plot smoothed training losses and save images from the current iteration, and then display them every 100 iterations.
from neuralnet_pytorch import monitor as mon mon.model_name = 'foo-model' mon.set_path() mon.print_freq = 100 ... for epoch in mon.iter_epoch(range(n_epochs)): for data in mon.iter_batch(data_loader): loss = net(data) mon.plot('training loss', loss, smooth=.99, filter_outliers=True) mon.imwrite('input images', data['images'], latest_only=True) ...
Parameters: - model_name (str) – name of the model folder.
Default:
None. - root (str) – path to store the collected statistics.
Default:
None. - current_folder (str) – if given, all the stats will be loaded from the given folder.
Default:
None. - print_freq (int) – statistics display frequency.
Unit is iteration.
Default:
None. - num_iters (int) – number of iterations per epoch. If specified, training iteration percentage will be displayed along with epoch. Otherwise, it will be automatically calculated in the first epoch. Default: 100.
- prefix (str) – predix for folder name of of each run.
Default:
'run'. - use_visdom (bool) – whether to use Visdom for real-time monitoring.
Default:
False. - use_tensorboard (bool) – whether to use Tensorboard for real-time monitoring.
Default:
False. - send_slack (bool) – whether to send the statistics to Slack chatroom.
Default:
False. - with_git (bool) – whether to retrieve git information.
Default:
False. - kwargs – some miscellaneous options for Visdom and other functions.
-
path¶ contains all the runs of model_name.
-
current_folder¶ path to the current run.
-
vis¶ an instance of
Visdomwhen use_visdom is set toTrue.
-
writer¶ an instance of Tensorboard’s
SummaryWriterwhen use_tensorboard is set toTrue.
-
plot_folder¶ path to the folder containing the collected plots.
-
file_folder¶ path to the folder containing the collected files.
-
image_folder¶ path to the folder containing the collected images.
-
hist_folder¶ path to the folder containing the collected histograms.
-
clear_hist_stats(key)[source]¶ removes the collected statistics for histogram plot of the specified key.
Parameters: key – the name of the histogram collection. Returns: None.
-
clear_mat_stats(key)[source]¶ removes the collected statistics for matrix plot of the specified key.
Parameters: key – the name of the matrix collection. Returns: None.
-
clear_num_stats(key)[source]¶ removes the collected statistics for scalar plot of the specified key.
Parameters: key – the name of the scalar collection. Returns: None.
-
epoch¶ returns the current epoch.
Returns: _last_epoch.
-
hist_stats¶ returns the collected tensors from beginning.
Returns: _hist_since_beginning.
-
iter¶ returns the current iteration.
Returns: _iter.
-
iter_batch(iterator)[source]¶ tracks training iteration and returns the item in iterator.
Parameters: iterator – the batch iterator. For e.g., enumerator(loader).Returns: a generator over iterator. Examples
>>> from neuralnet_pytorch import monitor as mon >>> mon.print_freq = 1000 >>> data_loader = ... >>> num_epochs = 10 >>> for epoch in mon.iter_epoch(range(num_epochs)): ... for idx, data in mon.iter_batch(enumerate(data_loader)): ... # do something here
See also
-
iter_epoch(iterator)[source]¶ tracks training epoch and returns the item in iterator.
Parameters: iterator – the epoch iterator. For e.g., range(num_epochs).Returns: a generator over iterator. Examples
>>> from neuralnet_pytorch import monitor as mon >>> mon.print_freq = 1000 >>> num_epochs = 10 >>> for epoch in mon.iter_epoch(range(mon.epoch, num_epochs)) ... # do something here
See also
-
load(file, method='pickle', version=-1, **kwargs)[source]¶ loads from the given file.
Parameters: - file – name of the saved file without version.
- method –
strorcallable. Ifcallable, it should be a custom method to load object. There are 3 types ofstr.'pickle': usepickle.dump()to store object.'torch': usetorch.save()to store object.'txt': usenumpy.savetxt()to store object.Default:
'pickle'. - version – the version of the saved file to load. Default: -1 (loads the latest version of the saved file).
- kwargs – additional keyword arguments to the underlying load function.
Returns: None.
-
mat_stats¶ returns the collected scalar statistics from beginning.
Returns: _num_since_beginning.
-
model_name¶ returns the name of the model.
Returns: _model_name.
-
num_stats¶ returns the collected scalar statistics from beginning.
Returns: _num_since_beginning.
-
prefix¶ returns the prefix of saved folders.
Returns: _prefix.
-
read_log(log)[source]¶ reads a saved log file.
Parameters: log – name of the log file. Returns: contents of the log file.
-
reset()[source]¶ factory-resets the monitor object. This includes clearing all the collected data, set the iteration and epoch counters to 0, and reset the timer.
Returns: None.
-
run_training(net, solver: torch.optim.optimizer.Optimizer, train_loader, n_epochs: int, closure=None, eval_loader=None, valid_freq=None, start_epoch=None, scheduler=None, scheduler_iter=False, device=None, *args, **kwargs)[source]¶ Runs the training loop for the given neural network.
Parameters: - net – must be an instance of
NetandModule. - solver – a solver for optimization.
- train_loader – provides training data for neural net.
- n_epochs – number of training epochs.
- closure – a method to calculate loss in each optimization step. Optional.
- eval_loader – provides validation data for neural net. Optional.
- valid_freq – indicates how often validation is run. In effect if only eval_loader is given.
- start_epoch – the epoch from which training will continue.
If
None, training counter will be set to 0. - scheduler – a learning rate scheduler.
Default:
None. - scheduler_iter – if
True, scheduler will run every iteration. Otherwise, it will step every epoch. Default:False. - device – device to perform calculation.
Default:
None. - args – additional arguments that will be passed to neural net.
- kwargs – additional keyword arguments that will be passed to neural net.
Returns: None.Examples
import neuralnet_pytorch as nnt from neuralnet_pytorch import monitor as mon class MyNet(nnt.Net, nnt.Module): ... def train_procedure(batch, *args, **kwargs): loss = ... mon.plot('train loss', loss) return loss def eval_procedure(batch, *args, **kwargs): pred = ... loss = ... acc = ... mon.plot('eval loss', loss) mon.plot('eval accuracy', acc) # define the network, and training and testing loaders net = MyNet(...) train_loader = ... eval_loader = ... solver = ... scheduler = ... # instantiate a Monitor object mon.model_name = 'my_net' mon.print_freq = 100 mon.set_path() # collect the parameters of the network def save_checkpoint(): states = { 'states': mon.epoch, 'model_state_dict': net.state_dict(), 'opt_state_dict': solver.state_dict() } if scheduler is not None: states['scheduler_state_dict'] = scheduler.state_dict() mon.dump(name='training.pt', obj=states, type='torch', keep=5) # save a checkpoint after each epoch and keep only the 5 latest checkpoints mon.schedule(save_checkpoint) print('Training...') # run the training loop mon.run_training(net, solver, train_loader, n_epochs, eval_loader=eval_loader, scheduler=scheduler, valid_freq=val_freq) print('Training finished!')
Parameters: - solver –
- scheduler –
- scheduler –
- net – must be an instance of
-
schedule(func, when=None, *args, **kwargs)[source]¶ uses to schedule a routine during every epoch in
run_training().Parameters: - func – a routine to be executed in
run_training(). - when – the moment when the
funcis executed. For the moment, choices are:'begin_epoch','end_epoch','begin_iter', and'end_iter'. Default:'begin_epoch'. - args – additional arguments to func.
- kwargs – additional keyword arguments to func.
Returns: None- func – a routine to be executed in
- model_name (str) – name of the model folder.
Default: