Cleaning ======== .. rubric:: Learning Objectives - Learn how to use the ``clean`` attribute. - Define a custom clean function. In the last section, we changed the heatmap plotting function and had to manually remove the target to have it regenerated. Wouldn't it be nice to not have to do that? Yes, yes it would! The best way to accomplish this is via the ``clean`` attribute. As one might guess, the ``clean`` attribute should be a function which cleans up after the task. Conveniently, the project creator has included an extremely useful function for this, called ``clean_targets``, which just goes ahead and ``rm``'s the target for that task. Here's what it looks like included in our humble download task. .. code:: python from doit.task import clean_targets def task_download_data(): return {'actions': ['curl -OL https://s3.amazonaws.com/pydoit-intermediate/Melee_data.csv.gz'], 'targets': ['Melee_data.csv.gz'], 'uptodate': [run_once], 'clean': [clean_targets]} Now, if we run ``doit clean``, the data file will be removed; if we run ``doit``, all the tasks will be rerun, because they all depend directly or indirectly on the downloaded data file. .. note:: Pydoit's creator clearly saw that removing the targets would be an extremely common task. As such, one does not actually need to import the ``clean_targets`` function; just passing ``[True]`` to ``clean`` will suffice. However, if you would like both remove the targets *and* do something else to revert your task, the list should include ``clean_targets`` and whatever other callable or string you'd like run. ``clean`` can also be a shell command or function that the user defines. For example, we might want to inform the user where the pipeline got the file they just blew away. .. code:: python DATA_URL = 'https://s3.amazonaws.com/pydoit-intermediate/Melee_data.csv.gz' def task_download_data(): def print_url(): print 'File was retrieved from: {0}'.format(DATA_URL) return {'actions': ['curl -OL {0}'.format(DATA_URL)], 'targets': ['Melee_data.csv.gz'], 'uptodate': [run_once], 'clean': [clean_targets, print_url]} The ``clean`` attribute takes the same form as ``actions`` or even ``uptodate``, and so can also be a string to be executed in the shell. .. rubric:: Clean all the things Add ``clean`` attributes to all the currently defined tasks. If you'd like, get creative with your attributes. Include two clean actions in at least one of the tasks (the second doesn't have to do anything groundbreaking).