Setting Luigi up with a Functional Scheduler History
Is it expected behavior that my Luigi tasks are vanishing from the web interface?
Yeah, they are supposed to go away after a while. Remember that Luigi was designed and built to run the data plumbing at Spotify, with tens of thousands of different jobs running each day. Imagine the clutter if everything stuck around forever. Tasks are visible in the Luigi scheduler web interface while they run and shortly after. They are not listed after they finish with success.
There is however a place, where you should be able to see a history of tasks and be able to explore what has been happening, even if that information is removed from the main web interface page. It needs to be enabled and configured explicitly in the luigi.cfg to be functional, otherwise there simply is no history. You can check it by accessing the /history URL of the scheduler web interface. If you are getting a 500 server error page instead of a list of tasks, it’s not enabled. Try reloading the page once to be sure.
This can be fixed by adding the following lines to your luigi.cfg. Maybe you need to create one in the first place, the typical location for it is in /etc/luigi/client.cfg or a luigi.cfg file in the working directory where you are starting luigid.
[scheduler] record_task_history = True state_path = /home/vagrant/luigi-state.pickle [task_history] db_connection = sqlite:////home/vagrant/luigi-task-hist.db
Replace the paths with something which suits your setup. You will need to install sqlalchemy through pip to make it work. Restart luigid, and you should have access to a /history URL, where future tasks will be displayed.
You can read up more on enabling the history in the official documentation. If you just want to see a functional and pre-configured Luigi setup in action for experimentation, check out the “Try Luigi with Vagrant” article and follow the steps to give it a test drive.