A Luigi Task Which Uses Multiple Outputs of Other Task
If you want to have a single Luigi Task consume data from multiple other dependencies, you can simply do so by specifying multiple Tasks using a list in the requires() Task method. You can also provide parameters to each, and their output will be subsequently processed one after the other:
def requires(self): return [ ATask(), AnotherTask(), OneMore(with_a_parameter) ]
An alternative would be to yield the required tasks, with the same effect but a slightly different syntax. Yields are also handy to require tasks in a more dynamic fashion.
def requires(self): yield ATask() yield AnotherTask() yield OneMore(with_a_parameter)
Make sure that you are able to process the outputs (which are similarly structured in the best case) of all tasks involved. In case you don’t want to use the data from the other tasks, but only want them to have finished beforehand, you should use the requires() method instead.
Join the mailing list!
Subscribe to get notified about future articles and stay in touch via email.
I write about Django, Kubernetes, Docker, automation- and deployment topics, but would also like to keep you up to date about news around the business-side of things.