vsupalov

A Luigi Task Which Uses Multiple Outputs of Other Task

January 9, 2016

If you want to have a single Luigi Task consume data from multiple other dependencies, you can simply do so by specifying multiple Tasks using a list in the requires() Task method. You can also provide parameters to each, and their output will be subsequently processed one after the other:

def requires(self):
    return [
        ATask(),
        AnotherTask(),
        OneMore(with_a_parameter)
    ]

An alternative would be to yield the required tasks, with the same effect but a slightly different syntax. Yields are also handy to require tasks in a more dynamic fashion.

def requires(self):
    yield ATask()
    yield AnotherTask()
    yield OneMore(with_a_parameter)

Make sure that you are able to process the outputs (which are similarly structured in the best case) of all tasks involved. In case you don’t want to use the data from the other tasks, but only want them to have finished beforehand, you should use the requires() method instead.

Join the mailing list!


Subscribe to get notified about future articles and stay in touch via email.

I write about Kubernetes, Docker, automation- and deployment topics, but would also like to keep you up to date about news around the business-side of things.

Privacy and your data: You can get more information about the usage of your data, the storage of your registration, sending out mails with the US-provider ConvertKit, statistical analysis of emails sent and your possibility to unsubscribe in my Privacy Policy.

I use the US-provider ConvertKit for email automation. By clicking to submit this form, you acknowledge that the information you provide will be transferred to ConvertKit for processing in accordance with their Privacy Policy and Terms.

We won't send you spam. Unsubscribe at any time. Powered by ConvertKit