vsupalov

A Luigi Task Which Uses Multiple Outputs of Other Task

January 09, 2016

If you want to have a single Luigi Task consume data from multiple other dependencies, you can simply do so by specifying multiple Tasks using a list in the requires() Task method. You can also provide parameters to each, and their output will be subsequently processed one after the other:

def requires(self):
    return [
        ATask(),
        AnotherTask(),
        OneMore(with_a_parameter)
    ]

An alternative would be to yield the required tasks, with the same effect but a slightly different syntax. Yields are also handy to require tasks in a more dynamic fashion.

def requires(self):
    yield ATask()
    yield AnotherTask()
    yield OneMore(with_a_parameter)

Make sure that you are able to process the outputs (which are similarly structured in the best case) of all tasks involved. In case you don’t want to use the data from the other tasks, but only want them to have finished beforehand, you should use the requires() method instead.

Join the Mailing List

Subscribe to get weekly updates and my latest articles by email.

    (About the content, privacy, analytics and revocation).

    We won't send you spam. Unsubscribe at any time.

    Powered By ConvertKit