This post / notebook is a lightning talk presented at Python Users Berlin (13.10.2016)

  • dask (link) is a flexible dynamic computing library for analytic computing
  • dask.bag provides extremely useful distributed data analysis abstractions over files on disk

See notebook