Full Spectrum: Credit and Banking
Who am I?
My name is Hussain Sultan.I am a partner at Full SpectrumAnalytics. I create personalizedanalytics software within banks for the sake of equitable and profitabledecision making.
What problem am I trying to solve?
Lending businesses create and manage valuations and cashflow models that outputthe profitability expectations for customer segments. These models are complexbecause they form a network of equations that need to be scored efficiently andkeep track of inputs/outputs at scale.
How Dask helps
Dask is instrumental in my work for creating efficient cashflow modelmanagement systems and general data science enablement on data lakes.
Dask provides a way to construct the dependencies of cashflow equations as aDAG (using the dask.delayedinterface) and provides a good developer experience for buildingscoring/gamification/model tracking applications.
Why I chose Dask originally
I chose dask for three reasons:
- It was lightweight
- The granular task scheduling approach to scaling both dataframes andarbitrary computations fit my use case well
- It is easy to scale my team with Python programmers
Some of the pain points of using Dask in our problem
It’s hard to get organization buy-in to adopt an open-source technology withoutvendored support and enterprise SLAs.
In a recent project, we had to integrate with the Orc data format that turnedout to be more expensive than I originally anticipated (compounded byenterprise hadoop set-up and encryption requirements). These changes havesince been upstreamed though, and so things are easier now.
Some of the technology that we use around Dask
We deployed on generic internal server with Jenkins scheduling a Jupyternotebook to execute. We built everything out using our internal analyticsplatform. We didn’t have to worry about security because everything was behinda corporate firewall.