Key Technologies / Libraries:
- Parallel computing (Dask)
- Scalable data visualisation (Datashader)
- Geographic data parsing / rendering (Geopandas / Spatialpandas)
- Web-hosted interactive dashboard (Plotly Dash)
Working with big data can be challenging for many reasons. It's even more difficult when the data at hand is unusual, like dealing with geospatial data.
In this project, we used Dask to distribute compute requirements across multiple nodes without leaving the familiar scipy patterns, and we use Datashader to visualise huge datasets almost instantly, as details are only resolved as necessary according to the user's desired zoom levels.
Plotly Dash provides an excellent web dashboard engine, powered by Flask and React, where the user's inputs are translated to the Dask cluster at the back end and translated to insights almost instantly.
While geospatial data can be difficult to work with on vanilla pandas, geopandas and spatialpandas provide extremely capable extensions so that geospatial data can be appropriately parsed and projected onto maps. This particular app example makes use of open-source map layers as well.