“We are going 100% online” are words that no longer surprise us. The same thing happened to the Data Innovation Summit conference.
While the topics of the sessions where as interesting as ever, it’s fair to say that the overall conference experience was lacking. Being able to interact with speakers and exhibitors is something that perhaps most take for granted when it comes to conferences, and that was jarring in its absence. No doubt the human interaction element is a big factor in what makes a conference what it is.
It was not all doom and gloom though. The newfound digitisation has some perks, being able to comb through multiple tracks while sitting in the same room. The key ingredient to improve the experience we feel was: short (~20 min) pre-recorded sessions released on slot time. This also enabled switching to something more interesting if you realised a particular session wasn’t for you. Overall, Hyperight did a great job to digitalise a conference that was supposed to be IRL.
In this blog post we share our three best takeaways from this all-digital conference.
- Never again use the word dashboard
- dataOps – focus on software engineering
- AI – slope of enlightenment
Death to the “dashboard”
Ever since entering the Data Science / Analytics scene some of us have dreaded any work related to the so called “Dashboards”. Nick Desbarats boiled down the problem with dashboards into two main issues:
- reaching too many roles, and
- reporting for multiple purposes.
Instead of building the catch-all version of the dashboard, build a specific displaying catering to a specific purpose. Among the 13 “displays” defined by Nick Desbarats, the following are two examples that in the real world often overlap each other:
- Monitoring Display: Metrics that users study daily. This display should help the user to quickly identify problem areas and positive/negative trend. A surprisingly effective example was an entire dashboard filled with metrics with red/green dots on changes of interest.
- Performance Display: The difference to the Monitoring type display is that here we should enable users to go beyond finding pain-points – finding potential solutions to problems. Typically filled with less metrics and more strategic KPIs.
If you want to know more of the displays that Nick mentions, please check him out!
When to storytell
Another common example of reporting for multiple purposes is using storytelling in your display, when you shouldn’t. Storytelling has two main focuses; persuading the viewer and educating the viewer. When persuading or educating your audience, it is very difficult if your narrative changes over time. Therefore static displays are more suited for storytelling. The above mentioned types of displays are examples of dynamic displays, storytelling shouldn’t be used in those.
There are many factors that prevent companies from actually succeeding in getting real value from their data efforts. We can boil down many of the failed attempts to failed dataOps (data operations).
We can boil down many failed data attempts to failed dataOps.
The word is inspired by devOps (development operations), which connects software development, QA and operations. Throw in data, which by itself is as complicated as the three aforementioned combined, and you get dataOps. Many representatives at the conference shared their journey of adapting dataOps in their organisation.
The key is a focus on good software engineering. To be successful in your dataOps, you need to incorporate the below.
- devOps. Version control and CI/CD, automated & continuous testing. Easy to create and deploy in new environments.
- Data Orchestration. Orchestration is the core of dataOps.
- Quality Management. Continuously monitoring, measuring and improving data.
- Agile. Self-organising teams, shorter sprints and engaging the business.
- Democratising data. Make data easily available for all teams and people. Allow tool of choice.
- Data Governance. Version control of data, data lineage and compliance.
- Lean. Simplify and automate processes.
Corona & failed dataOps
Lars Albertsson, founder of Scling, had a great session where he spoke about how all covid-19 analysts have failed in their spread analysis, due to different reasons. Even the Swedish “Folkhälsomyndigheten” is to blame for presenting somewhat incorrect conclusions of the spread trends in Sweden.
Lars claims that this Swedish authority is aware of this themselves also, but that they don’t have the dataOps in place to do it properly. A lot of the work of the epidemiologists, analysts and statisticians is “manual”, working in notebooks with one-off excel sheets with the latest data. They could have done a much better job had they had more software engineers in the team help them in setting up a proper data workflow.
A lot of talks focused on (or referred to) the current approach to data and how it relates to AI/ML. While the hype for ML (and ML mislabeled as AI) has ballooned in the recent say five years, it seems we are reaching the slope following the zenith of the ML hype graph. We were surprised at the small number of talks referring to innovations and flashy features for designing ML solutions. Instead we found plenty of simple case studies, words of caution, and realistic down to earth advice. When is ML appropriate and when is it not?
ML as a normalised concept
Rather than a buzzword to be marketed and flaunted, most developers and organisations have embraced that ML is more of a tool in the toolbox. This is perhaps due to the lowered bar of entry. For instance most cloud providers have near plug-and-play solutions nowadays. The hype has increased over the last several years, and perhaps now a lot of organisations are standing with their report cards in hand – what has the AI investment we started in 2017 accomplished, and what value has it actually generated?
As ironic as it might seem at a conference focused around data and AI, more than a few sessions had somewhat of a bearish stance on AI; choosing to be somewhat sceptical, stepping back and looking at it from a more complete and business-oriented view. Thus it would be fair to say that the AI/ML industry (for lack of a better term) is transitioning from hype and into maturity. This is probably where most organisations will thrive, both as producers and consumers. Leveraging ML where it makes sense, to produce the most value. Truly the case studies followed this theme quite well, where success stories were mostly minor solutions used to solve a specific problem. Failures on the other hand were instances where developers and teams didn’t consider simpler solutions, or moved too fast without considering the organisation’s needs or capabilities.
If there was something to replace AI as a hyped buzzword though, it was certainly Data Readiness (fine, it’s two words, but you get the point). The way most speakers talked about data during the conference as far as providing value for organisations was apparent – not as something to consider, but more something that is obvious and natural. For tech companies to thrive, they need to leverage the data they have and/or generate to provide value. More and more businesses are becoming software businesses, while more and more software businesses are becoming data businesses. An aspiring data ready organisation should should start by increasing data literacy and data maturity. Tying this point back to AI, making available data more legible and relevant is after all the first (and most time consuming) part to leverage a useful ML solution.