The beginning of a data science project is too often like watching a slow-paced action movie. Excitement slowly turns into wondering; when will all the action start!? After a while you still don’t have access to the data and you only wish to find someone who understands what the fields in the data mean and how they are formed in business processes. And what was the problem we’re trying to solve.. building AI, right?
Imagine a data science project where you get a predictive model, a user interface and an API in under three weeks of working hours! I recently worked on a case where we proved that this is possible. However, it required that certain elements were in place. These are the things that I considered critical success factors in our project:
- A data-solvable problem was defined and clear. Make sure you understand whether you have a regression or a classification problem in your hands. Or is it maybe a reinforcement learning or a clustering problem – or are you truly trying to develop general AI?
- The data was not sensitive. There was no need to start building environments only to get access to the data. Just use your own computer and set up your working environment as you see fit. In case you have sensitive data, but you also have an environment for analyzing it already set up, then you’re good to go as well!
- The data export was ready to be shared. E.g. a simple csv file is good to quickly get an idea what the data in a relational database is about and start experimenting before you have access to the actual data source. You need someone with data access, though, to get it to you and update it as needed. Also, make sure there is enough data to make valid predictions.
- The customer was available and committed to the project bringing domain knowledge and business understanding to the table. The progress was communicated actively, the meetings and demos were well-organized and the project was steered according to the feedback.
- A multi-disciplinary team was available, consisting of a designer, a data scientist and a front end developer. So, a real-looking product could be built straight away. Leaving the design and the user interface development work out would, naturally, make the project even shorter. However, this way we were able to test not only the model prediction accuracy but also the usability of the tool with its real future users.
Here is our team: front end developer Anssi, me and designer Hanna.
This project was, of course, only the first step – a proof of concept. After it a decision can be made whether the defined problem can be solved with the available data. Even though all of these criteria are met the conclusion might be that e.g. the quality of the data isn’t adequate and there’s no point in moving forward before the quality is improved.
Our case project included a testing period where we compared the tool’s predictions to human predictions and concluded that the project indeed was a success! After this first phase, now that we know the problem is solvable and the solution is useful, we can proceed to the next phase where we worry about database access, proper back end, logs, alerts, automatic model updates etc.
If you realized that your project doesn’t meet all of these 5 success factors, do not despair. Just be prepared to spend some time finding the ways of working that suit your project – but don’t expect quick wins!