- Expecting Machine Learning model training to be faster than writing software
- Stills needs lots of supporting software and infrastructure
- Need to ensure robust, scalable etc.
- Also pipelines etc. for data collection, prep, training
- Push people to start with software solution first
- No data collected yet
- Also need to regularly using this data, e.g. to generate reports—otherwise likely to be stale
- Keep humans in the loop
- For core/critical systems especially
- Curate training data, handle edge cases, review data
- Product launch focused on ML algorithm
- ML optimized for the wrong thing
- e.g. search optimized for engagement (clicks)
- Might learn to serve bad results—cause users to click back and try other links
- Is your ML improving things in the real world?
- Need to show impact to stakeholders
- Using a custom algorithm vs pre-trained
- Expectation—ease of use of pre-trained models means building own is easy—false
- Not retraining algorithms
- Invest in making process seamless
- Don’t design your own perception or NLP algorithm
- Seem much easier than they really are
- Optimized from decades of research
- Always use off the shelf models