Decision Forests: Taking Our Machine Learning to the Next Level

Alex PainoJuly 9, 2015

We’re adding random decision forests to our machine learning solution, so get ready for an 18% improvement in Sift Score accuracy!

This week, we launched an entirely new machine learning model called random decision forests, which will work alongside our existing models. Why? For an additional layer of prediction power, of course. With Sift Science’s decision forests in place, we expect that, on average, our customers will see a significant increase in fraud detection accuracy. This added model makes our online and large-scale learning capabilities even more robust! 

Anyone working on real-world machine learning systems can tell you that building an accurate model requires much more work than just slapping an off-the-shelf classifier on top of your data and calling it a day. Here at Sift Science, we’re constantly working to improve the accuracy of our machine learning models. This task involves experimenting with everything from new feature extraction techniques to new modeling strategies, and sometimes even trying new online learning strategies.

Every few weeks, we take everything we’ve learned from offline experiments and roll them into brand new models for all of our customers. Of course, this is in addition to our online learning that adjusts in real-time with every grain of new data that we receive. Every once in a while we make a change so large we feel our customers should know about it. 

With an update of this magnitude, it’s possible that our customers will see changes to their score distributions. In the coming weeks, we encourage you to reassess your Sift score thresholds. If you currently use Sift Science, you should already regularly evaluate your score thresholds, as small score shifts are common outside of core modeling updates – that’s where the power of online and large-scale learning come into play. We’ve been very careful to keep your score distributions close to the ones you’re used to, but it’s possible that some may behave differently with live data.

As always, if you see any issues or have questions about how best to use Sift Science, please tell us by filling out a support form.

Thanks, and happy fraud fighting!

 

Author