In the first two posts of this three-part series we discussed how we ensure correctness when conducting and analyzing machine […]
Author Archives for Alex Paino
In the first post on ML Experiments at Sift Science, we described how we minimize bias in our offline experimentation […]
At Sift Science we use machine learning to prevent various forms of abuse on the internet. To do this, we […]
At Sift Science, we use a variety of popular machine learning models to detect fraud for our customers. However, until recently we relied exclusively on a combination of linear models and sophisticated feature engineering. As we were reaching the limits of this setup, we began experimenting with our first non-linear model: random decision forests. Several months and over 100 experiments later, we were thrilled to announce the addition of random decision forests to our ensemble of models used to fight fraud. Along the way we learned quite a few things about designing a random decision forest classifier for the fraud detection use case. Here we detail several of these learnings, including how we handled sparse and missing features, useful model visualization techniques, heuristics we used to improve class separation, specialized feature engineering, and how we combined our random decision forest with our existing models. All told, these learnings resulted in an 18% reduction in error for our customers.
We're adding random decision forests to our machine learning solution, so get ready for an 18% improvement in Sift Score accuracy!This week, we launched an entirely new machine learning model called random decision forests, which will work alongside our existing models. Why? For an additional layer of prediction power, of course. With Sift Science’s decision forests in place, we expect that, on average, our customers will see a significant increase in fraud detection accuracy. This added model makes our online and large-scale learning capabilities even more robust!