Now that predictive analytics are hotter than ever in higher education, it's probably a good idea to review a few fundamentals for those who are just starting to sip from the cup of possibilities.
WHY PREDICTIVES? Predictive analytics have been used in a wide variety of settings, including higher education, to manage finances, inventory, operations, assets and resources. Increasingly, higher ed institutions are turning to business intelligence for enrollment management and student recruitment. The next great wave for predictive analytics adoption in higher education is focusing on institutional performance outcomes and individualized student success.
WHY COLLEGES NEED TO CARE: The more that colleges and universities are being held accountable for achieving specific performance outcomes (e.g., improved student retention, better completion rates for college students), the more predictive analytics will be touted as a solution for anticipating risks likely to be encountered while trying to achieve those outcomes.
SIMPLY KNOWING WHO IS AT RISK ISN'T ENOUGH. Knowing how to mitigate risks and how different students can be better served with targeted interventions and support makes predictions actionable. Predictions without action don't really matter very much to anyone.
BEWARE THE BRIGHT AND SHINY: It's going to be a little bit like the early days of the first dot.com out there, with lots of solutions available and even more promises about all the great things that those interventions and solutions can do (!!!). How will you know who to believe? How will you know what's right for you? You owe it to yourself to be informed.
THERE IS NO SINGLE "RIGHT WAY" TO CONDUCT PREDICTIVE ANALYSES. There are many techniques used to conduct predictive analyses. Typically one chooses the technique or techniques likely to yield results for the kinds of predictions one wants to make.
For example, if you are interested in predicting a discrete attribute you might be likely to use techniques including Logistic Regression, Decision Trees (CHAID, CART, Random Forest), Naïve Bayes, Support Vector Machine, Survival Analysis or Neural Networks. If you were more interested in predicting a continuous attribute, you'd look toward techniques such as Multiple Linear Regression, Time series or Decision Trees (CHAID, CART, Random Forest). If you were most interested in common groups, you might consider Hierarchical k-nearest neighbors, Neural Networks or Decision Trees (CHAID, CART, Random Forest).
Selecting the right techniques used for conducting predictive analyses have a lot to do with knowing the questions that the predictions will help answer, or the performance problems that are likely to be solved. Speaking of which...
START WITH A PROBLEM TO SOLVE: Sorry to burst your bubble but predictive analytics don't work quite like Magic 8-Ball. Not even Hadoop technologies do that. Answers simply do not emerge fully formed from the mists of your analytic techniques. It's easier to focus on solving a problem (e.g. what causes students to drop out? Are these causes common in all settings?) or finding a new opportunity (e.g. what motivates students to complete courses faster?). Your problem statements and queries will help you focus on finding data sources and selecting techniques for analyses that are likely to reveal the patterns you seek.