High school graduation and continued education are some of the top factors in determining adult success. Corona Norco Unified School District in Southern California was facing challenges familiar to other school districts across the country. How do they successfully educate and prepare children for post-graduation life and also intervene and provide extra guidance to those at risk of dropping out? High school dropout brings with it an increased risk of living a life in poverty, being unemployed, and even incarceration. Despite the remarkable work of educators, as a nation, we still have about 16% of students slipping through the cracks and not receiving a high school diploma. It’s not always easy for educators to identify every child who is at risk. Fortunately, partnering with Corona Norco, we were able to leverage the predictive abilities of machine learning to build digital, low-cost tools that improved visibility of both student and school performance.
We recently wrapped up a project, completed in 6 weeks, with Corona Norco Unified School District in Southern California.
The Corona Norco Unified School District wanted to update their data governance and integration strategy to better support reporting. While the district had a plentitude of student data from years of record keeping, it was tied up in disjointed SQL databases, making it difficult to produce unified reports. Administrators couldn’t make data-driven decisions in the long or short-term and they had poor visibility to their records. With our expertise in data warehousing, the Logic20/20 team helped them clean up the data and create a new data warehouse that was a consistent, unified source of truth. The data warehouse was organized and optimized to support primary use cases and the reports the district already used, and anticipated on using in the future.
Once the data warehouse was in place, the school district asked us to brainstorm how their newly optimized data could help solve one of their biggest concerns--preventing high school dropouts. Was there a way their data could pinpoint which students were at risk BEFORE they entered high school? As technologists, these are the types of complex challenges we live to solve, and we knew the perfect solution – predictive analytics based on a trained machine learning model. Machine learning is the method of training a machine using algorithms and existing data to create a model, and then using that model to predict future results. Corona Norco had all of the right ingredients for machine learning to be successful – large amounts of clean data and a partner with expertise in predictive modeling and business intelligence (BI) tools (that’s us!).
The primary goal was for teachers and other members of the student intervention teams to have insight into at-risk students, so they could intervene with each child early and continue to give them extra support throughout their high school experience. The first step was to identify the risk factors for high school dropout, and determine which data sets would be used. Logic20/20 led this process by researching determining factors, and Corona added additional ideas about other contributing factors unique to their region. For this project, we settled on these data sets: achievement, attendance, activity, demographic, and behavior. Once the data sets were identified, we trained the machine learning algorithm on the historical data and created the model. Within minutes, the model processes the data and starts producing preliminary results. During the 1.5-week training and testing process, the model was generated and refined over and over until we were able to receive results with 98% accuracy. Once the model was tuned for accuracy using the historical data, it was ready to process current data and predict which students were at risk for dropping out.
The next step was creating a set of dashboards that displayed the results of the predictive analysis and highlighted the at-risk students and program success. The dashboard shows an assessment of a student’s overall risk and explains why. It provides a component score for each of the risk factors, showing the trend of each factor for that student, and provides details such as attendance rate, detentions, GPA, etc. Our dashboards allow educators to quickly prioritize their limited financial and staffing resources to the students who need it most, and the types of interventions that will be the most impactful for the individual student. When designing and developing the dashboards, we worked closely with district administration and staff to make sure user friendly, requirements-focused visualizations were created that were both interactive and filterable. The dashboards provide analytics to both predict student risk and assess school performance.
This goal of the student dashboard was for educators to drill down and see specific student data and predictions about their risk level. It allows users to filter by school, grade, and several different risk factors. Users can see how many students are at each of the five risk levels, students ranked by estimated drop-out risk, and individual student risk summaries. This dashboard tells teachers, principals, and counselors which specific students to target for extra support. The predictive model does the tedious work of combing through the data, so that the educators can focus on what they do best - working one-on-one with students.
The student dashboard also contains visualizations of the model’s performance. These model diagnostics show the effectiveness of the model, and also reveal the model structure, risk levels, contributing factors, scoring of non-drop-outs and drop-outs and show an average drop-out trajectory. Given this information, the school district may choose to fine tune the model over time.
The high school preparedness analysis dashboard was developed to give insight into program performance of intermediate schools and to determine whether they are meeting their goal of preparing students for high school. This dashboard provides views of GPA comparisons between schools, per-subject performance, intra-district comparison, drop-out rate trending, and estimation of GPA impact (using a mixed-effects model). This dashboard gives educators clear metrics on the effectiveness of the schools and their programs, so that they can make improvements when needed.
The total amount of time for creating and implementing this solution for the Corona Norco school district was six weeks. In that short amount of time, Logic20/20 built and tuned the predictive model and created a set of dashboards. Advances in technique and technology allowed us to create a superior solution in one tenth of the time invested by other districts. In addition, this solution was architected for high scale and easy integration. Specifically, the model generates revised predictions daily per student, so it is always up to date. The model itself can be updated on current data at the touch of a button and previous models are archived for reference. The Corona Norco school district was pleased with the solution we built. “We are extremely happy with our decision to work with Logic 20/20! Their team is highly knowledgeable in the data warehouse life cycle, experienced in working with K-12 districts, always fun to work with and extremely efficient in meeting deliverables,” Jay O’Neill, Director at Corona Norco Unified School District.
Logic20/20 appreciated the chance to work on a project like this where we could partner with education experts and leverage our expertise to make a positive impact in the community. The ROI in these types of projects can’t be measured in dollars; the return is giving a child a chance to grow into an educated and productive adult, because an educator was able to provide them with the tools and guidance they needed to succeed.