Vision

To create an interactive machine learning tutorial for novices that provides hands-on learning through a practical application on image difference classification.

Guiding Questions

In our project, we were guided by a few questions that we wanted to answer:

  1. How do we help novices approach the problem more like experts?

  2. How can we scaffold novices to do machine learning applications without technical expertise?

Goals

To achieve our vision and answer our guiding questions, we came up with a few actionable goals for our project:

Sources

In our tutorial mode, we provided users with a sample dataset on deforestation. Deforestation is an appropriate dataset for this application because it is inherently about image differences (which our tutorial and application focused on). By looking at the satellite images of the same region from two seperate years, you can tell if deforestation had happened within that time period. For instance, the images below (from 2000 and 2015) show evidence of deforestation in an area in Indonesia:

2000
2015

To collect this historical satellite imagery data, we used Google Earth. We took screenshots of 100-square-mile regions in Indonesia, a country that is notorious for the deforestation of its rainforests. Specifically, we took two screenshots of the same area, one from the year 2000 and another from 2015. We did this for 25 seperate areas, getting 25 pairs of before and after images.

Since we collected this data ourselves, concerns about data quality were mitigated:

Approach

Broadly, we approached this project in a few stages:

  1. Contextual Inquiry (Novices): We interviewed novices and learned how they approached an image difference classification problem on deforestation. We figured out what they did and did not know about machine learning. From this, we identified gaps in knowledge that we could focus our instruction on.

  2. Contextual Inquiry (Experts): We interviewed two experienced machine learning developers on our team to learn how they tackled the same problem. We compared and contrasted their approach from the approach of the novices.

  3. Instructional Design: Using our insights from the contextual inquires, we drafted a narrative to explain the machine learning process to non-technical, novice users. This narrative would go in our "Tutorial" section.

  4. User Interface Design: We desgned an intuitive UI, filled with instructional GIFs and visualizations to explain what was happening in each stage of the machine learning process. We did this for both the "Tutorial" and "Build a Classifier" sections.

  5. Back-end: We built a back-end to do classification on a generic image difference problem. We hosted a web server on Heroku to perform feature generation, training, and testing using a user-provided dataset.

  6. Front-end: We developed a front-end tutorial and image classification platform (using only the deforestation data) that interfaced with our back-end on Heroku.

The slides below detail our approach in planning and implementing this project: