Course Overview


Course Content by year:

[2022] [2023] [2025]

This course seeks to empower ecologists to accurately and efficiently analyze large image, audio, or video datasets using computer vision methods.

It is a three-week full-immersion course at Caltech, composed of classroom training and hands-on projects with one-on-one mentorship. The students will be taught the rudiments of computer vision and how to train and evaluate computer vision models on their own data to help answer specific ecological research questions. Students will leave with a working tool, a grasp of the underlying concepts, and will be empowered to tackle diverse ecological problems with computer vision. School participants will also develop a network of computer vision researchers with whom they can engage and collaborate.

Learning Outcomes

  • Frame a scientific or ecological research question as a computer vision problem.
    • What is the data?
    • What is the computer vision task? (classification, detection, tracking, etc.)
    • How will solving the computer vision task lead to an answer to my research question? What additional steps will be needed?
  • Review relevant Computer Vision literature.
  • Curate a representative dataset to prototype a solution to your computer vision problem, and make well-informed choices about how to spend resources - i.e. what data to annotate, when to use weak labels, how to make the most of your time and money.
  • Determine how to split your data for training and evaluation based on your required output and target use case.
  • Use existing well-maintained open-source codebases to train baseline computer vision models, adapt data loaders and model architectures to fit your data.
  • Evaluate your models in a representative fashion, choose evaluation metrics, and curate evaluation datasets that will tell you how well the method will work for your target outcome.