Class Projects

Astro 497, Week 7, Day 3

TableOfContents()

Logistics

  • Exam Results

  • Lessons Learned

  • Mid-Semester Survey

Project Overview

Students will synthesize lessons learned in the class by building an exoplanet dashboard that ingests data related to detecting and/or characterizing exoplanets, performs basic data manipulations, fits a model to the data, assesses the quality of the model for the given observations, and effectively visualizes the results.

What is a Dashboard?

Purpose

  • Efficiently communicate what can be learned from data

How

  • Automating common tasks

    • Incorporating (new) data into decision making process

    • Data wrangling (e.g., cleaning, transforming, analyzing)

    • Applying simple models

    • Evaluating models

    • Providing common visualizations

  • Facilitate communications & learning

    • Visualizing data

    • Visualizing model predictions

    • Providing common model assessment metrics

    • Automate easy decisions

    • Ease finding information to make hard decisions

Dashboard Elements

  • Ingest data

  • Data Wrangling

  • Model Fitting

  • Model Assessment

  • Visualization

  • Warning Messages

Project Plan

Purpose

What will be the purpose of the dashboard?

Obtaining Data

What data set(s) will your dashboard use for its analysis?

  • What observatories and instruments could provide the data to be analyzed?

  • How many different objects (or time periods) are publicly available?

  • Where will you/your dashboard download the data from?

  • Is the data small enough that we will download the entire dataset once? Or is the dataset large enough that the dashboard will query a database to retrieve the data for each object (or time period) separate?

  • What format will the data be in?

Data Wrangling

  • What data wrangling tasks (e.g., cleaning, transforming) do you anticipate needing to perform?

  • Will the data for each object (or time period) arrive in a single table? Or will you need to perform joins across multiple tables?

Modeling

  • What models will your dashboard fit to the data?

  • What will serve as the robust baseline model?

  • What will serve as the more sophisticated model?

  • What will the models predict?

  • How will you assess your models?

Visualize/Communicate Results

Describe the plots that will be displayed on your dashboard. For each plot:

  • What data will be shown?

  • Will it be plotted with a curve, points, contours, histogram, etc.?

  • What will be the axes?

  • Is there additional information that could be conveyed through other attributes like size or color of points?

  • Would it be helpful to include multiple panels (e.g., to show data on different x or y scales, or to show predictions of different models)?

  • Will the figures that you have already described be sufficient for the dashboard to achieve its purpose? Or do you anticipate needing additional experimentation to convey the results of the analysis effectively? If you have some early ideas, then provide enough information that you can get constructive feedback on them.

Project schedule

  • What tasks do you (or each member of your team) plan to accomplish each week? Make sure to account for scheduling constraints such as exams or big assignments in other classes, holidays, and travel. Be sure to allow some contingency in the schedule for tasks that take longer than expected or other unexpected delays.

  • If you're working as part of a team, then make a plan for how your team will work. Will you work together on each task simultaneously? Will each person be responsible for writing code to do specific tasks separately? It's particularly important to make a plan that doesn't create problematic dependencies (e.g., one person needs to wait for working code from someone else and the team can only meet the deadline if everything goes perfectly).

  • If you or your team have any hard scheduling constraints that would prevent them from presenting during class on Dec 2, 5, 7 or 9. You may also indicate any additional scheduling preferences.

Teamwork

Questions

question(md"""
Is it necessary to do the final project in Julia? 
Can we do it in a language like R or Python instead?
""")
Question

Is it necessary to do the final project in Julia? Can we do it in a language like R or Python instead?

By far the easiest way to meet class requirements:

  • Pluto notebook & Julia

  • For nice UI can use PlutoUI.jl

A little extra hassle, but very possibly worth it

  • Pluto notebook with Julia, plus calls to Python or R

    • PyCall.jl: Justin & I have tested on Roar for you.

    • PythonCall.jl: Probably nicer in long term, but I'm not sure it's ready yet.

    • RCall.jl: For R users

  • Examples of when this would make sense:

    • Reading data in obscure file formats using astropy.io

    • Downloading data using astroquery or archive specific package (e.g., lightkurve or pyneid)

In theory, there are environments that could work

But it will probably take significantly more time.

  • Dash (Python-specific)

  • Reactor (R-specific)

  • Shiny (R-specific)

  • Potentially more, but I'm worried that they may be less mature, reliable, polished, well documented, etc.:

warning_box(md"""
If you try something other than Pluto, be prepared to spend significant ammount of time: figuring it out yourself, rewriting code to do tasks that I've already provided examples for, making the dashboard work reliably, and automating the setup process.
""")
Warning:

If you try something other than Pluto, be prepared to spend significant ammount of time: figuring it out yourself, rewriting code to do tasks that I've already provided examples for, making the dashboard work reliably, and automating the setup process.

warning_box(md"""
It is to be a dashboard, not a notebook or a project report:
- It should work on "new" data that you won't have been able to test it on
- Can not require users to rerun cells in a specific order after selecting dataset (e.g., target or date range) or changing a parameter.
- It should be *extremely easy* for users to use.
""")
Warning:

It is to be a dashboard, not a notebook or a project report:

  • It should work on "new" data that you won't have been able to test it on

  • Can not require users to rerun cells in a specific order after selecting dataset (e.g., target or date range) or changing a parameter.

  • It should be extremely easy for users to use.

warning_box(md"""
If using another language make absolutely sure that your dashboard works reliably for other users and on other systems.  
- Exactly reproduces all package versions
- Any dependencies need to be automatically installed (likely in user space)
- Works on Linux (ideally also MacOS, Windows, etc., but I won't test that) 
- Automatically deals with file paths, system libraries, etc.
- These details are often annoying, but: (1) the Julia & Pluto developers have taken care of the first two, and (2) Justin and I have already setup Roar to solve the remaining details, including using PyCall with astropy, pyquery and lightkurve.
""")
Warning:

If using another language make absolutely sure that your dashboard works reliably for other users and on other systems.

  • Exactly reproduces all package versions

  • Any dependencies need to be automatically installed (likely in user space)

  • Works on Linux (ideally also MacOS, Windows, etc., but I won't test that)

  • Automatically deals with file paths, system libraries, etc.

  • These details are often annoying, but: (1) the Julia & Pluto developers have taken care of the first two, and (2) Justin and I have already setup Roar to solve the remaining details, including using PyCall with astropy, pyquery and lightkurve.

warning_box(md"""
- I recommend that students who want to engage in original research become fluent in at least one high-level language (e.g., julia, python, R, IDL, matlab, Mathematica,...) and one compiled and strongly-typed language (e.g., julia, C/C++, Fortran,...).  
- If you are only fluent in high-level language(s), then there will come a time when you are severely limited in what you can do.  This is particularly a concern for people likely to work with large datasets, large models and/or computationally expensive models.  

→ If you are only fluent in high-level language(s), then I suggest using this opportunity to expand your skillset.  
""")
Warning:
  • I recommend that students who want to engage in original research become fluent in at least one high-level language (e.g., julia, python, R, IDL, matlab, Mathematica,...) and one compiled and strongly-typed language (e.g., julia, C/C++, Fortran,...).

  • If you are only fluent in high-level language(s), then there will come a time when you are severely limited in what you can do. This is particularly a concern for people likely to work with large datasets, large models and/or computationally expensive models.

→ If you are only fluent in high-level language(s), then I suggest using this opportunity to expand your skillset.

question(md"""Will we have access to any public databases or specific ones chosen for us?""")
Question

Will we have access to any public databases or specific ones chosen for us?

Your choice. (If you have a datasource in mind and would like suggestions for how to access it, let me know.)

Potential Data Sources

  • Transit light curves:

    • Kepler/K2

    • TESS

  • Transit Timing Variations:

    • Table of transit times from Holczer et al. (2016)

  • Radial Velocities:

    • California Legacy Survey RVs

    • NEID standard star observations

    • NEID solar observations

  • Host star properties

    • California Legacy Survey spectra

    • NEID standard star spectra

    • Gaia

question(md"What are some good external resources that provide in-depth explanations on methods for exoplanet data analysis?")
Question

What are some good external resources that provide in-depth explanations on methods for exoplanet data analysis?

It depends on the method. For most problems, details are sufficiently technical that you need to go to original journal articles. Usually, the state-of-the-art requires reading a set of papers each of which describes one or two steps in detail, but cite other papers for the details of the other steps. (I can help find those for a particular method.)

question(md"Would it be possible to use available data sets to discover planets that have not been found yet by anyone else?")
Question

Would it be possible to use available data sets to discover planets that have not been found yet by anyone else?

Yes. It's possible. That said, the easiest-to-find planets are the least likely to have been overlooked. I'd encourage you to set goals that don't depend on what others have done in the past (e.g., build the capability to detect a planet, but call it a success even if all your detections have been discovered previously).

Dashboard Checklist

  1. Dashboard successfully reads in data for user selected objects and/or time periods. (1 point)

  2. Dashboard performs whatever data wrangling is necessary to provide high-quality results in subsequent analysis. (1 point)

  3. Dashboard provides effective visualizations of input data (with relatively little preprocessing and, if applicable, after any potentially significant preprocessing). (1 point)

  4. Dashboard successfully fits baseline model to user-selected data. (1 point)

  5. Dashboard effectively visualizes the predictions of the baseline model and the deviations of the predictions from observations. (1 point)

  6. Dashboard provides accurate and useful assessment of quality of results from baseline model. (1 point)

  7. Dashboard successfully fits at least one more sophisticated model to user-selected data. (2 point)

  8. Dashboard effectively visualizes the predictions of at least one more sophisticated model and the deviations of the predictions from observations. (1 point)

  9. Dashboard provides accurate and useful assessment of quality of results from at least one more sophisticated model. (2 point)

  10. Dashboard provides additional visualizations and prominent warning messages that communicate the results of the analysis effectively and clearly. (3 point)

  11. Dashboard successfully runs to completion (and any error messages are in plain English) on Roar without manual setup steps. (2 points)

  12. Does the dashboard provide a simple and effective user interface for selecting data to be analyzed, setting any user-specified parameters and/or interacting with visualizations. (2 points)

  13. Does the dashboard effectively achieve its stated goals? (2 point)

Setup

ChooseDisplayMode()
     
using PlutoUI, PlutoTeachingTools
question(str; invite="Question") = Markdown.MD(Markdown.Admonition("tip", invite, [str]))
question (generic function with 1 method)

Built with Julia 1.8.2 and

PlutoTeachingTools 0.2.3
PlutoUI 0.7.43

To run this tutorial locally, download this file and open it with Pluto.jl.

To run this tutorial locally, download this file and open it with Pluto.jl.

To run this tutorial locally, download this file and open it with Pluto.jl.

To run this tutorial locally, download this file and open it with Pluto.jl.