MATH/COSC 3570 Project Chanllenge

Modified

May 2, 2024

Show your fun project! Let’s go!! 😎

Timeline and Things to Do

  • Team up! You will be working as a group of 4 (one team with 5). One of you, please email me

    1. your team member list
    2. your team name (Last year we have team names “ggsquard”, “Data Alpha”, “Analytics and Beyond”, “YU”, “Metadata”)

    by Monday, 4/1 11:59 PM.

  • Proposal. Please send me a one-page PDF describing what you are going to do for your project (no word limit) with your project title by Friday, 4/26 11:59 PM.

  • Meeting. Schedule a group meeting with Dr. Yu discussing your project. Please book a time slot in the Excel form. [Starts April 9, ends May 2.]

  • Presentation. You will be presenting your project on Monday, 5/6 10:30 AM - 12:30 PM.

  • Materials. Please share your entire work (slides, code, data, etc) by Monday, 5/6 11:59 PM.

Policy

Team up!

  • Each one of you loses 3 points of your project grade if you don’t meet the requirement or miss the deadline.

  • You will be randomly assigned to a group if you do not belong to any group before the deadline.

Proposal

  • Each one of you loses 3 points of your project grade if you don’t meet the requirement or miss the deadline.

  • Your proposal (in PDF) should include three parts:

    • Project title
    • The goal of your project. For example, what is the research question you’d like to answer? What packages/tools you’d like to introduce?
    • The description of the data set you use in your project. For example, where is the data set from, how large is the data, the variables you use for your project, etc.
  • Although it is risky, you can change your project topic after you submit your proposal if you decide to do something else.

Meeting

  • You lose 3 points of your project grade if you don’t meet the requirement or don’t meet with Dr. Yu at least once.

  • Please choose a meeting time in the Excel form.

  • You must let Dr. Yu know in advance if you need to change your meeting time.

  • You can change your meeting time once.

  • Every team member needs to show up in the meeting.

  • Prepare briefly talk about your project.

Presentation

  • Every student has to participate (in-person) in the final presentation in order to pass the course.

  • Each group presentation should be between 14 and 15 minute long, followed by 1 to 2 minute Q&A. If your presentation is too short or too long, every one of you loses 3 points of your project grade.

  • Every group member has to present some part of the group work. The one who does not present receives no point.

  • Questions are encouraged during Q&A. Everyone is welcome to ask any questions about the projects. It helps everyone evaluate every group’s project and presentation performance. See Section 4 for grading policy.

  • Each group is required to ask as least one question.

    • The \(k\)-th group should ask at least one question to the \((k-1)\)-th group in Q&A, \(k = 2, \dots, 7\).
    • The 1st group will ask the last (7th) group questions about their project.
  • If you, as a group, don’t ask a question when you should, every one of you loses 3 points of your project grade.

Materials

  • Each one of you loses 3 points of your project grade if you don’t meet the requirement or miss the deadline.

  • You need to share your entire work, including slides, code, and data if applicable.

  • Your code should be able to reproduce all the numerical results, outputs, tables, and figures shown in the slides, including the source of the raw data (where you find and load the data) if the project is about data analysis.

Project Content

Your project can be in either of the following categories:

  1. Data analysis including
    • data visualization
    • estimation/prediction using statistical or machine learning models
  2. Introduce one R or Python package never learned in class, or introduce some R or Python functions in our learned packages, but they are never discussed in class.

Data Analysis

For your data analysis project,

  • You need to show that you are good at asking meaningful questions and answering them with results of data analysis.

  • Your presentation should include data visualization. Your graphics should be informative that help you

    • explore relationships between variables in your data
    • decide which statistical model is used, so that your research questions can be properly answered.
  • You should discuss how and why statistical methods/machine learning algorithms are chosen for analyzing your data set.

    • The methods we learn in class may not be appropriate for your data and answering your research questions. If this happens, critique your own methods and provide suggestions for improving your analysis. Any issues of your data, and appropriateness of the statistical analysis should be discussed.
  • You can choose a data set that is publicly available or you may collect your own data using a survey or by conducting an experiment. The dataset you choose cannot be any datasets used in class, including homework assignments and lab exercises.

  • Below are a list of data repositories you can start with, but you are encouraged to explore more and find your favorite one, for example COVID-19 data if you are interested.

R/Python Packages

For your R/Python package project,

  • you need to

    • show how and why the package greatly helps us do data science.
    • explain how to use the functions in the packages by providing data science examples with some real data set. Please don’t use the toy examples in the package documentation.
  • If the functions of the package return any results or outputs, please explain them, teaching your audience how to appropriately read the outputs.

  • If the functions return graphics, explain why the visualizations are informative and useful for understanding data and analysis results.

  • You can choose a package that helps us do what we cannot do with the packages and tools learned in class.

    • For example, we only learn and packages to help us import data files into R, and we don’t know how to extract data from a website. The package helps us scrape data from web pages.
  • If you choose a package that provides the same functionality as the packages we learned, please show the packages you choose are better.

    • For example, its code is shorter, it is run faster, its output is more clear, its plot is prettier, etc. For example, (https://rdatatable.gitlab.io/data.table/) package provides a high-performance version of base R’s with syntax and feature enhancements for ease of use, convenience and programming speed.
  • Below are a list of popular R packages that you can start with.

  • Below are a list of popular Python packages that you can start with.

Project Evaluation and Grading

Get the Gold medal! 🥇

  • Your project performance grade is determined by your classmates and Dr. Yu.

  • Table 1 shows your possible performance grade. For example, if your group finish in second place (Silver) and you are elected as the best contributor, your grade is 98. If your group finish in third place (Bronze), and you are not the best contributor, you get 91.

  • Your project grade will be

Performance Grade Sheet
Best contributor 🎖️ Other members
Gold 🥇 1st 100 97
Silver 🥈 2nd 98 94
Bronze 🥉 3rd 95 91
Other teams 90 87

Group Performance Evaluation

  • You will need to evaluate all group projects except the one you work on.

  • You evaluate group performance based on the rubric attached. Four evaluation criteria are considered:

    • Project Content and Organization (8 pts)
    • Presentation Material (Slides) Quality (4 pts)
    • Oral Presentation Skill and Delivery (4 pts)
    • Interactions and Q&A (4 pts)
  • The total points of a project presentation is 20 points.

  • Evaluation sheets will be provided on the presentation day.

  • How do you get the full points for each category? Check the requirements below. Note that for Content and Organization, data analysis and package projects have different requirements.

  • Content and Organization (Data Analysis)

    • Beautiful visualization helps find out relationship of variables and specification of models
    • All questions are answered accurately by the models
    • Discuss how and why the models are chosen
    • Apply sophisticated models and detailed analysis
    • All ideas are presented in logical order
  • Content and Organization (Packages)

    • Show how and why the package greatly helps us do data science
    • Explain how to use the functions in the package by providing concrete real data science applications and examples with data sets
    • Teach audience with understandable examples of how to appropriately read the outputs and/or why the visualizations are informative and useful for understanding data and analysis results
    • Show the package is better in some sense, its code is shorter, it is run faster, its output is more clear, its plot is prettier, etc
    • All ideas are presented in logical order
  • Presentation Material Quality

    • Presentation material show code and output beautifully
    • Presentation material clearly aid the speaker in telling a coherent story
    • All tables and graphics are informative and related to the topic and make it easier to understand
    • Attractive design, layout, and neatness.
  • Oral Presentation Skill

    • Good volume and energy
    • Proper pace and diction
    • Avoidance of distracting gestures
  • Interactions and Q&A

    • Good eye contact with audience
    • Excellent listening skills
    • Answers audience questions with authority and accuracy
  • After you evaluate 6 group project presentations, you rank them from 1st to 6th based on their earned points.

  • No two groups receive the same ranking. If you give two or more groups some points, you still need to give them a different ranking, deciding which teams deserve a higher ranking according to your preference.

Individual Performance Evaluation

  • You choose one single person who you think contributes the most to your group project.

  • You cannot vote for yourself, and you can only vote for one of your teammates.

  • If you don’t vote, you can’t be the best contributor even if you obtain the most votes. The person with the second highest votes wins the best contribution reward.

  • If there is no one single person who gets the most votes, every team member remains the same grade. For example, if your group finish in third place (Bronze), and there is no best contributor, all members receive 91 (See Table 1).