Northeastern University Data Analysis Technique Assignment
(1) Email-essay scenario: The data science team you work on (at Salesforce, Netflix, or of your own imagining) is interested in using a data analysis technique new to the team/company. You were tasked with gaining a general overview of the technique and writing a medium length email (~3 paragraphs) to summarize your findings. Include:
- a description of the technique for a technical audience, you want to give them an introduction to the method that will be a jumping off point for them learning more detail and discussing the method
- explain the value of the method
- the types of data it can be used for
- the method’s limitations
Choosing a technique: Choose a data analysis technique you are interested in learning more about, based on your analytics skill level. For example, if you’ve never done a regression, pick a regression or correlation or something simpler. If you have more analytics experience, use this as an opportunity to learn more about a method you’re interested in or dig deeper into a method from another class. You are welcome to describe how this analysis method could be used for the data / business question from your case study.
Some ideas: regression, cross-validation, filtering (signal processing), deep learning, principal components analysis (PCA), random forests, randomization techniques
Requirements:
- Use in line citations where appropriate
- Include a reference list/bibliography
- Minimum 300 words
(4) Short Answer
Prompt:
Answer the following questions based on this week’s reading:
- What is the purpose of statistical diagnostics? (refer to Bartlett Chapter 8)
- What considerations need to be made when choosing training data for machine learning algorithms? (refer to machine learning reading)
- What is the purpose of cross-validation? (refer to cross validation videos)
Requirements
- Use in line citations where appropriate
- Include a reference list/bibliography
- Minimum 300 words (total)
Required reading: https://www.technologyreview.com/s/608248/biased-algorithms-are-everywhere-and-no-one-seems-to-care/