This course will provide students with a foundation in data preparation and preliminary analytics using R which can be applicable for research, quality improvement and industry large-scale data analytics projects. This course will include the following skills: data analysis with publicly available data sets; cleansing and imputing data; descriptive statistics; and data visualization.
After successfully completing this course, students will be able to
Please note that all times in the syllabus and in Blackboard refer to Eastern Time. The discussion board and assignment links for each week will open at the start of the week for submissions.
Discussion Board Posts: Each week there will be a discussion board that addresses a topic within the current module. These assignments will assess your ability to clearly and accurately apply concepts from your readings and from your own experiences. Each week you are expected to submit an initial post and comment on at least 2 other students’ posts. You need to follow APA guidelines for citing any sources you may reference in either your initial post or your response to others. Refer to the Discussion Rubric and weekly discussion question for submission guidelines. Please be sure to follow the individual directions provided with each Discussion Board Prompt, as the requirements may vary from Discussion Board to Discussion Board.
Homework Assignments (Weeks 1-7): As part of this course, we will be running an interactive program to help you learn the ins-and-outs of R. The program is called swirl (https://swirlstats.com).
Initial post: You should submit your initial post by 11:59 p.m. Sunday. Your initial post should be approximately 500 words.
Response to others: You should comment on at least 2 other students’ posts by 11:59 p.m. Wednesday. Your comments to others should be thorough, thoughtful, and they should offer some new content. Do not merely respond with “I agree” or “I disagree.” Engage directly with the ideas of your classmates and briefly mention which part of the post you are responding to.
Week 1 Assignment: Think of a research question on a topic that you would like to study. Identify the setting where you will collect the data and the variables that will be needed to answer the research questions, and then use Microsoft Excel to build a prototype of the data visualization. Refer to the Week 1 Assignment Rubric and assignment instructions for submission guidelines.
Week 2 Assignment: In this assignment, you will run an R script on a comma delimited file containing patient data. You will calculate the mean length of stay for patients, create a box plot, and run one additional calculation. Refer to the Week 2 Assignment Rubric and assignment instructions for submission guidelines.
Week 3 Assignment: In this assignment, you will select an R package from CRAN to use when answering a question about where urgent care centers should be located. Refer to the Week 3 Assignment Rubric and assignment instructions for submission guidelines.
Week 4 Assignment: This week, you will prepare (clean) a data set to meet certain specifications before analysis. Refer to the Week 4 Assignment Rubric and assignment instructions for submission guidelines.
Week 5 Assignment: Using the data set you prepared in Week 4, you will run commands in R Studio to consider whether payments from non-Medicare sources differ in interesting and meaningful ways by DRG, state, or both. Refer to the Week 5 Assignment Rubric and assignment instructions for submission guidelines.
Week 6 Assignment: This week, you will continue using your modified dataset on Maine and Alabama to determine mean discharge days, mean payments, and a five number summary. Refer to the Week 6 Assignment Rubric and assignment instructions for submission guidelines.
Week 7 Assignment: For your final assignment, you will prepare (clean) an EHR incentive file and run R scripts to obtain specific outputs. Refer to the Week 7 Assignment Rubric and assignment instructions for submission guidelines.
Your grade in this course will be determined by the following criteria:
Assignment | Points |
---|---|
Discussions (8*3 points each) | 24 |
Swirl Homework (6*2 points each) | 12 |
Week 1 Quiz | 5 |
Week 1 Assignment | 8 |
Week 2 Assignment | 10 |
Week 3 Assignment | 7 |
Week 4 Assignment | 10 |
Week 5 Assignment | 7 |
Week 6 Assignment | 10 |
Week 7 Assignment | 7 |
Total | 100 |
Grade | Points Grade | Point Average (GPA) |
A | 94 – 100% | 4.00 |
A- | 90 – 93% | 3.75 |
B+ | 87 – 89% | 3.50 |
B | 84 – 86% | 3.00 |
B- | 80 – 83% | 2.75 |
C+ | 77 – 79% | 2.50 |
C | 74 – 76% | 2.00 |
C- | 70 – 73% | 1.75 |
D | 64 – 69% | 1.00 |
F | 00 – 63% | 0.00 |
Course learning modules are divided into weeks. Each week starts on Wednesday at 12:00 am Eastern Time (ET) and closes on Wednesday at 11:59 pm ET, with the exception of Week 8, which ends on Sunday. All assignments must be submitted by 11:59 pm ET on the due date.
Learning Modules | Topics | Assignments and Due Dates |
Week 1 Jan 8 – Jan 15 |
Introduction to Data Analytics and R: Tools, Techniques and Data |
Week 1 Discussion: Initial post due Sunday. Responses due by Wednesday. Quiz: Data Analytics Process: Due by Wednesday. You will not be able to take the quiz after this date. Week 1 Swirl Homework (optional) Week 1 Assignment: Due by Wednesday. |
Week 2
Jan 15 – Jan 22 |
Common R Functions and Language |
Week 2 Discussion: Initial post due Sunday. Responses due by Wednesday. Week 2 Swirl Homework: Due by Wednesday Week 2 Assignment: Due by Wednesday. |
Week 3
Jan 22 – Jan 29 |
CRAN and R Packages |
Week 3 Discussion: Initial post due Sunday. Responses due by Wednesday. Week 3 Swirl Homework: Due by Wednesday Week 3 Assignment: Part 1 due by Sunday; Part 2 due by Wednesday. |
Week 4
Jan 29 – Feb 5 |
Data Preparation (cleansing and data normalization) |
Week 4 Discussion: Initial post due Sunday. Responses due by Wednesday. Week 4 Swirl Homework: Due by Wednesday Week 4 Assignment: Due by Wednesday. |
Week 5
Feb 5 – Feb 12 |
R Scripts (functions) and Data Analysis – Part 1 |
Week 5 Discussion: Initial post due Sunday. Responses due by Wednesday. Week 5 Swirl Homework: Due by Wednesday Week 5 Assignment: Due by Wednesday. |
Week 6
Feb 12 – Feb 19 |
R Scripts (functions) and Data Analysis – Part 2 |
Week 6 Discussion: Initial post due Sunday. Responses due by Wednesday. Week 6 Swirl Homework: Due by Wednesday Week 6 Assignment: Due by Wednesday.
|
Week 7
Feb 19 – Feb 26 |
R Scripts (functions) and Data Analysis – Part 3 |
Week 7 Discussion: Initial post due Sunday. Responses due by Wednesday. Week 7 Swirl Homework: Due by Wednesday Week 7 Assignment: Due by Wednesday. |
Week 8 (short week)
Feb 26 – Mar 1 |
Data Visualization (Conclusion) |
Week 8 Discussion: Initial post due Friday. Responses due by Sunday. |
Read:
Marc, D. & Sandefer, R. (2016). Data analytics in healthcare research: tools and strategies. Chicago, Illinois: AHIMA Press.
Kuo, Y., Goodwin, J. S., Chen, N., Lwin, K. K., Baillargeon, J., & Raji, M. A. (2015). Diabetes mellitus care provided by nurse practitioners vs primary care physicians. Journal of the American Geriatrics Society, 63(10), 1980-1988. doi:10.1111/jgs.13662
Swirl Guide
Week 1 Course Survey
Please also review the following resources:
Read the following article:
Kuo, Y., Goodwin, J. S., Chen, N., Lwin, K. K., Baillargeon, J., & Raji, M. A. (2015). Diabetes mellitus care provided by nurse practitioners vs primary care physicians. Journal of the American Geriatrics Society, 63(10), 1980-1988. doi:10.1111/jgs.13662
Do you think that a table is an adequate representation of the data? If not, suggest other visualizations. Support your preference with examples.
Take the following steps to complete your assignment:
Begin to think like a researcher
Read:
Marc, D. & Sandefer, R. (2016). Data analytics in healthcare research: tools and strategies. Chicago, Illinois: AHIMA Press.
R Language Definition: https://cran.r-project.org/doc/manuals/r-release/R-lang.pdf
An Introduction to R: https://cran.r-project.org/doc/manuals/r-release/R-intro.pdf
Identify an area in healthcare that could benefit from health informatics. What are the challenges of obtaining data? Can you always trust the analysis?
Read:
Marc, D. & Sandefer, R. (2016). Data analytics in healthcare research: tools and strategies. Chicago, Illinois: AHIMA Press.
CRAN and R Packages: https://cran.r-project.org/ Hint: Look for the packages link on the left side of the page.
Gutiérrez-Sacristán, A., Bravo, À., Giannoula, A., Mayer, M. A., Sanz, F., & Furlong, L. I. (2018). comoRbidity: An R package for the systematic analysis of disease comorbidities. Bioinformatics (Oxford, England), 34(18), 3228-3230. doi:10.1093/bioinformatics/bty315
For this week’s discussion read the following article and answer the questions below:
“comoRbidity an R package for the systematic analysis of disease comorbidities” and the supplementary file information.
How could this type of R package improve healthcare quality and enhance decision making? As you respond to this question, also consider: Why was comoRbidity created? What types of data analysis questions can the R package comoRbidity solve?
Scenario: You are employed at a local hospital as a health informatics analyst. The hospital has experienced a 200% increase in emergency room (ER) visits over the past 12 months. This has caused ER wait times to increase from 25 minutes to 90 minutes.
Initial analysis of of the data showed that many of the ER visits were due to routine medical care or minor illness/injury. The hospital wants to investigate the possibility of opening urgent care centers in the city to reduce the number of non-emergent ER visits. The current question at hand is: where should these clinics be located, and do we have certain locations within the city where large populations are being seen in the ER for non-emergent conditions?
Instructions: Knowing the type of data that is collected when you present at the ER, select an R package from the list on cran.r-project.org that could assist you in answering the aforementioned questions and can be used to show your analysis to hospital administration.
Install the R package that you selected and capture a screenshot of R Studio once the package has been installed. Discuss how you would use the selected R package to answer the questions above and what types of data would you need (submit your work as a Microsoft Word document).
Read:
CMS Public Data website: https://data.cms.gov/ This website is one source for publicly available health data.
Data.CMS.gov: Inpatient Prospective Payment System (IPPS) Provider Summary for the Top 100 Diagnosis-Related Groups (DRG) – FY2013 – https://data.cms.gov/Medicare-Inpatient/Inpatient-Prospective-Payment-System-IPPS-Provider/kd35-nmmt
Medical big data: promise and challenges: https://www.ncbi.nlm.nih.gov/pmc/articles/PMC5331970/
Data Cleaning: Detecting, Diagnosing, and editing data abnormalities: https://www.ncbi.nlm.nih.gov/pmc/articles/PMC1198040/
For this week’s discussion, read the two scholarly articles that address data cleansing and data normalization, and then discuss the following questions.
Scenario: For this assignment you are employed by CMS (Centers for Medicare and Medicaid Services) as a senior data analyst and you are being asked to compare and contrast chronic disease data from Maine and Alabama.
Instructions: To start, download (as a CSV file) the FY 2013 Inpatient Prospective Payment System (IPPS) Provider Summary for the Top 100 Diagnosis-Related Groups (https://data.cms.gov/Medicare-Inpatient/Inpatient-Prospective-Payment-System-IPPS-Provider/kd35-nmmt).
Before you can analyze the data, you need to prepare (clean) the data to meet the specifications of the chronic disease data report. These specifications include:
There are many ways to achieve the final dataset. Submit your dataset as a Microsoft Excel spreadsheet. There should be 513 rows in the file you submit. Include a brief paragraph (Word document) of the process you used to select the data. Note, some DRGs may not be present in both states.
Read:
Marc, D. & Sandefer, R. (2016). Data analytics in healthcare research: tools and strategies. Chicago, Illinois: AHIMA Press.
Perform a web search for “Anscombe’s Quartet,” a collection of four very small sets of data with identical (or near-identical) numerical statistics, but very different visual characteristics when graphed as scatterplots. (See Anscombe (1973) for the original paper.) Why it is important to visualize data before any statistical analyses are performed?
For this week’s assignment, import your CSV file (from week 4 — Alabama and Maine) into R Studio.
Read:
Introduction to dplyr: https://cran.r-project.org/web/packages/dplyr/vignettes/dplyr.html
Watch
SuperDataScience. (Aug 17, 2017). R PROGRAMMING dplyr BASICS – summarize, group_by, select, mutate, filter, arrange. Retrieved from https://www.youtube.com/watch?v=BaFkbNOaof8.
With your dataset in mind, what can descriptive statistics tell us about the data? What kind of inferences can we draw from the data? What are the limits to descriptive data within your dataset?
Use R studio and the R package, dplyr, answer the following questions based on the modified data set you created last week. Show your work through screenshots.
Read:
Marc, D. & Sandefer, R. (2016). Data analytics in healthcare research: tools and strategies. Chicago, Illinois: AHIMA Press.
Download:
You will use the file below in your assignment for this week.
Week 7 – Chapter 7.csv
Why is it important to identify erroneous values in the data set, harmonize measures, and recode data fields prior to analysis? As part of your response, provide examples of potential problems with the analysis if these issues were not corrected (think of any type of healthcare data set: admission data, cancer registry, etc.).
Note: The instructions in your book may not perfectly match the version of Excel you are using. Some icons or language may differs slightly, but the functionality is the same.
For this week’s assignment you will use the EHR incentive Excel file.
Marc, D. & Sandefer, R. (2016). Data analytics in healthcare research: tools and strategies. Chicago, Illinois: AHIMA Press.
Imagine you have recently been hired as a senior healthcare data analyst at the Maine Department of Public Health. You have noticed that most of their reports and data analysis are generated and published using traditional text generation. You are scheduled to meet with the chief operating officer (COO) to discuss how improvements to these reports could benefit decision making. Most of the reports are longitudinal in nature, where health data is generated and compared over months or years.
How would you convince the COO that data visualization could enhance data interpretation and lead to more informed decision making? Discuss the benefits of data visualization and provide examples of which types of data visualization would work best given the data that is tracked by the department.
Your Student Support Specialist is a resource for you. Please don't hesitate to contact them for assistance, including, but not limited to course planning, current problems or issues in a course, technology concerns, or personal emergencies.
Questions? Visit the Student Support Health Informatics page
The Student Academic Success Center (SASC) offers a range of services to support your academic achievement, including tutoring, writing support, test prep and studying strategies, learning style consultations, and many online resources. To make an appointment for tutoring, writing support, or a learning specialist consultation, go to une.tutortrac.com. To access our online resources, including links, guides, and video tutorials, please visit:
Any student who would like to request, or ask any questions regarding, academic adjustments or accommodations must contact the Student Access Center at (207) 221-4438 or pcstudentaccess@une.edu. Student Access Center staff will evaluate the student's documentation and determine eligibility of accommodation(s) through the Student Access Center registration procedure.
Togetherall is a 24/7 communication and emotional support platform monitored by trained clinicians. It’s a safe place online to get things off your chest, have conversations, express yourself creatively, and learn how to manage your mental health. If sharing isn’t your thing, Togetherall has other tools and courses to help you look after yourself with plenty of resources to explore. Whether you’re struggling to cope, feeling low, or just need a place to talk, Togetherall can help you explore your feelings in a safe supportive environment. You can join Togetherall using your UNE email address.
Students should notify their Student Support Specialist and instructor in the event of a problem relating to a course. This notification should occur promptly and proactively to support timely resolution.
ITS Contact: Toll-Free Help Desk 24 hours/7 days per week at 1-877-518-4673.
The College of Professional Studies supports its online students and alumni in their career journey!
The Career Ready Program provides tools and resources to help students explore and hone in on their career goals, search for jobs, create and improve professional documents, build professional network, learn interview skills, grow as a professional, and more. Come back often, at any time, as you move through your journey from career readiness as a student to career growth, satisfaction, and success as alumni.
Please review the technical requirements for UNE Online Graduate Programs: Technical Requirements
The College of Professional Studies uses Turnitin to help deter plagiarism and to foster the proper attribution of sources. Turnitin provides comparative reports for submitted assignments that reflect similarities in other written works. This can include, but is not limited to, previously submitted assignments, internet articles, research journals, and academic databases.
Make sure to cite your sources appropriately as well as use your own words in synthesizing information from published literature. Webinars and workshops, included early in your coursework, will help guide best practices in APA citation and academic writing.
You can learn more about Turnitin in the guide on how to navigate your Similarity Report.
ITS Contact: Toll Free Help Desk 24 hours/7 days per week at 1-877-518-4673
Course surveys are one of the most important tools that University of New England uses for evaluating the quality of your education, and for providing meaningful feedback to instructors on their teaching. In order to assure that the feedback is both comprehensive and precise, we need to receive it from each student for each course. Evaluation access is distributed via UNE email at the beginning of the last week of the course.
Assignments: Late assignments will be accepted up to 3 days late; however, there is a 10% grade reduction (from the total points) for the late submission. After three days the assignment will not be accepted.
Discussion posts: If the initial post is submitted late, but still within the discussion board week, there will be a 10% grade reduction from the total discussion grade (e.g., a 3 point discussion will be reduced by 0.3 points). Any posts submitted after the end of the Discussion Board week will not be graded.
Please make every effort ahead of time to contact your instructor and your student support specialist if you are not able to meet an assignment deadline. Arrangements for extenuating circumstances may be considered by faculty.
8 week: Students taking online graduate courses through the College of Professional Studies will be administratively dropped for non-participation if a graded assignment/discussion post is not submitted before Sunday at 11:59 pm ET of the first week of the term. Reinstatement is at the purview of the Dean's Office.
16 week: Students taking online graduate courses through the College of Professional Studies will be administratively dropped for non-participation if a graded assignment/discussion post is not submitted before Friday at 11:59 pm ET of the second week of the term. Reinstatement is at the purview of the Dean's Office.
The policies contained within this document apply to all students in the College of Professional Studies. It is each student's responsibility to know the contents of this handbook.
Please contact your student support specialist if you are considering dropping or withdrawing from a course. The last day to drop for 100% tuition refund is the 2nd day of the course. Financial Aid charges may still apply. Students using Financial Aid should contact the Financial Aid Office prior to withdrawing from a course.
The University of New England values academic integrity in all aspects of the educational experience. Academic dishonesty in any form undermines this standard and devalues the original contributions of others. It is the responsibility of all members of the University community to actively uphold the integrity of the academy; failure to act, for any reason, is not acceptable. For information about plagiarism and academic misconduct, please visit UNE Plagiarism Policies.
Academic dishonesty includes, but is not limited to the following:
Charges of academic dishonesty will be reviewed by the Program Director. Penalties for students found responsible for violations may depend upon the seriousness and circumstances of the violation, the degree of premeditation involved, and/or the student’s previous record of violations. Appeal of a decision may be made to the Dean whose decision will be final. Student appeals will take place through the grievance process outlined in the student handbook.