Updated at: 28 August 2024 (Javier Garcia-Bernardo and Laura Boeschoten)
Course manual
Introducton to digital trace data provides students with foundational knowledge in digital behavioral data, focusing on collecting, analyzing, and interpretating such data. The course emphasizes various data types and methodologies, and the implications that data and algorithmic biases play on reinforcing inequalities.
Over eight weeks, students will delve into the critical analysis of digital behavioral data. Weekly lectures cover theoretical and methodological background, while practical sessions involve hands-on data collection and analysis. Topics include social media data, web scraping, APIs, data donation, survey data, data quality frameworks, and ethical considerations in data collection and analysis.
Pre-requisites: Participants should be familiar with data analysis software (preferably R or another programming language/statistical analysis software e.g. Python, JASP, Stata, SPSS, SAS).
The course (7.5EC) is structured through eight weeks as following:
- 1 lecture per week. Before the lecture, students are expected to read the assigned readings.
- 1 practical per week. The practicals are hands-on sessions where students will work collecting, analyzing and interpreting digital trace data. Make sure to bring a laptop to the practicals. After the practical, students are required to hand-in the answers. Questions about the practicals can be discussed in the next practical.
- 1 group project (60% of the grade) with two assignments (30% each). For more information, read the project guidelines.
- 1 final exam (40% of the grade) on Remindo, composed of multiple choice and open questions.
Attendance is mandatory. If you are unable to attend a lecture or practical session, please inform the course coordinator in advance.
If you miss a session (e.g., due to sickness), you should catch up in the regular way: Read the readings, go through the lecture slides, do the practicals, ask your peers if you have questions, and (after the above) ask the lab teacher for further explanation.
To pass the course you need:
- Participate in the group project: We will create the groups for the group project during the first practical. If you miss that practical you will not be able to participate in the group project and will fail the course.
- To attend all four feedback sessions of the group project (see the project guidelines).
- Have a grade above 5.5 in all components (the two projects and the final exam).
- To have a final grade greater than or equal to 5.5. The final grade is based on the group project and the final exam. The group project consists of two assignments, each worth 30% of the final grade. The final exam is worth 40% of the final grade.
Resit: If the final grade is between 4.0 and 5.4, students may resit the exam. The resit will replace the grade from th exam. To have the right to the resit you need to attend over 80% of the practicals (i.e., at least 6) and hand-in your answers to the practical (in a PDF file) the Monday after each practical before 17:00 through this link. Make sure to name your file “labX_lastname.pdf”.
Fraud, plagiarism and use of generative AI
Plagiarism and fraud are serious academic offenses. Plagiarism is defined as the use of another person’s work without proper acknowledgment. This includes copying and pasting text from the internet, from books, or from other students. If you use text from another source, you must put it in quotation marks and provide a citation. If you do not, you are committing plagiarism. Fraud is defined as the use of dishonest methods to gain an unfair advantage. This includes copying another student’s work, submitting work that is not your own, or submitting the same work for two different courses. If you commit fraud or plagiarism, you will fail the course. If you are not sure what constitutes plagiarism or fraud, please see the (UU Fraud and plagiarism policy)[https://students.uu.nl/en/practical-information/policies-and-procedures/fraud-and-plagiarism].
The use of generative AI (e.g., chatGPT) in the group assignment is allowed only for the following cases:
- Creating code to analyze data, or explaining code
- Labeling data
- Copy-editing text (i.e., making the text more readable, but not changing the content)
The use of generative AI must be clearly indicated in the assignment, clearly identifying the specific model used and for what purpose. The use of generative AI for other purposes is not allowed.
Who to ask what
- General questions about the course: Email course coordinator (Javier)
- Questions about the lectures: Email lecturer (Laura or Javier)
- Questions about the practicals or group project (including grading): Email Thijs
Course objectives and learning outcomes:
The course aims to provide students with foundational knowledge in digital behavioral data, focusing on collecting, analyzing, and interpretating such data.
At the end of the course:
- Students develop fundamental knowledge and understanding of digital data collection and analysis
- Students apply their knowledge in a multi-disciplinary context to contemporary problems
- Students are able to judge how data and algorithmic biases can affect study results
- Students are able to determine the most effective research method(s) to address a research problem
- Students are capable of autonomous and responsible scholarly self-development
- Students are able to identify and critically evaluate ethical dilemmas
- Students are able to present on research findings and insights to specialist and non-specialist audiences clearly and unambiguously in English
Required readings (not complete)
During the course, we will use the following readings:
Books:
Articles:
- Ohme, J. Araujo, T. Boeschoten, L., Freelon. D. Ram, N., Reeves, B.B. & Robinson, T.N. (2024) Digital Trace Data Collection for Social Media Effects Research: Apis, Data Donation and (Screen) Tracking, Communication Methods and Measures, 18(2) 124-141. https://doi.org/10.1080/19312458.2023.2181319
- Boeschoten, L. Ausloos, J. Möller, J.E., Araujo, T. & Oberski, D.L. (2022) A framework for privacy preserving digital trace data collection through data donation, Computational communication research 4(2), 388 – 423 https://doi.org/10.5117/CCR2022.2.002.BOES p. 388 - 394
- Keymolen, E., Taylor, L. (2023). Data Ethics and Data Science: An Uneasy Marriage?. In: Liebregts, W., van den Heuvel, WJ., van den Born, A. (eds) Data Science for Entrepreneurship. Classroom Companion: Business. Springer, Cham. https://doi.org/10.1007/978-3-031-19554-9_20
- Eckman, S., Plank, B. & Kreuter, F. (2024). Position: Insights from Survey Methodology can Improve Training Data, arXiv:2403.01208v2
- Davidson, B. I., Wischerath, D., Racek, D., Parry, D. A., Godwin, E., Hinds, J., van der Linden D., Roscoe J. F., Ayravainen L. & Cork, A. G. (2023). Platform-controlled social media APIs threaten open science. Nature Human Behaviour, 7(12), 2054-2057.
- Freelon, D. (2018). Computational research in the post-API age. Political Communication, 35(4), 665-668.
- Meteen review of
O'neil, C. (2017). Weapons of math destruction: How big data increases inequality and threatens democracy. Crown.
- Business Insider (2024). Prosecutors used an AI tool to send a man to prison for life. Now the person who created it is under investigation.