Programme Curriculum

Attend an 18-hour compulsory summer boot camp

Before the commencement of the programme, all candidates must attend a mandatory 18-hour social data analytics boot camp. The boot camp training programme is designed to provide candidates with beginner-level skills in social data analytics so that they can follow the more advanced curriculum for the degree of Master of Social Sciences in the field of Social Data Analytics. The boot camp will be taught over three to four days, organised around the following topics:

  1. Welcome to the Boot Camp in Social Data Analytics: pedagogical approach, how to get started; an introduction to open-source software R and Rstudio and how to customize your interface
  2. Introductory/refresher course in algebra, probability and statistics
  3. Using objects (e.g. vectors and dataframes) and different classes of objects (e.g. numeric vs. string) in R-Studio
  4. How to clean, reshape, and restructure data with the tidyverse
  5. Basic Programming: write functions, loops, and learn how to debug your code
  6. Communicating and Collaborating with RMarkdown, Rpres, Shiny and Github


Students are required to complete 5 compulsory courses, 3 elective courses and 1 capstone project throughout their studies.

Before the commencement of the programme, all students must attend a mandatory 18-hour social data analytics summer boot camp. The boot camp training programme is designed to provide students with beginner-level skills in social data analytics so that they can follow the more advanced curriculum for the MSocSc(SDA) programme.

Compulsory Courses
MSDA7001 Introduction to social data analytics
MSDA7002 Statistical foundations
MSDA7003 Machine learning
MSDA7004 Research design and inference in the social sciences
MSDA7005 Programming for social scientists

Elective Courses
At least two elective courses from the following list*
MSDA7101 Big data solutions to social problems
MSDA7102 Simulating human behaviours with agent-based models
MSDA7103 Text as data: Natural language processing and social research
MSDA7104 Social network analysis
MSDA7105 Media data analysis

One additional elective course can also be taken from other programmes*
GEOG7308 Machine learning for geospatial data
GEOG7310 Cloud computing for geospatial data analytics
GEOG7311 Web GIS
MSBH7005 Scientific inquiry and research methods in behavioral health
SOCI7006 Research methods in media, culture and creative cities
SOWK6185 Qualitative research methods

Capstone project (Compulsory)
MSDA8001 Capstone project

Please click here for the regulations, syllabus and course descriptions.

*Offering schedule and quota is subject to availability


Students are required to complete 5 compulsory courses, 3 elective courses and 1 capstone project throughout their studies.

Before the commencement of the programme, all students must attend a mandatory social data analytics summer boot camp. The boot camp training programme is designed to provide students with beginner-level skills in social data analytics so that they can follow the more advanced curriculum for the MSocSc(SDA) programme.

(Horizontal) MSDA curriculum_20241125
(Vertical) MSDA curriculum_20241128

Compulsory Courses

Accordion Content

This course aims to help students make sense of “big data.” It guides students to ask and answer the following questions: What are big data? With big data, what questions can policymakers and researchers ask and answer? How to collect and analyze big data? This course introduces students to basic data science techniques and demonstrate how they can be applied to various formats of social data. Upon completion, students are expected to master basic concepts of data science and acquire hands-on experience with social data analytics.

6 credits
Assessment: 100% coursework

It is quite common for social scientists to use mathematical and statistical techniques to describe and analyse social phenomena. Mathematics and statistics provide the foundations for empirical propositions about relationships between social variables. Social scientists typically transform raw data from the real world into numerical generalisations using statistics. The role of mathematics in social science, though, is not restricted to the domain of statistical technique. Many social scientists also construct mathematical representations of social institutions to understand how they work. Building these formal model entails picking out the most important aspects of a situation and then trying to express them mathematically. Over the years, though, the mathematical demands of modern social science has scaled up considerably.

6 credits
Assessment: 100% coursework

This course explores machine learning as the algorithmic approach to learning from data. The course also covers key aspects of data mining, which is understood as the application of machine learning tools to obtain insight from data. Algorithms are placed in the context of their theoretical foundations in order to understand their derivation and correct application. Topics include linear models for regression and classification, local methods (nearest neighbor), neural networks, tree learning, kernel machines, unsupervised learning, ensemble learning, computational and statistical learning theory, and Bayesian learning. To expand and extend the development of theory and algorithms presented in lectures, practical applications will be given in tutorials and programming tasks during the project.

6 credits
Assessment: 100% coursework

This course introduces key methodological concepts and practices in the social sciences. It is especially useful to students without any background in the social sciences, but will also enable students with a background in the social sciences to develop their methodological practice and skills. The key aim of the course is to get students to develop a social science research proposal and to plan the research project successfully. The course is based around three broad topics: (i) philosophy of social science; (ii) research methodology, ethics and practical research strategies; and (iii) research design, with an emphasis on comparative and longitudinal research, and causal inference.

6 credits
Assessment: 100% coursework

The course provides an introduction to the basic computational tools, skills, and methods used in Computational Social Science using Python. Python is the most popular programming language for data science, used widely in both academia and the industry. Students will learn to use common workflow and collaboration tools, design, write, and debug simple computer programs, and manage, summarize, and visualize data with common Python libraries. The course will employ interactive tutorials and hands-on exercises using real social data. Participants will work independently and in groups with guidance and support from the lecturers. The practical exercises are designed to demand more autonomy and initiative as the course progresses, culminating in an open-ended group project.

This is an introductory course and no prior experience with programming is required. A basic understanding of statistics and some scripting experience (e.g., from building web pages or statistical analysis programs such as R or Stata) will be helpful but not needed.

6 credits
Assessment: 100% coursework

Elective Courses
At least two elective courses from the following list*

Accordion Content

Do Google and Facebook understand us better than we do ourselves? Are we becoming lab rats every time we go online? Is the impartially designed algorithm for predicting the probability of recidivism truly fair for sentencing individuals? What are the ethical issues underpinning big data science? When big data analytics are routinely applied in our daily lives, the ability to audit the adopted algorithms becomes crucial. This course aims to build students’ big data literacy through three major areas of focus: (1) Defining what big data is; (2) Providing an overview of existing big data analytical techniques; and (3) Discussing opportunities and challenges of big data analytics in tackling social problems.

The course will focus on elaborating the core principles of a variety of techniques adopted when predicting future phenomena through the lens of big data. We will use a case study approach to provide an in-depth understanding of various big data analytics, with the goal inspiring the students to think creatively and critically about how big data analytics can be used to making scientific discoveries and do social good.

6 credits
Assessment: 100% coursework

Despite its contributions to scientific development, traditional positivist, quantitative approaches (e.g., traditional variable-based statistical equations) have often been criticised for their over-simplification and decontextualisation of real-world phenomena in analysis. In contrast, systems science aims to understand complex relationships and their adaptive interactions among various elements within varying environments and systems. Systems science has been instrumental in breaking new scientific ground in diverse fields, including but not limited to engineering, decision analysis, transportation, public health, and urban sciences.

This course will pursue a solid understanding of systems science by exploring the latest advances in agent-based modelling (ABM) and the related analysis methods. ABM, a class of systems science, is an in-silico modelling to examine and predict ‘what-if’ conditions by simulating social behaviours and interactions among individual entities embedded in social structures.

This course is designed to introduce students to basic tools of theory building and data analysis in ABM to apply those tools to better understand social problems in human populations. Students will learn to use agent-based modelling on standard (free) software, paying attention to feedback processes, multilevel interactions, and the phenomenon of emergence. You will enrich your understanding of the problems people have when they share and cooperate, and examine essential models that can support you in your future career in social sciences and beyond.

This course is designed for anyone interested in understanding human behaviours, especially when sharing and cooperation are involved. It will be particularly useful for professionals dealing with challenges related to

public goods, common resources, and cooperation. If you are studying social sciences and are curious about how a computational approach works, this course will be particularly helpful.

6 credits
Assessment: 100% coursework

From historical archive to social media discourse, text data are among the most widely available format of social data. Natural Language Processing (NLP) tools help social analysts to use large volume of text to understand social phenomena. This course gives an overview of NLP methods from social sciences’ perspective. It discusses how to use NLP tools to discover interesting patterns, create reliable measurement, and make robust inference. It also introduces state-of-the-art generative language models and discusses their promises, limitations, and threats for social data analytics.

6 credits
Assessment: 100% coursework

The basic premise of this course is that the social world is relational. We can not ignore that we are influenced by people we know, have met and respect; ideas and allegiances are formed and maintained in social settings and organisations; not all people have equal opportunities when it comes to finding a job; we communicate over networks, be they online or offline; etc.  In this course we aim to produce a detailed understanding of the web of social contacts that structure our daily life and society. We will consider the network both as an object that is interesting in its own right and as something that creates co-dependencies between social units in terms of outcomes and properties of these social units themselves.

The overarching goal of the course is to provide us with tools that bridge theories on the one hand, and what we can actually observe in observational and archival empirics on the other. Put another way, we aim to avail ourselves of approaches that permits us to test if our theoretical ideas about social interaction are supported by what people, organisations and countries actually do. The course is structured around a collection of themes based on such theoretical concepts such as cohesion, embeddedness, homophily, transitivity, the Mathew effect, structural holes, influence, selection. We will examine these both from the perspective of how they structure the network and how these network effects structure behaviour, opinions and beliefs.

For the purposes of getting some practical understanding of the approaches presented, we will also explore analytic methods using block models, stochastic actor-oriented models, exponential random graph models, network autocorrelation and network effects models. It is not expected that the students become expert users in any of these methods but to appreciate the common goal across these models, namely to model and take into account the interdependencies. Data will mostly be handled in R but orientation to other analysis packages will be given.

6 credits
Assessment: 100% coursework

This course is designed to train students to familiarize a list of essential techniques for media data and social media analytics. It covers a variety of tools that help the learner conduct a range of applications independently, including web scraping, API programming, natural language processing, sentiment analysis, network analysis, digital map, web app development, as well as data visualization. The course is designed and taught in problem-based or project-driven mode which aims to facilitate real life applications in a variety of scenarios in social data analysis.

6 credits
Assessment: 100% coursework

Capstone project (Compulsory)

Accordion Content

This course aims to teach students how to integrate and apply the knowledge and skills they acquired through the programme. Students will conduct a research project in close collaboration with supervisors from the programme. Students will articulate their research objectives, conduct a relevant literature review and develop indicative methodology.   The course provides students with the opportunity to undertake a major piece of supported independent research. It is an opportunity to apply skills and techniques learned during the taught component of this programme to a substantive original research or industry focused problem of interest to the student. Projects will be supervised by academic staff affiliated with the Social Data Analytics programme.

Individual projects and research questions are chosen and formulated by students, and supported during the research process by one-to-one or small group meetings with a nominated member of academic staff, and student-led group meetings to seek peer support. The project may address a methodological or practical issue using desk based research and secondary data sources or may involve primary data collection. It may also be carried out in conjunction with an external organisation (such as local government, a charitable organisation or a commercial organisation) in order to address a relevant research or practical issue of interest to them, and making use of their data or other input. Regardless of the nature of the project itself, all projects must have a clearly defined aim and set of specific objectives that are novel or original and which relate to this programme of study.  All projects should be written up as an academic piece of work, using the guidance provided during the module.

12 credits
Assessment: 100% coursework

Please click here for the regulations, syllabus and course descriptions.

*Offering schedule and quota is subject to availability