Spatial Data Science Pathway

The Department of Geography’s Spatial Data Science ‘pathway’ offers undergraduates outside of Computer Science a unique opportunity to complete a suite of modules focused on the fundamentals of data science in Python. We believe that our spatial data science modules not only offer students a valuable set of tools for undertaking research at undergraduate and graduate levels, but that they also offer you a competitive advantage in today’s job market by helping you to stand out from the crowd.

We’ve designed our modules to provide the critical skills that we think you need to get started down the path to being either a practicing (spatial) data scientist or a valuable bridge between ‘the quants’ and the rest of an organisation, whether it’s a NGO, or a private or public sector body. To deliver this we cover a lot of ground and we will expect you to work hard in order to understand how to take what you’re learning about in programming and statistics, and to apply it to interesting problems. You should think of this as an intensive language class — and learning to code is like learning a new language — in which regular practice in a ‘lab’ is essential.

This is an investment that pays off; here’s what one recent graduate had to say about their experience with the ‘Geocomputation’ pathway that we’ve now adapted into ‘Spatial Data Science’:

The Geocomputation skill-set is without a doubt the most important learning I have taken away from King’s. The course was taught at a perfect pace. It was super challenging but there was plenty of support along the way. The Jupyter notebooks were fantastic and I regularly find myself referring back to them in my own work to find solutions.

I think, what has surprised me the most, is how much employers and recruiters in general value the skills. I’ve been pretty much employed in ‘data science’ since I handing in my dissertation. As things currently stand I am receiving offers left right and centre, all of which give me way more impact and pay much more than the typical city style graduate roles I’d previously been considering. I am not sure we sell the Geocomputation employment benefits enough. I also feel way more ‘future-proofed’ than my peers.

Geography Graduate (2019).

The pathway is composed of a suite of taught and online/self-directed modules:

  1. Code Camp (an online ‘boot camp’)
  2. Foundations of Spatial Data Science
  3. Principles of Spatial Data Science
  4. Applications of Spatial Data Science
  5. Directed Readings (currently for Geography students only)

The modules are designed to build, one on top of the other, so that it is possible to complete the pathway despite having little or no prior experience of computer programming.

Modules

Code Camp

Code Camp is essential for students who have enrolled on Foundations with no prior experience of programming or the Python programming language: it provides you with a series of introductory online ‘Jupyter notebooks’ in which you will learn the basics of coding, including variables, assignment, operators, loops, lists, dictionaries, and functions.

Code Camp content should be completed by students over the summer between their 1st and 2nd years of study. The pace is up to you, but we strongly recommend that you give yourself time to cover this content fully, and even to go back and revise shortly before the start of term. Covering the material in this way allows you ‘hit the ground running’ when Foundations starts in Term 1.

More information about Code Camp can be found here.

Processing LIDAR Data from Drones to Detect Vegetation Loss

Foundations of Spatial Data Science

Foundations is designed, unsurprisingly, to to provide a foundation in the concepts and methods of ‘data science’, with a particular focus on real-world data that includes spatial attributes. The module will be of particular interest to those who wish to develop their programming and analytic skills in an application-focussed context for employability, curiosity, and other skills reasons, but it will also appeal to students who wish to understand ‘data science’ as an approach to investigating the world around us.

We seek to develop an understanding of, and basic skills in, the use of computational techniques for accessing, exploring, visualising and performing reproducible analysis on real-world ‘big data’. You will work data to apply the core Python ‘data science stack’ to its analysis — pandas/geopandas, scipy, and Seaborne — to produce maps and graphs, summary tables, and other useful outputs related to EDA/ESDA (Exploratory (Spatial) Data Analysis).

Consequently, the aims of the module include:

  1. To provide students with an understanding of the foundational concepts and methods used by data scientists through interaction with large, real- world data sets.
  2. To provide students with experience of the decision-making process involved in selecting and employing data science methods and tools.
  3. To provide students with experience working with large, secondary data sets containing a variety of data types, and with uncertainties due to measurement and collection issues.
  4. To empower students with the ability to manipulate and analyse data in a reproducible fashion using code.

Learning Outcomes

By the end of the module, students should be able to:

  1. Understand techniques for working with different types of data and be able to employ them appropriately.
  2. Understand the limitations and assumptions inherent in the choice of different data presentation and analysis methods.
  3. Understand and use the importance of reproducible data manipulation and analysis techniques.
  4. Apply all of the above in a practical context.

Prerequisites

None, but students are strongly encouraged to complete the optional Code Camp Jupyter notebooks before the start of term since a basic familiarity with variables, operators, assignment, lists/arrays, dictionaries/hashes, and functions/subroutines will be assumed.

Changes in SIngapore Commuting Behaviour via Big Data

Principles of Spatial Data Science

This module focusses on the development of different types of models for analytic and predictive purposes; as such it focusses on the core concepts and techniques that underly ‘data science’ and its spatial extensions. We seek to develop a practical understanding of how to employ different types of models appropriately in an anlytical context that does not conform to ‘tidy’ expectations around distributions and independence of observations. Attention will therefore be given to the understanding of errors, false positives/negatives, and overfitting.

You will work with data to develop models of real-world behaviours of interest using the Python ‘data science stack’ for analysis — pandas/geopandas, scipy, sci-kit learn, PySAL and Seaborne — to produce and test models, maps, and graphs, summary tables, and other useful outputs such as measures of accuracy (e.g. precision and recall).

Consequently, the aims of the module include:

  1. To provide students with the means of investigating issues of scale, interaction and uncertainty in relationships between data.
  2. To enable students to develop, employ and interpret different types of models in order to understand these relationships.
  3. To enable students to identifying and analyse patterns in data with a particular focus on autocorrelation in model-building.
  4. To provide students with practical skills and understanding to conduct such analyses with computational tools.

Learning Outcomes

By the end of the module, students should be able to:

  1. Understand commonly-used methods to identify and quantify patterns in data, and be able to employ them appropriately.
  2. Build models to explore the importance of relationships in a variety of types of data.
  3. Understand the limitations and assumptions inherent in choosing between different data analysis methods.
  4. Use computational tools to manipulate and model non-spatial and spatial data.

Prerequisites

No formal prerequisites but a basic working knowledge of Python is assumed. In particular, students who have not taken taken Foundations of Spatial Data Science are strongly encouraged to complete the Jupyter notebooks associated with that module prior to the start of term since a reasonable familiarity with pandas, seaborn, geopandas, and statsmodels will be assumed, together with basic knowledge of matplotlib, numpy and scipy.stats. The module convenor reserves the right to review, on a case-by-case basis, applications to join this module by students who have not completed Foundations of Spatial Data Science.

Mapping of Karonga as part of Missing Maps Map-a-Thon

Applications of Spatial Data Science

This module builds on the Foundations and Principles modules in Year 2, but is designed to have a stronger focus on the applications of data science than on coding methods. Although substantial consideration is still given to the teaching of spatial data science analysis and modelling techniques, the expectation is that students will use this module as a platform through which to extend their understanding of such approaches into more advanced topics.

The module revolves around in-class seminars (and workshops) for which students will be expected to read, prepare, and lead seminar-style discussions with their colleagues. These discussions will be followed by short lab sessions which will be devoted to allowing students to deepen their understanding of a selected number of topics through applications to their own data. External speakers will also form a key part of this module, providing insight into how the techniques covered in this and previous modules are employed in the ‘real world’.

Examples of applications that may be covered in this module include:

  • Geodemographics and targeted marketing
  • Crime hotspot and insurable event analysis
  • Spatial Interaction Models and retail location planning

Consequently, the aims of the module include:

  1. To provide students with a deeper understanding of computational concepts and techniques suitable for data analysis in an applied spatial data science context.
  2. To provide students with experience of the decision-making process involved in selecting and employing a range of analysis methods and tools for applied spatial data science problems.
  3. To enhance students’ ability to manipulate and analyse data using computational techniques for applied questions and problems.

Learning Outcomes

By the end of the module students should be able to:

  1. Understand how contemporary computational techniques and methods for identifying and analysing patterns and dynamics in spatial systems are applied in real-world contexts.
  2. Understand the limitations and assumptions of these different techniques and methods for application to real-world data analysis and problem solving.
  3. Apply computational tools for applied spatial data science and problem solving.

Directed Readings

For students in the Geography Department who wish to take their work further still, it is possible to set up a Directed Readings module with one of our staff. We would strongly suggest that students on Applications consider doing this as a small group on a topic of mutual interest rather than each student doing something independently.

Back to Teaching & Learning Overview