In-Class Exercise 3

In this hands-on exercise, we learnt to create and plot the Choropleth map and the different codes that can be used to customize our maps.

Sarah Chin linkedin.com/in/sarahchin99/
08-30-2021

Boxplot

Before learning about Choropleth map and how to code it, we started off with the Boxplot.

ggplot(data=mpszpop2020, 
       aes(x = "", 
           y = AGED)) +
  geom_boxplot()

Percentile Map

In this exercise, we also learnt what’s a percentile map.

Pre-Processing steps

Before using the data to create maps or boxplots to make any analysis, we need to clean data by following these steps.

Step 1: Remove NA data using the following code

mpszpop2020a <- mpszpop2020 %>%
  drop_na()

Step 2: Create function to pull out specific data from the dataset

Creating a function avoids copy pasting too much in RMarkdown.

The following code also clearly marks out the quantile categories according to their respective percentage.

percent <- c(0,.01,.1,.5,.9,.99,1)
var <- mpszpop2020a["DEPENDENCY"] %>%
  st_set_geometry(NULL)
quantile(var[,1], percent)
Result:
##         0%         1%        10%        50%        90%        99%       100% 
##  0.0000000  0.1377778  0.5686120  0.7024793  0.8474114  1.2100000 19.0000000

There are many advantages to writing functions:

The following code allows you to create the get.var function, which extracts a variable as a vector out of an sf data frame:

get.var <- function(vname,df) {
  v <- df[vname] %>% 
    st_set_geometry(NULL)
  v <- unname(v[,1])
  return(v)
}

This step is vital while creating percentile maps.

Step 3; Plotting the percentile map using tmap functions

In this step, we can customize how our percentile map should look like. Here is an example of how a percentile map can look like:

percent <- c(0,.01,.1,.5,.9,.99,1)
var <- get.var("DEPENDENCY", mpszpop2020a)
bperc <- quantile(var,percent)
tm_shape(mpszpop2020) +
  tm_polygons() +
tm_shape(mpszpop2020a) +
  tm_fill("DEPENDENCY",
          title="DEPENDENCY",
          breaks=bperc,
          palette="Blues",
          labels=c("< 1%", "1% - 10%",
                   "10% - 50%", 
                   "50% - 90%",
                   "90% - 99%", 
                   "> 99%"))  +
  tm_borders() +
  tm_layout(title = "Percentile Map", 
            title.position = c("right",
                               "bottom"))
Result:

We can customise our percentile maps by the following:

  1. Colour

Refer to this link for a range of colour palettes that are available on RStudio: https://www.nceas.ucsb.edu/sites/default/files/2020-04/colorPaletteCheatsheet.pdf

Let’s say you want to change the colour to red! You can use this code:

percent <- c(0,.01,.1,.5,.9,.99,1)
var <- get.var("DEPENDENCY", mpszpop2020a)
bperc <- quantile(var,percent)
tm_shape(mpszpop2020) +
  tm_polygons() +
tm_shape(mpszpop2020a) +
  tm_fill("DEPENDENCY",
          title="DEPENDENCY",
          breaks=bperc,
          palette="Reds", # change the code in this line! 
          labels=c("< 1%", "1% - 10%",
                   "10% - 50%", 
                   "50% - 90%",
                   "90% - 99%", 
                   "> 99%"))  +
  tm_borders() +
  tm_layout(title = "Percentile Map", 
            title.position = c("right",
                               "bottom"))

Result:

  1. Borders

You can also remove the borders in the percentile map by removing the following line:

tm_borders() +

Take note that if you want to remove all the borders from the layer below, you will have to customise that layer as well.

percent <- c(0,.01,.1,.5,.9,.99,1)
var <- get.var("DEPENDENCY", mpszpop2020a)
bperc <- quantile(var,percent)
tm_shape(mpszpop2020) + # if you want to change this layer, 
  tm_polygons() + # you will need to use tm_fill() instead of the auto_fill tm_polygon()
tm_shape(mpszpop2020a) +
  tm_fill("DEPENDENCY",
          title="DEPENDENCY",
          breaks=bperc,
          palette="Reds", # change the code in this line! 
          labels=c("< 1%", "1% - 10%",
                   "10% - 50%", 
                   "50% - 90%",
                   "90% - 99%", 
                   "> 99%"))  +
  tm_borders() +
  tm_layout(title = "Percentile Map", 
            title.position = c("right",
                               "bottom"))
  1. Legend

You can also customise the legend in this row:

percent <- c(0,.01,.1,.5,.9,.99,1)
var <- get.var("DEPENDENCY", mpszpop2020a)
bperc <- quantile(var,percent)
tm_shape(mpszpop2020) +
  tm_polygons() + 
tm_shape(mpszpop2020a) +
  tm_fill("DEPENDENCY",
          title="DEPENDENCY",
          breaks=bperc,
          palette="Reds", 
          labels=c("< 1%", "1% - 10%",
                   "10% - 50%", 
                   "50% - 90%",
                   "90% - 99%", 
                   "> 99%"))  +
  tm_borders() +
  tm_layout(title = "Percentile Map", # change this layer to how you see fit!
            title.position = c("right",
                               "bottom"))

Box Map

To create a box map, we need to create custom breaks specification. However, the break points for the box map vary depending on whether lower or upper outliers are present.

Firstly, we would need to create the boxbreaks function.

boxbreaks <- function(v,mult=1.5) {
  qv <- unname(quantile(v))
  iqr <- qv[4] - qv[2]
  upfence <- qv[4] + mult * iqr
  lofence <- qv[2] - mult * iqr
  # initialize break points vector
  bb <- vector(mode="numeric",length=7)
  # logic for lower and upper fences
  if (lofence < qv[1]) {  # no lower outliers
    bb[1] <- lofence
    bb[2] <- floor(qv[1])
  } else {
    bb[2] <- lofence
    bb[1] <- qv[1]
  }
  if (upfence > qv[5]) { # no upper outliers
    bb[7] <- upfence
    bb[6] <- ceiling(qv[5])
  } else {
    bb[6] <- upfence
    bb[7] <- qv[5]
  }
  bb[3:5] <- qv[2:4]
  return(bb)
}

Secondly, we use the newly created function to remove any NA present in AGED.

mpszpop2020a <- mpszpop2020 %>%
  filter(AGED>=0)
var <- get.var("AGED", mpszpop2020a)
boxbreaks(var)

Here is how the result should look like:

[1] -4330 0 515 2080 3745 8590 20240

Lastly, we create the boxmap function.

boxmap <- function(vnam, df, 
                   legtitle=NA,
                   mtitle="Box Map",
                   mult=1.5){
  var <- get.var(vnam,df)
  bb <- boxbreaks(var)
  tm_shape(df) +
     tm_fill(vnam,title=legtitle,
             breaks=bb,
             palette="Blues",
          labels = c("lower outlier", 
                     "< 25%", 
                     "25% - 50%", 
                     "50% - 75%",
                     "> 75%", 
                     "upper outlier"))  +
  tm_borders() +
  tm_layout(title = mtitle, 
            title.position = c("right",
                               "bottom"))
}

The function above takes in the following arguments:

And it returns a tmap-element by plotting a map.

Use this line to activate our newly created function:

boxmap(“ECONOMY ACTIVE”, mpszpop2020a)

Results:

However, we notice that there where some subzones that have been removed from the original map of Singapore. In order to place it back, we need to insert the original map in the box map function.

boxmap <- function(vnam, df, 
                   legtitle=NA,
                   mtitle="Box Map",
                   mult=1.5){
  var <- get.var(vnam,df)
  bb <- boxbreaks(var)
  tm_shape(mpszpop2020) + # add this portion
    tm_polygons() + # add this portion
  tm_shape(df) +
    tm_fill(vnam,title=legtitle,
            breaks=bb,
            palette="Blues",
            labels = c("lower outlier", 
                       "< 25%", 
                       "25% - 50%", 
                       "50% - 75%",
                       "> 75%", 
                       "upper outlier"))  +
    tm_borders() +
    tm_layout(title = mtitle, 
              title.position = c("right",
                                 "bottom"))
}

Results: