R Leaflet Tutorial

Your own web map with R Leaflet


Overview

We’ve just created a whole bunch of data and so far it is just a bunch of shapefiles (or geojson files – depending on what you exported) and .csv files. We haven’t even had the chance to look at what we’ve made! While there are a dozen tools we could use to visualize and interact with our data (e.g. Tilemill, Mapbox Studio, CartoDB, Tableau, etc), we are going to stick with R for now and let it “gently” introduce us to the fundamentals of a web map.

Web Map Fundamentals

A number of great people out there have created an overview of web map fundamentals, let’s take a look and learn how it all works:

Alan McConchie & Beth Schechter! – anatomy of a web map

Maptime ATX – web map fundamentals

Joey’s Hello Web Maps Intro

Enter: R Leaflet

So if we know that to make a web map generally is composed of:

  1. map tiles (most of the time)leaflet_tiles
  1. geodatageodata
  2. html/css/javascript (your webstack)htmlcss.png

And we also know that:

  1. we know little or nothing about html/css/javascript
  2. we haven’t made any map tiles

THEN HOW THE HELL ARE WE SUPPOSED TO MAKE OUR OWN WEB MAP???

Well, turns out that others have also been in your same predicament and have developed a library to bring web mapping to R programmers.

Wait a second… so we can make a web map, without coding any html, css, or javascript?

YES! sort of. Some clever people got together and wrote a library in R that takes a very famous and awesome javascript library (yes there are libraries in javascript, and every other language out there!) called “Leaflet.js” and allows you to write R code and export a fully functional web map, with tiles and geodata drawn right on top!

Basically what happens is:

  1. you load up the “leaflet” library for R
  2. you write R code that then gets translated into html/css/javascript
  3. your translated code is written into an “.html” file which includes:
    • your html skeleton
    • the leaflet.js library and leaflet css style
    • your javascript which was translated from your R code

Our First Example

With 5 lines of code, we’re going to make our first interactive web map!

leaflet_test.png

Run these lines of code and create your first interactive map with R!

# install the leaflet library
install.packages("leaflet")

# add the leaflet library to your script
library(leaflet)
    
# initiate the leaflet instance and store it to a variable
m = leaflet()
    
# we want to add map tiles so we use the addTiles() function - the default is openstreetmap
m = addTiles(m)
    
# we can add markers by using the addMarkers() function
m = addMarkers(m, lng=-123.256168, lat=49.266063, popup="T")
    
# we can "run"/compile the map, by running the printing it
m

Now if we export the map and save as webpage…:

leaflet_webpage.png

The convention for naming .html files is “index”, therefore, let’s name our file: index.html

leaflet_html

CRAZY – with just 4 lines of code, you added a pin to a map that now works on the interwebz! Let’s scale this up and use our dataset and see what we can come up with!


Working with our 3-1-1 Data


hex_van

So let’s shift gears and work with our 3-1-1 data. We now we have a bunch of points for the month of January and hexagonal grids with the call densities. How do we begin to interact with the data?

What are the main interactions we are going to work with?:

  1. pop-up details (aka tool tips) on mouse click
  • reveal specific case type
    • reveal counts
  1. toggle layers on and off

** Let’s get started with a rather verbose first example**

Version 1: Working with Point Data

point_van.png

# reading in a csv file - our data_filter.csv from the 311 Tutorial
# this string tells my R where to find data_filter.csv on my laptop.
# you will need to change it to tell your R where to find your own
# file! 
filename = "/Users/andrasszeitz/Desktop/GEOB_472/data_filter.csv"
data_filter = read.csv(filename, header = TRUE)

# if you are not able to read in your own data, use this as a backup
# NOTE - you need to have the `curl` package loaded for this to work
data_filter = read.csv(curl('https://raw.githubusercontent.com/joeyklee/aloha-r/master/data/calls_2014/201401CaseLocationsDetails-geo-filtered.csv'), header=TRUE)

We’re going to use the subset() function to get store our data into separate variables

# subset out your data_filter:
df_graffiti_noise = subset(data_filter, cid == 1)
df_street_traffic_maint = subset(data_filter, cid == 2)
df_garbage_recycling_organics = subset(data_filter, cid == 3)
df_water = subset(data_filter, cid == 4)
df_animal_vegetation = subset(data_filter, cid == 5)
df_other = subset(data_filter, cid == 0)

Great, now each of our data categories lives in its own variable name.

Initiate Base Map Tiles

Now we’re going to initiate our leaflet map:

# initiate leaflet
m = leaflet()

# add openstreetmap tiles (default)
m = addTiles(m)

# you can now see that our maptiles are rendered
m

Define Symbology for Point Layers

Now we are going to define some colors using leaflet’s special colorFactor() function:

colorFactors = colorFactor(c('red', 'orange', 'purple', 'blue', 'pink', 'brown'),
                           domain = data_filter$cid)

What the colorFactor function does is take our list of colors and “maps” them to the domain that we defined. For example, when we apply the colorFactor() function to our data, it will color a point “red”, if the “cid” in the data is equal to “0”, “orange” if the “cid” is equal to 1, etc.

The next step is to add our points to the map. We can do so using the addCircleMarkers() function:

m = addCircleMarkers(m, 
                     lng = df_graffiti_noise$lon_offset, # we feed the longitude coordinates 
                     lat = df_graffiti_noise$lat_offset,
                     popup = df_graffiti_noise$Case_Type, 
                     radius = 2, 
                     stroke = FALSE, 
                     fillOpacity = 0.75, 
                     color = colorFactors(df_graffiti_noise$cid),
                     group = "1 - graffiti & noise"
                     )

Here’s what each argument is saying:

  • lng: we add our markers to our m
  • lat:we feed the latitude coordinates
  • popup:each feature will show the Case_Type on click
  • radius: we set the circle radius size
  • stroke: we set stroke to false
  • fillOpacity: we set a fill opacity
  • color: the colorFactors() function is applying the color based on the value of the “cid”
  • group:we name the group

Now repeat this function across the other layers:

m = addCircleMarkers(m, 
                     lng = df_street_traffic_maint$lon_offset, lat=df_street_traffic_maint$lat_offset, 
                     popup = df_street_traffic_maint$Case_Type, 
                     radius = 2, 
                     stroke = FALSE, fillOpacity = 0.75,
                     color = colorFactors(df_street_traffic_maint$cid),
                     group = "2 - street & traffic & maintenance")
m = addCircleMarkers(m, 
                     lng = df_garbage_recycling_organics$lon_offset, lat=df_garbage_recycling_organics$lat_offset, 
                     popup = df_garbage_recycling_organics$Case_Type, 
                     radius = 2, 
                     stroke = FALSE, fillOpacity = 0.75,
                     color = colorFactors(df_garbage_recycling_organics$cid),
                     group = "3 - garbage related")
m = addCircleMarkers(m, 
                     lng = df_water$lon_offset, lat=df_water$lat_offset, 
                     popup = df_water$Case_Type, 
                     radius = 2, 
                     stroke = FALSE, fillOpacity = 0.75,
                     color = colorFactors(df_water$cid),
                     group = "4 - water related")
m = addCircleMarkers(m, 
                     lng = df_animal_vegetation$lon_offset, lat=df_animal_vegetation$lat_offset, 
                     popup = df_animal_vegetation$Case_Type, 
                     radius = 2,
                     stroke = FALSE, fillOpacity = 0.75,
                     color = colorFactors(df_animal_vegetation$cid),
                     group = "5 - animals & vegetation")
m = addCircleMarkers(m, 
                     lng = df_other$lon_offset, lat=df_other$lat_offset, 
                     popup = df_other$Case_Type, 
                     radius = 2,
                     stroke = FALSE, fillOpacity = 0.75,
                     color = colorFactors(df_other$cid),
                     group = "0 - other")

Other Options for Base Map Tiles

Now we’re going to add some additional map tiles by using the addTiles() and the addProviderTiles() functions:

m = addTiles(m, group = "OSM (default)") 
m = addProviderTiles(m,"Stamen.Toner", group = "Toner")
m = addProviderTiles(m, "Stamen.TonerLite", group = "Toner Lite")

We can now add in our layer toggle control using the addLayersControl() function – notice baseGroups are our tileLayers and the overlayGroups are our data layers:

m = addLayersControl(m,
                     baseGroups = c("Toner Lite","Toner"),
                     overlayGroups = c("1 - graffiti & noise", "2 - street & traffic & maintenance",
                                       "3 - garbage related","4 - water related", "5 - animals & vegetation",
                                       "0 - other")
)

# make the map
m

Version 2: Optimizing version 1

point_van2.png

We already subset our data from before:

# subset out your data_filter:
df_graffiti_noise = subset(data_filter, cid == 1)
df_street_traffic_maint = subset(data_filter, cid == 2)
df_garbage_recycling_organics = subset(data_filter, cid == 3)
df_water = subset(data_filter, cid == 4)
df_animal_vegetation = subset(data_filter, cid == 5)
df_other = subset(data_filter, cid == 0)

Here’ we are:

  1. adding those variables to a list so we can loop through them
  2. creating a list of our layers that we want the toggle name to be
  3. looping though the list
# Now we're going to put those variables into a **list** so we can loop through them:
data_filterlist = list(df_graffiti_noise = subset(data_filter, cid == 1),
                df_street_traffic_maint = subset(data_filter, cid == 2),
                df_garbage_recycling_organics = subset(data_filter, cid == 3),
                df_water = subset(data_filter, cid == 4),
                df_animal_vegetation = subset(data_filter, cid == 5),
                df_other = subset(data_filter, cid == 0))

# Remember we also had these groups associated with each variable? Let's put them in a list too:
layerlist = c("1 - graffiti & noise", "2 - street & traffic & maintenance",
              "3 - garbage related","4 - water related", "5 - animals & vegetation",
              "0 - other")

# We can keep that same color variable:
colorFactors = colorFactor(c('red', 'orange', 'purple', 'blue', 'pink', 'brown'), domain=data_filter$cid)

# Now we have our loop - each time through the loop, it is adding our markers to the map object:
for (i in 1:length(data_filterlist)){
  m = addCircleMarkers(m, 
                        lng=data_filterlist[[i]]$lon_offset,
                        lat=data_filterlist[[i]]$lat_offset, 
                        popup=data_filterlist[[i]]$Case_Type, 
                        radius=2,
                        stroke = FALSE, 
                        fillOpacity = 0.75,
                        color = colorFactors(data_filterlist[[i]]$cid),
                        group = layerlist[i]
              )
}

Now this time we flip “overlayGroups” and “baseGroups” so that we can get the radiobutton functionality – that way only 1 category of calls are shown:

m = addTiles(m, "Stamen.TonerLite", group = "Toner Lite") 
m = addLayersControl(m,
                      overlayGroups = c("Toner Lite"),
                      baseGroups = layerlist
                      )
m

Version 3: Working with Polygons

hex_van2.png

Remember we creates that hexagonal grid to aggregate our data to? Well, let’s work with it to get a general impression of the call density across the city:

If you haven’t already, you should have the rgdal and the GISTools libraries loaded:

library(rgdal)
library(GISTools)

Now we need to read in our hexgrid we aggregted our data to:

# define the filepath
hex_1401_fn = 'https://raw.githubusercontent.com/joeyklee/aloha-r/master/data/calls_2014/geo/hgrid_250m_1401_counts.geojson'

# read in the geojson
hex_1401 = readOGR(hex_1401_fn, "OGRGeoJSON")
## OGR data source with driver: GeoJSON 
## Source: "https://raw.githubusercontent.com/joeyklee/aloha-r/master/data/calls_2014/geo/hgrid_250m_1401_counts.geojson", layer: "OGRGeoJSON"
## with 1418 features
## It has 5 fields

Now initiate a new map object but this time with the Stamen Toner lite style:

m = leaflet()
m = addProviderTiles(m, "Stamen.TonerLite", group = "Toner Lite")

Like any choropleth map, we need to set a color scale. We can do so by using the colorNumeric() function which is part of the R leaflet package.

We use the “Greens” color and set the “domain” to the column called “data” in our geojson file.

Here’s the exciting stuff. Let’s add our polygon to the map:

# Create a continuous palette function
pal = colorNumeric(
  palette = "Greens",
  domain = hex_1401$data
)

# add the polygons to the map
m = addPolygons(m, 
                data = hex_1401,
                stroke = FALSE, 
                smoothFactor = 0.2, 
                fillOpacity = 1,
                color = ~pal(hex_1401$data),
                popup = paste("Number of calls: ", hex_1401$data, sep="")
                )

let’s break this down:

  • m: this is our map object
  • data: this is our spatialPolygonsDataframe
  • smoothFactor: adding some smooth operator
  • fillOpacity: changing how transparent object fill is
  • color: we let our colorNumeric() function handle the color
  • popup: we set popup on click to our call density

And what is a choropleth map without a legend? Let’s add one using the addLegend() function:

m = addLegend(m, "bottomright", pal = pal, values = hex_1401$data,
              title = "3-1-1 call density",
              labFormat = labelFormat(prefix = " "),
              opacity = 0.75
)

m

Version 3: Synthesis

hex_van3.png

Now that we’ve got the two pieces of our map – toggleable & clickable point layers and hex grid – let’s put them together.

# initiate leaflet map layer
m = leaflet()
m = addProviderTiles(m, "Stamen.TonerLite", group = "Toner Lite") 

# --- hex grid --- #
# store the file name of the geojson hex grid
hex_1401_fn = 'https://raw.githubusercontent.com/joeyklee/aloha-r/master/data/calls_2014/geo/hgrid_250m_1401_counts.geojson'

# read in the data
hex_1401 = readOGR(hex_1401_fn, "OGRGeoJSON")
## OGR data source with driver: GeoJSON 
## Source: "https://raw.githubusercontent.com/joeyklee/aloha-r/master/data/calls_2014/geo/hgrid_250m_1401_counts.geojson", layer: "OGRGeoJSON"
## with 1418 features
## It has 5 fields
# Create a continuous palette function
pal = colorNumeric(
  palette = "Greens",
  domain = hex_1401$data
)

# add hex grid
m = addPolygons(m, 
                data = hex_1401,
                stroke = FALSE, 
                smoothFactor = 0.2, 
                fillOpacity = 1,
                color = ~pal(hex_1401$data),
                popup= paste("Number of calls: ",hex_1401$data, sep=""),
                group = "hex"
)

# add legend
m = addLegend(m, "bottomright", pal = pal, values = hex_1401$data,
              title = "3-1-1 call density",
              labFormat = labelFormat(prefix = " "),
              opacity = 0.75
)

# --- points data --- #

# subset out your data:
df_graffiti_noise = subset(data_filter, cid == 1)
df_street_traffic_maint = subset(data_filter, cid == 2)
df_garbage_recycling_organics = subset(data_filter, cid == 3)
df_water = subset(data_filter, cid == 4)
df_animal_vegetation = subset(data_filter, cid == 5)
df_other = subset(data_filter, cid == 0)

data_filterlist = list(df_graffiti_noise = subset(data_filter, cid == 1),
                df_street_traffic_maint = subset(data_filter, cid == 2),
                df_garbage_recycling_organics = subset(data_filter, cid == 3),
                df_water = subset(data_filter, cid == 4),
                df_animal_vegetation = subset(data_filter, cid == 5),
                df_other = subset(data_filter, cid == 0))
layerlist = c("1 - graffiti & noise", "2 - street & traffic & maintenance",
              "3 - garbage related","4 - water related", "5 - animals & vegetation",
              "0 - other")


colorFactors = colorFactor(c('red', 'orange', 'purple', 'blue', 'pink', 'brown'), domain=data_filter$cid)
for (i in 1:length(data_filterlist)){
  m = addCircleMarkers(m, 
                       lng=data_filterlist[[i]]$lon_offset, lat=data_filterlist[[i]]$lat_offset, 
                       popup=data_filterlist[[i]]$Case_Type, 
                       radius=2,
                       stroke = FALSE, fillOpacity = 0.75,
                       color = colorFactors(data_filterlist[[i]]$cid),
                       group = layerlist[i]
  )

}


m = addLayersControl(m,
                     overlayGroups = c("Toner Lite", "hex"),
                     baseGroups = layerlist
)

# show map
m

Congratulations you just made your first super awesome and fully functional web map using R!

Now, play with your maps and try to identify weaknesses and strengths of your map. Are the colors appropriate? could they be better? what’s missing? REFINE YOUR STUFF


Review:


In this tutorial we covered a lot of ground. We’re neither masters of R nor data viz experts, however we got the chance to go through the entire data viz pipeline:

  • acquire: pulled 3-1-1 data from github
  • parse: manipulated table data and geocoded the data
  • filter: removed points that fell outside of the bounding box of Vancouver
  • mine: classified the data & added jitter
  • represent: web mapping
  • refine: web mapping
  • interact: web mapping
Advertisements

R – Vancouver 311 Tutorial

Created by: Joey K Lee

Edited by: Andras Szeitz


Project: What are Vancouverites Complaining About?

Brief:

The City of Vancouver releases a dataset of the 3-1-1 phone calls – the general hotline regarding maintenance issues in the city. Currently there is no tool to visualize and access the data. How can citizens engage the city for these matters if there’s no way to work with the data? In the name of civic “hacking”, the project brief is to develop a project that:

  1. can parse and handle the BIG data being delivered by the city.
  2. shows the 3-1-1 data and potentially highlights insights derived from the data in an accessible web application.

Overview:

We will use this project to go through Ben Fry’s Data Visualization Pipeline. The final script generated by going through the steps of the data viz pipeline will be a good foundation to use for the upcoming R Assignment.


Process: Exploring the 3-1-1 data

Data viz is hard and in the end comes down to a lot of experimentation and exploration. This script attempts to showcase how the data viz pipeline is done in practice and how it is far from a linear process, but rather a very interactive and dynamic process.

setup:

Let’s being our script with a nice commented header:

######################################################
# Vancouver 3-1-1: Data Processing Script
# Date:
# By: 
# Desc: 
######################################################

We noticed how libraries can help us to read in geographic data and even help us make new scales. Since this is a bigger project, we’re going to need the help of some more libraries:


# ------------------------------------------------------------------ #
# ---------------------- Install Libraries ------------------------- #
# ------------------------------------------------------------------ #
install.packages("GISTools")
install.packages("RJSONIO")
install.packages("rgdal")
install.packages("RCurl")
install.packages("curl")
  
# Unused Libraries:
# install.packages("ggmap")
# library(ggmap)

You will notice we have some unused libraries – these are some that I started out using in the beginning, but decided to not use. I kept them here just for future reference. NOTE: ggmap was previously used for it’s geocode() function – but with google’s 2500 api call limit, it wasn’t enough for the 10,000+ geocoding events we would need for our project.

Now that you’ve installed the libraries now Let’s load up our libraries to make all those new functions available to us:

# ------------------------------------------------------------------ #
# ----------------------- Load Libararies -------------------------- #
# ------------------------------------------------------------------ #
library(GISTools)
library(RJSONIO)
library(rgdal)
library(RCurl)
library(curl)

Acquire

We use the curl() function to make http/https requests from the web to get data and used our read.csv() function to read our table in to R.

# ------------------------------------------------------------------ #
# ---------------------------- Acquire ----------------------------- #
# ------------------------------------------------------------------ #
# access from the interwebz using "curl"
fname = curl('https://raw.githubusercontent.com/joeyklee/aloha-r/master/data/calls_2014/201401CaseLocationsDetails.csv')

# Read data as csv
data = read.csv(fname, header=T)

# inspect your data
print(head(data))
##   Year Month Day Hour Minute           Department
## 1 2014     1   1    7     21       CSG - Licenses
## 2 2014     1   1    7     45    ENG - Solid Waste
## 3 2014     1   1    9     27    ENG - Solid Waste
## 4 2014     1   1   10      6    ENG - Solid Waste
## 5 2014     1   1   10     15 PRB - Administration
## 6 2014     1   1   10     17        ENG - Streets
##                            Division
## 1                    Animal Control
## 2                        Sanitation
## 3                        Sanitation
## 4                        Sanitation
## 5    General Park Board Information
## 6 Traffic and Electrical Operations
##                                          Case_Type Hundred_Block
## 1                          Dead Animal Pickup Case  Intersection
## 2                            Missed Garbage Pickup          22##
## 3 Abandoned Garbage Pickup - City Property & Parks          10##
## 4                            Missed Garbage Pickup          27##
## 5                               PRB_Park Ranger SR          11##
## 6                                    Sign - Repair          8600
##                      Street_Name    Local_Area
## 1 E 64TH AV and PRINCE ALBERT ST        Sunset
## 2                SW MARINE DRIVE    Kerrisdale
## 3                      W 12TH AV      Fairview
## 4                      W 16TH AV Arbutus Ridge
## 5                   W CORDOVA ST      Downtown
## 6            - 8699 GRANVILLE ST       Marpole

Parse

Upon inspecting our data, we notice we have the addresses, but the city has put int “#’s” to help with the anonymity of the callers. Furthermore, we notice that we don’t have any lat/lon coordinates to work with to turn our 3-1-1 calls into a something spatial. How can we develop a solution for this?

First let’s sub out the “#” with “0”:

# ------------------------------------------------- #
# -------------- Parse: Geocoder ------------------- #
# ------------------------------------------------- #
# change intersection to 00's
data$h_block = gsub("#", "0", data$Hundred_Block)
print(head(data$h_block))
## [1] "Intersection" "2200"         "1000"         "2700"        
## [5] "1100"         "8600"

Next let’s concatenate the newly created “h_block” column with the Street_Name column, and a string that specifices that all of the calls are from Vancouver, BC:

# Join the strings from each column together & add "Vancouver, BC":
data$full_address = paste(data$h_block, 
                          paste(data$Street_Name,
                                "Vancouver, BC",
                                sep=", "),
                          sep=" ")

We also notice that the city has put in the word “intersection” for those calls that refer to an intersection. Let’s take those out so as to make our geocoding parsing potentially easier:

# removing "Intersection " from the full_address entries
data$full_address = gsub("Intersection ", "", data$full_address)
print(head(data$full_address))
## [1] "E 64TH AV and PRINCE ALBERT ST, Vancouver, BC"
## [2] "2200 SW MARINE DRIVE, Vancouver, BC"           
## [3] "1000 W 12TH AV, Vancouver, BC"                 
## [4] "2700 W 16TH AV, Vancouver, BC"                 
## [5] "1100 W CORDOVA ST, Vancouver, BC"              
## [6] "8600  - 8699 GRANVILLE ST, Vancouver, BC"

Now, we need to convert the street addresses into latitude & longitude coordinates so that we have a spatial reference to where these calls occurred. We will use the BC Government’s geocoding API, which will take our street addresses and return lat/long coordinates. But, the whole dataset has just over 10,000 rows (each one is an individual call to 311) and geocoding the entire dataset takes about 1 hour.  We will reduce the size of the dataset so that we can actually geocode the addresses in a realistic amount of time during the tutorial. But for your assignment, you will have to geocode the entire dataset!

# creating a random sequence of 1000 numbers between 1 and the length
# of the dataset. These numbers will be the rows we retain from the
# full dataset in order to more quickly geocode during the tutorial
set.seed(1)
keepers = sample(seq(from = 1, to = 10195, by = 1), size = 1000, replace = TRUE)

# creating a new variable to hold the full data set, then trimming the 
# original dataset to have only the 1000 random rows
full_data = data
data = data[keepers,] 

But before we can geocode the street addresses, we need to create a function to do the task! The BC government’s geocoding API works by taking a URL that has a street address as a search term, processing it, and returning a web page with the latitude and longitude of the street address (plus a bunch of other information). The code block below defines a function in R, and after we define the function, we can use it and R to process all of our street addresses in one go!

# a function taking a full address string, formatting it, and making
# a call to the BC government's geocoding API
bc_geocode = function(search){
  # return a warning message if input is not a character string
  if(!is.character(search)){stop("'search' must be a character string")}
  
  # formatting characters that need to be escaped in the URL, ie:
  # substituting spaces ' ' for '%20'.
  search = RCurl::curlEscape(search)
  
  # first portion of the API call URL
  base_url = "http://apps.gov.bc.ca/pub/geocoder/addresses.json?addressString="

  # constant end of the API call URL
  url_tail = "&locationDescriptor=any&maxResults=1&interpolation=adaptive&echo=true&setBack=0&outputSRS=4326&minScore=1&provinceCode=BC"
  
  # combining the URL segments into one string
  final_url = paste0(base_url, search, url_tail)
  
  # making the call to the geocoding API by getting the response from the URL
  response = RCurl::getURL(final_url)
  
  # parsing the JSON response into an R list
  response_parsed = RJSONIO::fromJSON(response)
  
  # if there are coordinates in the response, assign them to `geocoords`
  if(length(response_parsed$features[[1]]$geometry[[3]]) > 0){
    geocoords = list(lon = response_parsed$features[[1]]$geometry[[3]][1],
                      lat = response_parsed$features[[1]]$geometry[[3]][2])
  }else{
    geocoords = NA
  }
  
  # returns the `geocoords` object
  return(geocoords)
}

Now the function is defined. Let’s use it to geocode the 1000 rows in the dataset that we retained.

# Geocode the events - we use the BC Government's geocoding API
# Create an empty vector for lat and lon coordinates
lat = c() 
lon = c()

# loop through the addresses
for(i in 1:length(data$full_address)){
  # store the address at index "i" as a character
  address = data$full_address[i]
  # append the latitude of the geocoded address to the lat vector
  lat = c(lat, bc_geocode(address)$lat)
  # append the longitude of the geocoded address to the lon vector
  lon = c(lon, bc_geocode(address)$lon)
  # at each iteration through the loop, print the coordinates - takes about 20 min.
  print(paste("#", i, ", ", lat[i], lon[i], sep = ","))
}

After a major computational task (e.g. geocoding) you probably want to write your file out so that you keep a copy of your data at this step. In side the quotes for the variable ofile, put in the path to where you want your data stored.

# add the lat lon coordinates to the dataframe
data$lat = lat
data$lon = lon

# after geocoding, it's a good idea to write your file out!
# you will need to modify the directory where you want R to write
# your data!
# joey's computer (mac): '/Users/Jozo/Projects/Github-local/Workshop/aloha-r/data/calls_2014/201401CaseLocationsDetails-geo.csv'
# sally's computer (windows): 'c:\\Sally\\Documents\\van311-project\\201401CaseLocationsDetails-geo.csv'
ofile = "/Users/andrasszeitz/Desktop/GEOB_472/geocoded2.csv" 
write.csv(data, ofile)

Mine

Creating Categories or Classification

We won’t be doing any heavy analysis on the data, but we will try to tease out some categories to make the data more intuitive for exploring. The first thing we’ll do is to see what are the types of unique cases that are being called in:

# ------------------------------------------------- #
# --------------------- Mine ---------------------- #
# ------------------------------------------------- #
# --- Examine the unique cases --- #

# examine how the cases are grouped - are these intuitive?
unique(data$Department)
##  [1] CSG - Licenses                          
##  [2] ENG - Solid Waste                       
##  [3] PRB - Administration                    
##  [4] ENG - Streets                           
##  [5] ENG - Transportation                    
##  [6] ENG - Water & Sewer                     
##  [7] PRB - Planning and Operations           
##  [8] Business Planning & Services            
##  [9] Public Safety - Fire                    
## [10] CSG - Inspections                       
## [11] CSG - Licenses & Inspections            
## [12] Interdepartmental Initiatives           
## [13] ENG - Office of the Deputy City Engineer
## [14] City Administration                     
## 14 Levels: Business Planning & Services ... Public Safety - Fire
unique(data$Division)
##  [1] Animal Control                                    
##  [2] Sanitation                                        
##  [3] General Park Board Information                    
##  [4] Traffic and Electrical Operations                 
##  [5] Neighbourhood Parking and Transportation          
##  [6] Traffic and Data Management                       
##  [7] Street Activities - Integrated Graffiti Management
##  [8] Street Activities - Streets Furniture             
##  [9] Water Operations                                  
## [10] Streets Design - P & D - Accessibility            
## [11] Street Trees                                      
## [12] 311 Contact Centre                                
## [13] Sewers Operations                                 
## [14] Prevention                                        
## [15] Electrical                                        
## [16] Property Use                                      
## [17] Administration Branch                             
## [18] License Office                                    
## [19] Streets Operations                                
## [20] Solid Waste Management                            
## [21] Parking Ops & Enforcement                         
## [22] Vegetation                                        
## [23] Building                                          
## [24] Snow Angel Program                                
## [25] Plumbing and Gas                                  
## [26] Kent Construction Supplies and Services           
## [27] Elections                                         
## [28] Communications                                    
## [29] Street Activities - Streets Horticulture          
## [30] Sewers and Drainage Design                        
## [31] Sewer Separation                                  
## [32] Water Design                                      
## [33] Transfer and Landfill Operations                  
## [34] Active Transportation                             
## 34 Levels: 311 Contact Centre ... Water Operations
# examine the types of cases - can we make new groups that are more useful?
# Print each unique case on a new line for easier inspection
for (i in 1:length(unique(data$Case_Type))){
  print(unique(data$Case_Type)[i], max.levels=0)
}
## [1] Dead Animal Pickup Case
## [1] Missed Garbage Pickup
## [1] Abandoned Garbage Pickup - City Property & Parks
## [1] PRB_Park Ranger SR
## [1] Sign - Repair
## [1] Animal Control General Inquiry Case
## [1] Lost Pets Case
## [1] Residential Parking Requests
## [1] Missed Yard Trimmings and Food Scraps Pickup
## [1] Traffic & Pedestrian Signal - New
## [1] Graffiti Removal - City Property
## [1] Poster/Sign Removal Request
## [1] Street Furniture Repair and Maintenance Request
## [1] Water Leaks/Breaks
## [1] Street Cleaning & Debris Pickup
## [1] Wheelchair Curb/Ramp Request
## [1] Recycling Bag Request
## [1] Animal Complaint - Non-Emergency Case
## [1] Street Light - Out
## [1] Missed Recycling Pickup
## [1] Missed Apartment Recycling Pickup
## [1] Street Tree Work Request SR
## [1] Citizen Feedback
## [1] Cart - Green (Yard Trimmings and Food Scraps)
## [1] Water Service Turn On/Off Request
## [1] Sewer Pipe Inquiries
## [1] Fire Reinspection Request for Firehall
## [1] Street Litter Can Request
## [1] Electrical Inspection Cancellation Case
## [1] Cart - Garbage
## [1] Fire Reinspection Request for Inspector
## [1] Collection Calendar Mail-Out Request
## [1] PUI Noise Complaint Case
## [1] Building Plans Information Request
## [1] Licence Payment Request Case
## [1] Streets - General Issues
## [1] Street Light - Pole Repair
## [1] Water Service Locate Request
## [1] Recycling Box Request
## [1] Gone Out of Business Case
## [1] Street - Surface Water Flooding
## [1] Blue Box and Leaf Removal Guide Mail-Out Request
## [1] Illegal Dumping/Abandoned Garbage Pickup
## [1] Sidewalk - Repair
## [1] Abandoned Vehicle Request
## [1] Cart - Apartment Recycling
## [1] Water Work Site Complaint
## [1] Parks Litter Can or Cart Request
## [1] Water Hydrant Issue
## [1] Secondary Suite Information Request
## [1] Vegetation Maintenance SR
## [1] Street - Repair
## [1] Building Inspection Cancellation Case
## [1] PUI Noise General Inquiry Case
## [1] Snow Angel Program - Individual Volunteer
## [1] Street Light - New/Relocation
## [1] FPB_General Inquiry Case
## [1] PUI General Inquiry Case
## [1] Pothole - Repair
## [1] Trees and Vegetation Encroachment - City Property
## [1] Holding Stray Case
## [1] Green Bin Program - Feedback and General Inquiry
## [1] Traffic Calming Request
## [1] Traffic & Pedestrian Signal - Modify
## [1] Catch Basin Issues
## [1] Snow & Ice Removal - City Property
## [1] Plumbing and Gas Inspection Cancellation Case
## [1] Street Light - Flat Glass Fixture Request
## [1] General Information Request SR
## [1] Traffic Sign - Modify
## [1] Sewer Manhole Issues
## [1] Snow and Ice Removal - Sidewalk Bylaw Violation
## [1] Water General Inquiry
## [1] Sewer General Inquiries
## [1] Curbside Sign - New
## [1] Occupancy Permit Information Request
## [1] Graffiti Removal - External Organization
## [1] Water General Work Request
## [1] Pavement Marking - Repair
## [1] Election General Concerns
## [1] Home Safety Check Request Case
## [1] Apartment Recycling - Registration Request
## [1] Street and Traffic Light - Utility Damage
## [1] Sewer Odour Complaints
## [1] Horticulture Inquiry on Right-of-Way
## [1] Traffic Sign - New
## [1] Sewer Design General Inquiries
## [1] Homelessness/Transient Issue
## [1] Banner Request
## [1] Sewer Separation Inspection Cancellation Case
## [1] Boulevard Maintenance Issues
## [1] Parking Meter Requests
## [1] Bridges & Structures - Repair
## [1] Street Sign - New
## [1] Animal Cremation Case
## [1] Dead Skunk Pickup
## [1] Water Pressure or No Water Issue
## [1] Water Conservation Violation
## [1] Truck Violation
## [1] Water Meter Issue
## [1] Flag Request
## [1] Pavement Markings Request - New/Modify
## [1] Snow and Ice Removal - Sidewalk Bylaw Inquiry
## [1] Transfer Station & Recycling - General Inquiries
## [1] Curbside Sign - Modify
## [1] Crosswalk Marking - New
## [1] Chafer Beetle Feedback
## [1] Water Damage To City Water System
## [1] Traffic Count Request
## [1] Sewer Utility Damage
## [1] Bicycle Route Map Request

Whoa! That’s a heck of a lot of cases. Our role of data visualization people is to take this mess and try to make sense of it so that we can represent it in a manner that is more intuitive and possibly delightful. There are a bunch of ways that we can organize these cases into specific classes, so I invite you to think about how we might best organize these. For now, I’ve come up with 6 classes that bin these cases together. Does this make sense? Should we change it?:

# --- graffiti and noise --- #
Graffiti Removal - City Property
Graffiti Removal - External Organization
PUI Noise Complaint Case
PUI Noise General Inquiry Case

# --- street surface & maintenance --- #
Street Furniture Repair and Maintenance Request
Street Cleaning & Debris Pickup
Street Light - Out
Street Tree Work Request SR
Street Litter Can Request
Streets - General Issues
Street Light - Pole Repair
Street - Surface Water Flooding
Street - Repair
Street Light - New/Relocation
Street Light - Flat Glass Fixture Request
Street and Traffic Light - Utility Damage
Street Sign - New
Crosswalk Marking - New
Boulevard Maintenance Issues
Bicycle Route Map Request
Sidewalk - Repair
Pothole - Repair
Pavement Markings Request - New/Modify
Pavement Marking - Repair
Sewer Pipe Inquiries
Sewer Manhole Issues
Sewer General Inquiries
Sewer Design General Inquiries
Sewer Separation Inspection Cancellation Case
Sewer Utility Damage
Sewer Odour Complaints
Plumbing and Gas Inspection Cancellation Case
Snow Angel Program - Individual Volunteer
Snow & Ice Removal - City Property
Snow and Ice Removal - Sidewalk Bylaw Violation
Snow and Ice Removal - Sidewalk Bylaw Inquiry
Traffic & Pedestrian Signal - New
Traffic Calming Request
Traffic & Pedestrian Signal - Modify
Traffic Sign - Modify
Street and Traffic Light - Utility Damage
Traffic Sign - New
Traffic Count Request
Truck Violation
Residential Parking Requests
Parking Meter Requests
Abandoned Vehicle Request

# --- garbage, recycling & organics --- #
Missed Garbage Pickup
Abandoned Garbage Pickup - City Property & Parks
Cart - Garbage
Illegal Dumping/Abandoned Garbage Pickup
Parks Litter Can or Cart Request
Recycling Bag Request
Missed Recycling Pickup
Missed Apartment Recycling Pickup
Recycling Box Request
Cart - Apartment Recycling
Apartment Recycling - Registration Request
Transfer Station & Recycling - General Inquiries
Blue Box and Leaf Removal Guide Mail-Out Request
Missed Yard Trimmings and Food Scraps Pickup
Cart - Green (Yard Trimmings and Food Scraps)
Green Bin Program - Feedback and General Inquiry
Collection Calendar Mail-Out Request

# --- water related issues --- #
Water Leaks/Breaks
Water Service Turn On/Off Request
Water Service Locate Request
Street - Surface Water Flooding
Water Work Site Complaint
Water Hydrant Issue
Water General Inquiry
Water General Work Request
Water Pressure or No Water Issue
Water Conservation Violation
Water Meter Issue
Water Damage To City Water System
Catch Basin Issues

# --- animal and vegetation related --- #
Dead Animal Pickup Case
Animal Control General Inquiry Case
Animal Complaint - Non-Emergency Case
Animal Cremation Case
Dead Skunk Pickup
Lost Pets Case
Holding Stray Case
Chafer Beetle Feedback
Vegetation Maintenance SR
Trees and Vegetation Encroachment - City Property
Horticulture Inquiry on Right-of-Way

# --- other --- #
Poster/Sign Removal Request
Sign - Repair
Curbside Sign - New
Curbside Sign - Modify
Banner Request
Fire Reinspection Request for Firehall
Fire Reinspection Request for Inspector
Citizen Feedback
Wheelchair Curb/Ramp Request
Wheelchair
PRB_Park Ranger SR
Building Plans Information Request
Building Inspection Cancellation Case
Licence Payment Request Case
Gone Out of Business Case
FPB_General Inquiry Case
PUI General Inquiry Case
Electrical Inspection Cancellation Case
Bridges & Structures - Repair
Secondary Suite Information Request
General Information Request SR
Election General Concerns
Occupancy Permit Information Request
Home Safety Check Request Case
Flag Request
Homelessness/Transient Issue

In the end our 6 classes are:

  1. graffiti and noise
  2. street surface & maintenance
  3. animal and vegetation related
  4. water related
  5. garbage, recycling & organics related
  6. other

Now that we have our classes, we can put them into vectors:

# graffiti and noise
graffiti_noise = c('Graffiti Removal - City Property',
                    'Graffiti Removal - External Organization',
                    'PUI Noise Complaint Case',
                    'PUI Noise General Inquiry Case')

# street surface & maintenance
street_traffic_maint = c('Street Furniture Repair and Maintenance Request',
                          'Street Cleaning & Debris Pickup',
                          'Street Light - Out',
                          'Street Tree Work Request SR',
                          'Street Litter Can Request',
                          'Streets - General Issues',
                          'Street Light - Pole Repair',
                          'Street - Surface Water Flooding',
                          'Street - Repair', 
                          'Street Light - New/Relocation',
                          'Street Light - Flat Glass Fixture Request',
                          'Street and Traffic Light - Utility Damage',
                          'Street Sign - New',
                          'Crosswalk Marking - New',
                          'Boulevard Maintenance Issues',
                          'Bicycle Route Map Request',
                          'Sidewalk - Repair',
                          'Pothole - Repair',
                          'Pavement Markings Request - New/Modify',
                          'Pavement Marking - Repair',
                          'Sewer Pipe Inquiries',
                          'Sewer Manhole Issues',
                          'Sewer General Inquiries',
                          'Sewer Design General Inquiries',
                          'Sewer Separation Inspection Cancellation Case',
                          'Sewer Utility Damage',
                          'Sewer Odour Complaints',
                          'Plumbing and Gas Inspection Cancellation Case',
                          'Snow Angel Program - Individual Volunteer',
                          'Snow & Ice Removal - City Property',
                          'Snow and Ice Removal - Sidewalk Bylaw Violation',
                          'Snow and Ice Removal - Sidewalk Bylaw Inquiry',
                          'Traffic & Pedestrian Signal - New',
                          'Traffic Calming Request',
                          'Traffic & Pedestrian Signal - Modify',
                          'Traffic Sign - Modify',
                          'Street and Traffic Light - Utility Damage',
                          'Traffic Sign - New',
                          'Traffic Count Request','Truck Violation',
                          'Residential Parking Requests',
                          'Parking Meter Requests',
                          'Abandoned Vehicle Request')

# garbage, recycling & organics related
garbage_recycling_organics = c('Missed Garbage Pickup',
                                'Abandoned Garbage Pickup - City Property & Parks',
                                'Cart - Garbage',
                                'Illegal Dumping/Abandoned Garbage Pickup',
                                'Parks Litter Can or Cart Request',
                                'Recycling Bag Request',
                                'Missed Recycling Pickup',
                                'Missed Apartment Recycling Pickup',
                                'Recycling Box Request',
                                'Cart - Apartment Recycling',
                                'Apartment Recycling - Registration Request',
                                'Transfer Station & Recycling - General Inquiries',
                                'Blue Box and Leaf Removal Guide Mail-Out Request',
                                'Missed Yard Trimmings and Food Scraps Pickup',
                                'Cart - Green (Yard Trimmings and Food Scraps)',
                                'Green Bin Program - Feedback and General Inquiry',
                                'Collection Calendar Mail-Out Request')

# water related
water = c('Water Leaks/Breaks',
           'Water Service Turn On/Off Request',
           'Water Service Locate Request',
           'Street - Surface Water Flooding',
           'Water Work Site Complaint',
           'Water Hydrant Issue', 'Water General Inquiry',
           'Water General Work Request',
           'Water Pressure or No Water Issue',
           'Water Conservation Violation',
           'Water Meter Issue',
           'Water Damage To City Water System',
           'Catch Basin Issues')

# animal and vegetation related
animal_vegetation = c('Dead Animal Pickup Case',
                       'Animal Control General Inquiry Case',
                       'Animal Complaint - Non-Emergency Case',
                       'Animal Cremation Case',
                       'Dead Skunk Pickup','Lost Pets Case',
                       'Holding Stray Case',
                       'Chafer Beetle Feedback',
                       'Vegetation Maintenance SR',
                       'Trees and Vegetation Encroachment - City Property',
                       'Horticulture Inquiry on Right-of-Way')

# other
other = c('Poster/Sign Removal Request',
           'Sign - Repair',
           'Curbside Sign - New',
           'Curbside Sign - Modify',
           'Banner Request',
           'Fire Reinspection Request for Firehall',
           'Fire Reinspection Request for Inspector',
           'Citizen Feedback',
           'Wheelchair Curb/Ramp Request','Wheelchair',
           'PRB_Park Ranger SR',
           'Building Plans Information Request',
           'Building Inspection Cancellation Case',
           'Licence Payment Request Case',
           'Gone Out of Business Case',
           'FPB_General Inquiry Case',
           'PUI General Inquiry Case',
           'Electrical Inspection Cancellation Case',
           'Bridges & Structures - Repair',
           'Secondary Suite Information Request',
           'General Information Request SR',
           'Election General Concerns',
           'Occupancy Permit Information Request',
           'Home Safety Check Request Case',
           'Flag Request',
           'Homelessness/Transient Issue')

Assigning Categories a Unique Id (CID)

We can then run through the lists and give each of the classes and cid so that it is easier to identify and represent them later:

# give class id numbers:
data$cid = 9999
for(i in 1:length(data$Case_Type)){
  if(data$Case_Type[i] %in% graffiti_noise){
    data$cid[i] = 1    
  }else if(data$Case_Type[i] %in% street_traffic_maint){
    data$cid[i] = 2   
  }else if(data$Case_Type[i] %in% garbage_recycling_organics){
    data$cid[i] = 3   
  }else if(data$Case_Type[i] %in% water){
    data$cid[i] = 4   
  }else if(data$Case_Type[i] %in% animal_vegetation){
    data$cid[i] = 5   
  }else{
    data$cid[i] = 0   
  }
}

Great, so we now have our data binned into specific groups in a way that seems to make sense. However, if do a little poking around at our data, we notice that since our addresses are aggregated a lot of times to the same address point, our lat/lon coordinates come out the same. How can we deal with this spatial overlap?

Jitter: Offsetting Overlapping Points

One way to do this is to add in some “jitter” to each point if it happens to have the same coordinates.

# --- handle overlapping points --- #
# Set offset for points in same location:
data$lat_offset = data$lat
data$lon_offset = data$lon

# Run loop - if value overlaps, offset it by a random number
for(i in 1:length(data$lat)){
  if ( (data$lat_offset[i] %in% data$lat_offset) && (data$lon_offset[i] %in% data$lon_offset)){
    data$lat_offset[i] = data$lat_offset[i] + runif(1, 0.0001, 0.0005)
    data$lon_offset[i] = data$lon_offset[i] + runif(1, 0.0001, 0.0005)
  } 
}

To derive some more insight into the data, how about looking at the top calls:

# --- what are the top calls? --- #
# get a frequency distribution of the calls:
top_calls = data.frame(table(data$Case_Type))
top_calls = top_calls[order(top_calls$Freq), ]
print(top_calls)
##                                                  Var1 Freq
## 7                                      Banner Request    1
## 18                             Chafer Beetle Feedback    1
## 30                                       Flag Request    1
## 38                     Home Safety Check Request Case    1
## 67                             Sewer Odour Complaints    1
## 70                               Sewer Utility Damage    1
## 74      Snow and Ice Removal - Sidewalk Bylaw Inquiry    1
## 76          Snow Angel Program - Individual Volunteer    1
## 98                                    Truck Violation    1
## 100                      Water Conservation Violation    1
## 8                           Bicycle Route Map Request    2
## 11                      Bridges & Structures - Repair    2
## 21                            Crosswalk Marking - New    2
## 69      Sewer Separation Inspection Cancellation Case    2
## 95                                 Traffic Sign - New    2
## 101                 Water Damage To City Water System    2
## 26                          Election General Concerns    3
## 48               Occupancy Permit Information Request    3
## 52             Pavement Markings Request - New/Modify    3
## 63                Secondary Suite Information Request    3
## 64                     Sewer Design General Inquiries    3
## 96   Transfer Station & Recycling - General Inquiries    3
## 49                             Parking Meter Requests    4
## 75    Snow and Ice Removal - Sidewalk Bylaw Violation    4
## 82          Street Light - Flat Glass Fixture Request    4
## 93                              Traffic Count Request    4
## 106                                 Water Meter Issue    4
## 5                               Animal Cremation Case    5
## 17                                 Catch Basin Issues    5
## 22                             Curbside Sign - Modify    5
## 29            Fire Reinspection Request for Inspector    5
## 51                          Pavement Marking - Repair    5
## 87                                  Street Sign - New    5
## 25                                  Dead Skunk Pickup    6
## 111                      Wheelchair Curb/Ramp Request    6
## 59                     PUI Noise General Inquiry Case    7
## 94                              Traffic Sign - Modify    7
## 40               Horticulture Inquiry on Right-of-Way    8
## 107                  Water Pressure or No Water Issue    8
## 6          Apartment Recycling - Registration Request    9
## 79          Street and Traffic Light - Utility Damage    9
## 102                             Water General Inquiry   10
## 66                               Sewer Manhole Issues   12
## 110                         Water Work Site Complaint   12
## 10                       Boulevard Maintenance Issues   13
## 23                                Curbside Sign - New   13
## 32                     General Information Request SR   13
## 83                      Street Light - New/Relocation   13
## 9    Blue Box and Leaf Removal Guide Mail-Out Request   14
## 36   Green Bin Program - Feedback and General Inquiry   14
## 99                          Vegetation Maintenance SR   14
## 39                       Homelessness/Transient Issue   15
## 108                      Water Service Locate Request   17
## 53      Plumbing and Gas Inspection Cancellation Case   18
## 27            Electrical Inspection Cancellation Case   19
## 103                        Water General Work Request   19
## 33                          Gone Out of Business Case   24
## 92                            Traffic Calming Request   25
## 97  Trees and Vegetation Encroachment - City Property   28
## 65                            Sewer General Inquiries   29
## 31                           FPB_General Inquiry Case   30
## 54                        Poster/Sign Removal Request   30
## 90               Traffic & Pedestrian Signal - Modify   30
## 37                                 Holding Stray Case   31
## 12              Building Inspection Cancellation Case   32
## 73                 Snow & Ice Removal - City Property   32
## 57                           PUI General Inquiry Case   33
## 91                  Traffic & Pedestrian Signal - New   33
## 104                               Water Hydrant Issue   39
## 81    Street Furniture Repair and Maintenance Request   40
## 86                          Street Litter Can Request   44
## 85                         Street Light - Pole Repair   45
## 68                               Sewer Pipe Inquiries   50
## 35           Graffiti Removal - External Organization   55
## 109                 Water Service Turn On/Off Request   55
## 43                                     Lost Pets Case   56
## 56                                 PRB_Park Ranger SR   64
## 44                  Missed Apartment Recycling Pickup   65
## 13                 Building Plans Information Request   66
## 20               Collection Calendar Mail-Out Request   67
## 50                   Parks Litter Can or Cart Request   70
## 71                                  Sidewalk - Repair   71
## 72                                      Sign - Repair   77
## 62                       Residential Parking Requests   80
## 24                            Dead Animal Pickup Case   82
## 4                 Animal Control General Inquiry Case   87
## 77                                    Street - Repair   87
## 58                           PUI Noise Complaint Case  102
## 89                           Streets - General Issues  105
## 28             Fire Reinspection Request for Firehall  119
## 34                   Graffiti Removal - City Property  140
## 55                                   Pothole - Repair  142
## 105                                Water Leaks/Breaks  149
## 78                    Street - Surface Water Flooding  164
## 61                              Recycling Box Request  165
## 3               Animal Complaint - Non-Emergency Case  168
## 2                           Abandoned Vehicle Request  175
## 14                         Cart - Apartment Recycling  181
## 41           Illegal Dumping/Abandoned Garbage Pickup  190
## 42                       Licence Payment Request Case  201
## 80                    Street Cleaning & Debris Pickup  266
## 88                        Street Tree Work Request SR  267
## 60                              Recycling Bag Request  325
## 46                            Missed Recycling Pickup  366
## 16      Cart - Green (Yard Trimmings and Food Scraps)  371
## 19                                   Citizen Feedback  513
## 15                                     Cart - Garbage  689
## 1    Abandoned Garbage Pickup - City Property & Parks  721
## 84                                 Street Light - Out  761
## 45                              Missed Garbage Pickup  763
## 47       Missed Yard Trimmings and Food Scraps Pickup 1229

Filter

Removing Outliers

So we’ve added in some new coordinates and fiddled with them a bit to help with the representation. At this point, it is a good idea to filter out any extraneous points or points that fall outside of our area of interest – in this case Vancouver. Geocoders aren’t perfect and so we should be wary to include any false positives.

First, let’s subset out data that is either an NA or falls outside of Vancouver’s bounds:

# ------------------------------------------------------------------ #
# ---------------------------- Filter ------------------------------ #
# ------------------------------------------------------------------ #
# Subset only the data if the coordinates are within our bounds or if 
# it is not a NA
data_filter = subset(data, (lat <= 49.313162) & (lat >= 49.199554) & 
                    (lon <= -123.019028) & (lon >= -123.271371) & is.na(lon) == FALSE )
# plot the data
plot(data_filter$lon, data_filter$lat)
# If everthing looks good, we might also write out our file:
# Andras' laptop (Mac): '/Users/andrasszeitz/Desktop/GEOB_472/data/311-geo-filtered.csv'
# Andras' computer (Windows): 'C:/Users/aszeitz/Desktop/GEOB_472/data/311-geo-filtered.csv'
ofile_filtered = '/Users/andrasszeitz/Desktop/GEOB_472/Data/311-geo-filtered.csv' 
write.csv(data_filter, ofile_filtered)

NOTE: we write another file out here so we can save our “final dataset”

With a filtered dataset, we can now write our file out to a shapefile and a geojson (if you want to open it later in another GIS).

I highly recommend that you look into learning more about all the different geographic data types like shapefiles and geojson. The last thing you want to do is show up to a job interview and not understand how they work, what the differences are, what are their advantages and disadvantages, etc. For more check out this link, or this link, or this link

Saving Dataset

# --- Convert Data to Shapefile --- #
# store coordinates in dataframe
coords_311 = data.frame(data_filter$lon_offset, data_filter$lat_offset)

# create spatialPointsDataFrame
data_shp = SpatialPointsDataFrame(coords = coords_311, data = data_filter)

# set the projection to wgs84
projection_wgs84 = CRS("+proj=longlat +datum=WGS84")
proj4string(data_shp) = projection_wgs84

# set an output folder for our geojson files:
geofolder = "/Users/andrasszeitz/Desktop/GEOB_472/"

# Join the folderpath to each file name:
opoints_shp = paste(geofolder, "calls_", "1401", ".shp", sep = "")
print(opoints_shp) # print this out to see where your file will go & what it will be called
## [1] "/Users/andrasszeitz/Desktop/GEOB_472/data/geo/calls_1401.shp"
opoints_geojson = paste(geofolder, "calls_", "1401", ".geojson", sep = "") 
print(opoints_geojson) # print this out to see where your file will go & what it will be called
## [1] "/Users/andrasszeitz/Desktop/GEOB_472/data/geo/calls_1401.geojson"
# write the file to a shp
writeOGR(data_shp, opoints_shp, layer = "data_shp", driver = "ESRI Shapefile",
         check_exists = FALSE)
## Warning in writeOGR(check_exists = FALSE, data_shp, opoints_shp, layer =
## "data_shp", : Field names abbreviated for ESRI Shapefile driver
# write file to geojson
writeOGR(data_shp, opoints_geojson, layer = "data_shp", driver = "GeoJSON",
         check_exists = FALSE)

We have all the raw data, but remember that saying “overview first, details on demand”? Our brains simply can’t understand the sheer number of points on the map. How about using some way of aggregating the points to a grid? Turns our hexagonal grids are quite good for conveying density of data points. I’ve created a hexagonal grid in QGIS at a resolution of about 250m x 280m:

Creating Hexagon Grid

hex_van

First let’s read in the grid and reproject it from UTM zone 10 north to WGS84:

# --- aggregate to a grid --- #
# ref: http://www.inside-r.org/packages/cran/GISTools/docs/poly.counts
# set the file name - combine the shpfolder with the name of the grid
grid_fn = 'https://raw.githubusercontent.com/joeyklee/aloha-r/master/data/calls_2014/geo/hgrid_250m.geojson'

# read in the hex grid setting the projection to utm10n
hexgrid = readOGR(grid_fn, 'OGRGeoJSON')
## OGR data source with driver: GeoJSON 
## Source: "https://raw.githubusercontent.com/joeyklee/aloha-r/master/data/calls_2014/geo/hgrid_250m.geojson", layer: "OGRGeoJSON"
## with 5104 features
## It has 4 fields
# transform the projection to wgs84 to match the point file and store
#  it to a new variable (see variable: projection_wgs84)
hexgrid_wgs84 = spTransform(hexgrid, projection_wgs84)

Assigning Values to Hexagons

Next, let’s use the poly.counts() function to count the number of points in each grid cell:

# Use the poly.counts() function to count the number of occurrences of calls per grid cell
grid_cnt = poly.counts(data_shp, hexgrid_wgs84)

Finally, write out the data to a shapefile and geojson – we will use the geojson later, but just for good practice:

# define the output names:
ohex_shp = paste(geofolder, "hexgrid_250m_", "1401", "_cnts",
                  ".shp", sep = "")
print(ohex_shp)
## [1] "/Users/andrasszeitz/Desktop/GEOB_472/data/geo/hexgrid_250m_1401_cnts.sh
ohex_geojson = paste(geofolder, "hexgrid_250m_","1401","_cnts",".geojson") 
print(ohex_geojson)
## [1] "/Users/andrasszeitz/Desktop/GEOB_472/data/geo/hexgrid_250m_1401_cnts.geojson"
# write the file to a shp
writeOGR(check_exists = FALSE, hexgrid_wgs84, ohex_shp, 
         layer = "hexgrid_wgs84", driver = "ESRI Shapefile")

# write file to geojson
writeOGR(check_exists = FALSE, hexgrid_wgs84, ohex_geojson, 
         layer = "hexgrid_wgs84", driver = "GeoJSON")

Technically, we have enough data now to start putting together our web app which visualizes the 3-1-1 data. The next step will be to work with the representation and using the data we produced to filter the visuals with user interactions.

Full Script:

If all is well, this should run in it’s entirety – remember you need to change the folder paths!!!:

#####################################################################
# Vancouver 3-1-1: Data Processing Script
# Date:
# By: 
# Desc: 
#####################################################################

# ------------------------------------------------------------------ #
# ---------------------- Install Libraries ------------------------- #
# ------------------------------------------------------------------ #
# uncomment these lines if these libraries have not yet been installed
# install.packages("GISTools")
# install.packages("RJSONIO")
# install.packages("rgdal")
# install.packages("RCurl")
# install.packages("curl")

# ------------------------------------------------------------------ #
# ----------------------- Load Libararies -------------------------- #
# ------------------------------------------------------------------ #
library(GISTools)
library(RJSONIO)
library(rgdal)
library(RCurl)
library(curl)

# ------------------------------------------------------------------ #
# ---------------------------- Acquire ----------------------------- #
# ------------------------------------------------------------------ #
# access from the interwebz using "curl"
fname = curl('https://raw.githubusercontent.com/joeyklee/aloha-r/master/data/calls_2014/201401CaseLocationsDetails.csv')

# Read data as csv
data = read.csv(fname, header=T)

# inspect your data
print(head(data))

# ------------------------------------------------------------------ #
# ------------------------ Geocoding Function ---------------------- #
# ------------------------------------------------------------------ #

# a function taking a full address string, formatting it, and making
# a call to the BC government's geocoding API
bc_geocode = function(search){
  # return a warning message if input is not a character string
  if(!is.character(search)){stop("'search' must be a character string")}
  
  # formatting characters that need to be escaped in the URL, ie:
  # substituting spaces ' ' for '%20'.
  search = RCurl::curlEscape(search)
  
  # first portion of the API call URL
  base_url = "http://apps.gov.bc.ca/pub/geocoder/addresses.json?addressString="

  # constant end of the API call URL
  url_tail = "&locationDescriptor=any&maxResults=1&interpolation=adaptive&echo=true&setBack=0&outputSRS=4326&minScore=1&provinceCode=BC"
  
  # combining the URL segments into one string
  final_url = paste0(base_url, search, url_tail)
  
  # making the call to the geocoding API by getting the response from the URL
  response = RCurl::getURL(final_url)
  
  # parsing the JSON response into an R list
  response_parsed = RJSONIO::fromJSON(response)
  
  # if there are coordinates in the response, assign them to `geocoords`
  if(length(response_parsed$features[[1]]$geometry[[3]]) > 0){
    geocoords = list(lon = response_parsed$features[[1]]$geometry[[3]][1],
                      lat = response_parsed$features[[1]]$geometry[[3]][2])
  }else{
    geocoords = NA
  }
  
  # returns the `geocoords` object
  return(geocoords)
}

# ------------------------------------------------------------------ #
# ---------------------- Parse: Geocoder --------------------------- #
# ------------------------------------------------------------------ #
# change intersection to 00's
data$h_block = gsub("#", "0", data$Hundred_Block)
print(head(data$h_block))

# Join the strings from each column together & add "Vancouver, BC":
data$full_address = paste(data$h_block, 
                          paste(data$Street_Name,
                                "Vancouver, BC",
                                sep=", "),
                          sep=" ")

# removing "Intersection " from the full_address entries
data$full_address = gsub("Intersection ", "", data$full_address)
print(head(data$full_address))

# creating a random sequence of 1000 numbers between 1 and the length
# of the dataset. These numbers will be the rows we retain from the
# full dataset in order to more quickly geocode during the tutorial
set.seed(1)
keepers = sample(seq(from = 1, to = 10195, by = 1), size = 1000, replace = TRUE)

# creating a new variable to hold the full data set, then trimming the 
# original dataset to have only the 1000 random rows
full_data = data
data = data[keepers, ]

# Geocode the events - we use the BC Government's geocoding API
# Create an empty vector for lat and lon coordinates
lat = c() 
lon = c()
# loop through the addresses
for(i in 1:length(data$full_address)){
  # store the address at index "i" as a character
  address = data$full_address[i]
  # append the latitude of the geocoded address to the lat vector
  lat = c(lat, bc_geocode(address)$lat)
  # append the longitude of the geocoded address to the lon vector
  lon = c(lon, bc_geocode(address)$lon)
  # at each iteration through the loop, print the coordinates - takes about 20 min.
  print(paste("#", i, ", ", lat[i], lon[i], sep = ","))
}

# add the lat lon coordinates to the dataframe
data$lat = lat
data$lon = lon

# after geocoding, it's a good idea to write your file out!
# joey's computer (mac): '/Users/Jozo/Projects/Github-local/Workshop/aloha-r/data/calls_2014/201401CaseLocationsDetails-geo.csv'
# sally's computer (windows): 'c:\\Sally\\Documents\\van311-project\\201401CaseLocationsDetails-geo.csv'
ofile = "/Users/andrasszeitz/Desktop/GEOB_472/geocoded2.csv" 
write.csv(data, ofile)

# ------------------------------------------------------------------ #
# ----------------------------- Mine ------------------------------- #
# ------------------------------------------------------------------ #
# --- Examine the unique cases --- #

# examine how the cases are grouped - are these intuitive?
unique(data$Department)
unique(data$Division)

# examine the types of cases - can we make new groups that are more useful?
# Print each unique case on a new line for easier inspection
for (i in 1:length(unique(data$Case_Type))){
  print(unique(data$Case_Type)[i], max.levels=0)
}

# Determine classes to group case types:

# graffiti and noise
graffiti_noise = c('Graffiti Removal - City Property',
                    'Graffiti Removal - External Organization',
                    'PUI Noise Complaint Case',
                    'PUI Noise General Inquiry Case')

# street surface & maintenance
street_traffic_maint = c('Street Furniture Repair and Maintenance Request',
                          'Street Cleaning & Debris Pickup',
                          'Street Light - Out',
                          'Street Tree Work Request SR',
                          'Street Litter Can Request',
                          'Streets - General Issues',
                          'Street Light - Pole Repair',
                          'Street - Surface Water Flooding',
                          'Street - Repair', 
                          'Street Light - New/Relocation',
                          'Street Light - Flat Glass Fixture Request',
                          'Street and Traffic Light - Utility Damage',
                          'Street Sign - New',
                          'Crosswalk Marking - New',
                          'Boulevard Maintenance Issues',
                          'Bicycle Route Map Request',
                          'Sidewalk - Repair',
                          'Pothole - Repair',
                          'Pavement Markings Request - New/Modify',
                          'Pavement Marking - Repair',
                          'Sewer Pipe Inquiries',
                          'Sewer Manhole Issues',
                          'Sewer General Inquiries',
                          'Sewer Design General Inquiries',
                          'Sewer Separation Inspection Cancellation Case',
                          'Sewer Utility Damage',
                          'Sewer Odour Complaints',
                          'Plumbing and Gas Inspection Cancellation Case',
                          'Snow Angel Program - Individual Volunteer',
                          'Snow & Ice Removal - City Property',
                          'Snow and Ice Removal - Sidewalk Bylaw Violation',
                          'Snow and Ice Removal - Sidewalk Bylaw Inquiry',
                          'Traffic & Pedestrian Signal - New',
                          'Traffic Calming Request',
                          'Traffic & Pedestrian Signal - Modify',
                          'Traffic Sign - Modify',
                          'Street and Traffic Light - Utility Damage',
                          'Traffic Sign - New',
                          'Traffic Count Request','Truck Violation',
                          'Residential Parking Requests',
                          'Parking Meter Requests',
                          'Abandoned Vehicle Request')

# garbage, recycling & organics related
garbage_recycling_organics = c('Missed Garbage Pickup',
                                'Abandoned Garbage Pickup - City Property & Parks',
                                'Cart - Garbage',
                                'Illegal Dumping/Abandoned Garbage Pickup',
                                'Parks Litter Can or Cart Request',
                                'Recycling Bag Request',
                                'Missed Recycling Pickup',
                                'Missed Apartment Recycling Pickup',
                                'Recycling Box Request',
                                'Cart - Apartment Recycling',
                                'Apartment Recycling - Registration Request',
                                'Transfer Station & Recycling - General Inquiries',
                                'Blue Box and Leaf Removal Guide Mail-Out Request',
                                'Missed Yard Trimmings and Food Scraps Pickup',
                                'Cart - Green (Yard Trimmings and Food Scraps)',
                                'Green Bin Program - Feedback and General Inquiry',
                                'Collection Calendar Mail-Out Request')

# water related
water = c('Water Leaks/Breaks',
           'Water Service Turn On/Off Request',
           'Water Service Locate Request',
           'Street - Surface Water Flooding',
           'Water Work Site Complaint',
           'Water Hydrant Issue', 'Water General Inquiry',
           'Water General Work Request',
           'Water Pressure or No Water Issue',
           'Water Conservation Violation',
           'Water Meter Issue',
           'Water Damage To City Water System',
           'Catch Basin Issues')

# animal and vegetation related
animal_vegetation = c('Dead Animal Pickup Case',
                       'Animal Control General Inquiry Case',
                       'Animal Complaint - Non-Emergency Case',
                       'Animal Cremation Case',
                       'Dead Skunk Pickup','Lost Pets Case',
                       'Holding Stray Case',
                       'Chafer Beetle Feedback',
                       'Vegetation Maintenance SR',
                       'Trees and Vegetation Encroachment - City Property',
                       'Horticulture Inquiry on Right-of-Way')

# other
other = c('Poster/Sign Removal Request',
           'Sign - Repair',
           'Curbside Sign - New',
           'Curbside Sign - Modify',
           'Banner Request',
           'Fire Reinspection Request for Firehall',
           'Fire Reinspection Request for Inspector',
           'Citizen Feedback',
           'Wheelchair Curb/Ramp Request','Wheelchair',
           'PRB_Park Ranger SR',
           'Building Plans Information Request',
           'Building Inspection Cancellation Case',
           'Licence Payment Request Case',
           'Gone Out of Business Case',
           'FPB_General Inquiry Case',
           'PUI General Inquiry Case',
           'Electrical Inspection Cancellation Case',
           'Bridges & Structures - Repair',
           'Secondary Suite Information Request',
           'General Information Request SR',
           'Election General Concerns',
           'Occupancy Permit Information Request',
           'Home Safety Check Request Case',
           'Flag Request',
           'Homelessness/Transient Issue')

# give class id numbers:
data$cid = 9999
for(i in 1:length(data$Case_Type)){
  if(data$Case_Type[i] %in% graffiti_noise){
    data$cid[i] = 1    
  }else if(data$Case_Type[i] %in% street_traffic_maint){
    data$cid[i] = 2   
  }else if(data$Case_Type[i] %in% garbage_recycling_organics){
    data$cid[i] = 3   
  }else if(data$Case_Type[i] %in% water){
    data$cid[i] = 4   
  }else if(data$Case_Type[i] %in% animal_vegetation){
    data$cid[i] = 5   
  }else{
    data$cid[i] = 0   
  }
}

# --- handle overlapping points --- #
# Set offset for points in same location:
data$lat_offset = data$lat
data$lon_offset = data$lon

# Run loop - if value overlaps, offset it by a random number
for(i in 1:length(data$lat)){
  if ( (data$lat_offset[i] %in% data$lat_offset) && (data$lon_offset[i] %in% data$lon_offset)){
    data$lat_offset[i] = data$lat_offset[i] + runif(1, 0.0001, 0.0005)
    data$lon_offset[i] = data$lon_offset[i] + runif(1, 0.0001, 0.0005)
  } 
}

# --- what are the top calls? --- #
# get a frequency distribution of the calls:
top_calls = data.frame(table(data$Case_Type))
top_calls = top_calls[order(top_calls$Freq), ]
print(top_calls)

# ------------------------------------------------------------------ #
# ---------------------------- Filter ------------------------------ #
# ------------------------------------------------------------------ #
# Subset only the data if the coordinates are within our bounds or if it is not a NA
data_filter = subset(data, (lat <= 49.313162) & (lat >= 49.199554) & 
                    (lon <= -123.019028) & (lon >= -123.271371) & is.na(lon) == FALSE )

# plot the data
plot(data_filter$lon, data_filter$lat)

# If everthing looks good, we might also write out our file:
ofile_filtered = '/Users/andrasszeitz/Desktop/GEOB_472/filtered.csv'
write.csv(data_filter, ofile_filtered)

# --- Convert Data to Shapefile --- #
# store coordinates in dataframe
coords_311 = data.frame(data_filter$lon_offset, data_filter$lat_offset)

# create spatialPointsDataFrame
data_shp = SpatialPointsDataFrame(coords = coords_311, data = data_filter)

# set the projection to wgs84
projection_wgs84 = CRS("+proj=longlat +datum=WGS84")
proj4string(data_shp) = projection_wgs84

# set an output folder for our geojson files:
geofolder = "/Users/andrasszeitz/Desktop/GEOB_472/"

# Join the folderpath to each file name:
opoints_shp = paste(geofolder, "calls_", "1401", ".shp", sep = "")
print(opoints_shp) # print this out to see where your file will go & what it will be called
opoints_geojson = paste(geofolder, "calls_", "1401", ".geojson", sep = "") 
print(opoints_geojson) # print this out to see where your file will go & what it will be called

# write the file to a shp
writeOGR(data_shp, opoints_shp, layer = "data_shp", driver = "ESRI Shapefile",
         check_exists = FALSE)

# write file to geojson
writeOGR(data_shp, opoints_geojson, layer = "data_shp", driver = "GeoJSON",
         check_exists = FALSE)

# --- aggregate to a grid --- #
# ref: http://www.inside-r.org/packages/cran/GISTools/docs/poly.counts
# set the file name - combine the shpfolder with the name of the grid
# grid_fn = paste(shpfolder,'hgrid_100m.shp', sep="")
grid_fn = 'https://raw.githubusercontent.com/joeyklee/aloha-r/master/data/calls_2014/geo/hgrid_250m.geojson'

# read in the hex grid setting the projection to utm10n
hexgrid = readOGR(grid_fn, 'OGRGeoJSON')

# transform the projection to wgs84 to match the point file and store
# it to a new variable (see variable: projection_wgs84)
hexgrid_wgs84 = spTransform(hexgrid, projection_wgs84)

# Use the poly.counts() function to count the number of occurences of calls per grid cell
grid_cnt = poly.counts(data_shp, hexgrid_wgs84)

# create a data frame of the counts
grid_total_counts = data.frame(grid_cnt)

# set grid_total_counts dataframe to the hexgrid data
hexgrid_wgs84$data = grid_total_counts$grid_cnt

# remove all the grids without any calls
hexgrid_wgs84 = subset(hexgrid_wgs84, grid_cnt > 0)

# define the output names:
ohex_shp = paste(geofolder, "hexgrid_250m_", "1401", "_cnts",
                  ".shp", sep = "")
print(ohex_shp)

ohex_geojson = paste(geofolder, "hexgrid_250m_", "1401", 
                      "_cnts", ".geojson", sep = "") 
print(ohex_geojson)

# write the file to a shp
writeOGR(check_exists = FALSE, hexgrid_wgs84, ohex_shp, 
         layer = "hexgrid_wgs84", driver = "ESRI Shapefile")

# write file to geojson
writeOGR(check_exists = FALSE, hexgrid_wgs84, ohex_geojson, 
         layer = "hexgrid_wgs84", driver = "GeoJSON")

Creating a full dataset:

We completed the exploration for 1 month, but what about the remaining 11 months? What is the value added of having the other temporal data? For the future, you can explore the full year’s worth of data by running this script on each of the datasets. Don’t forget to change the filenames for each year – if you don’t you’ll write over your data and have to run all the code again 😉

R Tutorial

Created by: Joey K Lee

Edited by: Andras Szeitz

About

This is an intro to the R programming language which we will use throughout the rest of the course and hopefully you’ll use when you leave UBC and start your lives in research, industry, and beyond!

Requirements:

  • First: download and install R – make sure you download the right one for your system.
    • For Windows users, you can choose between 32-bit and 64-bit versions of R, you need to choose which one is right for your computer. If you’re really unsure, then choose the 32-bit, BUT if you don’t mind google searching “32-bit or 64-bit windows for “
    • SUPER IMPORTANT for Windows: We’ve noticed some issues with R and the packages if you install R and R studio to to the “C:\Program Files” which is the default option.
      • Instead try to install R and RStudio on your “C:\” drive. If you have already installed R to your “C:\Program Files” and have been noticing problems, uninstall R and Rstudio and reinstall them in your “C:\” drive.
  • Second: download and install R studio (make sure to install R first!)

Outline

Why are we learning R?

Why are we using R after spending all that time learning Processing? Well, first I would go back to the fact that programming languages are like materials, each with a distinct qualities, benefits, and limitations. In this way, it doesn’t make sense to say one language is better or worse than another, rather that they are just different. So why learn R? Here’s why I chose to introduce R in this course:

  • High Level Programming Language: R is what we consider a high level programming language. With a few lines of code, you can do a lot of stuff which is super nice when we’re learning.
  • Easy to setup and has a nice Integrated Development Environment (IDE): R is nice because it sort of lives within itself – it is easy to download, install, and setup, it is cross platform (it can be used in Windows, Mac OS, etc), and has a nice IDE called RStudio which we will use to help us write our code.
  • Large user community: Like Processing, R has a HUGE user community that has grown to include domains of all sorts. This means also if you have a specific question that most likely someone has already found the answer and posted it online. The New York Times Data team is using R (among other things), for example!
  • Super handy packages: Because of the large user community people have developed packages for doing all sorts of cool things. Want to make an interactive map? Bam! There’s an R package for that. Want to make a hillshade of a DEM in R? There’s a package for that too! You’ll start to see that packages will be your best friend, so use your favorite search engine and find the right packages for you!
  • Nice plots: R kicks ass for plotting – and this is a data visualization class after all. With one line of code you can already start looking at your data, add a few more and you already have something pretty to look at. As a way of exploring and displaying your data, R is a nice way to go.
  • Local expertise: Its always handy to have people around to help you if you get stuck. There are plenty of people teaching R at UBC and also some expert grad students and faculty in Geography.
  • Open source mapping: When we dive further into R we’ll start to see that we can start to do some really fancy geoprocessing in R – Basically your own, free, GIS 🙂 When you leave UBC and no longer have access to proprietary software, you’ll be happy to know that you have the tools to continue mapping away.

Getting Started

Throughout this workshop, try your best to make connections between what you learned in the intro to Processing to what R has to offer. There are some fundamental differences you will begin to notice, but you’ll really quickly begin to see the advantages (and disadvantages) of using R.


R Studio IDE:

This is the R studio IDE. We will use R studio as our environment for running our code, making our scripts, and viewing our plots. It’s sort of like our Processing PDE, with the major difference being that R is an interpreted language meaning we can run our code line-by-line versus Processing which is compiled which means Processing runs all of the code at once. You’ll see why this is advantages for working with and exploring data. Anyways, because of this key difference, R studio is setup to help us explore our data line-by-line.

rstudio

Your typical R studio IDE will look like the above and includes:

  • Script window: A place to put your script (your entire code)
  • R console: every line of code your run from your script will be run, evaluated and printed here.
  • Global Environment: This will show all the variables you declare and assign. The fancy thing about R studio is that it will tell you what kind of variable it is and even let your preview your data that you read in!
  • Plot, Packages, Help, & more: Your plots will be shown here. Also you will be able to find packages and access R’s help documentation.

Keeping Track:

  • Print everything – check that things are working and see how you change and manipulate data with functions and operations.
  • Plot everything – again, check that things are working and see how you change and manipulate data with functions and operations.
  • Comment your code – it will make it easier for you to remember what you were trying to do, and make it easier for others to read and understand your code

Quick Syntax review:

Mathematical operators in R

While most of the mathematical operators are the same as in Processing, there’s a few differences. Let’s check out those mathematical operators in R again.

Operator        Description
+               addition
-               subtraction
*               multiplication
/               division
^ or **         exponentiation
x %% y          modulus (x mod y) 5%%2 is 1
x %/% y         integer division 5%/%2 is 2

Logical Operators in R

Operator        Description
<               less than
<=              less than or equal to   >               greater than
>=              greater than or equal to
==              exactly equal to
!=              not equal to
!x              Not x
x | y           x OR y
x & y           x AND y
isTRUE(x)       test if 'x' is TRUE

Assignment Operators and Subsetting in R

Operator        Description
<-              assigns a value to a name
=               assigns a value to a name
[]              extracts a part of an object
$               extracts a part of a vector

Gettin’ down and di-R-ty:

After learning about the fundamental building blocks of programming (and the data visualization pipeline), the goal now is to apply our understanding by taking code and data and producing visual output.

Instead of learning how to code R from the bottom up (which we did with Processing), we will go from the top down, meaning: we are going to learn by running examples and in the process train ourselves to deconstruct the patterns and functions we pick up along the way.

But before we get into some examples, let’s do a quick tour of running code in R, some R syntax and data types!


Your First R commands: “Hello World”

Let’s start by just running a few commands in the R Console. To run a command in the console window, simply:

  • type your statement and
  • press [enter].

In the console window, let’s write 3 statements and run them. Here’s the result of running the 3 examples below.

r-intro

  • Addition
    5 + 3
    
  • Logical Operation
    100 < 200
    
  • String
    "Oops I did it again"
    

Running code from the script window

Now that you’ve seen how R reads, evaluates, and prints the statements you feed it line-by-line, .

When running code line-by-line in the script window you:

  1. Highlight the lines you want to run:

highlight.png

  1. then “run the code” by pressing at the same time:
    • On Mac: [cmd] + [enter]
    • On Windows: [ctrl] + [enter]

The results will then be “printed” to the console:

script-print.png

Now, to familiarize yourself with running code line-by-line from the script window, let’s write another 3 statements, but this time insert a comment between each of the statements to neatly organize your code.

NOTE: In R, comments are made by using the hash symbol (#)

Try writing 3 examples of your own in the script window and seeing what is printed to the console:

  • Commenting:
    # --- Gettin' Down and di-R-ty: an intro to R --- #
    # remember anything after the comment symbol is not evaluated
    
  • Multiplication
    # Multiplication
    6 * 5
    
  • Exponentiation
    # Exponents
    5**2
    
  • String
    # Strings
    "Hello World"
    

FYI, we can also use the print() function to print statements for us like this:

# --- heres another way to print in R --- #
print("Hello World!")
print(5*10)
print(10-3)

Run this and see that it return the same thing! Pretty neat stuff.

NOTICE SOMETHING? WE DON’T NEED TO USE SEMI-COLONS (;) IN R! This means that R doesn’t use semi-colons to signify the end of a statement in contrast to Processing. Rather R uses line breaks to interpret each statement. Excited? Sad? Indifferent? In any case, just remember you don’t need semi-colons to end statements in R. Are you beginning to see how programming languages all have their own flavor?

Now that you see how to run code line-by-line, let’s dive further into some more exciting stuff.

Help! Information!

If you are trying to use a function and you aren’t getting the output you are expecting, or maybe it’s straight-up not working, RStudio has very useful help documentation for all the functions and packages you might use! To get the information about what parameters a function is expecting, and in what format, just type ?’function’ in the console. ie:

# Getting help on mean:
?mean

untitled


Variables, Data, & Plotting

Before we take a leap into a real-world example, it’s worth noting a few fundamental things about R variables, data, and plotting.

R Variables:

Unlike in Processing which is strongly typed, R is loosely typed meaning that we don’t need to specify the type of data that we are assigning to each variable – notice we don’t have to specify “int” or “float” or “String” – R knows that type of data we are assigning!:

# I'm an integer variable:
x <- 12
print(x)

# I'm a float variable:
y <- 16.5
print(y)

# I'm a string variable:
z <- "Hello"
print(z)

Fundamental R Data Types:

Here are a number of fundamental R data objects that over time you will come to fully understand what they are, what makes them special, and when and where you would use them. For the 4 major data objects below, all we need to now is that:

Vectors (pretty much just a 1-D Array):

A vector in R is basically a collection of elements of the same data type (e.g numeric, character, boolean.) – a 1-D array. We create a vector in R by using the c() function. The c() is short for “concatenate” which means “paste together or chain together in a series”.

Here’s 2 examples of a vectors:

# Here's some data of months of the year and the precipitation in Vancouver
months <- c(1, 2, 3, 4, 5, 6, 7, 8, 9 , 10, 11, 12)
precipitation <- c(186, 94, 118, 85, 60, 59, 38, 39, 48, 126, 183, 177)

Lists:

An R list is a collection of elements (similar to a vector) BUT the key difference is that a list can contain different data types. Lists can be created by using the list() function and filling in values to be stored in the list.

Here’s an example of a list of all the same values:

# Hey I'm a list:
my_list <- list(3, 6, 9, 12,15)

Here’s a list of lists of mixed data types:

# ---- a list ---- #
my_fav_things <- list(pi, 42, "Vancouver")

As you can see, depending on your data, this could be useful way of organizing and structuring your data. Looking kind of like Javascript Object Notation (JSON) with those keys and values, eh?

Dataframes:

A dataframe is essentially data in a tabular form (think an excel spreadsheet) – it is composed of rows and columns of equal length. Each column is a vector ( aka 1-D array), meaning it contains all of the same data type (except for the header row) and is recursive, meaning that we can apply functions and mathematical operations on each column. Holy guacamole, does this mean R is really great at processing tabular data like excel sheets, csv files, and maybe even shapefiles? Yes! That is exactly right!

Very generally, dataframes are composed of vectors, which are like each column. OH! That means we can take our month and precipitation data and create a data frame?! yes! All we have to do is use the data.frame() function:

A dataframe of the rain data:

# Here's some data of months of the year and the precipitation in Vancouver
months <- c(1, 2, 3, 4, 5, 6, 7, 8, 9 , 10, 11, 12)
precipitation <- c(186, 94, 118, 85, 60, 59, 38, 39, 48, 126, 183, 177)

# rain dataframe
rain_data <- data.frame(months, precipitation)
print(rain_data)

This gives us a nice tabular dataset 🙂

precip-df

We use R’s selector operator ($) to select each column:

# select the precpitation column and calculate the annual average
mean_precip <- mean( rain_data$precipitation )
print(mean_precip)

We see that the mean precipitation in Vancouver is 101.0833 mm.

Plotting!

Ok, so now we are familiar with some of the basic data objects.  But just looking at a vector of numbers or a dataframe doesn’t make it easy to see what information the data holds. The great thing about R is how easily we can make plots to visualize our data! Here, we’ll plot the monthly precipitation data as a scatterplot:

# A simple scatterplot:
plot(x = months, y = precipitation)

precip-p.png

Or how about a line plot:

# a line plot:
plot(x = months, y = precipitation, type = "l")

precip-l.png

Or how about both lines and points:

# scatterplot with lines - "b" stands for both:
plot(x = months, y = precipitation, type = "b")

precip-b.png
Or how about a barplot:

# plot it as a bar plot
barplot(height = precipitation, names.arg = months)

precip-bar

Say, we wanted to plot the annual average on top of the monthly values, we could do something like this:

# create the plot:
plot(x = rain_data$month, y = rain_data$precipitation, type = "b" )

# add the mean line:
abline(h = mean_precip)

precip-mean.png

How easy and great is that? Sure sure, you’re probably thinking “This is lame, I could do this in Excel” – but what if you had to make these graphs for every year in Vancouver since 1965? You could write a loop to automate that in R instead of clicking over and over again in Excel… AND if you needed to change a color or an axis label, you could just run your script again… just sayin’.

*NOTE: We usually won’t be making our own dataframes from scratch, but rather reading them in the form of delimited text files like .csv, .xls, .tsv, .dat, etc. *

and finally, Matrices:

Think of matrices as a raster or bitmap image. While the x and y dimensions can be different (e.g. like a photograph) all the columns must be the same data type.

As a quick example, let’s use one of R’s preloaded datasets – a matrix (e.g. raster) of Auckland’s Maunga Whau Volcano:

# store the volcano data to a variable:
cano <- volcano
print(cano)

Whoa! we get a bunch of print outs that essentially show this structure/numbers which represent the topography of this volcano:
cano-ex

Now let’s plot it, but instead of using plot() let’s use the image() function:

# image() fuction to plot the cano
image(cano)

Wow! Such colors!

cano-img.png

What other goodies can we plot? How about some contours?

# contour lines of matrix
contour(cano)

cano-contours.png

And believe it or not, enter the 3rd dimension:

# Get some perspective
persp(cano, expand = 0.3, phi = 35, theta = 10)

cano-persp.png


Reading Dataframes from files:

We just saw how to make a dataframe using the data.frame() function using a set of vectors. Now let’s read in data from a file as a dataframe using R’s read.csv() function:

# --- reading in data from a csv file --- #
# store the filename & path to a variable
fileName <- 'https://raw.githubusercontent.com/joeyklee/aloha-r/master/data/rain2014.csv'

# pass the filename variable to the "read.csv()" function
# use "header = TRUE" if there's a header
rain_csv <- read.csv(fileName, header = TRUE)

If we look at the data again, we see that it looks exactly like the dataframe we created earlier:

precip-df

Now, we can do exactly the same steps as before EXCEPT we need to change the data name from rain_data to rain_csv:

# select the precpitation column and calculate the annual average
mean_precip <- mean(rain_data$precipitation)
print(mean_precip)

rain-csv-plot.png


R Packages (aka libraries)

R packages (aka libraries) may be one of the most useful things about R. Libraries are basically bundled up scripts that people (software developers, researchers, designers, artists, etc) have written to help take complicated tasks or computations, wrapped up in simple(r) to use functions to make programming easier, more fun, and easy to read.

In this way, you can say that Processing is a language and also a “library” because it makes functions available (e.g. ellipse() and rect()) which would take 10 to 20 lines of code in Java and puts them into an easy to use function.

In R, we have countless numbers of libraries to help us to tasks. Remember earlier I mentioned that there are R packages that can help you do geoprocessing and even make your own interactive web map? Well, now we’re going to introduce a package that allows us to read in shapefiles into R so that we can use it as a basemap for our crime plot.

NOTE: It may seem like R libraries are just doing magic behind the scenes, but they are simply making more functions available to you to use that aren’t already included in the base R library – think of it as an actually library, you go there to get books that you don’t have at home 😉

SO to start, we need to install the package we want to use. After doing a few google searches, I found an R package called “GISTools” that allows us to read shapefiles into R, make choropleth maps, do geoprocessing, etc – essentially your own, free and open source GIS.

installing and adding a library

We install packages in R by using the install.packages() function:

# ---------- Using our first R package to display a shapefile! ---------- #
# install the maptools library
install.packages("GISTools")
install.packages('scales')

NOTE: If you get an error like “cannot write to lib” or something on windows — you need to change the read/write permissions on that folder. You can do so by navigating to that folder > right click > properties > check the box that says “read/write”.

After we install our R package, we need to import it to our script. We do so using the library() function:

# import the GISTools functions by calling the library() function
library(GISTools)
library(scales)

Now that we have our library imported, we have can read in some shapefiles that are conveniently sitting in our data folder. Similar to how we read in our .csv file, we just have to include the path name to each of our .geojson files.

# --- read in shps --- #
# Building shp
fname_buildings <- "https://raw.githubusercontent.com/joeyklee/aloha-r/master/data/example/buildings.geojson"
buildings <- readOGR(fname_buildings, 'OGRGeoJSON')
plot(buildings) # plot to inspect

# Roads shp
fname_roads <- "https://raw.githubusercontent.com/joeyklee/aloha-r/master/data/example/roads.geojson"
roads <- readOGR(fname_roads, 'OGRGeoJSON')
plot(roads) # plot to inspect

# Co2 points
fname_co2 <- "https://raw.githubusercontent.com/joeyklee/aloha-r/master/data/example/co2.geojson"
co2 <- readOGR(fname_co2, 'OGRGeoJSON')
plot(co2)

By plotting the data, we can see what we have! Now, let’s combine the plots together to make a very basic map:
ex1.png

# Plot the buildings
plot(buildings, 
     col = "#808080", 
     border = F,
     bg = "#FFFFFF",
     main = "CO2 Mixing Ratios Downtown Vancouver")

ex2

# Add the roads to the plot with "add = TRUE"
plot(roads,
     col = "#000000",
     lwd = 1,
     add = TRUE)

ex3.png

# Add the CO2 observation points to the plot with "add = TRUE"
vals = rescale(co2@data$co2, c(0.5, 5))
plot(co2,
     col = "#FF6600",
     pch = 20,
     alpha = 0.5,
     cex = vals,
     add = TRUE)

We can even add a north arrow in R (or later in Illustrator):

# We can even add a north arrow:
north.arrow(xb = -123.144035, yb = 49.273001, len = 0.0005)

And a scale bar:

# scale bar:
map.scale(xc = -123.151295, 49.271729, len = 0.01,
          ndivs = 2, subdiv = 3, units = "test")

Now you’re more than primed to construct and deconstruct our first real-world R project!

Processing Tutorial – Abridged


Before we dive in:

What You’ll Need

  • Download Processing for your specific operating system.
  • The confidence, curiosity and enthusiasm to learn and the drive to teach yourself and the patience to help those around you.

What you need to do:

  • Do the One hour of code with Dan Shiffman to get familiar with Processing and the concepts we’ll dive further into today.

Processing Workshop

In this workshop we will be introduced to Processing – a programming language and environment developed “to promote literacy within the visual arts and visual literacy within technology”. Processing is used across a variety of communities but has had particular success within the art and design community (and data visualization community) for its strength in generating visual and interactive output. The purpose of this workshop is to:

  1. Understand the basic concepts of programming visual outputs with code.
  2. Become familiar with writing simple code to draw shapes, lines, and colours in Processing.
  3. Explore the difference between static and dynamic displays as created by our code.
  4. Become acquainted with the sources of information and help available to ‘part-time’ coders.

We will use the concepts learned in this workshop throughout the rest of the course to learn how to systematically approach data visualization problems and explore other tools and methods of producing visuals with data.


Learning Outcome / Assignment:

At the end of this workshop, you will be equipped with the fundamental tools to draw with code. With your new skills you will be able to produce all sorts of visual output using Processing’s primitive shapes and programming concepts such as variables, conditionals, and loops. For your assignment choose one these three Proun drawings by El Lissitzsky.

proun1proun2proun3

You will develop 2 versions:

  1. The first will be a version as close to the image as possible.
  2. The second will be a remix of the image – playing with the composition through the use of alternative colors, strokeWeight, object positioning, etc. – save your code in a separate file!

Be sure to try and use variables – ones that you define and in-built variables like mouseX and mouseY – to affect the composition.

NOTE: You are encouraged to work together on the initial composition, but your remixes should all look different.

Turning in the assignment:

  • create a blog (that you create from wordpress, tublr, cargo collective, blogspot, etc)
  • you will post your two images AND your code to your blog.
  • send the link to your blog to Andras and Sally.
  • Watch the video below to successfully copy your Processing code to your blog such that it is formatted with syntax highlighting.

https://player.vimeo.com/video/140105501

how-to-post-code-wordpress from Joey Lee on Vimeo.


Index

  • S1. The Processing Environment
    • The Text Editor & Display Window
  • S2: Fundamentals
    • S2.1: Structure
      • S2.1.1: Comments
      • S2.1.2: Functions
      • S2.1.3: Expressions & Statements
  • S3.1: Shape
    • S3.1.1: coordinates
    • S3.1.2: primitives
    • S3.1.3: properties
    • Basic Color properties
    • Basic Stroke Properties
    • S3.1.4: custom shapes
  • S4: A tiny bit more Fundamentals
    • Variables and Data Types
  • S5. Dynamic Sketches
    • A dynamic sketch
    • Dynamic Sketch Remix
    • Saving your work
  • Recap & Future Directions:
    • What we learned today
  • Future Directions
    • Endless learning
  • Web Searching
  • Resources
  • References
  • Appendix


S1. The Processing Environment

The Text Editor & Display Window

The Processing Development Environment (IDE) looks like this: pde

  • Text editor: a nice place to put your code
  • Play Button: Run your “sketch”
  • Stop Button: Stop your “sketch”
  • Message Area: One-line messages are printed here – usually error messages.
  • Console: The console will print out feedback – usually text that you instruct it to print out while you are building your sketch.
  • Display Window: “A computer screen is a grid of light elements called pixels. Each pixel has a position within the grid defined by coordinates.” This is where your sketches will be shown.

S2: Fundamentals

S2.1: Structure

Voltaire once said, “If you wish to converse with me, define your terms”. In this section, we’re going to define a number of terms so we can all speak the same language when referring to the different elements of a program.

What you should learn and remember from this section are:

  1. comments
  2. functions can have a number of parameters that are passed arguments
    • e.g. size() is a function that has two parameters: width and height. We can pass 2 numbers in like 700 and 500 which are the arguments.
  3. expressions & statements

S2.1.1: Comments

In code, comments are things you write that the computer ignores and does not evaluate.   In Processing, comments are indicated by beginning a line with //. For example:

// ellipse(1, 1, 1, 1);  This is a commented line because of the '//'
ellipse(250, 100, 50, 50);

These are useful when programming to help you (and others who read your code) keep track of what’s happening in your program and allows you to test different functions and statements.

We can use two methods of comments:

For single line comments: //

// fill(255, 0, 0);
// ellipse(250, 100, 50, 50);   

OR

Block comments for multiple lines of code:

/* 
fill(255, 0, 0);
ellipse(250, 100, 50, 50);
*/

TIP: We can use comments to help us structure our logical steps when writing code. For example:

// first set the canvas size

// next set the canvas color

// next draw a circle in the center of the canvas. Since the canvas is 700 pixels wide and 500 pixels tall, the center must be at (350, 250). 

// next draw a triangle about 1/3 the width of the canvas

// ... 

S2.1.2: Functions

Functions allow you to draw shapes, set colors, calculate numbers, and to execute many other types of actions. A function’s name is usually a lowercase word followed by parentheses. Functions have parameters in which you fill in with arguments.

The comma-separated elements between the parentheses are the parameters that you fill with arguments, and they affect the way the function works. Some functions have no parameters and others have many. This program demonstrates the size() and background() functions. cont.

// The size function has two parameters. 
// The first sets the width of the display window and the second sets the height
// the arguments here are 200, 200
size(200, 200);


// This version of the background function has one parameter.
// It sets the gray value for the background of the display window in the
// range of 0 (black) to 255 (white). The argument here is 102.
background(102);

S2.1.3: Expressions & Statements:

Using an analogy to human languages, a software expression is like a phrase. Software expressions are often combinations of operators such as +, *, and / that operate on the values to their left and right. A software expression can be as basic as a single number or can be a long combination of elements. An expression always has a value, determined by evaluating its contents.

// expression       // value
5 + 5               10

// expression       // value
10 < 50             true

A statement – composed of a set of expressions – is like a sentence that gets translated into machine readable code that instructs the computer to do something. For example we see that the collection of these expressions makes a statement (below) that tell the computer to do a particular task.

size(200, 200);     // Runs the size() function to set the canvas size
int x;              // Declares a new variable x as an integer type
x = 102;            // Assigns the value 102 to the variable x
background(x);      // Runs the background() function

S3.1: Shape

Let’s dive into how we can draw shapes with Processing.


S3.1.1: coordinates

The coordinate space of a Processing canvas is set using the size() function. The parameters for width and height sets the number of pixels that will be in the x-coordinate space and y-coordinate space.

The Processing canvas starts at (0,0) at the top-left corner of the canvas.

coordinate-space1

Credits: “A Programming Handbook for Visual Artists and Designers” – Reas & Fry

* for your reference: Processing also comes with a number of other renderers including a 3D renderer, but we won’t get into this now – just something to think about!


S3.1.2: primitives

Processing’s primitive shapes are the bread and butter of making visual output with code. As we saw in our first program in which we made a circle, triangle, and square with one function, there are other primitive shapes that we can use. These primitives are listed below:


point(x,y)

point1

Experiment: draw 5 points in a row on your screen


line(x1, y1, x2, y2)

line1

Experiment: draw 3 “x’s” on your screen using lines

triangle(x1, y1, x2, y2, x3, y3)

triangle1

Experiment: draw 3 triangles if different sizes slightly overlapping each other

quad(x1, y1, x2, y2, x3, y3, x4, y4)

quad1

Experiment: draw 1 funky quadrilateral

rect(x, y, width, height)

rect1

Experiment: draw 2 rectangles with exactly the same arguments – use the rectMode(CENTER) function for 1 of the rectangles and see what it does.

ellipse(x, y, width, height)

ellipse1

Experiment: draw a “bullseye” target with 6 ellipses. Play will fill colors to change the color of each ellipse.

bezier(x1, y1, cx1, cy1, cx2, cy2, x2, y2)

bezier1

Experiment: draw 1 bezier curve to see how it works.

S3.1.3: properties

In Processing we can change the properties of the shapes to affect their fill color, stroke color, and drawing mode. Let’s go over a few of them here:

Basic Color properties

If we use RGB color space:

* value1 = red          (between 0 - 255)
* value2 = green        (between 0 - 255)
* value3 = blue         (between 0 - 255)
* alpha = transparency  (between 0 - 100)

Processing comes with a handy Color tool to select colors

colortool1

NOTE: if you want to save a color to a variable, you must use the color type:

color pink = color(255, 8, 152);
background(pink);

fill()

fill(value1, value2, value3)
fill(value1, value2, value3, alpha)

noFill()

Use the noFill() function if you don’t want a fill color

size(700, 500);

// with fill
fill(242, 172, 20);
ellipse(width*0.25, height * 0.5, 150, 150);

// no fill
noFill();
ellipse(width*0.75, height * 0.5, 150, 150);
nofill1

Experiment: make your own example using noFill() with another set of shapes like a triangle and rectangle.


Basic Stroke Properties

stroke()

stroke(value1, value2, value3)
stroke(value1, value2, value3, alpha)

noStroke()

Use the noStroke() function if you don’t want a stroke color

size(700, 500);

// no stroke
noStroke();
fill(242, 172, 20);
ellipse(width*0.25, height * 0.5, 150, 150);

// black stroke
stroke(0);
fill(242, 172, 20);
ellipse(width*0.75, height * 0.5, 150, 150);
nostroke1

Experiment: make your own example using stroke() and noStroke() with another set of shapes like a triangle and rectangle.


strokeWeight()

You can use the strokeWeight() function to change the width of the stroke.

size(700, 500);

strokeWeight(1);
line(width*0.25, height * 0.25, width*0.75, height * 0.25); 

strokeWeight(3);
line(width*0.25, height * 0.35, width*0.75, height * 0.35); 

strokeWeight(5);
line(width*0.25, height * 0.45, width*0.75, height * 0.45); 

strokeWeight(7);
line(width*0.25, height * 0.55, width*0.75, height * 0.55);

strokeWeight(9);
line(width*0.25, height * 0.65, width*0.75, height * 0.65);

strokeWeight(10);
line(width*0.25, height * 0.75, width*0.75, height * 0.75);
strokeweight1

Experiment: make your own example using strokeWeight() AND noFill() with a set of slightly overlapping rectangles.


S3.1.4: custom shapes

In Processing we can create our own shapes from a series of vertices. We can do by sandwiching a series of vertex(x,y) functions in between the beginShape() and endShape() functions.

For example – let’s make a funny fish:

size(700, 500);
strokeWeight(4);

// right eye
fill(255);
ellipse(width*0.6, height*0.2, 50, 100);
fill(0);
ellipse(width*0.6, height*0.15, 25, 50);

// body
fill(255);
stroke(0);
beginShape();
vertex(width*0.25, height*0.75);
vertex(width*0.35, height*0.15);
vertex(width*0.45, height*0.25);
vertex(width*0.65, height*0.25);
vertex(width*0.55, height*0.85);
endShape(CLOSE);

// left eye
fill(255);
ellipse(width*0.5, height*0.35, 50, 100);
fill(0);
ellipse(width*0.5, height*0.3, 25, 50);

processing_beginshape-011There are a number of shape mode parameters that can be passed into the beginShape() function in order to have more control over the custom shapes being produced. These include:

beginShape(POINTS)

// Draws a point at each vertex 
beginShape(POINTS);
vertex(30, 20);
vertex(85, 20);
vertex(85, 75);
vertex(30, 75);
endShape();

beginShape(LINES)

// Draws a line between each pair of vertices 
beginShape(LINES);
vertex(30, 20);
vertex(85, 20);
vertex(85, 75);
vertex(30, 75);
endShape();

beginShape(TRIANGLES)

// Connects each grouping of three vertices as a triangle 
beginShape(TRIANGLES);
vertex(75, 30);
vertex(10, 20);
vertex(75, 50);
vertex(20, 60);
vertex(90, 70);
vertex(35, 85);
endShape();

beginShape(TRIANGLE_STRIP)

// Starting with the third vertex, connects each subsequent
// vertex to the previous two 
beginShape(TRIANGLE_STRIP);
vertex(75, 30);
vertex(10, 20);
vertex(75, 50);
vertex(20, 60);
vertex(90, 70);
vertex(35, 85);
endShape();

beginShape(TRIANGLE_FAN)

beginShape(TRIANGLE_FAN);
vertex(10, 20);
vertex(75, 30);
vertex(75, 50);
vertex(90, 70);
vertex(10, 20);
endShape();

beginShape(QUADS)

beginShape(QUADS);
vertex(30, 25);
vertex(85, 30);
vertex(85, 50);
vertex(30, 45);
vertex(30, 60);
vertex(85, 65);
vertex(85, 85);
vertex(30, 80);
endShape();

beginShape(QUAD_STRIP)

beginShape(QUAD_STRIP);
vertex(30, 25);
vertex(85, 30);
vertex(30, 45);
vertex(85, 50);
vertex(30, 60);
vertex(85, 65);
vertex(30, 80);
vertex(85, 85);
endShape();

All of these shape modes are super exciting and interesting, but we won’t go into much more detail here. For more info, check out the Processing Documentation and/or the “Programming Handbook for Visual Artists and Designers” by Reas & Fry.

S4. A tiny bit more Fundamentals

Variables and Data Types

In the earlier example, we will see the use of variables like mouseX, mouseY , and mousePressed. What we know about these variables is that they store data like the location of the x- and y- coordinate of the mouse or if the mouse is pressed.

In programming languages, we can define our own variables to store data and in turn work with that data running calculations, statistics and/or visualizing it.

Before we work through an example, here are the fundamental data types in Processing, how to declare them, and assign them a value:


  • integer numbers

    • Integers are whole numbers such as 12, -120, 8, and 934.
    • Processing represents integer data with the “int” data type. With variables, we can:
    • Assign single values to a variable:
      // x is an integer equal to 10
      int x = 10;
      • This reads: x is an integer equal to 10
    • Assign an expression to a variable:
      // y is an integer equal to 5 times 5
      int y = 5 * 5;
      • This reads: y is an integer equal to the product of 5 and 5
    • Assign an expression involving other variables to a variable:
      // z is an integer equal to the sum of x and y
      int z = x - y ;
      • This reads: z is an integer equal to the sum of variable x and variable y
  • floating numbers

    • Floating-point numbers have a decimal point for creating fractions of whole numbers such as 12.8, -120.75, 8.125, and 934.82736.
    • Processing represents floating-point data with the “float” data type:
    • Assign single values to a variable:
      // i is an floating number equal to 95.7
      float i = 10.5
    • Assign an expression to a variable:
      // j is an float equal to 5.0 times 5.0
      float j = 5.0 * 5.0;
    • Assign an expression involving other variables to a variable:
      // k is a normalized difference between variables i and j
      float k = i - j / i + j ;
      • This reads: k is an float equal to the difference of variable i and variable j divided by the sum of variable i and varible j
  • strings

    • A string is a sequence of characters.
    • Processing represents string data with the “String” data type:
    • Assign single values to a variable:
      // exclamation is a string containing the sentence "hello world!"
      String exclamation = "My Gosh! ";
    • Assign an expression to a variable:
      // task is a String expression "Hey yo"
      String task = "I'm" + " " + "programming.";
      • you can combine strings using the “+” operator.
    • Assign an expression involving other variables to a variable:
      // realization is a string concatenating the variable exclamation and task
      String realization = exclamation + task;
      
      // use the text() function to render the text to the canvas
      text(realization, width/2, height/2);
  • boolean

    • A boolean is a true or false value. (we will see in the mousePressed function’s value)
    • Processing represents boolean data with the “boolean” data type:
    • Assign single values to a variable:
      // trueLove is a a boolean type set to true
      boolean trueLove = true;
    • Assign an expression to a variable:
      // falseExpression is a boolean type showing false for the expression 10 is less than 5
      boolean falseExpression = 10 < 5;
      • 10 is NOT less than 5, therefore the value of falseExpression is false.
    • Assign an expression involving other variables to a variable:
      // trueExpression is a boolean type that is equal to the expression that trueLove does not equal falseExpression
      boolean trueExpression = trueLove != falseExpression;
  • arrays

    • An array is a list of data. It is possible to have an array of any type of data. BUT they must all be the same data type.
    • Each piece of data in an array is identified by an index number representing its position in the array. The first element in the array is [0], the second element is [1], and so on (because Processing is “zero indexed”).
    • There are a few ways to create an array.
    • Assign a list values to a variable:
      // an array of integers:
      int[] d1 = {1, 2, 3, 4, 5, 6};
      
      // calculate the maximum value of the array
      int d1_max = max(d1);
      print(d1_max);
      
      // use the printArray() function to print it
      printArray(d1);
      • this reads: d1 is an array of integers (int[]) with the values – 1,2,3,4,5,6. The max value printed to the console is 6.
      • We use the printArray() function to print the numbers in the array.
    • Create an empty array, then append values to it:
      // an empty array:
      int[] d2 = {};
      
      // add numbers to it
      d2 = append(d2, 20);
      d2 = append(d2, 40);
      d2 = append(d2, 60);
      • Keep this in mind when starting to work with data and you need to iteratively add values to an array.
    • Create an array of a known size, then append values to it:
      int[] numbers = new int[3];
      numbers[0] = 90;  // Assign value to first element in the array
      numbers[1] = 150; // Assign value to second element in the array
      numbers[2] = 30;  // Assign value to third element in the array
      int a = numbers[0] + numbers[1]; // Sets variable 'a' to 240
      int b = numbers[1] + numbers[2]; // Sets variable 'b' to 180 

Don’t worry if I just lost you with arrays – they can be a little intimidating, but they are awesome, and they will hold onto our data so we can make stuff like this:

population-011

S5. Dynamic Sketches

A dynamic sketch

A dynamic sketch makes use of Processing’s setup() and draw() functions to structure a logic for interaction and animations.

Together they look something like this:

setupdraw1

credits: “Learning Processing” – Shiffman

A Responsive Ellipse-drawing tool dynamic-sketch1

Let’s break down this code line by line:

void setup() {
    size(700, 500);
    background(255);
}

void draw() {
  if (mousePressed) {
    background(255);
  }
  ellipse(mouseX, mouseY, 40, 40);
}

setup() Set up your sketch by calling all statements that only need to occur once during your drawing – this might be things like the size() of your canvas.

draw() Continuously repeat the statements within this function – continually evaluate the statements in a loop, over and over and over, forever.

Notice how within the draw() function there is something new; an if-statement.  If statements generally look like this:

// this is not 'real' code, it's just trying to describe an if-statement!
if  ('this is true') {
  then do this
}

So, what the computer is doing when it reaches the if-statement line is asking: “Is the mousePressed?  If it is, run background(255), which will clear the canvas.  If the mouse is not being pressed, the computer will not run the code within the curly brackets {}, and will skip to drawing the ellipses.  Because this code is within draw(), the computer will constantly run and re-run, and re-re-run this code.  The result on the canvas should make sense when you think of it like this!

NOTE: mousePressed is equal to TRUE whenever your mouse is pressed and FALSE otherwise. Therefore it is a variable which stores a boolean of either TRUE or FALSE in response to a mouse event.

What the above example shows is:

  • our basic understanding of conditional statement using the if statement.
  • the use of variables – in this case the in-built variables called mouseX, mouseY, and mousePressed.

Expanding ‘if’ to ‘if/else’

But let’s say we want more control over our program… what can we do? The if/else statement allows you to say:

*“if a condition is met, then execute the statements within the brackets of the”if” statement, HOWEVER if those conditions are NOT met, then run the statements within the brackets of the “else” statement.“*

// this is not 'real' code, it's just trying to describe an if/else-statement!
if ('this is true') {
  then do this
} else {
  do this
}
ifelse-11

Taking our code from above, we can write an if/else statement that draws ellipses with orange stroke when the mouse is pressed and rectangles with purple stroke if it is not pressed.

void setup() {
    size(700, 500);
    background(255);
}

void draw() {
  if (mousePressed) {
    stroke(255, 202, 26, 50);
    noFill();
    ellipse(mouseX, mouseY, 40, 40);
  } else {
    stroke(250, 15, 247, 50);
    noFill();
    rect(mouseX, mouseY, 40, 40);
}

Another dynamic example!

Here’s a sample of a sketch that tracks the speed of your mouse in the X and Y direction and increases the size of the drawn ellipse with faster mouse speed:

mouse-speed

// sets up values that do not need to change in the canvas 
void setup() {
  // setting the size of the canvas  
  size(700, 500);
  
  // setting the background colour
    background(255);
}

// the code that will be run continuously
void draw() {
  // if the mouse is pressed, set the background to 255,
  // otherwise draw an ellipse
  if (mousePressed) {
    background(255);
  }
  
  // create a variable that calculates the speed of the mouse
  // mouseX is the current location of the mouse
  // pmouseX is the previous location of the mouse
  int ellipseRX = abs(mouseX - pmouseX);
  
  // set the stroke colour
  stroke(30, 160, 200);
  
  // draw an ellipse where the X and Y radii are the value
  // of the mouse speed
  ellipse(mouseX, mouseY, ellipseRX, ellipseRX);
}

Dynamic Sketch Remix

Take a few minutes now and remix the code from one of the dynamic sketches above:

  • add 1 more new shape that responds to your mouseX and mouseY and
  • Use the conditional to affect the color of the shapes by the mousePressed.

Saving your work

When you’re programming, saving your work is super important. Let’s save our first program.

  • From the PDE, go to “save…”
  • Navigate to a folder called “Sketches” (or create one if there isn’t already)
  • Save your work with a name that doesn’t include spaces. The convention is to use “camel case”, or to replace spaces with underscores, like:
    • “myFirstSketch”
    • “my_first_sketch”

You’ll notice when you save your “sketch” the file type will be added to the end of whatever you called your program. So if you named your project “my_first_sketch”, you will see in your folder “my_first_sketch.pde”. The “.pde” is the processing file type.

How do you feel? Empowered? Confused? Excited?

Let’s learn more! *** Now that we’ve reviewed the One hour of code, let’s learn more about the fundamental concepts of programming. These fundamentals will help you to navigate Processing as well as other languages you may encounter in the future, like Javascript, python, or R (which you will encounter in a few weeks).***


Recap & Future Directions:

What we learned today

  • The fundamentals of programming in Processing – and programming in general. We covered:
    • Processing’s primitive shapes & how to make some of our own.
    • Fundamental Data types
    • Mathematical operations in Processing
    • Variable declarations and assignments
    • Conditionals (if / else if / else) and controlling program flow using logical operators
  • We learned:
    • That we can program!
    • How to begin thinking through code – how to break down problems into steps.
    • We need to practice this stuff often to get better.
  • What we didn’t learn:
    • We learned a lot of things today, but it certainly is only the tip of iceberg. There are tons of more functions and programmatic control structures we didn’t get into today. It will be up to you to learn more about these things. There are a ton of great resources out there which are listed in the references section bewow.
    • Here’s a list of some of the major things I think we didn’t cover today:
      • Different renderers (e.g. P3D, OPENGL, etc)
      • For loops
      • While loops
      • reading in tables, dictionaries, JSON
      • Reading and writing files
      • exporting to PDF or saving frames as PNG
      • sharing your sketch online
      • and the list goes on and on… 🙂

Future Directions

Endless learning

There’s so much more to learn and so much cool stuff we haven’t yet done with data! Throughout the course I hope you can reference back to this workshop as you learn more about Processing and as a way to help you understand some of the other tools we will be using like R or Mapbox Studio, and others.


Web Searching

“Programming is just about how good you are at google searching.”


Resources

Learning Processing With Dan Shiffman – this dude is handsdown the best guy ever.

“Programming Handbook for Visual Artists and Designers” by Casey Reas and Ben Fry

P5.js tutorials

“Getting Started with Processing” – Ben Fry

“Visualzing Data” – Ben Fry

Processing.org


References

“Processing, A Programming Handbook for Visual Artists and Designers” by Casey Reas and Ben Fry

“Getting Started with Processing” – Ben Fry

“Learning Processing” – Dan Shiffman


Appendix

Here’s a bunch of reference material from other versions of this workshop that just didn’t make it in. Feel free to go through some of these examples to help you understand some of the concepts we brushed over.

A1: Printing to the console:

The most basic way to get feedback from a computer is by “printing to the console.” We can do this by using the print() or println() functions. The difference between print() and println() is that print() does not start a new line break after each function call. If this seems like a bunch of hogwash, try out the examples below 🙂

print()

print("Hello World");
print("My name is Joey");
print("It is raining outside");

println()

println("Hello World");
println("My name is Joey");
println("It is raining outside");

A2: Data & Variables & Computation

What is data? Data could be anything from the time of day, the location of a bus stop, the color of your hair, and even the location of your mouse on the screen and the keys you pressed to login to your email. In general, data often consists of measurements of physical characteristics whether it be a digital photo of your dog or the precipitation in Vancouver.

Processing can store, modify, and manipulate many different kinds of data, including numbers, letters, words, colors, images, fonts, and boolean values (true, false).

One of the key features of programming is the ability to store values and/or data to variables. Below are the main data types in Processing – and programming in general – and examples of how to store them to a variable in Processing. NOTE: Processing (the java version) is strongly typed and thus we must declare what the data type will be when assigning data to a variable. If this sounds crazy, have a look at the examples below 🙂

In the following sections we are going to quickly go through:

  1. Arithmetic and mathematical functions
  2. Relational expressions (remember, what does the crocodile want to eat?)
  3. Variables & data types

Arithmetic & Functions

Here’s a list of the main arithmetic functions in Processing. There are others including functions for making exponents, doing square roots, making log transformations, etc – for a full list check the Processing Documentation.

Just give them a good look for now, we will use a number of these in the examples below. Most of these you’ve seen before.

+ (add)
- (subtract) 
* (multiply)
/ (divide)
% (modulus)
() (parentheses)
++ (increment)
-- (decrement)
+= (add assign)
-= (subtract assign)
*= (multiply assign)
/= (divide assign)
- (negation)
ceil()
floor()
round()
min()
max()

Relational Expressions

A relational expression is made up of two values that are compared with a relational operator. In Processing, two values can be compared with relational operators as follows:

    Expression          Evaluation

    3> 5                false

    3< 5                true

    5< 3                false
    
    5> 3                true

In Processing, we have these relational operators at our fingertips in order to evaluate truthy or falsy values:

    Operator            Meaning
    >                   greater than

    <                   less than

    >=                  greater than or equal to <= less than or equal to

    ==                  equivalent to
    
    !=                  not equivalent to

We will clear the air once we go through some control structures like conditionals and looping.


A3: Control structures

A3.1: Conditionals: if/else/else if statements

Conditionals allow a program to make decisions about which lines of code run and which do not. They let actions take place only when a specific condition is met. Conditionals allow a program to behave differently depending on the values of their variables.

if/else statements are ways to control the behavior of your program. This allows us to make decisions about what happens in our code. if/else statements might allow us to filter out data, change the size of an ellipse or rectangle depending on the value of a variable, or react to a mouse or key input (oooh fancy!). *** ### A simple if statement So this is the basic structure of an if statement. It basically says: “if a condition is met, execute the statements within the brackets.”

// psuedo code - this wont run in Processing
if (test) {
        statements
}

In our first sketch example (below), we played with conditional statements by using the if/else statement in conjunction with the mousePressed variable to tell the computer:

“if the mouse is pressed, then change the background color and clear the ellipses from the canvas.”

void setup() {
    size(700, 500);
    background(255);
}

void draw() {
  if (mousePressed) {
    background(255);
  }
  ellipse(mouseX,mouseY,40,40);
}

A simple if/else if/else statment

We’ve now made an if statement, and if/else statement, and now we want to add 1 more layer of control to our decision making. Here we introduce the “else if” conditional.

The else if statement allows us to say “if a condition is met, then execute the statements within the brackets of the if statement, HOWEVER if those conditions are NOT met, then evaluate whether or not the condition is met for the”else if” statement. If the condition is met in the “else if statement”, then run the statements within those brackets, if not, then check if the conditions are met in the next else if statement and so on and so forth. And if none of those are met, then run the else statement.”

// psuedo code - this wont run in Processing
if (test) {
        statements
} else if (test) {
    statements
} else if {
    statements
} 

... (as many else if statements as you'd like)...

else{
    statements
}

Challenge 2: Deconstructing an if/else if/else statement

ifelseifelse-11 Here’s some kind of an 80’s pattern you might have seen on a fanny pack.

Get into groups or talk to your neighbor and break this sketch down. Notice we introduce what looks like a new in-built variable we haven’t seen yet called frameCount and have a modulo (%) that seems to be doing something.

void setup() {
  size(700, 500);
  background(255);
  frameRate(10);
  noStroke();
}

void draw() {
  if (mousePressed) {
    if (frameCount % 4 == 0) {
      fill(#FFAB0D);
      ellipse(mouseX, mouseY, 60, 60);
    } else if (frameCount % 4 == 1) {
      fill(#0DDDFF);
      rect(mouseX, mouseY, 60, 60);
    } else if (frameCount % 4 == 2) {
      fill(#650DFF);
      triangle(mouseX, mouseY, mouseX+ 60, mouseY+60, mouseX-60, mouseY+60);
    } else {
      fill(#FF0D66);
      quad(mouseX, mouseY, mouseX, mouseY+60, mouseX-75, mouseY+60, mouseX+75, mouseY);
    }
  }
}

Wasn’t that fun?! This may seem trivial, but essentially you can now write code that is responding to data and producing visuals based on the control structures you’ve put in place.


A3.2:Logical Operators

We can have multiple “tests” conditions in an if or else if statement. Using the logical operators (such as the ones you see below) allow you to combine test conditions in an if/else if/ else statement makes it possible to do this.

Operator            Meaning

&&                  AND 

||                  OR 

!                   NOT 


Expression          Evaluation

true && true        true
true && false       false
false && false      false
true || true        true
true || false       true
false || false      false
!true               false
!false              true

Challenge 3: A simple drawing and erasing tool

drawerase1

Run this code and see what happens when you:

  1. press the mouse
  2. Press any key on your keyboard
  3. press both the mouse and keys on your keyboardvoid setup() { size(700, 500); background(0); rectMode(CENTER); noStroke(); }void draw() { if (mousePressed && keyPressed) { background(0);} else if (mousePressed){ fill(255); rect(mouseX, mouseY, 60, 60); } else if (keyPressed) { fill(0); rect(mouseX, mouseY, 60, 60); }}

Add comments to the code at each conditional statement in the draw() function telling us what will happen if that stement is TRUE.


A3.3: repetition

Computers are excellent at executing repetitive tasks accurately and quickly. Modern computers are also logic machines. Building on the work of the logicians Leibniz and Boole, modern computers use logical operations such as AND, OR, and NOT to determine which lines of code are run and which are not.

A3.3.1: The for loop

The for loop is probably the coolest thing since ever. At the most basic level for loops allow your to repeat a statement or statements over and over again until a condition is met.

Why are for loops so awesome? Let’s look at an example of how a loop can be used to take a program which takes 14 lines of code to make 14 lines…

// original hard-coded program
size(200, 200);
line(20, 20, 20, 180);
line(30, 20, 30, 180);
line(40, 20, 40, 180);
line(50, 20, 50, 180);
line(60, 20, 60, 180);
line(70, 20, 70, 180);
line(80, 20, 80, 180);
line(90, 20, 90, 180);
line(100, 20, 100, 180);
line(110, 20, 110, 180);
line(120, 20, 120, 180);
line(130, 20, 130, 180);
line(140, 20, 140, 180);

… and make the program 4 lines of code:

size(200, 200);
for (int i = 20; i < 150; i += 10) {
  line(i, 20, i, 180);
}

The general structure if described in the image below.

forloops1 Credits: “A Programming Handbook for Visual Artists and Designers” – Reas & Fry

Let’s look closer at the specific “for” structure:

for (int i = 20; i < 80; i += 5) {
        line(20, i, 80, i+15);
}

If you were to read this out loud it would sound like: “for the integer i equals 20, as long as i is less than 80, increment i by 5, drawing a line each time with new y coordinates based on i.”


Let’s get a little crazy and make a nested loop. A nested loop? You can do that?! Sure!

size(400, 400);
noStroke();

for (int i = 20; i < width - 20; i += 20) {
  for (int j = 20; j < height - 20; j += 20) {
        fill(random(255), random(255), random(255));
        ellipse(i, j, 10, 10);
    }
}
processing_nestedloop1

How does a nested loop work? First, the outer loop starts. Then the inner loop runs until the condition is met at which point the outer loop increments, repeating the inner loop again until the condition is met at which point the outer loop increments, and over and over until the outer loop condition is met. This means that columns of ellipses are drawn from left to right with each complete loop.

*NOTE: the random() function generates a random number from 0 to the value of the input parameter. In this case, for each iteration in the loop, a random number is generated between 0 and 255 for each parameter in the fill() function.


A4: Synthesis:

As our last examples, we will work with arrays to make some simple animations. We will use the following concepts we’ve picked up in this workshop:

  • variables:
    • mousePressed, keyPressed
  • conditionals
    • if statements
  • increment
    • i++
  • shape properties
    • e.g. stroke, strokeWeight
  • shapes
    • point, line
  • random()
  • looping:
    • for loops!
    • draw function

A4.1 Synthesis 1: Animated data line

aloha_drawarray11
// Declare the arrays - globally
int[] y = {};
int[] x = {};

// declare a counter variable i to increment through the arrays later
int i = 0;

// set the size and background color
void setup() {
  size(700, 500);
  background(255);
}

// continuously evaluate the draw function
void draw() {
  if (mousePressed){
    // append the x & y coordinates of the mouse to array x and y
    x = append(x, mouseX);
    y = append(y, mouseY);
    // draw a point with a strokeWeight 2 where the mouse is
    strokeWeight(2);
    stroke(150, 150, 150, 80);
    point(mouseX, mouseY);
  }
    // when you press a key and 
    // as long as the index is less than the length of the array 
    // so we don't run out of data points to draw
  if (keyPressed && i+1 < y.length) {
    // increment i by 1 
    i++;
    // choose a random stroke color
    stroke(random(255), random(255), random(255), 60);
    // use the modulo to select a strokeWeight from 0 to 8 
    // as the numbers increment
    strokeWeight(i % 8);
    // draw a line using the data from the array by 
    // selecting the index positions 
    line(x[i-1], y[i-1], x[i], y[i]);
  }
}
  • up until this point, we haven’t seen the difference between global and local variable declaration. This is what we call variable scope. While global variables can be accessed across your entire program, local variables are limited to the functions that contain them.
  • in this case, if we declared our array x and y in the setup fuction, we wouldnt be able to access that data in the draw function because it would be “stuck” inside the setup function and therefore not accessible in the draw function.

A4.2 Synthesis 2: Animated taxi trail

Let’s make an animation of a taxi cab in Vancouver.

*** before you continue: 1. save your sketch as “taxi_trail”, 2. copy the “assets/data” folder into your “taxi_trail” folder you just created.***

taxi-anim1
// To read in an image: 
// 1. instantiate the PImage class & 
// 2. declare the image object 
PImage backgroundmap;

// An array of arrays of coordinates of a taxi in Vancouver
int[][] coords = {{274,487},{274,487},{318,441},{328,428},{341,419},{353,410},{362,399},{365,392},{407,426},{447,446},{462,455},{506,473},{517,478},{526,484},{547,493},{551,497},{556,503},{559,507},{559,507},{566,512},{566,512},{573,518},{573,518},{581,524},{586,533},{598,542},{598,547},{584,560},{578,568},{567,578},{556,586},{549,597},{529,613},{523,619},{523,619},{521,622},{548,646},{562,658},{566,667},{556,681},{551,685},{534,668},{517,657},{508,648},{501,642},{491,656},{468,680},{457,671},{445,653},{430,636},{414,622},{398,607},{382,595},{350,571},{334,556},{346,544},{359,533},{339,504},{329,493},{319,483},{326,476},{335,468},{342,460},{336,448},{327,443},{315,451},{304,459},{293,466},{288,471},{278,479},{271,488},{267,476},{264,466},{257,456},{247,448},{237,442},{237,442},{231,436},{231,436},{224,430},{201,416},{200,415},{197,396},{200,381},{200,369},{200,369},{198,355},{189,345},{184,339},{201,335},{214,328},{228,324},{248,321},{261,316},{278,308},{314,302},{324,310},{326,313},{336,322},{339,316},{354,310},{361,311},{386,304},{394,298},{420,307},{422,310},{430,321},{437,321},{445,307},{454,296},{476,291},{481,293},{498,294},{507,288},{514,286},{533,284},{548,287},{556,281},{561,277},{561,270},{561,248},{552,254},{547,259},{532,262},{529,261},{511,254},{505,252},{476,244},{463,235},{452,223},{449,222},{419,214},{411,212},{403,207},{390,192},{383,176},{374,161},{353,148},{346,139},{341,128},{338,102},{330,88},{317,76},{306,64},{295,51},{288,84},{285,109},{263,108},{254,97},{260,85},{265,77},{268,63},{262,54},{260,42},{254,36},{237,39},{222,53},{206,62},{186,72},{178,90},{166,99},{153,105},{135,125},{130,150},{140,182},{140,205},{131,222},{120,230},{101,236},{89,256},{86,282},{93,293},{113,302},{133,303},{156,317},{170,328},{178,332},{196,333},{233,324},{264,322},{274,316},{289,301},{306,298},{320,303},{329,323},{331,345},{344,367},{355,378},{372,392},{390,409},{396,420},{414,438},{431,450},{439,460},{452,467},{460,476},{461,478},{448,502},{433,498},{424,490},{409,471},{401,463},{395,459},{393,467},{383,472},{375,467},{369,462},{363,467},{360,471},{360,476},{374,484},{386,495},{388,498},{392,504},{401,512},{402,513},{415,522},{422,532},{423,539},{413,544},{405,550},{396,556},{394,562},{402,567},{416,578},{421,582},{417,598},{397,616},{388,624},{373,632},{369,639},{366,647},{366,652},{376,661},{383,666},{377,676},{365,679},{348,688},{340,693},{333,697},{307,709},{291,717},{269,728},{256,735},{236,743},{228,757},{229,771},{228,782},{241,783},{265,783},{272,783},{275,784},{276,785},{281,800},{284,815},{299,826},{302,824},{310,805},{321,799},{345,784},{355,772},{357,770},{384,744},{387,736},{391,730},{399,725},{404,724},{417,721},{423,716},{432,706},{435,705},{446,698},{452,692},{478,665},{485,658},{487,655},{501,647},{513,632},{525,621},{538,608},{549,598},{556,591},{576,573},{589,551},{595,544},{609,552},{620,573},{632,582},{653,593},{681,599},{701,604},{737,606},{764,603},{765,613},{765,632},{763,650},{764,657},{783,658},{803,664},{804,679},{785,671},{752,663},{727,669},{709,671},{680,671},{661,666},{651,662},{638,649},{619,639},{601,624},{596,620},{589,614},{578,603},{571,594},{564,586}};

// global counter variable i to increment the animation
int i = 0;

// set the size and background color
void setup() {
  size(960, 840);
  background(255);
  // read in the map image
  // Images must be in the "data" directory to load correctly
  backgroundmap = loadImage("backgroundmap_960x840.png");
  
  // reduce the frame rate to slow the animation
  frameRate(20);
}


// the draw function
void draw(){
  // load the image using the image() function 
  //so the map is drawn every frame of the animation
  image(backgroundmap, 0, 0);
  
  // add annotations
  //  fill(255);
  //  stroke(255);
  //  rect(width*0.60, 35, 500, 50);
  //  
  //  fill(255, 0, 0);
  //  textSize(32);
  //  text("click to show the route", width*0.62, 40, width*0.75, 200);

  // animate the taxi as long as there data data available
  // by incrementing i by 1 each frame
  if(i+1 < coords.length){
    i++;
    fill(255, 214, 46, 75);
    strokeWeight(3);
    stroke(255, 214, 46);
    ellipse(coords[i][0], coords[i][1], 20, 20);
  }
  
  // if the mouse is pressed, draw the entire route of the taxi in red
  if(mousePressed){
    // using a for loop, we iterate through all the data points
    // to draw the whole taxi route 
    for(int j = 0; j < coords.length - 1; j++){
      strokeWeight(4);
      stroke(255, 0, 0, 85);
      // draw a line from position x1, y1 to x2, y2
      line(coords[j][0], coords[j][1], coords[j+1][0], coords[j+1][1]);
    }
  }

}
  • we add some new elements, one of which is the PImage class. Don’t worry too much about this now as it relates to object oriented programming. The important thing to know is that the PImage class allows us to read in images and work with them in our sketches.

Whew! That’s a lot of stuff. Take some time to play around with the code, decompress, have some coffee and a snack. If you made it this far you are amazing. I don’t know about you but my mind is blown!

S8: More on math

Math can be an important aspect of programming, but it’s not necessary to be good at math to understand or enjoy programming.

A5.1: arithmetic & functions – more examples


A5.1.1: Add

 int x;
 int y;
 int z;
 
 x = 10;
 y = 5;
 z = x + y;
 
 println(z);  // z will equal 15
 
 // let's take z and set the background color
 background(z);
 
 // then create an ellipse and set the fill(z, z+100, z+20)
 fill(z, z+100, z+20);
 ellipse(width/2, height/2, 50, 50);

Experiment: Using addition and variable declarations, come up with your own exmple of how you might use a the combination to parameterize your drawing *** ###S8.1.2:subtract

int x;
int y;
int z;
 
x = 10;
y = 5;
z = x - y;
 
println(z);  // z will equal 5

// let's make a yellow ellipse with a strokeWeight(z)
strokeWeight(z);
fill(#FFF700);
ellipse(width/2, height/2, 50, 50);

Experiment: Change the numbers in the sketch to see how changing the value of the varibles affects your shapes. *** ###S8.1.3:multiply int x; int y; int z;

x = 10;
y = 5;
z = x * y;
 
println(z);  // z will equal 50

// let's make a rectangle that's z pixels wide and z*2 pixels high
size(400, 400);
rect(width/2, height/2, z, z*2);

Experiment: Change the numbers in the sketch to see how changing the value of the varibles affects your shapes.


A5.1.2: divide

int x;
int y;
int z;
 
x = 10;
y = 5;
z = x / y;
 
println(z);  // z will equal 2

// let's make 3 ellipses adjusting their width and height by the z value, in this case just dividing by 2
size(400, 400);

fill(z+240, z+15, z*4);
ellipse(width/2, height/2, 100/z, 100/z);

fill(z, z*60, z+60);
ellipse(width/2, height*0.25, 50/z, 50/z);

fill(z, z*35, z+200);
ellipse(width/2, height*0.75, 50/z, 50/z);

A5.1.3: modulo

The modulo returns the remainder of the division between two numbers

int x; 
int y;
int z;
int j;
int i;

x = 10;
y = 5;
j = 3;

z = x % y;
i = x % j;

println(z); // z will equal 0 since the remainder of 10/5 is 0
println(i); // i will return 1 since the remainder of 10/3 is 1

// Here's an if/else statement - we're going to learn about these in section: Control!
if (z == 1){
    fill(#00F9FF); // if z == 1, then color the ellipse turqoise
} else {
    fill(#FE00FF); // if z does not equal 1, then color the ellipse purple
}

ellipse(width/2, height/2, 50, 50); // we'll see a purple ellipse! 

A5.1.4: parentheses

The parentheses can be used to set up your mathematical order of operations (remember PEMDAS?). Be sure to keep the order of operations in mind when writing your statements.

int x = 3 + 4 * 5; // Assign 23 to x 
int y = (3 + 4) * 5; // Assign 35 to y

println(x);
println(y);

A5.1.5: increment

The increment is a shortcut for adding 1 value to an existing variable.

// incrementing the long way;
int x = 0;
x = x + 1;
println(x); // x will print as 1;

// incrementing the short way:
int y = 0;
y++;
println(y); // y will print as 1;

Experiment: use the increment to change the stroke size of 3 lines


A5.1.6: decrement

// decrementing the long way;
int x = 10;
x = x - 1;
println(x); // x will print as 9;

// incrementing the short way:
int y = 10;
y--;
println(y); // y will print as 9;

Experiment: use the increment to change the stroke size of 3 lines


A5.1.7: += (add assign), -= (subtract assign), *= (multiply assign), /= (divide assign)

The add assign, subtract assign, multiply assign, and divide assign allow you to add, subtract, multiply, or divide a variable by a specified value:

// add assign
float a = 10.0;
a += 10;
println(a);     // a will print as 20.0;

// subtract assign
float b = 35.6;
b -= 0.6;
println(b);     // b will print as 35.0

// multiply assign
int c = 100;
c *= 3;
println(c);     // c will print as 300

// divide assign
int d = 36;
d /= 4;
println(d);     // d will print as 8

A5.1.8: ceil()

The ceil() function rounds whatever parameter up to the nearest whole number

    // round 32.4 up to 33.0
    float y = ceil(32.4);
    println(y); 
    
    // round 67.7 up to 68
    float x = ceil(67.7);
    println(x);
    

A5.1.9: floor()

The floor() function rounds whatever parameter down to the nearest whole number

    // round 32.4 down to 32.0
    float y = floor(32.4);
    println(y); 
    
    // round 67.7 up to 67.0
    float x = floor(67.7);
    println(x);
    

A5.1.10: round()

The round() function returns the closest value.

    // round 32.4 to 32.0
    float y = round(32.4);
    println(y); 
    
    // round 67.7 up to 68.0
    float x = round(67.7);
    println(x);

A5.1.11: min(), max()

The min() function returns the minimum value from a sequence of at least 2, but maximum 3 parameter. You can also apply the min() function on an array to return the minimum value in an array of numbers.

    int u = min(5, 9); // Assign 5 to u
    
int v = min(-4, -12, -9); // Assign -12 to v 

float w = min(12.3, 230.24); // Assign 12.3 to w 

The max() function returns the maximum value from a sequence of at least 2, but maximum 3 parameters. You can also apply the max() function on an array to return the maximum value in an array of numbers.

int x = max(5, 9); // Assign 9 to x

int y = max(-4, -12, -9); // Assign -4 to y 

float z = max(12.3, 230.24); // Assign 230.24 to z

A6: More on Control

The programs we’ve seen so far run each line of code in sequence. They run the first line, then the second, then the third, etc. The program stops when the last line is run. It’s often beneficial to change this order-sometimes skipping lines or repeating lines many times to perform a repetitive action. Although the lines of code that comprise a program are always positioned in an order from top to bottom on the page, this doesn’t necessarily define the order in which each line is run. This order is called the flow of the program. Flow can be changed by adding elements of code called control structures.

A6.1 conditionals & decision making

A6.1.1:Relational Expressions

A few examples of these:

>

// 3 is greater than 5 is FALSE
println(3 > 5); // Prints "false"

<

// 3 is less than 5 is TRUE
println(3 < 5); // Prints "true"

<=

// 3 is less than or equal to 5 is TRUE
println(3 <= 5); // Prints "true"

>=

// 3 is greater than or equal to 5 is FALSE
println(3 >= 5); // Prints "false"

==

// 3 is equal 5 is FALSE
println(3 == 5); // Prints "false"

!= // 3 does not equal 5 is TRUE println(3 != 5); // Prints “true”

 

 

A7: Processing for websites!

P5.js

I have to mention P5.js which is the javascript library inspired by Processing not only because it is super cool, but because I, along with many others, believe the age of the web is here! P5.js is a javascript library that takes the soul of Processing and implements it in javascript meaning that you can sketch with code using web technologies (html, css, & javascript). I would argue that the syntax is friendlier and is more flexible in terms of sharing your work. Besides the fact that the browser can crash if you send it a ton of data, P5.js may be an interesting way to move out of your desktop and onto the web.

*NOTE: I decided not to teach P5.js since I thought introducing html, css, and javascript would be maybe too much for the first class. Please let me know if you have other thoughts!

P5.js website:

http://p5js.org/

P5.js tutorials:

http://p5js.org/tutorials/