---
title: "Cartograflow"
subtitle: "Filtering Origin-Destination Matrix for Thematic Flow Mapping"
author: "Françoise Bahoken, Sylvain Blondeau"
date: "`r Sys.Date()`"
output: html_vignette
vignette: >
\usepackage[ps2pdf,
bookmarks=true,
dvipsone=pdftex,
backref,
]{hyperref}
%\VignetteIndexEntry{CartogRaflow}
%\VignetteEncoding{UTF-8}
%\SweaveUTF8
%\VignetteEngine{knitr::rmarkdown}
#editor_options:
#chunk_output_type: inline
---
`Cartograflow` is designed to filter origin-destination (OD) flow matrix for thematic mapping purposes.
## Description of functions
**1. Preparing flow data sets:**
**1.1 General functions**
You can use long "L" or matrix "M" [n*n] flow dataset formats.
-- `flowtabmat()` is to transform "L" to "M" formats, also to build an empty square matrix from spatial codes.
-- `flowcarre()` is to square a matrix.
-- `flowjointure()` is to performs a spatial join between a flow dataset and a spatial features layer or an external matrix.
-- `flowstructmat()` fixes an unpreviously codes shift in the flow dataset "M" format. If necessary this function is to be used with `flowjointure` and `flowtabmat`.
**1.2. Flow computation:**
-- `flowtype()` is to compute several types of flow from an asymmetric matrix:
x= `flux` for remaining initial flow (Fij)
x= `transpose` for reverse flow value (Fji)
x= `bivolum` for bilateral volum, as gross flow (FSij)
x= `bibal` for bilateral balance, as net flow (FBij)
x= `biasym` for asymetry of bilateral flow (FAij)
x= `bimin` for minimum of bilateral flow (minFij)
x= `bimax` for maximum of bilateral flow (maxFij)
x= `birange` for bilateral flow range (rangeFij)
x= `bidisym` for bilateral disymetry as (FDij)
-- `flowplaces()` is to compute several types of flow places oriented from an asymmetric:
ie. as a dataframe that describes the flows from Origin / destination point of view
x= `ini` for the number of incoming links (as in-degree)
x= `outi` for the number of outcoming links (as out-degree)
x= `degi` for the total number of links (as in and out degrees)
x= `intra` for total intra zonal interaction (if main diagonal is not empty
x= `Dj` for the total flows received by (j) place
x= `voli` for the total volume of flow per place
x= `bali` for the net balance of flow per place
x= `asyi` for the asymetry of flow per place
x= `allflowplaces` for computing all the above indicators
**1.3. Flow reduction:**
-- `flowlowup()` is to extracts the upper or the lower triangular part of a matrix - preferably for symmetrical matrixes.
x= `up` for the part above the main diagonal
x= `low` for the part below the main diagonal
-- `flowreduct()` is to reduce the flow dataset regarding another matrix, e.g. distances travelled.
`metric` is the metric of the distance matrix :
- metric= `continuous` (e.g. for kilometers)
- metric= `ordinal` (e.g. for `k` contiguity)
If the metric is `continuous` (e.g for filtering flows by kilometric distances travelled), use:
`d.criteria` is for selecting the minimum or the maximum distance criteria
- d.criteria= `dmin` for keeping only flows up to a _dmin_ criterion in km
- d.criteria= `dmax` for selecting values less than a _dmax_ criterion in km
`d` is the value of the selected `dmin` or `dmax` criteria.
Notice that these arguments can be used as a filter criterion in `flowmap()`.
See Cartograflow_distance and Cartograflow_ordinal_distance Vignettes for examples.
URL: https://github.com/fbahoken/cartogRaflow/tree/master/vignettes
**2. Flows filtering:**
**2.1. Filtering from flow concentration analysis**
**Flow concentration analysis:**
-- `flowgini()` performs a Gini's concentration analysis of the flow features, by computing _Gini coefficient_ and plotting interactive _Lorenz curve_.
To be use before `flowanalysis()`
See Cartograflow_concentration Vignette for example.
URL: https://github.com/fbahoken/cartogRaflow/tree/master/vignettes
**Flow filtering according to a concentration criterion:**
-- `flowanalysis()` computes filters criterions based on:
- argument `critflow` is to filter the flows according to their significativity (% of total of flow information) ;
- argument `critlink` is to filter the flows according to their density (% of total features)
These arguments can be used as `filter` criterion in `flowmap()`.
See Cartograflow_concentration Vignette for example.
URL: https://github.com/fbahoken/cartogRaflow/tree/master/vignettes
**2.2. Spatial / territorial filtering of flows**
**Flow filtering based on a continuous distance criterion**
-- `flowdist()` computes a _continous distance_ matrix from spatial features (area or points). The result is a matrix of the distances travelled between ODs, with flows filtered or not.
See Cartograflow_distance Vignette for example.
URL: https://github.com/fbahoken/cartogRaflow/tree/master/vignettes
**Flow filtering based on an ordinal distance / neighbourhood criterion**:
-- `flowcontig()` compute an _ordinal distance_ matrix from spatial features (area). The result is a matrix of adjacency or k-contiguity of the ODs.
- `background` is the areal spatial features ;
- code` is the spatial features codes ;
- `k` is to enter the number (k:1,2,...,k) of the contiguity matrix to be constructed : if (k=1), ODs places are adjacent, then the flow have to cross only 1 boundary, else (k=k) ODs places are distant from n borders ;
- `algo` is the algorithm to use for ordinal distance calculation (also Default is "automatic" for "Dijkstra's") ;
Notice that the function automatically returns the maximum (k) number of the spatial layer.
See Cartograflow_distance_ordinal Vignette for example.
**3. Flow mapping**
-- `flowmap()` is to plot flows as segments or arrows, by acting on the following arguments:
- `filter` is to filter or not flow's information or features
- `threshold` is used to set the filtering level of the flows when filter="True"
- `taille` is the value of the width of the flow feature
- `a.head` is the arrow head parameter (in, out, in and out)
- `a.length` is the length of the edges of the arrow head (in inches)
- `a.angle` is the angle from the shaft of the arrow to the edge of the arrow head
- `a.col` is the arrow's color
- `plota` is to add spatial features as map background to the flows's plot
- `add` is to allow to overlay flow features on external spatial features background
## Examples of applications
-- **Useful packages**
Best external R package to use:
{dplyr} {sf} {igraph} {rlang} {cartography}
```{r include=FALSE, message=FALSE}
rm(list=ls())
library(sf)
library(dplyr)
library(cartograflow)
library(cartography)
knitr::opts_chunk$set(fig.width=6, fig.height=6)
```
**1. Load datasets**
--------------------
**Flow dataset**
```{r flowdata_preprocess, warning=FALSE, echo=TRUE}
# Load Statistical information
tabflow<-read.csv2("./data/MOBPRO_ETP.csv", header=TRUE, sep=";",stringsAsFactors=FALSE,
encoding="UTF-8", dec=".",check.names=FALSE)
```
```{r var_typing, echo=FALSE, warning=FALSE}
# Variable typing
tabflow$i<-as.character(tabflow$i)
tabflow$j<-as.character(tabflow$j)
tabflow$Fij<-as.numeric(tabflow$Fij)
tabflow$count<-as.numeric(tabflow$count)
str(tabflow)
```
**Select variable and change matrix format**
```{r flowdata_reverse, echo=TRUE, message=FALSE, warning=FALSE}
# Selecting useful variables for changing format
tabflow<-tabflow %>% select(i,j,Fij)
# From list (L) to matrix (M) format
matflow <-flowtabmat(tabflow,matlist="M")
head(matflow[1:4,1:4])
dim(matflow)
```
```{r flowdata_reverseM, message=FALSE, warning=FALSE, include=FALSE}
# From matrix (M) to list (L) format
tabflow<-flowtabmat(tab=matflow,
matlist="L")
colnames(tabflow)<-c("i","j","Fij")
head(tabflow)
```
**Geographical dataset**
```{r data_preprocess, message=FALSE, warning=FALSE, include=FALSE}
# Load a list of geo codes
ID_CODE<-read.csv2("./data/COD_GEO_EPT.csv",
header=TRUE,sep=";",stringsAsFactors=FALSE,encoding="UTF-8", dec=".", check.names=FALSE)
#head(ID_CODE)
CODE<-ID_CODE%>% dplyr::select(COD_GEO_EPT)
colnames(CODE)<-c("CODGEO")
#head(CODE)
```
**2. Flow types computing**
--------------------
```{r vara_typing2, message=FALSE, warning=FALSE, include=FALSE}
# Variable typing
tabflow$i<-as.character(tabflow$i)
tabflow$j<-as.character(tabflow$j)
tabflow$Fij<-as.numeric(tabflow$Fij)
as.data.frame(tabflow)
```
*Compute bilateral flows types : eg. volum, balance, bilateral maximum and all types*
```{r data_computing, echo=TRUE, message=FALSE, warning=FALSE}
# Bilateral volum (gross) FSij:
tabflow_vol<-flowtype(tabflow, format="L", origin="i", destination="j", fij="Fij", x= "bivolum" )
# Matrix format (M= : matflow_vol<-flowtype(matflow, format="M", "bivolum")
# Bilateral balance (net ) FBij:
tabflow_net<-flowtype(tabflow, format="L", origin="i", destination="j", fij="Fij", x="bibal")
# Bilateral maximum (maxFij):
tabflow_max<-flowtype(tabflow, format="L", origin="i", destination="j", fij="Fij", x="bimax")
# Compute all types of bilateral flows, in one 11 columns
tabflow_all<-flowtype(tabflow,format="L", origin="i", destination="j", fij="Fij", x="alltypes")
head(tabflow_all)
```
**3. Direct flow mapping**
---------------------------
**3.1. Plot all origin-destination without any filtering criterion**
The result will reveal a graphic complexity ("spaghetti-effect"")
Plot links
```{r maps_links, echo=TRUE, fig.show='hold', fig.width=6, message=FALSE, warning=FALSE, ECHO=FALSE}
library(sf)
map<-st_read("./data/MGP_TER.shp")
# Add and overlay spatial background
par(bg = "NA")
# Graphic parameters
par(mar=c(0,0,1,0))
extent <- c(2800000, 1340000, 6400000, 4800000)
resolution<-150
plot(st_geometry(map), col = NA, border=NA, bg="#dfe6e1")
plot(st_geometry(map), col = "light grey", add=TRUE)
# Flowmapping of all links
flowmap(tab=tabflow,
fij="Fij",
origin.f = "i",
destination.f = "j",
bkg = map,
code="EPT_NUM",
nodes.X="X",
nodes.Y = "Y",
filter=FALSE,
add=TRUE
)
library(cartography)
# Map cosmetics
layoutLayer(title = "All origin-destination for commuting in Greater Paris, 2017",
coltitle ="black",
author = "Cartograflow, 2020",
sources = "Data : INSEE, 2017 ; Basemap : APUR, RIATE, 2018.",
scale = 2,
tabtitle = FALSE,
frame = TRUE,
col = "grey"
)
# North arrow
north("topright")
```
**3.2. Plot the above-average flows**
```{r maps_flowmean, echo=TRUE, fig.show='hold', fig.width=6, message=FALSE, warning=FALSE, ECHO=FALSE}
library(sf)
map<-st_read("./data/MGP_TER.shp")
# Add and overlay spatial background
par(bg = "NA")
# Graphic parameters
par(mar=c(0,0,1,0))
extent <- c(2800000, 1340000, 6400000, 4800000)
resolution<-150
plot(st_geometry(map), col = NA, border=NA, bg="#dfe6e1")
plot(st_geometry(map), col = "light grey", add=TRUE)
# Flow mapping above-average flows
flowmap(tab=tabflow,
fij="Fij",
origin.f = "i",
destination.f = "j",
bkg = map,
code="EPT_NUM",
nodes.X="X",
nodes.Y = "Y",
filter=TRUE,
threshold =(mean(tabflow$Fij)), #mean value is the level of threshold
taille=20,
a.head = 1,
a.length = 0.11,
a.angle = 30,
a.col="#138913",
add=TRUE)
# Map Legend
legendPropLines(pos="topleft",
title.txt="Commuters > 13220 ",
title.cex=0.8,
cex=0.5,
values.cex= 0.7,
var=c(mean(tabflow$Fij),max(tabflow$Fij)),
lwd=5,
frame = FALSE,
col="#138913",
values.rnd = 0
)
#Map cosmetic
layoutLayer(title = "Commuters up to above-average in Greater Paris",
coltitle ="black",
author = "Cartograflow, 2020",
sources = "Data : INSEE, 2017 ; Basemap : APUR, RIATE, 2018.",
scale = 2,
tabtitle = FALSE,
frame = TRUE,
col = "grey"
)
# North arrow
north("topright")
```
**3.3. Plot the net flows of bilateral flows**
```{r maps_flownet, echo=TRUE, fig.show='hold', fig.width=6, message=FALSE, warning=FALSE, ECHO=FALSE}
#library(sf)
map<-st_read("./data/MGP_TER.shp")
# Net matrix reduction
tabflow_net <- tabflow_net %>% filter(.data$FBij>=0)
# Net matrix thresholding
Q80<-quantile(tabflow_net$FBij,0.95)
# Add and overlay spatial background
par(bg = "NA")
# Graphic parameters
par(mar=c(0,0,1,0))
extent <- c(2800000, 1340000, 6400000, 4800000)
resolution<-150
plot(st_geometry(map), col = NA, border=NA, bg="#dfe6e1")
plot(st_geometry(map), col = "light grey", add=TRUE)
# Flow mapping above-average flows
flowmap(tab=tabflow_net,
fij="FBij",
origin.f = "i",
destination.f = "j",
bkg = map,
code="EPT_NUM",
nodes.X="X",
nodes.Y = "Y",
filter=TRUE,
threshold = Q80,
taille=12,
a.head = 1,
a.length = 0.11,
a.angle = 30,
a.col="#4e8ef5",
add=TRUE)
# Map Legend
legendPropLines(pos="topleft",
title.txt="Commuters > 5722 ",
title.cex=0.8,
cex=0.5,
values.cex= 0.7,
var=c(Q80,max(tabflow_net$FBij)),
lwd=12,
frame = FALSE,
col="#4e8ef5",
values.rnd = 0
)
#Map cosmetic
layoutLayer(title = "Net commuters in Greater Paris (20% strongest)",
coltitle ="black",
author = "Cartograflow, 2020",
sources = "Data : INSEE, 2017 ; Basemap : APUR, RIATE, 2018.",
scale = 2,
tabtitle = FALSE,
frame = TRUE,
col = "grey"
)
# North arrow
north("topright")
```
## Sample datasets
-- _Statistical dataset_ :
- INSEE - Base flux de mobilité (2015)
- URL : https://www.insee.fr/fr/statistiques/fichier/3566008/rp2015_mobpro_txt.zip
-- _Geographical dataset_ :
- municipalities : IGN, GEOFLA 2015 v2.1
- Greater Paris : APUR, UMS 2414 RIATE, 2018.
## See also
https://github.com/fbahoken/cartogRaflow/tree/master/vignettes
-- cartograflow_general.html
-- cartograflow_concentration.html
-- cartograflow_distance.html
-- cartograflow_ordinal_distance.hmtl
## Reference
-- Bahoken Francoise (2016), Programmes pour R/Rtudio annexés, in : _Contribution à la cartographie d'une matrix de flux_, Thèse de doctorat, Université Paris 7, pp. 480-520.