Learning about API's and R

API’s in R

Amanda from Rstudio recently hosted a very helpful Webinar on API’s in R. After watching it, I wanted to write down some of the key points and put them into practice.

Basics of API

Ask the right question

An API (Application Program Interface) is a way for two programs to communicate. To do so, they need to speak the same language. If you ask an API a question it understands, it will respond. otherwise it will probably give you an error message.

What’s the language?

Web API’s typically function through five “verbs” of the HTTP (Hyper Text Transfer Protocol)
1. GET - Equivalent to “reading” data. 1. PUT - Equivalent to “update or modify” the data." 1. POST - Used to “Create” data. 1. PATCH 1. DELETE - As you might guess, used to delete data.

This helps you phrase your request correctly, but you’ll still need more information! Whilst there may be common themes, every API is structured differently. To know what you can ask an API, and how to do so, you’ll need to refer to it’s documentation. This will describe it’s “endpoints” that you can call. Think of an endpoint as a certain question or action you can request.

API Resources in R

When I first came across this, I thought to myself, this is great, but how do I actually make a request?

Well, the most basic way to do so would be using curl (See - Url, get it?). This is a library tool for transferring data. But in R, there is a better way.

httr is a package created for working with HTTP. It’s structured around the HTTP verbs mentioned above, allowing you to easily make calls to URLs and then parse the responses. For an in depth resource, see the httr website.

At the end of this, there a few examples of using the package.

HTPP Verbs

As well as the 6 HTTP verbs, httr has some additional utilities to make your life easier.

Responses

It can automatically parse responses (usually JSON or XML), though you may also wish to use the JSONlite or xml2 package for more control when parsing.

There are also utilites to check the response status of calls - stop_for_status()

Authentication

It also tries to make OAuth authentication as smooth as possible, which I found to be the most offputting part whilst learning, with multiple demoes for Oauth1.0 and 2.0. Use it with the Oauth_ collection of functions.

  • JSONlite and xml2
  • See vignettes for both

Example API’s to practice on

  • StarWars API - This is a well structured API to practice on, allowing you to access information about characters, planets, vehicles and films from the StarWars films.
  • Open Movie Data Base - Similar to IMDB, this contains information about movies
  • httpbin

Examples

For a brief example, we will use the Star Wars API. As a starting point, check out its documentation.

It has a “Root” resource which we can call. The response of this provides information on the other resources available via the API.

library(httr)
## Warning: package 'httr' was built under R version 3.3.3
library(dplyr)
## Warning: package 'dplyr' was built under R version 3.3.3
## 
## Attaching package: 'dplyr'
## The following objects are masked from 'package:stats':
## 
##     filter, lag
## The following objects are masked from 'package:base':
## 
##     intersect, setdiff, setequal, union
request <- GET("https://swapi.co/api")

swcontent <- request$content

httr::content(request)
## $people
## [1] "https://swapi.co/api/people/"
## 
## $planets
## [1] "https://swapi.co/api/planets/"
## 
## $films
## [1] "https://swapi.co/api/films/"
## 
## $species
## [1] "https://swapi.co/api/species/"
## 
## $vehicles
## [1] "https://swapi.co/api/vehicles/"
## 
## $starships
## [1] "https://swapi.co/api/starships/"

Say we needed to check the names of all of the films, and we could only do it via this API.

swfilms <- GET("https://swapi.co/api/films")

filmscontent <- httr::content(swfilms)

str(filmscontent, max.level = 2)
## List of 4
##  $ count   : int 7
##  $ next    : NULL
##  $ previous: NULL
##  $ results :List of 7
##   ..$ :List of 14
##   ..$ :List of 14
##   ..$ :List of 14
##   ..$ :List of 14
##   ..$ :List of 14
##   ..$ :List of 14
##   ..$ :List of 14

I’ll save you from filling up an entire screen with the response. As we called the films endpoint without specifying a particular film, it’s returned information on all of them! We can easily e

With the handy map function from purrr, we can easily pull out just the film names.

library(purrr)
## Warning: package 'purrr' was built under R version 3.3.3
map_chr(filmscontent$results, "title")
## [1] "A New Hope"              "Attack of the Clones"   
## [3] "The Phantom Menace"      "Revenge of the Sith"    
## [5] "Return of the Jedi"      "The Empire Strikes Back"
## [7] "The Force Awakens"

A Wookie thing

I like wookies. Chewbacca is the highlight of Starwars for me. Let’s find out more information about him. Checking the documentation (or the root resource), we can see there is a people endpoint. Let’s use it to find the wookies.

people <- GET("https://swapi.co/api/people")

people_info <- content(people)

str(people_info, max.level = 2)
## List of 4
##  $ count   : int 87
##  $ next    : chr "https://swapi.co/api/people/?page=2"
##  $ previous: NULL
##  $ results :List of 10
##   ..$ :List of 16
##   ..$ :List of 16
##   ..$ :List of 16
##   ..$ :List of 16
##   ..$ :List of 16
##   ..$ :List of 16
##   ..$ :List of 16
##   ..$ :List of 16
##   ..$ :List of 16
##   ..$ :List of 16

There are 87 characters listed in the API. The responses come in pages. That’s a lot to parse. Instead, we can search for Chewbacca by name!

chewwie <- GET("https://swapi.co/api/people?search=chewbacca") %>% httr::content()


# How many results?
chewwie$count
## [1] 1
# Is it Chewwie?

chewwie$results[[1]]$name
## [1] "Chewbacca"
# Yes it's Chewwie!

But what if we wanted to find more wookies? Part of the API response contains Chewwies species. This comes in the format of an URL, https://swapi.co/api/species/3/. At first this may seem a tad unhelpful. But it’s infact very powerful, as it allows us to do the following.

wookie_species <- chewwie$results[[1]]$species %>% as.character() %>% GET() %>% content()

str(wookie_species)
## List of 15
##  $ name            : chr "Wookiee"
##  $ classification  : chr "mammal"
##  $ designation     : chr "sentient"
##  $ average_height  : chr "210"
##  $ skin_colors     : chr "gray"
##  $ hair_colors     : chr "black, brown"
##  $ eye_colors      : chr "blue, green, yellow, brown, golden, red"
##  $ average_lifespan: chr "400"
##  $ homeworld       : chr "https://swapi.co/api/planets/14/"
##  $ language        : chr "Shyriiwook"
##  $ people          :List of 2
##   ..$ : chr "https://swapi.co/api/people/13/"
##   ..$ : chr "https://swapi.co/api/people/80/"
##  $ films           :List of 5
##   ..$ : chr "https://swapi.co/api/films/2/"
##   ..$ : chr "https://swapi.co/api/films/7/"
##   ..$ : chr "https://swapi.co/api/films/6/"
##   ..$ : chr "https://swapi.co/api/films/3/"
##   ..$ : chr "https://swapi.co/api/films/1/"
##  $ created         : chr "2014-12-10T16:44:31.486000Z"
##  $ edited          : chr "2015-01-30T21:23:03.074598Z"
##  $ url             : chr "https://swapi.co/api/species/3/"

There are two wookies listed in the API. One must be Chewbacca, but who is the other? Let’s call their ID in the API to find out.

mystery_wookie <- GET("https://swapi.co/api/people/80/") %>% content()

mystery_wookie$name
## [1] "Tarfful"

Easter Egg

As a valuable example of reading the documentation, we can see that there are two response formats. JSON, and wookie? Let’s see what we get. JSON is the default response type, but we can request wookie by adding the following to our request.

GET("https://swapi.co/api/planets/1/?format=wookiee") %>% content()
## $whrascwo
## [1] "Traaoooooahwhwo"
## 
## $rcooaoraaoahoowh_akworcahoowa
## [1] "23"
## 
## $oorcrhahaoraan_akworcahoowa
## [1] "304"
## 
## $waahrascwoaoworc
## [1] "10465"
## 
## $oaanahscraaowo
## [1] "rarcahwa"
## 
## $rrrcrahoahaoro
## [1] "1 caorawhwararcwa"
## 
## $aoworcrcraahwh
## [1] "wawocworcao"
## 
## $churcwwraoawo_ohraaoworc
## [1] "1"
## 
## $akooakhuanraaoahoowh
## [1] "200000"
## 
## $rcwocahwawowhaoc
## $rcwocahwawowhaoc[[1]]
## [1] "acaoaoakc://cohraakah.oaoo/raakah/akwoooakanwo/1/"
## 
## $rcwocahwawowhaoc[[2]]
## [1] "acaoaoakc://cohraakah.oaoo/raakah/akwoooakanwo/2/"
## 
## $rcwocahwawowhaoc[[3]]
## [1] "acaoaoakc://cohraakah.oaoo/raakah/akwoooakanwo/4/"
## 
## $rcwocahwawowhaoc[[4]]
## [1] "acaoaoakc://cohraakah.oaoo/raakah/akwoooakanwo/6/"
## 
## $rcwocahwawowhaoc[[5]]
## [1] "acaoaoakc://cohraakah.oaoo/raakah/akwoooakanwo/7/"
## 
## $rcwocahwawowhaoc[[6]]
## [1] "acaoaoakc://cohraakah.oaoo/raakah/akwoooakanwo/8/"
## 
## $rcwocahwawowhaoc[[7]]
## [1] "acaoaoakc://cohraakah.oaoo/raakah/akwoooakanwo/9/"
## 
## $rcwocahwawowhaoc[[8]]
## [1] "acaoaoakc://cohraakah.oaoo/raakah/akwoooakanwo/11/"
## 
## $rcwocahwawowhaoc[[9]]
## [1] "acaoaoakc://cohraakah.oaoo/raakah/akwoooakanwo/43/"
## 
## $rcwocahwawowhaoc[[10]]
## [1] "acaoaoakc://cohraakah.oaoo/raakah/akwoooakanwo/62/"
## 
## 
## $wwahanscc
## $wwahanscc[[1]]
## [1] "acaoaoakc://cohraakah.oaoo/raakah/wwahanscc/5/"
## 
## $wwahanscc[[2]]
## [1] "acaoaoakc://cohraakah.oaoo/raakah/wwahanscc/4/"
## 
## $wwahanscc[[3]]
## [1] "acaoaoakc://cohraakah.oaoo/raakah/wwahanscc/6/"
## 
## $wwahanscc[[4]]
## [1] "acaoaoakc://cohraakah.oaoo/raakah/wwahanscc/3/"
## 
## $wwahanscc[[5]]
## [1] "acaoaoakc://cohraakah.oaoo/raakah/wwahanscc/1/"
## 
## 
## $oarcworaaowowa
## [1] "2014-12-09T13:50:49.641000Z"
## 
## $wowaahaowowa
## [1] "2014-12-21T20:48:04.175778Z"
## 
## $hurcan
## [1] "acaoaoakc://cohraakah.oaoo/raakah/akanrawhwoaoc/1/"