Parse JSON with R

Handling JSON object is currently an important task when we live in era of web service where result of an api often be wrapped in a json object.

You can use RJSONIO package or jsonlite package for parsing JSON in R. Both libraries works very well for me. However, there are some diffrences between them. In this post, I will parse some JSON objects with both package and compare differences between them.

Parse a simple Object

We start with the simplest JSON object. Here I have a JSON object x = {a : 1}.

 library(RJSONIO)
x <- fromJSON('{"a": 1}')
## $a
## [1] 1
class(x)
## [1] "numeric"
str(x)
##  Named num 1
##  - attr(*, "names")= chr "a"
x + 2
## a 
## 3
 library(jsonlite)
x <- fromJSON('{"a": 1}')
## $a
## [1] 1
class(x)
## [1] "list"
str(x)
## List of 1
##  $ a: int 1
x$a + 2
## [1] 3

We see a diffrence here, RJONIO parse object x to a numeric while jsonlite return a list. It's reason of the diffrence when we want take value of x

Parse a simple Array

Next we will parse a simplest array x = [1, 2, 3, 4]

 library(RJSONIO)
x <- fromJSON("[1, 2, 3, 4]")
x
## [1] 1 2 3 4
class(x)
## [1] "numeric"
str(x)
##  num [1:4] 1 2 3 4
x[1] + 2
## [1] 3
 library(jsonlite)
x <- fromJSON("[1, 2, 3, 4]")
## [1] 1 2 3 4
class(x)
## [1] "integer"
str(x)
##  int [1:4] 1 2 3 4
x[1] + 2
## [1] 3

After parse an array, RJSONIO return numeric while jsonlite return integer. But it's the same if we want to take value of first element of returned object.

Parse commplex Array

Now we take a look in a complex json array, and see how we access element

 library(RJSONIO)
x <- fromJSON('[{"age": 15, "name": "Peter"}, {"age": 23, "name": "John"}]')
## [[1]]
## [[1]]$age
## [1] 15
## 
## [[1]]$name
## [1] "Peter"
## 
## 
## [[2]]
## [[2]]$age
## [1] 23
## 
## [[2]]$name
## [1] "John"
class(x)
## [1] "list"
str(x)
## List of 2
##  $ :List of 2
##   ..$ age : num 15
##   ..$ name: chr "Peter"
##  $ :List of 2
##   ..$ age : num 23
##   ..$ name: chr "John"
x[[1]]$age
## [1] 15
 library(jsonlite)
x <- fromJSON('[{"age": 15, "name": "Peter"}, {"age": 23, "name": "John"}]')
##   age  name
## 1  15 Peter
## 2  23  John
class(x)
## [1] "data.frame"
str(x)
## 'data.frame':    2 obs. of  2 variables:
##  $ age : int  15 23
##  $ name: chr  "Peter" "John"
x[1, 'age']
## [1] 15

We see diffrences clearly. When parse complex json array, RJSONIO return a list while jsonlite returns a data frame.

Real world JSON parsing

Now I use geo API for getting geographic information of Eiffel Tower. The goal is get of latitude and longitude of Eiffel Tower.

Let's how two packages do this job

 library(RJSONIO)
library(RCurl)
json <- getURL("http://maps.google.com/maps/api/geocode/json?sensor=false&address=Eiffel%20Tower")
data <- fromJSON(json)
## $results
## $results[[1]]
## $results[[1]]$address_components
## $results[[1]]$address_components[[1]]
## $results[[1]]$address_components[[1]]$long_name
## [1] "Eiffel Tower"
## 
## $results[[1]]$address_components[[1]]$short_name
## [1] "Eiffel Tower"
## 
## $results[[1]]$address_components[[1]]$types
## [1] "point_of_interest" "establishment"    
## 
## 
## $results[[1]]$address_components[[2]]
## $results[[1]]$address_components[[2]]$long_name
## [1] "Champ de Mars"
## 
## $results[[1]]$address_components[[2]]$short_name
## [1] "Champ de Mars"
## 
## $results[[1]]$address_components[[2]]$types
## [1] "premise"
##
## ...
## 
## $results[[1]]$formatted_address
## [1] "Champ de Mars, Eiffel Tower, 5 Avenue Anatole France, 75007 Paris, France"
## 
## $results[[1]]$geometry
## $results[[1]]$geometry$location
##       lat       lng 
## 48.858370  2.294481 
## 
## $results[[1]]$geometry$location_type
## [1] "APPROXIMATE"
## 
## $results[[1]]$geometry$viewport
## $results[[1]]$geometry$viewport$northeast
##      lat      lng 
## 48.85972  2.29583 
## 
## $results[[1]]$geometry$viewport$southwest
##       lat       lng 
## 48.857021  2.293132 
## 
## 
## $results[[1]]$place_id
## [1] "ChIJLU7jZClu5kcR4PcOOO6p3I0"
## 
## $results[[1]]$types
## [1] "point_of_interest" "establishment"    
## 
## 
## 
## $status
## [1] "OK"
class(data)
## [1] "list"
str(data)
## List of 2
##  $ results:List of 1
##   ..$ :List of 5
##   .. ..$ address_components:List of 9
##   .. .. ..$ :List of 3
##   .. .. .. ..$ long_name : chr "Eiffel Tower"
##   .. .. .. ..$ short_name: chr "Eiffel Tower"
##   .. .. .. ..$ types     : chr [1:2] "point_of_interest" "establishment"
##   .. .. ..$ :List of 3
##   .. .. .. ..$ long_name : chr "Champ de Mars"
##   .. .. .. ..$ short_name: chr "Champ de Mars"
##   .. .. .. ..$ types     : chr "premise"
##   .. .. ..$ :List of 3
##   .. .. .. ..$ long_name : chr "5"
##   .. .. .. ..$ short_name: chr "5"
##   .. .. .. ..$ types     : chr "street_number"
##   .. .. ..$ :List of 3
##   .. .. .. ..$ long_name : chr "Avenue Anatole France"
##   .. .. .. ..$ short_name: chr "Avenue Anatole France"
##   .. .. .. ..$ types     : chr "route"
##   .. .. ..$ :List of 3
##   .. .. .. ..$ long_name : chr "Paris"
##   .. .. .. ..$ short_name: chr "Paris"
##   .. .. .. ..$ types     : chr [1:2] "locality" "political"
##   .. .. ..$ :List of 3
##   .. .. .. ..$ long_name : chr "Paris"
##   .. .. .. ..$ short_name: chr "75"
##   .. .. .. ..$ types     : chr [1:2] "administrative_area_level_2" "political"
##   .. .. ..$ :List of 3
##   .. .. .. ..$ long_name : chr "Île-de-France"
##   .. .. .. ..$ short_name: chr "IDF"
##   .. .. .. ..$ types     : chr [1:2] "administrative_area_level_1" "political"
##   .. .. ..$ :List of 3
##   .. .. .. ..$ long_name : chr "France"
##   .. .. .. ..$ short_name: chr "FR"
##   .. .. .. ..$ types     : chr [1:2] "country" "political"
##   .. .. ..$ :List of 3
##   .. .. .. ..$ long_name : chr "75007"
##   .. .. .. ..$ short_name: chr "75007"
##   .. .. .. ..$ types     : chr "postal_code"
##   .. ..$ formatted_address : chr "Champ de Mars, Eiffel Tower, 5 Avenue Anatole France, 75007 Paris, France"
##   .. ..$ geometry          :List of 3
##   .. .. ..$ location     : Named num [1:2] 48.86 2.29
##   .. .. .. ..- attr(*, "names")= chr [1:2] "lat" "lng"
##   .. .. ..$ location_type: chr "APPROXIMATE"
##   .. .. ..$ viewport     :List of 2
##   .. .. .. ..$ northeast: Named num [1:2] 48.9 2.3
##   .. .. .. .. ..- attr(*, "names")= chr [1:2] "lat" "lng"
##   .. .. .. ..$ southwest: Named num [1:2] 48.86 2.29
##   .. .. .. .. ..- attr(*, "names")= chr [1:2] "lat" "lng"
##   .. ..$ place_id          : chr "ChIJLU7jZClu5kcR4PcOOO6p3I0"
##   .. ..$ types             : chr [1:2] "point_of_interest" "establishment"
##  $ status : chr "OK"
 library(jsonlite)
data <- fromJSON(txt="http://maps.google.com/maps/api/geocode/json?sensor=false&address=Eiffel%20Tower")
data
## $results
##                                                                                                                                                                                                                                                                                                                                                                           address_components
## 1 Eiffel Tower, Champ de Mars, 5, Avenue Anatole France, Paris, Paris, Île-de-France, France, 75007, Eiffel Tower, Champ de Mars, 5, Avenue Anatole France, Paris, 75, IDF, FR, 75007, point_of_interest, establishment, premise, street_number, route, locality, political, administrative_area_level_2, political, administrative_area_level_1, political, country, political, postal_code
##                                                           formatted_address
## 1 Champ de Mars, Eiffel Tower, 5 Avenue Anatole France, 75007 Paris, France
##   geometry.location.lat geometry.location.lng geometry.location_type
## 1              48.85837              2.294481            APPROXIMATE
##   geometry.viewport.northeast.lat geometry.viewport.northeast.lng
## 1                        48.85972                         2.29583
##   geometry.viewport.southwest.lat geometry.viewport.southwest.lng
## 1                        48.85702                        2.293132
##                      place_id                            types
## 1 ChIJLU7jZClu5kcR4PcOOO6p3I0 point_of_interest, establishment
## 
## $status
## [1] "OK"
class(data)
## [1] "list"
str(data)
## List of 2
##  $ results:'data.frame': 1 obs. of  5 variables:
##   ..$ address_components:List of 1
##   .. ..$ :'data.frame':  9 obs. of  3 variables:
##   .. .. ..$ long_name : chr [1:9] "Eiffel Tower" "Champ de Mars" "5" "Avenue Anatole France" ...
##   .. .. ..$ short_name: chr [1:9] "Eiffel Tower" "Champ de Mars" "5" "Avenue Anatole France" ...
##   .. .. ..$ types     :List of 9
##   .. .. .. ..$ : chr [1:2] "point_of_interest" "establishment"
##   .. .. .. ..$ : chr "premise"
##   .. .. .. ..$ : chr "street_number"
##   .. .. .. ..$ : chr "route"
##   .. .. .. ..$ : chr [1:2] "locality" "political"
##   .. .. .. ..$ : chr [1:2] "administrative_area_level_2" "political"
##   .. .. .. ..$ : chr [1:2] "administrative_area_level_1" "political"
##   .. .. .. ..$ : chr [1:2] "country" "political"
##   .. .. .. ..$ : chr "postal_code"
##   ..$ formatted_address : chr "Champ de Mars, Eiffel Tower, 5 Avenue Anatole France, 75007 Paris, France"
##   ..$ geometry          :'data.frame':   1 obs. of  3 variables:
##   .. ..$ location     :'data.frame': 1 obs. of  2 variables:
##   .. .. ..$ lat: num 48.9
##   .. .. ..$ lng: num 2.29
##   .. ..$ location_type: chr "APPROXIMATE"
##   .. ..$ viewport     :'data.frame': 1 obs. of  2 variables:
##   .. .. ..$ northeast:'data.frame':  1 obs. of  2 variables:
##   .. .. .. ..$ lat: num 48.9
##   .. .. .. ..$ lng: num 2.3
##   .. .. ..$ southwest:'data.frame':  1 obs. of  2 variables:
##   .. .. .. ..$ lat: num 48.9
##   .. .. .. ..$ lng: num 2.29
##   ..$ place_id          : chr "ChIJLU7jZClu5kcR4PcOOO6p3I0"
##   ..$ types             :List of 1
##   .. ..$ : chr [1:2] "point_of_interest" "establishment"
##  $ status : chr "OK"

As you can see, while RJSONIO parse json to list of list, jsonlite parse json to list of data frame.

References

Advertisements

One thought on “Parse JSON with R

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s