R and gpx — how to read and visualize gpx files in r
- Select a language for the TTS:
- UK English Female
- UK English Male
- US English Female
- US English Male
- Australian Female
- Australian Male
- Language selected: (auto detect) - EN

Play all audios:

PLOTTING GPX FILES IS EASY WITH R — LEARN HOW TO VISUALIZE GPX FILES WITH R IN MINUTES. Geospatial data is everywhere around us, and it’s essential for data professionals to know how to work
with it. One common way to store this type of data is in GPX files. Today you’ll learn everything about it, from theory and common questions to R and GPX file parsing. We’ll start simple —
with just a bit of theory and commonly asked questions. This is needed to get a deeper understanding of how storing geospatial data works. If you’re already familiar with the topic, feel
free to skip the first section. > _New to geomapping in R? Follow this __guide to make stunning > geomaps in R with Leaflet__._ INTRODUCTION TO R AND GPX Online route mapping services
such as Strava and Komoot store the routes in GPX file format. It’s an easy and convenient way to analyze, visualize, and display different types of geospatial data, such as geolocation
(latitude, longitude), elevation, and many more. For example, take a look at the following image. It represents a Strava cycling route in Croatia I plan to embark on later this summer. It’s
the highest paved road in the country, and I expect the views to be breathtaking: Why is this relevant? Because Strava allows you to export any route or workout in GPX file format. But what
is GPX anyway? WHAT IS A GPX FILE? Put simply, GPX stands for _GPS eXchange Format_, and it’s nothing but a simple text file with geographical information, such as latitude, longitude,
elevation, time, and so on. If you plot these points on a map, you’ll know exactly where you need to go, and what sort of terrain you might expect, at least according to the elevation. The
Strava route we’ll analyze today is just a plain route and has 1855 latitude, longitude, and elevation data points. If I was to complete this route and export the file from workouts, it
would also include timestamps. These data points are ridiculously easy to load into R. You don’t need a dedicated package to combine R and GPX — all is done with an XML parser. More on that
in a bit. WHAT IS THE DIFFERENCE BETWEEN GPS AND GPX? This is a common question beginners have. GPS stands for _Global Positioning System_ which provides users with positioning, navigation,
and timing services. GPX, on the other hand, is a file format used to exchange GPS data by storing geographical information at given intervals. These data include waypoints, tracks,
elevation, and routes. If you’re working on GPS programs or plan to build navigation applications, GPX files are a common map data format used. GPX is an open standard in the geospatial
world that has been around for 2 decades. It’s important you know how to work with them. WHAT PROGRAM OPENS A GPX FILE? You can’t open a GPX file without dedicated software or a programming
language. Downloadable software includes Google Earth Pro and Garmin BaseCamp, just to name a few. If you’re into coding, you should know that any major programming language can load and
parse GPX files, R and Python included. HOW TO LOAD AND PARSE GPX FILES IN R Now you’ll learn how to combine R and GPX. First things first, we’ll load a GPX file into R. To do so, we’ll have
to install a library for parsing XML files. Yes — GPX is just a fancier version of XML: install.packages("XML") We can now use the XML::htmlTreeParse() function to read a GPX
file. Make sure you know where your file is saved beforehand: library(XML) gpx_parsed <- htmlTreeParse(file = "croatia_bike.gpx", useInternalNodes = TRUE) gpx_parsed The
gpx_parsed variable contains the following: If you think that looks like a mess, you are not wrong. The file is pretty much unreadable in this form, but you can spot a structure if you focus
for long enough. The trkpt element contains latitude and longitude information for every point, and there's also an ele tag which contains the elevation. Use the following R code to
extract and store them in a more readable data structure — data.frame: coords <- xpathSApply(doc = gpx_parsed, path = "//trkpt", fun = xmlAttrs) elevation <- xpathSApply(doc
= gpx_parsed, path = "//trkpt/ele", fun = xmlValue)df <- data.frame( lat = as.numeric(coords["lat", ]), lon = as.numeric(coords["lon", ]), elevation =
as.numeric(elevation) )head(df, 10) tail(df, 10) The route represents a roundtrip, so starting and ending data points will be almost identical. The fun part happens in the middle, but we
can’t know that for sure before inspecting the data further. The best way to do so is graphically, so next, we’ll go over a couple of options for visualizing GPX data in R. HOW TO VISUALIZE
GPX FILES IN R When it comes to data visualization and GPX files, you have options. You can go as simple as using a built-in plot() function or you can pay for custom solutions. The best
approach would be to use the ggmap package, but it requires a GCP subscription to an API which isn't free. We won't cover it in the article, but we'll go over the next best
thing. For starters, let’s explore the most basic option. It boils down to plotting a line chart that has all individual data points connected: plot(x = df$lon, y = df$lat, type =
"l", col = "black", lwd = 3, xlab = "Longitude", ylab = "Latitude") The route looks on point, but the visualization is useless. There’s no
underlying map below it, so we have no idea where this route takes place. The other, significantly better alternative is the leaflet package. It's designed for visualizing geospatial
data, so it won't have any trouble working with our data frame: library(leaflet)leaflet() %>% addTiles() %>% addPolylines(data = df, lat = ~lat, lng = ~lon, color =
"#000000", opacity = 0.8, weight = 3) Now we’re getting somewhere! The route looks almost identical to the one shown earlier on Strava, but we don’t have to stop here. You can
invest hours into producing a perfect geospatial visualization, but for the purpose of this article, we’ll display one additional thing — elevation. Leaflet doesn’t ship with an easy way of
using elevation data (numeric) for coloring purposes, so we have to be somewhat creative. The get_color() function will return one of four colors, depending on the elevation group. Then,
data points for groups are added manually to the chart inside a for loop: get_color <- function(elevation) { if (elevation < 500) { return("green") } if (elevation
< 1000) { return("yellow") } if (elevation < 1500) { return("orange") } return("red") }# New dataset with the new variable for color
df_color <- df %>% rowwise() %>% mutate(color = get_color(elevation))df_color$last_color <- dplyr::lag(df_color$color)# Map map <- leaflet() %>% addTiles() for (color
in levels(as.factor(df_color$color))) { map <- addPolylines(map, lat = ~lat, lng = ~lon, data = df_color[df_color$color == color | df_color$last_color == color, ], color = ~color) } map
The map isn’t perfect, but it informs us which route segments have a higher elevation than the others. SUMMARY OF R AND GPX And that’s the basics of R and GPX! You’ve learned the basic
theory behind this file format, and how to work with it in the R programming language. We’ve only scratched the surface, as there’s plenty more you can do. For example, plotting the
elevation profile or making the polyline interactive would be an excellent next step. Now it’s time for the homework assignment. We encourage you to play around with any GPX file you can
find and use R to visualize it. Feel free to explore other visualization libraries and make something truly amazing. When done, please share your results with us on Twitter — @appsilon. We’d
love to see what you can come up with. > _Want to build interactive maps with R and R Shiny? __Try Leaflet > and Tmap__._ _Originally published at __https://appsilon.com__._