{"id":46734,"date":"2022-10-07T00:00:00","date_gmt":"2022-10-07T07:00:00","guid":{"rendered":"https:\/\/griddb-linux-hte8hndjf8cka8ht.westus-01.azurewebsites.net\/blog\/analyzing-nba-play-by-play-data-using-r-and-griddb\/"},"modified":"2025-11-13T12:56:20","modified_gmt":"2025-11-13T20:56:20","slug":"analyzing-nba-play-by-play-data-using-r-and-griddb","status":"publish","type":"post","link":"https:\/\/griddb-linux-hte8hndjf8cka8ht.westus-01.azurewebsites.net\/en\/blog\/analyzing-nba-play-by-play-data-using-r-and-griddb\/","title":{"rendered":"Analyzing NBA Play-by-Play Data using R and GridDB"},"content":{"rendered":"<p>The R programming language is a favorite of data scientists for conducting statistical analysis of datasets. Generally an analysis of large datasets benefit greatly from having a fast database backing the data &#8212; that&#8217;s where GridDB comes in.<\/p>\n<p>For this article, we will be looking to ingest a large dataset via R, and then with the large dataset in place, we will run a variety of SQL queries to see what kind of information we can glean from the dataset. Lastly, because the R programming language excels at graphing our data, we will try to plot our results with gplot.<\/p>\n<h2>Table of Contents<\/h2>\n<ol>\n<li>Getting Started<\/li>\n<li>Picking a Dataset<\/li>\n<li>Ingest Play-by-Play Data via hoopR<\/li>\n<li>Analyzing Play by Play Data<\/li>\n<li>Other Sorts Ideas for Analysis<\/li>\n<li>Conclusion<\/li>\n<\/ol>\n<h2>Getting Started<\/h2>\n<p>To follow along, you can clone the repo with the following:<\/p>\n<div class=\"clipboard\">\n<pre><code class=\"language-sh\">$ git clone -b r_analysis git@github.com:griddbnet\/Project.git<\/code><\/pre>\n<\/div>\n<h3>Prerequisites<\/h3>\n<p>You will need<\/p>\n<ul>\n<li>GridDB<\/li>\n<li>R <\/li>\n<li>An R IDE like <a href=\"https:\/\/www.rstudio.com\/\">RStudio<\/a><\/li>\n<\/ul>\n<h3>Sequence of Operations<\/h3>\n<p>To run this you will need to accomplish the following:<\/p>\n<ol>\n<li>Clone the repo<\/li>\n<li>\n<p>Install the necessary libraries:<\/p>\n<ul>\n<li>install.packages(&#8220;RJDBC&#8221;,dep=TRUE)<\/li>\n<li>install.packages(&#8220;rJava&#8221;)<\/li>\n<li>install.packages(&#8220;hoopR&#8221;)<\/li>\n<li>install.packages(&#8220;nflreadr&#8221;)<\/li>\n<li>install.packages(&#8220;devtools&#8221;, dep=TRUE)<\/li>\n<li>devtools::install_github(&#8220;abresler\/nbastatR&#8221;)<\/li>\n<li>install.packages(&#8216;stringr&#8217;)<\/li>\n<li>install.packages(&#8216;dplyr&#8217;)<\/li>\n<li>install.packages(&#8216;ggplot2&#8217;)<\/li>\n<li>install.packages(&#8216;lubridate&#8217;)<\/li>\n<li>install.packages(&#8216;ggalt&#8217;)<\/li>\n<\/ul>\n<\/li>\n<li>\n<p>Run the Ingest code (<code>ingest.R<\/code>)<\/p>\n<\/li>\n<li>\n<p>Run the querying code (<code>query.R<\/code>)<\/p>\n<\/li>\n<\/ol>\n<h2>Picking a Dataset<\/h2>\n<p>Picking an extremely large dataset can lead us down many paths &#8212; we are, after all, in the era of big data. For this article, we have opted to use go in a slightly-off-kilter direction: sports. Using the <a href=\"https:\/\/hoopr.sportsdataverse.org\/index.html\">hoopR<\/a> library, we can ingest play-by-play data from all NBA seasons starting from 2002 until the most recent season. In this case, ingesting all of the seasons did not seem necessary, so we opted to simply ingest the latest season and conduct our analysis from there.<\/p>\n<h2>Ingest Play-by-Play Data via hoopR<\/h2>\n<p>To ingest our dataset, we first need to connect to our running GridDB server.<\/p>\n<h3>Connecting to GridDB via JDBC<\/h3>\n<p>As mentioned before, we will utilize JDBC to connect to our server. Luckily, there is a package which allows for the programming language R to connect directly via JDBC called <a href=\"https:\/\/www.rforge.net\/RJDBC\/\">RJDBC<\/a>. Using this package, we can simply enter our JDBC credentials and create a connection with GridDB. Once that connection is made, we can use the DBI connection to make sql queries to our GridDB instance.<\/p>\n<p>To make the connection, we must of course import the appropriate library and then enter our credentials, including the <a href=\"https:\/\/github.com\/griddb\/jdbc\">GridDB JDBC file<\/a>.<\/p>\n<div class=\"clipboard\">\n<pre><code class=\"language-R\">library(RJDBC)\n\ndrv &lt;- JDBC(\"com.toshiba.mwcloud.gs.sql.Driver\",\n            \"\/usr\/share\/java\/gridstore-jdbc-5.0.0.jar\")\n             #identifier.quote = \"`\")\n\nconn &lt;- dbConnect(drv, \"jdbc:gs:\/\/127.0.0.1:20001\/myCluster\/public\", \"admin\", \"admin\")<\/code><\/pre>\n<\/div>\n<p>If all of your details are correct, the <code>conn<\/code> variable will now be a DBI connection to GridDB. With this done, we can move on to ingesting our dataset.<\/p>\n<h3>Ingesting Data via JDBC<\/h3>\n<p>To accomplish ingesting the play-by-play data, we will look to hoopR&#8217;s built-in functions which attempts to do all the work for us. The library&#8217;s function <code>load_nba_pbp<\/code> looks to be exactly what is needed: it accepts a DBI connection as one of its parameters and will load a specific year of data into our DB connection. From looking at the source code, we can see that the data is available to us in <code>.csv<\/code> and <code>.rds<\/code> file format directly from one of the hoopR&#8217;s publically available GitHub repositories.<\/p>\n<p>So to ingest, we will load in the file directly from GitHub and ingest the data, line by line until it is finished. Using the RJDBC API allows us to simply call <code>.dbWritetable<\/code> and it will handle creating our SQL statements for us, including creating the table.<\/p>\n<div class=\"clipboard\">\n<pre><code class=\"language-R\">loader &lt;- rds_from_url\nurls &lt;- paste0(\"https:\/\/raw.githubusercontent.com\/sportsdataverse\/hoopR-data\/main\/nba\/pbp\/rds\/play_by_play_2022.rds\") \n\np &lt;- NULL\n\nout &lt;- lapply(urls, progressively(loader, p))\nout &lt;- rbindlist_with_attrs(out)\n\nout$type_abbreviation &lt;- NULL\n\nfor (i in 1:nrow(out)) {\n  RJDBC::dbWriteTable(conn, \"nba_pbp_2022\", out[i, ], append = TRUE )\n}<\/code><\/pre>\n<\/div>\n<p>That&#8217;s the entirety of our ingest script; it simply reads in the file directly from GitHub and then goes line by line, ingesting until finished, roughly 600,000 lines of data. Once it is done, any aspects of the latest 2022 NBA season can be analyzed with some queries.<\/p>\n<h2>Analyzing Play by Play Data<\/h2>\n<p>To analyze the data, we can simply form SQL queries to return data we want to look at. First, let&#8217;s take a look at the all of the columns included in the dataset:<\/p>\n<div class=\"clipboard\">\n<pre><code class=\"language-sh\">\nColumns:\nNo  Name                  Type            CSTR  RowKey\n------------------------------------------------------------------------------\n 0  shooting_play         STRING\n 1  sequence_number       STRING\n 2  period_display_value  STRING\n 3  period_number         INTEGER\n 4  home_score            INTEGER\n 5  coordinate_x          INTEGER\n 6  coordinate_y          INTEGER\n 7  scoring_play          STRING\n 8  clock_display_value   STRING\n 9  team_id               STRING\n10  type_id               STRING\n11  type_text             STRING\n12  away_score            INTEGER\n13  id                    DOUBLE\n14  text                  STRING\n15  score_value           INTEGER\n16  participants_0_athlete_id  STRING\n17  participants_1_athlete_id  STRING\n18  participants_2_athlete_id  STRING\n19  season                INTEGER\n20  season_type           INTEGER\n21  game_id               INTEGER\n22  away_team_id          INTEGER\n23  away_team_name        STRING\n24  away_team_mascot      STRING\n25  away_team_abbrev      STRING\n26  away_team_name_alt    STRING\n27  home_team_id          INTEGER\n28  home_team_name        STRING\n29  home_team_mascot      STRING\n30  home_team_abbrev      STRING\n31  home_team_name_alt    STRING\n32  home_team_spread      DOUBLE\n33  game_spread           DOUBLE\n34  home_favorite         STRING\n35  game_spread_available  STRING\n36  qtr                   INTEGER\n37  time                  STRING\n38  clock_minutes         INTEGER\n39  clock_seconds         DOUBLE\n40  half                  STRING\n41  game_half             STRING\n42  lag_qtr               DOUBLE\n43  lead_qtr              DOUBLE\n44  lag_game_half         STRING\n45  lead_game_half        STRING\n46  start_quarter_seconds_remaining  INTEGER\n47  start_half_seconds_remaining  INTEGER\n48  start_game_seconds_remaining  INTEGER\n49  game_play_number      INTEGER\n50  end_quarter_seconds_remaining  DOUBLE\n51  end_half_seconds_remaining  DOUBLE\n52  end_game_seconds_remaining  DOUBLE\n53  period                INTEGER<\/code><\/pre>\n<\/div>\n<p>To show this information we used the <a href=\"https:\/\/github.com\/griddb\/cli\">GridDB CLI&#8217;s<\/a> <code>showcontainer<\/code> command: <code>showcontainer nba_pbp_2022<\/code>.<\/p>\n<h3>Choosing Relevant Datapoints<\/h3>\n<p>Though there of course many directions in which we can take our analysis, one of the most visually pleasing datapoints to chart onto a plot is shot makes and misses. It would be even better if we could somehow plot the results of the datapoints onto a plot which resembled an NBA court for proper context. Luckily for us, we can see some columns which can help us with this endeavor, namely: coordinate_x, coordinate_y, score_value, shooting_play, participants_0_athlete_id, and type_id.<\/p>\n<p>Using those columns we can grab the coordinates of a variety of different plays from specific players, from specific games, or from specific teams. Specifically we can see that <code>type_id<\/code> can correspond to many different event types. For example, we can specifically target <code>Step Back Jump Shot<\/code> by searching for <code>type_id<\/code> of 132. For example, as of right now <a href=\"https:\/\/www.espn.com\/nba\/player\/_\/id\/3945274\/luka-doncic\">Luka Doncic<\/a> comes to mind as the step back jumpshot leader, especially with <a href=\"https:\/\/www.espn.com\/nba\/player\/_\/id\/3992\/james-harden\">James Harden<\/a> being hampered by injuries as of late.<\/p>\n<p>To check if this is true, we can simply run a SQL query. To start, let&#8217;s run this query in our shell and then we are happy with the data results, we can move to plotting our data onto a plot.<\/p>\n<h3>Querying the Dataset<\/h3>\n<p>To start, let&#8217;s try to get the count of step back attempts by both players. We formulate our SQL query: <code>SELECT COUNT(*) FROM nba_pbp_2022 WHERE shooting_play = 'TRUE' AND participants_0_athlete_id = '3945274' AND type_id = '132'<\/code>. Here our query is finding all instances where Luka Doncic attempted a step back jump shot &#8212; this includes both makes and misses. Running this query shows a blistering 450 attempts from Luka Doncic on step back jumpshots. Although please note that this dataset also includes all of the postseason which of course adds some volume to this metric as he made round 3 of the playoffs.<\/p>\n<p>If we make the same query for James Harden I suspect we see a much smaller total, even though Harden is the player who popularized the move. Let&#8217;s run this query: <code>SELECT COUNT(*) FROM nba_pbp_2022  WHERE shooting_play = 'TRUE' AND participants_0_athlete_id = '3992' AND type_id = '132'<\/code>. And sure enough, we get 349 results, with the last attempt being in a losing effort to the Miami Heat in game 6 of the Eastern Conference SemiFinals.<\/p>\n<p>But what if instead of looking at total attempts, we wanted to know who <strong>made<\/strong> more shots of this type? To do so, we can simply add a score_value of 2 or greater in our query: <code>SELECT COUNT(*) FROM nba_pbp_2022  WHERE shooting_play = 'TRUE' AND participants_0_athlete_id = '3945274' AND type_id = '132' AND score_value &gt;= 2<\/code>. Once we run this query for both players, we see that Harden shot 122\/349 on step back shots, while Luka shot 175\/450, or better stated that Harden shot ~35% on step back shots compared to Luka&#8217;s ~39% on higher volume; perhaps Luka is the new step back king!<\/p>\n<p>Here&#8217;s what that chart looks like<\/p>\n<p><img decoding=\"async\" src=\"https:\/\/raw.githubusercontent.com\/griddbnet\/Project\/r_analysis\/images\/luka_stepbacks.png\" alt=\"\" \/><\/p>\n<h3>Visualizing The Dataset<\/h3>\n<p>So now we know that Luka shot better and on more attempts of step back jumpshots. Wouldn&#8217;t it be cool to be able to visualize where on the court Luka was attempting and making these shots? As stated before, the play by play data includes coordinates for where a specific event occured. So we can directly query to extract the coordinates for all step back jump shots. Now to plot these values onto something resembling an NBA court.<\/p>\n<p>To accomplish this feat, we can borrow the court made by the <a href=\"https:\/\/github.com\/toddwschneider\/ballr\">ballr library<\/a>. Once we are able to draw the court with <a href=\"https:\/\/ggplot2.tidyverse.org\/reference\/ggplot.html\">ggplot<\/a>, we mutate our coordinates to match what the court expects and plot all of the precise locations of the events onto an NBA half court visualization.<\/p>\n<div class=\"clipboard\">\n<pre><code class=\"language-R\">queryString &lt;- \"select coordinate_x, coordinate_y from nba_pbp_2022 WHERE shooting_play = 'TRUE' AND participants_0_athlete_id = '3945274' AND type_id = '132'\"\nrs &lt;- dbGetQuery(conn, queryString )\n\nsource(\"https:\/\/raw.githubusercontent.com\/toddwschneider\/ballr\/master\/plot_court.R\")\nsource(\"https:\/\/raw.githubusercontent.com\/toddwschneider\/ballr\/master\/court_themes.R\")\nplot_court() # created the court_points object we need\ncourt_points &lt;- court_points %>% mutate_if(is.numeric,~.*10)\n\nrs &lt;- rs  %>%  mutate_if(is.numeric,~.*10)<\/code><\/pre>\n<\/div>\n<p>And then of course, the final step will be to plot all of our data points directly onto our court_points<\/p>\n<div class=\"clipboard\">\n<pre><code class=\"language-R\">DBcourt &lt;- \n  ggplot(rs, aes(x=coordinate_x-250, y=coordinate_y+45)) + \n  scale_fill_manual(values = c(\"#00529b\",\"#cc4b4b\"),guide='none')+\n  geom_path(data = court_points,\n            aes(x = x, y = y, group = desc),\n            color = \"black\")+\n  coord_equal()+\n  geom_point(aes(fill=\"TRUE\",color=score_value\/10),size=1) +\n  xlim(-260, 260)+\n  labs(title=\"Shot location\",x=\"\",\n       y=\"\",\n       caption = \"with GridDB\")\n\nprint(DBcourt)<\/code><\/pre>\n<\/div>\n<p>With this snippet of code, we are plotting our coordinates for all of Luka Doncic&#8217;s step back jumpers. The makes will be in a brighter shade of blue, and the misses will be nearly black. Because it is 450 data points plotted onto a small court, it is a tad messy but you can get a feel of how he did throughout the season.<\/p>\n<h2>Other Sorts of Ideas for Analysis<\/h2>\n<p>Of course, analyzing just step back jumpers is only a small sliver of what you can do with this much data at your disposal. You can, for example, also look at made shots in the 4th quarter to try to extrapolate &#8220;clutchness&#8221;. Really, with this much data, the possibilities are limitless.<\/p>\n<p>Here are some examples of simple shot charts you could make for different players. First, let&#8217;s take a look at all of Steph Curry&#8217;s made baskets for the season and then chart it:<\/p>\n<div class=\"clipboard\">\n<pre><code class=\"language-R\">queryString &lt;- \"select coordinate_x, coordinate_y from nba_pbp_2022 WHERE shooting_play = 'TRUE' AND participants_0_athlete_id = '3975' AND score_value >= 2\"<\/code><\/pre>\n<\/div>\n<p><img decoding=\"async\" src=\"https:\/\/raw.githubusercontent.com\/griddbnet\/Project\/r_analysis\/images\/curry-made-shots.png\" alt=\"\" \/><\/p>\n<p>Or let&#8217;s try it for Giannis Antetokounmpo:<\/p>\n<div class=\"clipboard\">\n<pre><code class=\"language-R\">queryString &lt;- \"select coordinate_x, coordinate_y from nba_pbp_2022 WHERE shooting_play = 'TRUE' AND participants_0_athlete_id = '3032977' AND score_value >= 2\"<\/code><\/pre>\n<\/div>\n<p><img decoding=\"async\" src=\"https:\/\/raw.githubusercontent.com\/griddbnet\/Project\/r_analysis\/images\/Giannis.png\" alt=\"\" \/><\/p>\n<p>Klay Thompson:<\/p>\n<div class=\"clipboard\">\n<pre><code class=\"language-R\">queryString &lt;- \"select coordinate_x, coordinate_y from nba_pbp_2022 WHERE shooting_play = 'TRUE' AND participants_0_athlete_id = '6475' AND score_value >= 2\"<\/code><\/pre>\n<\/div>\n<p><img decoding=\"async\" src=\"https:\/\/raw.githubusercontent.com\/griddbnet\/Project\/r_analysis\/images\/Klay.png\" alt=\"\" \/><\/p>\n<p>LeBron James:<\/p>\n<div class=\"clipboard\">\n<pre><code class=\"language-R\">queryString &lt;- \"select coordinate_x, coordinate_y from nba_pbp_2022 WHERE shooting_play = 'TRUE' AND participants_0_athlete_id = '1966' AND score_value >= 2\"<\/code><\/pre>\n<\/div>\n<p><img decoding=\"async\" src=\"https:\/\/raw.githubusercontent.com\/griddbnet\/Project\/r_analysis\/images\/lebron.png\" alt=\"\" \/><\/p>\n<h2>Conclusion<\/h2>\n<p>And with that, we have seen how to use GridDB with R and how to query extremely large datasets and to visualize said dataset.<\/p>\n","protected":false},"excerpt":{"rendered":"<p>The R programming language is a favorite of data scientists for conducting statistical analysis of datasets. Generally an analysis of large datasets benefit greatly from having a fast database backing the data &#8212; that&#8217;s where GridDB comes in. For this article, we will be looking to ingest a large dataset via R, and then with [&hellip;]<\/p>\n","protected":false},"author":1,"featured_media":28839,"comment_status":"open","ping_status":"open","sticky":false,"template":"","format":"standard","meta":{"_acf_changed":false,"footnotes":""},"categories":[121],"tags":[],"class_list":["post-46734","post","type-post","status-publish","format-standard","has-post-thumbnail","hentry","category-blog"],"acf":[],"yoast_head":"<!-- This site is optimized with the Yoast SEO plugin v27.1.1 - https:\/\/yoast.com\/product\/yoast-seo-wordpress\/ -->\n<title>Analyzing NBA Play-by-Play Data using R and GridDB | GridDB: Open Source Time Series Database for IoT<\/title>\n<meta name=\"description\" content=\"The R programming language is a favorite of data scientists for conducting statistical analysis of datasets. Generally an analysis of large datasets\" \/>\n<meta name=\"robots\" content=\"index, follow, max-snippet:-1, max-image-preview:large, max-video-preview:-1\" \/>\n<link rel=\"canonical\" href=\"https:\/\/griddb.net\/en\/blog\/analyzing-nba-play-by-play-data-using-r-and-griddb\/\" \/>\n<meta property=\"og:locale\" content=\"en_US\" \/>\n<meta property=\"og:type\" content=\"article\" \/>\n<meta property=\"og:title\" content=\"Analyzing NBA Play-by-Play Data using R and GridDB | GridDB: Open Source Time Series Database for IoT\" \/>\n<meta property=\"og:description\" content=\"The R programming language is a favorite of data scientists for conducting statistical analysis of datasets. Generally an analysis of large datasets\" \/>\n<meta property=\"og:url\" content=\"https:\/\/griddb.net\/en\/blog\/analyzing-nba-play-by-play-data-using-r-and-griddb\/\" \/>\n<meta property=\"og:site_name\" content=\"GridDB: Open Source Time Series Database for IoT\" \/>\n<meta property=\"article:publisher\" content=\"https:\/\/www.facebook.com\/griddbcommunity\/\" \/>\n<meta property=\"article:published_time\" content=\"2022-10-07T07:00:00+00:00\" \/>\n<meta property=\"article:modified_time\" content=\"2025-11-13T20:56:20+00:00\" \/>\n<meta property=\"og:image\" content=\"https:\/\/griddb-linux-hte8hndjf8cka8ht.westus-01.azurewebsites.net\/wp-content\/uploads\/2022\/10\/luka-stepbacks.png\" \/>\n\t<meta property=\"og:image:width\" content=\"986\" \/>\n\t<meta property=\"og:image:height\" content=\"638\" \/>\n\t<meta property=\"og:image:type\" content=\"image\/png\" \/>\n<meta name=\"author\" content=\"Israel\" \/>\n<meta name=\"twitter:card\" content=\"summary_large_image\" \/>\n<meta name=\"twitter:creator\" content=\"@GridDBCommunity\" \/>\n<meta name=\"twitter:site\" content=\"@GridDBCommunity\" \/>\n<meta name=\"twitter:label1\" content=\"Written by\" \/>\n\t<meta name=\"twitter:data1\" content=\"Israel\" \/>\n\t<meta name=\"twitter:label2\" content=\"Est. reading time\" \/>\n\t<meta name=\"twitter:data2\" content=\"10 minutes\" \/>\n<script type=\"application\/ld+json\" class=\"yoast-schema-graph\">{\"@context\":\"https:\/\/schema.org\",\"@graph\":[{\"@type\":\"Article\",\"@id\":\"https:\/\/griddb.net\/en\/blog\/analyzing-nba-play-by-play-data-using-r-and-griddb\/#article\",\"isPartOf\":{\"@id\":\"https:\/\/griddb.net\/en\/blog\/analyzing-nba-play-by-play-data-using-r-and-griddb\/\"},\"author\":{\"name\":\"Israel\",\"@id\":\"https:\/\/griddb-linux-hte8hndjf8cka8ht.westus-01.azurewebsites.net\/en\/#\/schema\/person\/c8a430e7156a9e10af73b1fbb46c2740\"},\"headline\":\"Analyzing NBA Play-by-Play Data using R and GridDB\",\"datePublished\":\"2022-10-07T07:00:00+00:00\",\"dateModified\":\"2025-11-13T20:56:20+00:00\",\"mainEntityOfPage\":{\"@id\":\"https:\/\/griddb.net\/en\/blog\/analyzing-nba-play-by-play-data-using-r-and-griddb\/\"},\"wordCount\":1470,\"commentCount\":0,\"publisher\":{\"@id\":\"https:\/\/griddb-linux-hte8hndjf8cka8ht.westus-01.azurewebsites.net\/en\/#organization\"},\"image\":{\"@id\":\"https:\/\/griddb.net\/en\/blog\/analyzing-nba-play-by-play-data-using-r-and-griddb\/#primaryimage\"},\"thumbnailUrl\":\"\/wp-content\/uploads\/2022\/10\/luka-stepbacks.png\",\"articleSection\":[\"Blog\"],\"inLanguage\":\"en-US\",\"potentialAction\":[{\"@type\":\"CommentAction\",\"name\":\"Comment\",\"target\":[\"https:\/\/griddb.net\/en\/blog\/analyzing-nba-play-by-play-data-using-r-and-griddb\/#respond\"]}]},{\"@type\":\"WebPage\",\"@id\":\"https:\/\/griddb.net\/en\/blog\/analyzing-nba-play-by-play-data-using-r-and-griddb\/\",\"url\":\"https:\/\/griddb.net\/en\/blog\/analyzing-nba-play-by-play-data-using-r-and-griddb\/\",\"name\":\"Analyzing NBA Play-by-Play Data using R and GridDB | GridDB: Open Source Time Series Database for IoT\",\"isPartOf\":{\"@id\":\"https:\/\/griddb-linux-hte8hndjf8cka8ht.westus-01.azurewebsites.net\/en\/#website\"},\"primaryImageOfPage\":{\"@id\":\"https:\/\/griddb.net\/en\/blog\/analyzing-nba-play-by-play-data-using-r-and-griddb\/#primaryimage\"},\"image\":{\"@id\":\"https:\/\/griddb.net\/en\/blog\/analyzing-nba-play-by-play-data-using-r-and-griddb\/#primaryimage\"},\"thumbnailUrl\":\"\/wp-content\/uploads\/2022\/10\/luka-stepbacks.png\",\"datePublished\":\"2022-10-07T07:00:00+00:00\",\"dateModified\":\"2025-11-13T20:56:20+00:00\",\"description\":\"The R programming language is a favorite of data scientists for conducting statistical analysis of datasets. Generally an analysis of large datasets\",\"inLanguage\":\"en-US\",\"potentialAction\":[{\"@type\":\"ReadAction\",\"target\":[\"https:\/\/griddb.net\/en\/blog\/analyzing-nba-play-by-play-data-using-r-and-griddb\/\"]}]},{\"@type\":\"ImageObject\",\"inLanguage\":\"en-US\",\"@id\":\"https:\/\/griddb.net\/en\/blog\/analyzing-nba-play-by-play-data-using-r-and-griddb\/#primaryimage\",\"url\":\"\/wp-content\/uploads\/2022\/10\/luka-stepbacks.png\",\"contentUrl\":\"\/wp-content\/uploads\/2022\/10\/luka-stepbacks.png\",\"width\":986,\"height\":638},{\"@type\":\"WebSite\",\"@id\":\"https:\/\/griddb-linux-hte8hndjf8cka8ht.westus-01.azurewebsites.net\/en\/#website\",\"url\":\"https:\/\/griddb-linux-hte8hndjf8cka8ht.westus-01.azurewebsites.net\/en\/\",\"name\":\"GridDB: Open Source Time Series Database for IoT\",\"description\":\"GridDB is an open source time-series database with the performance of NoSQL and convenience of SQL\",\"publisher\":{\"@id\":\"https:\/\/griddb-linux-hte8hndjf8cka8ht.westus-01.azurewebsites.net\/en\/#organization\"},\"potentialAction\":[{\"@type\":\"SearchAction\",\"target\":{\"@type\":\"EntryPoint\",\"urlTemplate\":\"https:\/\/griddb-linux-hte8hndjf8cka8ht.westus-01.azurewebsites.net\/en\/?s={search_term_string}\"},\"query-input\":{\"@type\":\"PropertyValueSpecification\",\"valueRequired\":true,\"valueName\":\"search_term_string\"}}],\"inLanguage\":\"en-US\"},{\"@type\":\"Organization\",\"@id\":\"https:\/\/griddb-linux-hte8hndjf8cka8ht.westus-01.azurewebsites.net\/en\/#organization\",\"name\":\"Fixstars\",\"url\":\"https:\/\/griddb-linux-hte8hndjf8cka8ht.westus-01.azurewebsites.net\/en\/\",\"logo\":{\"@type\":\"ImageObject\",\"inLanguage\":\"en-US\",\"@id\":\"https:\/\/griddb-linux-hte8hndjf8cka8ht.westus-01.azurewebsites.net\/en\/#\/schema\/logo\/image\/\",\"url\":\"https:\/\/griddb.net\/wp-content\/uploads\/2019\/04\/fixstars_logo_web_tagline.png\",\"contentUrl\":\"https:\/\/griddb.net\/wp-content\/uploads\/2019\/04\/fixstars_logo_web_tagline.png\",\"width\":200,\"height\":83,\"caption\":\"Fixstars\"},\"image\":{\"@id\":\"https:\/\/griddb-linux-hte8hndjf8cka8ht.westus-01.azurewebsites.net\/en\/#\/schema\/logo\/image\/\"},\"sameAs\":[\"https:\/\/www.facebook.com\/griddbcommunity\/\",\"https:\/\/x.com\/GridDBCommunity\",\"https:\/\/www.linkedin.com\/company\/griddb-by-toshiba\"]},{\"@type\":\"Person\",\"@id\":\"https:\/\/griddb-linux-hte8hndjf8cka8ht.westus-01.azurewebsites.net\/en\/#\/schema\/person\/c8a430e7156a9e10af73b1fbb46c2740\",\"name\":\"Israel\",\"image\":{\"@type\":\"ImageObject\",\"inLanguage\":\"en-US\",\"@id\":\"https:\/\/griddb-linux-hte8hndjf8cka8ht.westus-01.azurewebsites.net\/en\/#\/schema\/person\/image\/\",\"url\":\"https:\/\/secure.gravatar.com\/avatar\/4df8cfc155402a2928d11f80b0220037b8bd26c4f1b19c4598d826e0306e6307?s=96&d=mm&r=g\",\"contentUrl\":\"https:\/\/secure.gravatar.com\/avatar\/4df8cfc155402a2928d11f80b0220037b8bd26c4f1b19c4598d826e0306e6307?s=96&d=mm&r=g\",\"caption\":\"Israel\"},\"url\":\"https:\/\/griddb-linux-hte8hndjf8cka8ht.westus-01.azurewebsites.net\/en\/author\/israel\/\"}]}<\/script>\n<!-- \/ Yoast SEO plugin. -->","yoast_head_json":{"title":"Analyzing NBA Play-by-Play Data using R and GridDB | GridDB: Open Source Time Series Database for IoT","description":"The R programming language is a favorite of data scientists for conducting statistical analysis of datasets. Generally an analysis of large datasets","robots":{"index":"index","follow":"follow","max-snippet":"max-snippet:-1","max-image-preview":"max-image-preview:large","max-video-preview":"max-video-preview:-1"},"canonical":"https:\/\/griddb.net\/en\/blog\/analyzing-nba-play-by-play-data-using-r-and-griddb\/","og_locale":"en_US","og_type":"article","og_title":"Analyzing NBA Play-by-Play Data using R and GridDB | GridDB: Open Source Time Series Database for IoT","og_description":"The R programming language is a favorite of data scientists for conducting statistical analysis of datasets. Generally an analysis of large datasets","og_url":"https:\/\/griddb.net\/en\/blog\/analyzing-nba-play-by-play-data-using-r-and-griddb\/","og_site_name":"GridDB: Open Source Time Series Database for IoT","article_publisher":"https:\/\/www.facebook.com\/griddbcommunity\/","article_published_time":"2022-10-07T07:00:00+00:00","article_modified_time":"2025-11-13T20:56:20+00:00","og_image":[{"width":986,"height":638,"url":"https:\/\/griddb-linux-hte8hndjf8cka8ht.westus-01.azurewebsites.net\/wp-content\/uploads\/2022\/10\/luka-stepbacks.png","type":"image\/png"}],"author":"Israel","twitter_card":"summary_large_image","twitter_creator":"@GridDBCommunity","twitter_site":"@GridDBCommunity","twitter_misc":{"Written by":"Israel","Est. reading time":"10 minutes"},"schema":{"@context":"https:\/\/schema.org","@graph":[{"@type":"Article","@id":"https:\/\/griddb.net\/en\/blog\/analyzing-nba-play-by-play-data-using-r-and-griddb\/#article","isPartOf":{"@id":"https:\/\/griddb.net\/en\/blog\/analyzing-nba-play-by-play-data-using-r-and-griddb\/"},"author":{"name":"Israel","@id":"https:\/\/griddb-linux-hte8hndjf8cka8ht.westus-01.azurewebsites.net\/en\/#\/schema\/person\/c8a430e7156a9e10af73b1fbb46c2740"},"headline":"Analyzing NBA Play-by-Play Data using R and GridDB","datePublished":"2022-10-07T07:00:00+00:00","dateModified":"2025-11-13T20:56:20+00:00","mainEntityOfPage":{"@id":"https:\/\/griddb.net\/en\/blog\/analyzing-nba-play-by-play-data-using-r-and-griddb\/"},"wordCount":1470,"commentCount":0,"publisher":{"@id":"https:\/\/griddb-linux-hte8hndjf8cka8ht.westus-01.azurewebsites.net\/en\/#organization"},"image":{"@id":"https:\/\/griddb.net\/en\/blog\/analyzing-nba-play-by-play-data-using-r-and-griddb\/#primaryimage"},"thumbnailUrl":"\/wp-content\/uploads\/2022\/10\/luka-stepbacks.png","articleSection":["Blog"],"inLanguage":"en-US","potentialAction":[{"@type":"CommentAction","name":"Comment","target":["https:\/\/griddb.net\/en\/blog\/analyzing-nba-play-by-play-data-using-r-and-griddb\/#respond"]}]},{"@type":"WebPage","@id":"https:\/\/griddb.net\/en\/blog\/analyzing-nba-play-by-play-data-using-r-and-griddb\/","url":"https:\/\/griddb.net\/en\/blog\/analyzing-nba-play-by-play-data-using-r-and-griddb\/","name":"Analyzing NBA Play-by-Play Data using R and GridDB | GridDB: Open Source Time Series Database for IoT","isPartOf":{"@id":"https:\/\/griddb-linux-hte8hndjf8cka8ht.westus-01.azurewebsites.net\/en\/#website"},"primaryImageOfPage":{"@id":"https:\/\/griddb.net\/en\/blog\/analyzing-nba-play-by-play-data-using-r-and-griddb\/#primaryimage"},"image":{"@id":"https:\/\/griddb.net\/en\/blog\/analyzing-nba-play-by-play-data-using-r-and-griddb\/#primaryimage"},"thumbnailUrl":"\/wp-content\/uploads\/2022\/10\/luka-stepbacks.png","datePublished":"2022-10-07T07:00:00+00:00","dateModified":"2025-11-13T20:56:20+00:00","description":"The R programming language is a favorite of data scientists for conducting statistical analysis of datasets. Generally an analysis of large datasets","inLanguage":"en-US","potentialAction":[{"@type":"ReadAction","target":["https:\/\/griddb.net\/en\/blog\/analyzing-nba-play-by-play-data-using-r-and-griddb\/"]}]},{"@type":"ImageObject","inLanguage":"en-US","@id":"https:\/\/griddb.net\/en\/blog\/analyzing-nba-play-by-play-data-using-r-and-griddb\/#primaryimage","url":"\/wp-content\/uploads\/2022\/10\/luka-stepbacks.png","contentUrl":"\/wp-content\/uploads\/2022\/10\/luka-stepbacks.png","width":986,"height":638},{"@type":"WebSite","@id":"https:\/\/griddb-linux-hte8hndjf8cka8ht.westus-01.azurewebsites.net\/en\/#website","url":"https:\/\/griddb-linux-hte8hndjf8cka8ht.westus-01.azurewebsites.net\/en\/","name":"GridDB: Open Source Time Series Database for IoT","description":"GridDB is an open source time-series database with the performance of NoSQL and convenience of SQL","publisher":{"@id":"https:\/\/griddb-linux-hte8hndjf8cka8ht.westus-01.azurewebsites.net\/en\/#organization"},"potentialAction":[{"@type":"SearchAction","target":{"@type":"EntryPoint","urlTemplate":"https:\/\/griddb-linux-hte8hndjf8cka8ht.westus-01.azurewebsites.net\/en\/?s={search_term_string}"},"query-input":{"@type":"PropertyValueSpecification","valueRequired":true,"valueName":"search_term_string"}}],"inLanguage":"en-US"},{"@type":"Organization","@id":"https:\/\/griddb-linux-hte8hndjf8cka8ht.westus-01.azurewebsites.net\/en\/#organization","name":"Fixstars","url":"https:\/\/griddb-linux-hte8hndjf8cka8ht.westus-01.azurewebsites.net\/en\/","logo":{"@type":"ImageObject","inLanguage":"en-US","@id":"https:\/\/griddb-linux-hte8hndjf8cka8ht.westus-01.azurewebsites.net\/en\/#\/schema\/logo\/image\/","url":"https:\/\/griddb.net\/wp-content\/uploads\/2019\/04\/fixstars_logo_web_tagline.png","contentUrl":"https:\/\/griddb.net\/wp-content\/uploads\/2019\/04\/fixstars_logo_web_tagline.png","width":200,"height":83,"caption":"Fixstars"},"image":{"@id":"https:\/\/griddb-linux-hte8hndjf8cka8ht.westus-01.azurewebsites.net\/en\/#\/schema\/logo\/image\/"},"sameAs":["https:\/\/www.facebook.com\/griddbcommunity\/","https:\/\/x.com\/GridDBCommunity","https:\/\/www.linkedin.com\/company\/griddb-by-toshiba"]},{"@type":"Person","@id":"https:\/\/griddb-linux-hte8hndjf8cka8ht.westus-01.azurewebsites.net\/en\/#\/schema\/person\/c8a430e7156a9e10af73b1fbb46c2740","name":"Israel","image":{"@type":"ImageObject","inLanguage":"en-US","@id":"https:\/\/griddb-linux-hte8hndjf8cka8ht.westus-01.azurewebsites.net\/en\/#\/schema\/person\/image\/","url":"https:\/\/secure.gravatar.com\/avatar\/4df8cfc155402a2928d11f80b0220037b8bd26c4f1b19c4598d826e0306e6307?s=96&d=mm&r=g","contentUrl":"https:\/\/secure.gravatar.com\/avatar\/4df8cfc155402a2928d11f80b0220037b8bd26c4f1b19c4598d826e0306e6307?s=96&d=mm&r=g","caption":"Israel"},"url":"https:\/\/griddb-linux-hte8hndjf8cka8ht.westus-01.azurewebsites.net\/en\/author\/israel\/"}]}},"_links":{"self":[{"href":"https:\/\/griddb-linux-hte8hndjf8cka8ht.westus-01.azurewebsites.net\/en\/wp-json\/wp\/v2\/posts\/46734","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/griddb-linux-hte8hndjf8cka8ht.westus-01.azurewebsites.net\/en\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/griddb-linux-hte8hndjf8cka8ht.westus-01.azurewebsites.net\/en\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/griddb-linux-hte8hndjf8cka8ht.westus-01.azurewebsites.net\/en\/wp-json\/wp\/v2\/users\/1"}],"replies":[{"embeddable":true,"href":"https:\/\/griddb-linux-hte8hndjf8cka8ht.westus-01.azurewebsites.net\/en\/wp-json\/wp\/v2\/comments?post=46734"}],"version-history":[{"count":1,"href":"https:\/\/griddb-linux-hte8hndjf8cka8ht.westus-01.azurewebsites.net\/en\/wp-json\/wp\/v2\/posts\/46734\/revisions"}],"predecessor-version":[{"id":51405,"href":"https:\/\/griddb-linux-hte8hndjf8cka8ht.westus-01.azurewebsites.net\/en\/wp-json\/wp\/v2\/posts\/46734\/revisions\/51405"}],"wp:featuredmedia":[{"embeddable":true,"href":"https:\/\/griddb-linux-hte8hndjf8cka8ht.westus-01.azurewebsites.net\/en\/wp-json\/wp\/v2\/media\/28839"}],"wp:attachment":[{"href":"https:\/\/griddb-linux-hte8hndjf8cka8ht.westus-01.azurewebsites.net\/en\/wp-json\/wp\/v2\/media?parent=46734"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/griddb-linux-hte8hndjf8cka8ht.westus-01.azurewebsites.net\/en\/wp-json\/wp\/v2\/categories?post=46734"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/griddb-linux-hte8hndjf8cka8ht.westus-01.azurewebsites.net\/en\/wp-json\/wp\/v2\/tags?post=46734"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}