[[toc]]
TheBlueAlliance stores their data in a database, such as Google Cloud Datastore. However, databases themselves are typically heavily secured, so that bad actors can't intentionally edit the database and corrupt the data (and break the site). So instead, TBA exposes special URLs that return raw data, just like how specific URLs return a webpage. The data format that data is returned in is called JSON - a widely standardized format that many programming languages can understand.
You can view an example of a JSON response here: https://www.thebluealliance.com/api/v3/status
However, you'll see an error:
{"Error": "X-TBA-Auth-Key is a required header or URL param. Please get an access key at http://www.thebluealliance.com/account."}
This is because our request to the server didn't have all the information that the server wanted - specifically, we are missing a header that tells the server who we are. The reason we have to tell the server who we are is so that TBA (and many other sites that use API authentication practices) can track which users are requesting the most data - while TBA doesn't really do anything with this information, more popular APIs - such as the official League of Legends API or the Google Maps API - can limit how much data is returned to each user in order to prevent servers from being overwhelmed.
Don't worry about the error right now - we'll fix it in our upcoming code.
Let's stick to the basics here. We'll use Python here since it's very easy to utilize. If you already have a Python environment set up, you're free to skip ahead.
We'll use the PyCharm educational IDE, since it's a great way for new Python programmers to learn. There is both an educational and a regular version of PyCharm, so feel free to grab whichever one you would prefer. The educational version has some simpler UIs and easier-to-navigate options, as well as coming with a Python version in the installer.
When installing, you'll come across this menu screen:
Note that my options for Choose Python version
are greyed out, since I already have both of those Python versions installed. You should choose to install Python 3.8 if you have not installed it already (Python 2.7 is no longer officially supported). If you don't install any Python version, you won't be able to run the code!
Make sure you select Learner
here.
Click New Project
.
Let's take a step to examine what all this means:
Location
is simply the folder on your hard drive where your project will be saved.- You'll see 3 options for what to use for a "new environment" - for this, we'll simply use
Virtualenv
, or "virtual environment" - this will create a copy of the Python installation inside the folder you specified inLocation
. The reason for this is a bit out of scope for this tutorial, but effectively it allows for easier separation of concerns when having multiple Python projects on your hard drive that each use their own external libraries. - The
Base interpreter
is simply which Python version will be copied to the virtual environment. This is used when you have multiple versions of Python installed, such as Python 3.8 and Python 2.7. - Make sure you select the last option to create a welcome script.
You'll see a window like this:
The first thing to do to make your life a little easier is go to View -> Appearance -> Toolbar
:
Now, we can click the magical green play button to run the script!
And you should see a console output the message:
Hi, PyCharm
As much as I would love to teach you the basics of Python syntax, it's out of scope of what this guide is intended for. There are tons and tons of resources for learning the basics of Python, and it's a very easy language to learn. If you aren't familiar with Python at all, you're free to continue, but I won't be explaining every aspect of the code (such as iterating through dictionaries and lists).
While it's possible to get data from a web API with the Python standard library, it's a lot easier with third-party libraries. My personal favorite and recommended library is tbapy. In order to install it, follow these steps:
- Go to
File -> Settings
. - Navigate to
Project: myProject -> Python Interpreter
. - You'll see that we have two default libraries installed -
pip
andsetuptools
.pip
is used to download other libraries, andsetuptools
is used to install and distribute libraries. - Click the
+
on the right side. - Search for
tbapy
: - Click
Install Package
. - When it's installed successfully, you can close these windows. You may note that you have several more libraries installed, such as
CacheControl
,certifi
,chardet
, and others - these are all libraries used bytbapy
, so they had to be installed too.
You'll want to get a Read API Key from here: https://www.thebluealliance.com/account
Copy the value under X-TBA-Auth-Key
- that's your API key (and what the error earlier was complaining about you not having!).
Back to PyCharm. The first thing we have to do is import tbapy
so that our script knows what we are talking about:
import tbapy
After that, we have to tell tbapy
what our API key is so that it can go and get data for us:
tba = tbapy.TBA("ThisIsMyAPIKey")
This creates an instance of the TBA
class and assigns it to the tba
variable. The TBA
class has lots of helper methods to help us get data, such as status
:
print(tba.status())
After running, your IDE should look like this:
You'll see that the console successfully output some information regarding the status of the TBA API! This isn't really any info you probably actually care about, so let's try getting information about a team...
import tbapy
tba = tbapy.TBA("ThisIsMyAPIKey")
print(tba.team(2791))
Feel free to swap out 2791
for your own team number. You should get an output like this:
Team({'address': None, 'city': 'Latham', 'country': 'USA', 'gmaps_place_id': None, 'gmaps_url': None, 'home_championship': {'2020': 'Detroit'}, 'key': 'frc2791', 'lat': None, 'lng': None, 'location_name': None, 'motto': None, 'name': 'GE Energy (Power and Water)/gcom Software/Google/PVA/The Colden Company Inc/CAPCOM Federal Credit Union/NYSUT/Crisafulli Brothers/Atlas Copco/Price Chopper/Market 32&Shaker High School', 'nickname': 'Shaker Robotics', 'postal_code': '12110', 'rookie_year': 2009, 'school_name': 'Shaker High School', 'state_prov': 'New York', 'team_number': 2791, 'website': 'http://www.team2791.org'})
Which looks great! You'll notice that it looks pretty similar to JSON, except it's wrapped in a Team( ... )
block -- this is just because it's actually printing an instance of a Team
class that tbapy
has implemented, and it simply internally stores the JSON representation of a team. You can interact with this Team
instance just like it was a pure JSON object.
One minor nitpack is that it's all printed on one line, so it can be a little hard for the human eye to parse through. Luckily, Python has some helper methods for that! We'll use pprint
(short for "pretty print") for that:
Now we can much more easily see the JSON blob representing team 2791.
A lot! You can see the full API documentation here: https://www.thebluealliance.com/apidocs/v3
And the tbapy
documentation here: https://github.com/frc1418/tbapy#retrieval-functions
So, lets look at the tbapy
API docs and explain what those function arguments mean.
For example:
Here, tba.team
is exactly what we did in the last script we ran! 2791
(or whichever team you put) is team
parameter - you'll note that the tbapy
developers noted that you can provide either the team number (2791
) or the team key, which would be a string of "frc2791"
. Team keys are pretty common within the TBA API and the official FRC APIs.
Next, you'll notice [simple]
- the brackets simply mean it's an optional parameter. Since the official TBA API has multiple team API endpoints, tbapy
simply condensed it into one method. You can see the endpoints on the official TBA docs page:
Clicking on each of those tells you what type of data they each return - in this case, the first method returns a Team
object, while the second returns a TeamSimple
object. To find out what these mean, scroll all the way down to the bottom of the TBA page to find these:
You'll see that TeamSimple
just returns less data, in case you don't need all of it from the full Team
model.
So lets print the simple version of our team. To provide the optional simple
parameter, simply replace tba.team(2791)
with tba.team(2791, simple=True)
:
Lets say we want to find the average number of district points our team has earned across our 2019 events. First, we need to find all our 2019 events:
import tbapy
from pprint import pprint
tba = tbapy.TBA("ThisIsMyAPIKey")
all_events = tba.team_events(2168, year=2019)
Here, we are getting all events that team 2168 attended in 2019. An example event model looks like:
{'address': '100 Institute Rd, Worcester, MA 01609, USA',
'city': 'Worcester',
'country': 'USA',
'district': {'abbreviation': 'ne',
'display_name': 'New England',
'key': '2019ne',
'year': 2019},
'division_keys': [],
'end_date': '2019-04-13',
'event_code': 'necmp',
'event_type': 2,
'event_type_string': 'District Championship',
'first_event_code': 'NECMP',
'first_event_id': None,
'gmaps_place_id': 'ChIJdzY3EFkG5IkRrW3cc4Yhw8Y',
'gmaps_url': 'https://maps.google.com/?cid=14322328101321469357',
'key': '2019necmp',
'lat': 42.2745754,
'lng': -71.8062724,
'location_name': 'Worcester Polytechnic Institute',
'name': 'New England District Championship',
'parent_event_key': None,
'playoff_type': 0,
'playoff_type_string': 'Elimination Bracket (8 Alliances)',
'postal_code': '01609',
'short_name': 'New England',
'start_date': '2019-04-10',
'state_prov': 'MA',
'timezone': 'America/New_York',
'webcasts': [{'channel': 'nefirst_red', 'type': 'twitch'},
{'channel': 'nefirst_blue', 'type': 'twitch'}],
'website': 'http://www.nefirst.org/',
'week': 6,
'year': 2019}
There is some important info here - notably, it tells us that this was a district championship, and that it was in the 2019ne
district, and also various other details. However, team_events
gets us all events - including offseasons, which we don't want district points for.
Comparably, an offseason event blob may look like this:
{'address': '7802 Hague Rd, Indianapolis, IN 46256, USA',
'city': 'Indianapolis',
'country': 'USA',
'district': None,
'division_keys': [],
'end_date': '2019-07-13',
'event_code': 'iri',
'event_type': 99,
'event_type_string': 'Offseason',
'first_event_code': 'IRI',
'first_event_id': None,
'gmaps_place_id': 'ChIJt-fTNsJMa4gRk20afFmPaQU',
'gmaps_url': 'https://maps.google.com/?cid=390000457241226643',
'key': '2019iri',
'lat': 39.8961663,
'lng': -86.0349536,
'location_name': 'Lawrence North High School',
'name': 'Indiana Robotics Invitational',
'parent_event_key': None,
'playoff_type': 0,
'playoff_type_string': 'Elimination Bracket (8 Alliances)',
'postal_code': '46256',
'short_name': 'Indiana Robotics Invitational',
'start_date': '2019-07-12',
'state_prov': 'IN',
'timezone': 'America/Indiana/Indianapolis',
'webcasts': [{'channel': 'firstinspires', 'type': 'twitch'}],
'website': 'http://indianaroboticsinvitational.org/',
'week': None,
'year': 2019}
You'll note that 'district'
is None
- which is the equivalent of null
in Java or C++.
So lets iterate over all the events and print which ones are district events:
You'll note that we can iterate over all events by for event in all_events
. We can access a given events district value by event["district"]
. We check if its not None
- which means it's an event that belongs to a district. We can then print a string with the name of the event embedded in it by using "f-strings". f'{event_name}'
simply embeds the event_name
variable into the string.
But unfortunately our given event blobs don't include district points information. However, that information is available elsewhere in the API, so we need to make another call to the API. You'll note that the district points data is structured as such:
We're nearly there! Now we just need to calculate an average of the 'total'
field...
Now here's the issue - we're including out of district events! 2168 is a New England team that participated in the 2019 Springside Chestnut Hill FMA event, where they didn't earn any district points. As a challenge to the reader, try calculating the average district points that 2168 earned at each New England (NE) event in 2019. Your answer should be 116.67
.
Hint: Check the if
statement on line 11. :)