Space Launches Dataset Exploration

The number of space launches and satellites in orbit has increased drastically in the last few years, and we would like to see how marked that trend is compared to previous decades.

We’ll be exploring launch data from the Gunter Space Page, that Gunter kindly authorized me for using. In the Chronology section, we have every space launch for the year, with a list of payloads the launch carried. After a quick analysis of the columns in the dataset and some cleaning on the columns with useful information, we’ll create some visualizations and answer some questions.

Each of our rows of our dataset will have a “payload” (something that was put in orbit, hopefully, or was lost if the launch failed), in a launch. Each launch has a unique ID. Let’s see what other columns we have. (this dataset was partially cleaned for simplicity beforehand)

import numpy as np 
import pandas as pd
import seaborn as sns
import matplotlib.pyplot as plt
import re

df = pd.read_csv("../input/spacelaunches/clean_launch_data.csv")
df.head()

	nation	type / application	operator	contractors	equipment	configuration	propulsion	power	lifetime	mass	orbit	date	id	vehicle	site	failed	type, application
0	ussr	technology	NaN	npo energia	2 transmitters	pressurized sphere with four antennas	none	batteries	21 days	84 kg	228 km × 947 km, 65.0�	1957-10-04T03:00:00.000Z	1957 α (001)	Sputnik (1)	Ba LC-1/5	False	NaN
1	ussr	biological resaerch	NaN	NaN	NaN	r-7 core with added payload	none (after burnout)	batteries	6 days	508 kg (payload)	212 km × 1660 km, 65.3�	1957-11-03T03:00:00.000Z	1957 β (002)	Sputnik (1)	Ba LC-1/5	False	NaN
2	usa	science	nasa	naval research laboratory (nrl)	NaN	NaN	none	solar cells, batteries	NaN	1.5 kg	654 km × 3969 km, 34.25�	1957-12-06T03:00:00.000Z	1957-F01	Vanguard	CC LC-18A	True	NaN
3	usa	research, magnetosphere, micro meteorites	nasa	jet propulsion laboratory (jpl)	cosmic-ray detection package, temperature sens...	aerodynimcally shaped satellite body attached ...	none	batteries	NaN	14 kg (#1-3), 17 kg (#4-5)	356 km × 2548 km, 33.24� (#1); 186 km × 2799 k...	1958-02-01T03:00:00.000Z	1958 α (001)	Juno-1	CC LC-26A	False	NaN
4	usa	science	nasa	naval research laboratory (nrl)	NaN	NaN	none	solar cells, batteries	NaN	1.5 kg	654 km × 3969 km, 34.25�	1958-02-05T03:00:00.000Z	1958-F01	Vanguard	CC LC-18A	True	NaN

For each payload, the row indicates what country it belongs to (“nation”). But that’s not necessarily the country from where the payload was launched. That information is part of the “site” column. More information here: the vehicle that was used in the launch, and if the launch was successful.

Besides that, we know the type of the satellite, contractor, configuration, mass, the equipment the satellite carries, the propulsion method it uses, how it is powered, its expected lifetime, and mass. Many of the columns need some cleaning so we won’t concern ourselves a lot with them. Let’s see which of the columns will be most useful

df.describe()

	nation	type / application	operator	contractors	equipment	configuration	propulsion	power	lifetime	mass	orbit	date	id	vehicle	site	failed	type, application
count	9304	9165	6994	8189	5217	6339	6891	9088	4237	7140	6768	9359	9359	9359	9359	9359	137
unique	176	506	1067	1072	1139	686	349	125	392	1383	1612	5247	5956	497	207	2	6
top	usa	communication	gru	tsskb	?	cubesat (3u)	?	solar cells, batteries	~ 1 year (iss orbit); 2-3 years (sso)	~260 kg	geo	2017-02-15T03:00:00.000Z	2017-008	Soyuz-U	CC SLC-40	False	earth observation, traffic monitoring
freq	3753	1821	870	750	961	709	1729	2372	383	478	1219	104	104	796	575	8566	111

As we gathered from peeking at the first rows, most of the columns need more cleaning in order to be useful: there are many unique values (power, contractor, type of the satellite), so we won’t be able to make much sense of them without more work: there seem to be 349 different types of propulsion and power configurations, the “type / application column” seems to be indicated as “type , application” in some cases and as “type / application” in others.

Looking at the equipment column, it’s easy to tell it needs much more work. It won’t used for now. Instead, we’ll focus on columns regarding the launch to get a sense of long term trends in the access countries have been gaining to space, and leave the information about individual payloads for some other time.

df.info()

<class 'pandas.core.frame.DataFrame'>
RangeIndex: 9359 entries, 0 to 9358
Data columns (total 17 columns):
 #   Column              Non-Null Count  Dtype 
---  ------              --------------  ----- 
 nation              9304 non-null   object
 type / application  9165 non-null   object
 operator            6994 non-null   object
 contractors         8189 non-null   object
 equipment           5217 non-null   object
 configuration       6339 non-null   object
 propulsion          6891 non-null   object
 power               9088 non-null   object
 lifetime            4237 non-null   object
 mass                7140 non-null   object
orbit               6768 non-null   object
date                9359 non-null   object
id                  9359 non-null   object
vehicle             9359 non-null   object
site                9359 non-null   object
failed              9359 non-null   bool  
type, application   137 non-null    object
dtypes: bool(1), object(16)
memory usage: 1.2+ MB

The nation column has some missing values. Not a lot (some 0.5% of all data), but that column will be useful later. There is no reasonable way of inferring the nation that owns the payload in rows that don’t have that information available, so we’ll complete with “Unknown”.

df["nation"] = df["nation"].fillna("unknown")

df["equipment"].dropna().value_counts().head(30)
#Best stay away from this for now!

?                                                                                                         961
ku/ka-band payload, optical inter-satellite link                                                          478
ps0, ps1 or ps2 imaging payload                                                                           383
ku-band payload                                                                                           129
see above                                                                                                 117
stratos (gps radio occultation payload), sense (ais-receiver), airsafe (asd-b payload, #78 and later)     111
package                                                                                                    96
ftor-2 camera package, kust-12m elint package                                                              81
ftor-4 camera package                                                                                      76
l-band payload, ka-band cross-links, ka-band downlinks, ads-b payload, ais-receiver (on 58 satellites)     75
ftor-6 camera package                                                                                      61
priroda-4 imaging system (2 × kfa-1000, 3 × kate-200)                                                      53
j-1 camera, index camera                                                                                   52
radar                                                                                                      37
tv, ir, ac                                                                                                 36
6 c-band transponders, 1 ku-band transponder                                                               35
3 delta-1 c-band transponders, 3 delta-2 c-band transponders                                               35
docking system                                                                                             30
zhemchug-4 camera                                                                                          30
24 c-band transponders                                                                                     30
1 (+1) uhf 200 w transponder, c-band uplink                                                                27
none                                                                                                       27
priroda-3                                                                                                  27
mural-camera, index camera                                                                                 26
ais-receiver                                                                                               25
camera                                                                                                     23
tv, ir, sm, rmk-2                                                                                          21
32 ku-band transponders                                                                                    21
j-3 camera, disic camera                                                                                   17
12 c-band transponders                                                                                     17
Name: equipment, dtype: int64

Let us start working with the date and the failed column. First, take a look at how many launches we had every year. Launches have been more frequent these last few years, and we’d expect to see that reflected in the data. We’ll be using the “failed” column to see how many launches were successful; the “site” column, that denotes the launch site for the launches (and indirectly, the country); and the nation column: what country the payload being put in orbit belongs to.

First, let’s extract the year and the month of each launch, and calculate the launch success rate each year. After some quick visualizations, both to get a sense of the data and to find some trends that seem evident, we’ll be in position to answer a few questions.

#Fixing the type / application column
df["type"] = df["type / application"].fillna("").astype(str)  +df["type, application"].fillna("").astype(str) 
df = df.drop(["type / application","type, application"], axis = 1)
df.head()

df["date"] = pd.to_datetime(df["date"])
df["year"] = df["date"].dt.year
df["month"] = df["date"].dt.month

First analysis - number of launches, and success rate

We’ll plot how many launches took place each year, regardless of how many payloads it was carrying, to see if the data makes sense. For that, since the dataset has a row for each payload, we’ll first grab the year and id column, and determine the number of separate space launches each year using groupby.

#on "id" we have the id of the launch. We'll only keep those two columns, and drop the duplicates to get the unique launches each year.
per_year_df = df[ ["year","id"] ].drop_duplicates()
#Now, we group by year, get the size, and with that plot the number of launches.
per_year_df.groupby("year").size().plot(kind="line",figsize=(16,4))

<matplotlib.axes._subplots.AxesSubplot at 0x7f8b66091810>

png

Interestingly enough, launches have indeed become more frequent lately, but are still far from the peak in the Cold War. If we plot the total number of launches each decade, the 70s and 80s are the most active. That information seems to make sense. We also see that for 2020 we have a drop in the number of launches, since the year is not done yet.

per_year_df.hist(bins=[1960,1970,1980,1990,2000,2010,2020])

array([[<matplotlib.axes._subplots.AxesSubplot object at 0x7f8b65e92990>]],
      dtype=object)

png

Let’s take a closer look now at the success and the failures of the launches. For that, we’ll grab the id and failed column of our dataset, and add the number of items launched into space, using size and groupby.

#Failure Rate, another plot in the same showing failure trend
per_year_df = df[ ["year","id","failed"] ].drop_duplicates()
per_year_df.head()

	year	id	failed
0	1957	1957 α (001)	False
1	1957	1957 β (002)	False
2	1957	1957-F01	True
3	1958	1958 α (001)	False
4	1958	1958-F01	True

Now, plot them to see the percentage of successful launches each year. For that it will be necessary to re-arrange our data set, and add a success rate column.

failures_per_year = per_year_df.groupby(["year","failed"]).size().unstack().reset_index();
failures_per_year["success"] = (failures_per_year[False])
failures_per_year["failures"] = (failures_per_year[True])
failures_per_year["total_launches"] = (failures_per_year[True]+failures_per_year[False])
failures_per_year["success_rate"] = 100*failures_per_year[False]/failures_per_year["total_launches"]
failures_per_year = failures_per_year[ ["year","success","failures","total_launches","success_rate"] ]

#failures_per_year.tail(10)

success_year= pd.melt(failures_per_year,id_vars=["year"],value_vars=["total_launches","failures","success_rate"],var_name="variable")

#success_year.tail(10)

“Melt” is used to unpivot our data, turning “success”,”failures”, and “success_rate” columns to values of a single column, named variable, in order to use a grid and analyze each of these variables separately.

facets = sns.FacetGrid(success_year, col="variable",sharex=False,sharey=False)
facets.map(sns.lineplot,"year","value")

<seaborn.axisgrid.FacetGrid at 0x7f8b65e4b990>

png

As one would expect, success rate has increased a lot since the first few years, and is around 90-95% in these last few years. With our first global view done, let’s try and turn our attention to this same information, but looking at the countries launching these satellites, and taking a closer look at the payload being put into orbit.

Launch Sites - Cleaning, and joining

Let’s now take a look at our site information. It indicates the country where the launch site is located, it has a code, and a “details” column. “suo” indicates it’s a Suborbital Launch Site, and orb that it is an orbital launch site. The “t” at the end indicates the launch site (https://space.skyrocket.de/directories/launchsites.htm) The rest is probable left over html from web scrapping.

sites = pd.read_json("../input/spacelaunches/sites.json")
sites.head(25)

	country	raw	details	code	name
0	Algeria	Ha= Centre interarmées d’essais d’engins spéc...	suo	Ha	Centre interarmées d’essais d’engins spéciau...
1	Algeria	Reg= Reggane, Adrar Province	suo	Reg	Reggane, Adrar Province
2	Antarctica	DdU= Dumont d'Urville Station, Terre Adelie †	suot	DdU	Dumont d'Urville Station, Terre Adelie †
3	Antarctica	Mat= Base Aerea Teniente Benjamin Matienzo, N...	suot	Mat	Base Aerea Teniente Benjamin Matienzo, Nunat...
4	Antarctica	McM= McMurdo Station, Ross Ice Shelf, New Zea...	suot	McM	McMurdo Station, Ross Ice Shelf, New Zealand...
5	Antarctica	Mol= Molodezhnaya Station, Enderby Land, Aust...	suo	Mol	Molodezhnaya Station, Enderby Land, Australi...
6	Antarctica	Rot= Rothera Research Station, British Antarc...	suot	Rot	Rothera Research Station, British Antarctic ...
7	Antarctica	Sip= Siple Station, Ellsworth Land †	suot	Sip	Siple Station, Ellsworth Land †
8	Antarctica	Sho= Showa Station (Syowa Station), East Ongu...	suot	Sho	Showa Station (Syowa Station), East Ongul Is...
9	Antarctica	VcM= Vicecomodoro Marambio Station, Seymour I...	suot	VcM	Vicecomodoro Marambio Station, Seymour Islan...
10	Antigua and Barbuda	Ant= Antigua Air Force Station, Antigua †	suot	Ant	Antigua Air Force Station, Antigua †
11	Argentina	PB= Base Naval Puerto Belgrano, Punta Alta, C...	orbp option	PB	Base Naval Puerto Belgrano, Punta Alta, Coro...
12	Argentina	Cha= Centro de Ensayo y Lanzamiento de Proyec...	suo	Cha	Centro de Ensayo y Lanzamiento de Proyectile...
13	Argentina	LPa= Las Palmas, Chaco †	suot	LPa	Las Palmas, Chaco †
14	Argentina	MCh= CELPA Atlantica (CELAT), Mar Chiquita, B...	suo	MCh	CELPA Atlantica (CELAT), Mar Chiquita, Bueno...
15	Argentina	PdI= Puente del Inca, Mendoza †	suot	PdI	Puente del Inca, Mendoza †
16	Argentina	PIn= Polo Espacial Punta Indio, Capetina, Bue...	suot	PIn	Polo Espacial Punta Indio, Capetina, Buenos ...
17	Argentina	Ser= Poligono de Salinas Grandes, Serrezuela,...	suot	Ser	Poligono de Salinas Grandes, Serrezuela, Cor...
18	Argentina	Tar= Tartagal, Salta †	suot	Tar	Tartagal, Salta †
19	Argentina	VR= Villa Reynolds Air Base, Mercedes, San Lu...	suot	VR	Villa Reynolds Air Base, Mercedes, San Luis ...
20	Australia	AnP= Anna Plains, Western Australia †	suot	AnP	Anna Plains, Western Australia †
21	Australia	Car= Carnarvon Quobba Site, Western Australia †	suot	Car	Carnarvon Quobba Site, Western Australia †
22	Australia	CaN= Carnarvon North Site, Western Australia †	suot	CaN	Carnarvon North Site, Western Australia †
23	Australia	Lan= Lancelin, Perth, Western Australia †	suot	Lan	Lancelin, Perth, Western Australia †
24	Australia	Wo= Woomera Instrumented Range, Woomera, Sout...	orb	Wo	Woomera Instrumented Range, Woomera, South A...

The only useful information right now is in the country and the code, in order to join them to our launch dataset. We’ll keep those columns, and build a new dataframe with the launch data. Tweaking the site column in our main dataframe will be necessary since it includes the launch pad (Ba LC 1/5, for instance)

df["site_code"] = df["site"].apply(lambda x: x.split(" ")[0]).astype("string")
print("Has null values? ", len(df["site_code"])!=len(df["site_code"].dropna()))
sites.set_index("code",inplace=True)

Has null values?  False

In order to properly join both datasets, we need to see if the site launch codes specified in our launch data matches the site code.

#Check different values for the site codes.
launch_site_codes = set(df["site_code"].unique())
site_codes = set(sites.index.unique())
unmatched_codes = list(launch_site_codes-site_codes)
unmatched_codes

['',
 'WI,',
 'In,',
 '@BaS',
 'Va,',
 'SLC-36A',
 'CC,',
 'Kw,',
 'Mo,',
 'BaS',
 'Ed,',
 'CCK',
 'SLC-36B',
 'Ga,',
 '@YS']

Some are empty, some have commas, others seem to have capitalization issues, other include a @. Let’s define a function to replace our simple “split” operation then, since the cleaning operation won’t be as simple as we thought. We’ll apply that function to the site column, and check again for unmatched values.

def clean_site_code(raw_site):
    return re.sub("@|,|-.*","",raw_site.split(" ")[0]).strip()

def unmatched_codes(launch_site_codes,site_codes):
    unmatched_code_list = list(set(launch_site_codes)-set(site_codes))
    return unmatched_code_list

df["site_code"] = df["site"].apply(clean_site_code).astype("string")
unmatched = unmatched_codes(df["site_code"].unique(),sites.index.unique())
df[df["site_code"].isin(unmatched)]["site_code"].value_counts()

CCK    473
YS       7
BaS      5
SLC      4
         1
Name: site_code, dtype: Int64

Almost done, but more information is needed about that CCK site, since it accounts for almost all of our missing values. Also, it’s necessary to check for the rest of the missing values by visiting the page of the launch year.

#What payloads have incorrect launch sites (besides CCK)? With the year of the launch site, we'll revisit our source of the data.

df[ (df["site_code"].isin(unmatched)) & ( df["site_code"] != "CCK")][["year","site_code","site"]]

	year	site_code	site
4897	1998	BaS	BaS
4898	1998	BaS	BaS
4899	1998	BaS	BaS
5310	2002		LC-1/5
5480	2004	SLC	SLC-36A
5484	2004	SLC	SLC-36B
5495	2004	SLC	SLC-36B
5517	2004	SLC	SLC-36A
5573	2005	BaS	@BaS
5648	2006	BaS	@BaS K-84
8493	2019	YS	@YS (Tai Rui)
8494	2019	YS	@YS (Tai Rui)
8495	2019	YS	@YS (Tai Rui)
8496	2019	YS	@YS (Tai Rui)
8497	2019	YS	@YS (Tai Rui)
8498	2019	YS	@YS (Tai Rui)
8499	2019	YS	@YS (Tai Rui)

I went back to explore the source of the data, and checked the years with problematic launch sites. (https://space.skyrocket.de/doc_chr/lau2019.htm, https://space.skyrocket.de/doc_chr/lau1998.htm, etc) . CCK seems to correspond to Cape Canaveral. Since the “CCK” is listed as a Launch Site different from “CC”, I’ll manually add it to our sites list. There are only a few sites, it’s reasonable to go over them one by one and review the other problematic launch codes. Remember the goal is to determine which country they belong to. “LC-1/5” and “SLC-x”, appear to be space lauch complexes in Cape Canaveral. BaS seems to be Ba, and YS is China, not completely sure where. We’ll add an entry for that site as well.

new_sites = [
    ["USA","NASA John F. Kennedy Space Center, Cape Canaveral, Florida, USA","orb","NASA John F. Kennedy Space Center, Cape Canaveral, Florida, USA"],
    ["China","Unknown, China","orb","Unknown, China"]
]
cck_ys = pd.DataFrame(new_sites
                   ,columns=["country","raw","details","name"],
                   index=["CCK","YS"])

sites = sites.append(cck_ys)

unmatched = unmatched_codes(df["site_code"].unique(),sites.index.unique())
df[df["site_code"].isin(unmatched)][["year","site_code"]]

#df.loc[df["site"]=="LC-1/5", "site_code" ] = "CC"
df.loc[df["site_code"]=="", "site_code" ] = "CC"
df.loc[df["site_code"]=="SLC", "site_code" ] = "CC"
df.loc[df["site_code"]=="BaS", "site_code" ] = "Ba"
#df[df["site_code"]=="SLC"]

print("Number of unmatched site codes: ", len(df[df["site_code"].isin(unmatched)][["year","site_code","site"]]))

Number of unmatched site codes:  0

0 unmatched site codes. Everything is in order to join both datasets.

joined = df.join(sites, on="site_code", lsuffix="site__")
joined.head()

	nation	operator	contractors	equipment	configuration	propulsion	power	lifetime	mass	orbit	...	site	failed	type	year	month	site_code	country	raw	details	name
0	ussr	NaN	npo energia	2 transmitters	pressurized sphere with four antennas	none	batteries	21 days	84 kg	228 km × 947 km, 65.0�	...	Ba LC-1/5	False	technology	1957	10	Ba	Kazakhstan	Ba= Baikonur (Tyuratam, NIIP-5, GIK-5), Tyuratam	orb	Baikonur (Tyuratam, NIIP-5, GIK-5), Tyuratam
0	ussr	NaN	npo energia	2 transmitters	pressurized sphere with four antennas	none	batteries	21 days	84 kg	228 km × 947 km, 65.0�	...	Ba LC-1/5	False	technology	1957	10	Ba	USSR / Russia	Ba= Baikonur (Tyuratam, NIIP-5, GIK-5), Tyuratam	orb	Baikonur (Tyuratam, NIIP-5, GIK-5), Tyuratam
1	ussr	NaN	NaN	NaN	r-7 core with added payload	none (after burnout)	batteries	6 days	508 kg (payload)	212 km × 1660 km, 65.3�	...	Ba LC-1/5	False	biological resaerch	1957	11	Ba	Kazakhstan	Ba= Baikonur (Tyuratam, NIIP-5, GIK-5), Tyuratam	orb	Baikonur (Tyuratam, NIIP-5, GIK-5), Tyuratam
1	ussr	NaN	NaN	NaN	r-7 core with added payload	none (after burnout)	batteries	6 days	508 kg (payload)	212 km × 1660 km, 65.3�	...	Ba LC-1/5	False	biological resaerch	1957	11	Ba	USSR / Russia	Ba= Baikonur (Tyuratam, NIIP-5, GIK-5), Tyuratam	orb	Baikonur (Tyuratam, NIIP-5, GIK-5), Tyuratam
2	usa	nasa	naval research laboratory (nrl)	NaN	NaN	none	solar cells, batteries	NaN	1.5 kg	654 km × 3969 km, 34.25�	...	CC LC-18A	True	science	1957	12	CC	USA	CC= Cape Canaveral Air Force Station, Eastern...	orb	Cape Canaveral Air Force Station, Eastern Te...

5 rows × 23 columns

There’s a bit of a situation here. Launchpads that are in what was the Soviet Union are listed twice apparently. Ideally, we’d like to consider the country as “USSR” if the launch was before it collapsed. We’ll need to determine when our data first registers “Russia” or another ex-USSR country, establish that as the “collapse_date” of the Soviet Union, and modify our sites dataset accordingly.

ussr_df = df[(df["nation"] == "russia") | (df["nation"] == "ussr")][ ["year","nation"] ]
grid = sns.FacetGrid(ussr_df,col="nation")
grid.map(plt.hist,"year")

<seaborn.axisgrid.FacetGrid at 0x7f8b6718ed50>

png

Apparently that won’t do. Some satellites are considered to be from “Russia” in the the 80s, and some are considered soviet well into the 90s. Remember, the data isn’t perfect. So we’ll say the launch sites belong to the USSR for dates before the dissolution of the Soviet Union. Another thing to do is finding other ambiguous site codes and fix them.

sites.reset_index().head()
#sites[sites["country"].str.contains("USSR") | sites["country"].str.contains("Uzb")]
sites.reset_index().groupby("index").filter(lambda x: len(x) > 1)

	index	country	raw	details	name
85	Ba	Kazakhstan	Ba= Baikonur (Tyuratam, NIIP-5, GIK-5), Tyuratam	orb	Baikonur (Tyuratam, NIIP-5, GIK-5), Tyuratam
121	Sin	North Korea	Sin= Sinpo Shipyard, South Hamgyong Province	mis	Sinpo Shipyard, South Hamgyong Province
123	Sin	North Korea	Sin= Sinpo Shipyard, South Hamgyong Province	mist	Sinpo Shipyard, South Hamgyong Province
194	WI	USA	WI= Wallops Flight Facility, Wallops Island, ...	orb	Wallops Flight Facility, Wallops Island, Vir...
195	WI	USA	WI= Mid-Atlantic Regional Spaceport (MARS), W...	orb	Mid-Atlantic Regional Spaceport (MARS), Wall...
198	Ba	USSR / Russia	Ba= Baikonur (Tyuratam, NIIP-5, GIK-5), Tyuratam	orb	Baikonur (Tyuratam, NIIP-5, GIK-5), Tyuratam
204	Kat	USSR / Russia	Kat= Kattakurgan missile base, Samarkand regi...	mis	Kattakurgan missile base, Samarkand region, ...
228	Kat	Uzbekistan	Kat= Kattakurgan missile base, Samarkand regi...	mis	Kattakurgan missile base, Samarkand region, ...

We found some other issues as well: There seems to be a duplicate value for North Korea (a temporary site, that became permanent later?), and a misscategorized one in the USA (Wallops Flight Facility).

sites.reset_index(inplace=True)
sites.head()

	index	country	raw	details	name
0	Ha	Algeria	Ha= Centre interarmées d’essais d’engins spéc...	suo	Centre interarmées d’essais d’engins spéciau...
1	Reg	Algeria	Reg= Reggane, Adrar Province	suo	Reggane, Adrar Province
2	DdU	Antarctica	DdU= Dumont d'Urville Station, Terre Adelie †	suot	Dumont d'Urville Station, Terre Adelie †
3	Mat	Antarctica	Mat= Base Aerea Teniente Benjamin Matienzo, N...	suot	Base Aerea Teniente Benjamin Matienzo, Nunat...
4	McM	Antarctica	McM= McMurdo Station, Ross Ice Shelf, New Zea...	suot	McMurdo Station, Ross Ice Shelf, New Zealand...

sites.at[123,"index"] = "Sin-2"
sites.at[195,"index"] = "W-MARS"


sites.groupby("index").filter(lambda x: len(x) > 1)
sites.set_index("index",inplace=True)

sites.head()

ambiguous_sites = sites[ sites["country"]=="USSR / Russia" ]

df.loc[ (df["year"]<1992) & (df["site_code"].isin(ambiguous_sites.index) ), "site_code" ] = df["site_code"] + "-USSR"

ambiguous_sites.index = ambiguous_sites.index + "-USSR"
sites = sites.append(ambiguous_sites)

sites.tail()

	country	raw	details	name
index
Vla-USSR	USSR / Russia	Vla= Vladimirovka Poligon	suo	Vladimirovka Poligon
Vo-USSR	USSR / Russia	Vo= Vostochniy, Amurskaya Oblast'	orb	Vostochniy, Amurskaya Oblast'
Vor-USSR	USSR / Russia	Vor= Vorkuta Sovetskiy, Vorkuta, Komi Republic	mis	Vorkuta Sovetskiy, Vorkuta, Komi Republic
Ya-USSR	USSR / Russia	Ya= Yasnaya missile base, Chita Oblast	mis	Yasnaya missile base, Chita Oblast
Yur-USSR	USSR / Russia	Yur= Yurya missile site	mis	Yurya missile site

Let’s now join again, and plot the launches for each year, by country.

joined = df.join(sites, on="site_code")
#print("Joined dataframe length: ",len(joined))
joined.head(10)

	nation	operator	contractors	equipment	configuration	propulsion	power	lifetime	mass	orbit	...	site	failed	type	year	month	site_code	country	raw	details	name
0	ussr	NaN	npo energia	2 transmitters	pressurized sphere with four antennas	none	batteries	21 days	84 kg	228 km × 947 km, 65.0�	...	Ba LC-1/5	False	technology	1957	10	Ba-USSR	USSR / Russia	Ba= Baikonur (Tyuratam, NIIP-5, GIK-5), Tyuratam	orb	Baikonur (Tyuratam, NIIP-5, GIK-5), Tyuratam
1	ussr	NaN	NaN	NaN	r-7 core with added payload	none (after burnout)	batteries	6 days	508 kg (payload)	212 km × 1660 km, 65.3�	...	Ba LC-1/5	False	biological resaerch	1957	11	Ba-USSR	USSR / Russia	Ba= Baikonur (Tyuratam, NIIP-5, GIK-5), Tyuratam	orb	Baikonur (Tyuratam, NIIP-5, GIK-5), Tyuratam
2	usa	nasa	naval research laboratory (nrl)	NaN	NaN	none	solar cells, batteries	NaN	1.5 kg	654 km × 3969 km, 34.25�	...	CC LC-18A	True	science	1957	12	CC	USA	CC= Cape Canaveral Air Force Station, Eastern...	orb	Cape Canaveral Air Force Station, Eastern Te...
3	usa	nasa	jet propulsion laboratory (jpl)	cosmic-ray detection package, temperature sens...	aerodynimcally shaped satellite body attached ...	none	batteries	NaN	14 kg (#1-3), 17 kg (#4-5)	356 km × 2548 km, 33.24� (#1); 186 km × 2799 k...	...	CC LC-26A	False	research, magnetosphere, micro meteorites	1958	2	CC	USA	CC= Cape Canaveral Air Force Station, Eastern...	orb	Cape Canaveral Air Force Station, Eastern Te...
4	usa	nasa	naval research laboratory (nrl)	NaN	NaN	none	solar cells, batteries	NaN	1.5 kg	654 km × 3969 km, 34.25�	...	CC LC-18A	True	science	1958	2	CC	USA	CC= Cape Canaveral Air Force Station, Eastern...	orb	Cape Canaveral Air Force Station, Eastern Te...
5	usa	nasa	jet propulsion laboratory (jpl)	cosmic-ray detection package, temperature sens...	aerodynimcally shaped satellite body attached ...	none	batteries	NaN	14 kg (#1-3), 17 kg (#4-5)	356 km × 2548 km, 33.24� (#1); 186 km × 2799 k...	...	CC LC-26A	True	research, magnetosphere, micro meteorites	1958	3	CC	USA	CC= Cape Canaveral Air Force Station, Eastern...	orb	Cape Canaveral Air Force Station, Eastern Te...
6	usa	nasa	naval research laboratory (nrl)	NaN	NaN	none	solar cells, batteries	NaN	1.5 kg	654 km × 3969 km, 34.25�	...	CC LC-18A	False	science	1958	3	CC	USA	CC= Cape Canaveral Air Force Station, Eastern...	orb	Cape Canaveral Air Force Station, Eastern Te...
7	usa	nasa	jet propulsion laboratory (jpl)	cosmic-ray detection package, temperature sens...	aerodynimcally shaped satellite body attached ...	none	batteries	NaN	14 kg (#1-3), 17 kg (#4-5)	356 km × 2548 km, 33.24� (#1); 186 km × 2799 k...	...	CC LC-5	False	research, magnetosphere, micro meteorites	1958	3	CC	USA	CC= Cape Canaveral Air Force Station, Eastern...	orb	Cape Canaveral Air Force Station, Eastern Te...
8	ussr	NaN	NaN	see above	NaN	none	batteries, solar cells	~ 1 month	1327 kg	217 km × 1864 km, 65.2�	...	Ba LC-1/5	True	scientific	1958	4	Ba-USSR	USSR / Russia	Ba= Baikonur (Tyuratam, NIIP-5, GIK-5), Tyuratam	orb	Baikonur (Tyuratam, NIIP-5, GIK-5), Tyuratam
9	usa	nasa	naval research laboratory (nrl)	NaN	NaN	none	batteries	NaN	10 kg	NaN	...	CC LC-18A	True	science	1958	4	CC	USA	CC= Cape Canaveral Air Force Station, Eastern...	orb	Cape Canaveral Air Force Station, Eastern Te...

10 rows × 23 columns

Exploring the country data

We’ll aggregate the data we have with the agg function, which is more flexible than .count(), or .sum(), and allows us to specify what to do with each column: Nation has the country the individual payload belongs to.

With that we can begin answering some questions, such as how many nations put payloads in space each year, how many countries have launch capabilities, and how many payloads are placed into space with each launch.

unique_launches = joined.groupby("id",as_index=False).agg({'nation': 'nunique', 
                                                           'type': 'nunique', 'year': 'first', 'month': 'first', 
                                                           'country': 'first',"site":"first", "power":"count","vehicle":"first","date":"first","failed":"first","equipment":"count"})

unique_launches.loc[ (unique_launches["year"]>=1992) & (unique_launches["country"]=="USSR / Russia" ), "country" ] = "Russia"

unique_launches = unique_launches.rename(columns={"power":"launches","equipment":"payload_total"})

agg_map = {"country":"nunique","type":"nunique","launches":"count","nation":"sum","vehicle":"nunique","failed":"sum","payload_total":"sum"}
countries_year = unique_launches.groupby("year",as_index=False).agg(agg_map)
successful_countries_year = unique_launches[unique_launches["failed"]==False].groupby("year",as_index=False).agg(agg_map)

countries_year["success_rate"] = 100 - 100*(countries_year["failed"] / countries_year["launches"])
countries_year["countries_per_launch"] = 100*(countries_year["country"] / countries_year["launches"])
countries_year["nations_per_launch"] = 100*(countries_year["nation"] / countries_year["launches"])

countries_year.head()

	year	country	type	launches	nation	vehicle	failed	payload_total	success_rate	countries_per_launch	nations_per_launch
0	1957	2	1	3	3	2	1.0	1	66.666667	66.666667	100.0
1	1958	2	1	28	28	9	21.0	20	25.000000	7.142857	100.0
2	1959	2	1	25	25	11	12.0	16	52.000000	8.000000	100.0
3	1960	2	2	38	38	18	19.0	26	50.000000	5.263158	100.0
4	1961	2	3	50	50	15	21.0	37	58.000000	4.000000	100.0

Furthermore, we create a table with nation information summarized by year and month.

nations_year_month = joined.groupby(["year","month"],as_index=False).agg({'nation': 'nunique'})

Visualizations

Now that most of the data has been processed and is in a usable form, it is time to start visualizing some of it, using Seaborn. We are mainly concerned with the data summarized by year.

#Add color map.
from matplotlib.colors import ListedColormap
cmap = ListedColormap(sns.color_palette("RdYlGn",10))

Success rate

#Let's analize the error/success rate each year
fig = plt.figure(figsize=(20,6))

ax = sns.barplot(x="year",y="launches",data=countries_year,color="green")
sns.barplot(x="year",y="failed",data=countries_year,color="red",ax=ax)

#ax.legend(countries_year,["A","B"])
ax.set_xticklabels([year if year%5==0 else "" for year in countries_year["year"]])
ax.set_title("Number of space launches each year (successful and unsuccessful)", fontsize=20)

Text(0.5, 1.0, 'Number of space launches each year (successful and unsuccessful)')

png

Access to space

Now, to visualize how access to space has grown lately, let’s go back to our original table, group by month and year and count the different countries that launched to space in that month. After that, we’ll build a pivot table, and plot the different countries launching from their soil in a heatmap, and the number of nations puting satellites in orbit.

Let’s first take a look the nation column to see the different nations launching to space.

print(df["nation"].value_counts())
print(df["nation"].value_counts().tail(40))

usa                     3753
ussr                    1847
ussr / russia            904
china                    540
russia                   419
                        ... 
guatemala                  1
usa, uk, netherlands       1
japan, usa                 1
china → indonesia          1
costa rica                 1
Name: nation, Length: 177, dtype: int64
portugal                                 1
bangladesh                               1
israel, france                           1
usa, spain                               1
uk, netherlands                          1
russia, australia                        1
romania                                  1
usa, argentina                           1
ethiopia                                 1
international (eutelsat), qatar          1
usa / international                      1
france, italy                            1
russia → china                           1
greece, saudi arabia                     1
germany, uk, usa                         1
usa, europe                              1
rwanda                                   1
international → afghanistan              1
india, russia                            1
france, usa                              1
canada, usa                              1
germany, luxembourg                      1
netherlands, germany, belgium, israel    1
italy, iraq                              1
europe, russia, usa                      1
turkmenistan                             1
france, europe                           1
turkmenistan, monaco                     1
azerbaijan                               1
italy, france                            1
indonesia, germany                       1
azerbaijan, international                1
usa, canada                              1
usa → canada                             1
algeria, uk                              1
guatemala                                1
usa, uk, netherlands                     1
japan, usa                               1
china → indonesia                        1
costa rica                               1
Name: nation, dtype: int64

Ownership of satellites does not seem so simple. Some satellites belong to several countries, and after some research some seem to be transfered while in orbit (denoted by the arrow -> in the field). We’ll clean this column before ploting. Let’s see how many nations own a single satellite, and how many have been transfered.

#df[ df["nation"].fillna("unknown").str.contains("→") ].count()
multi_ownership = df[ df["nation"].str.count(",") > 0 ]
transferred = df[ df["nation"].str.count("→") > 0 ]

print("Satellites with multiple owners: ", len(multi_ownership) )
print("Transferred satellites: ", len(transferred) )


df["country_count"] = 1 + df["nation"].str.count(",")

fix, axes = plt.subplots(3,1,figsize=(10,10))

axes[0].set_title("Country count")
axes[1].set_title("Payloads with multi-ownership")
axes[2].set_title("Payloads with multi-ownership")

df.hist("country_count",ax=axes[0])
df[ df["country_count"] > 1 ].hist("country_count",ax=axes[1])
df[ df["country_count"] > 2 ].hist("country_count",ax=axes[2])

Satellites with multiple owners:  171
Transferred satellites:  12

array([<matplotlib.axes._subplots.AxesSubplot object at 0x7f8b61437310>],
      dtype=object)

png

Edit: An earlier version didn’t take into account the different nations in this column.

Our goal is to get a sense of how many nations have reached space with a satellite, Many satellites are apparently owned by two nations, but after that. We’ll deal with this adding a “second_nation” column, leave the “nation” column for the first nation, and disregard the few cases with more than three owners. As for satellites that have been transfered, we’ll only consider the first nation at launch as the owner nation.

def clean_country(x):
    return x.strip();

def build_clean_map():
    words_to_remove = ["\d+","\(.*?\)","\(","\)","→.*"]

    clean_map = {word:"" for word in words_to_remove}
    return clean_map

clean_map = build_clean_map();

#joined["second_nation"] = joined["nation"].apply(lambda x: x.split(",")[1] if (x.count(",") > 0) else None)
#joined["first_nation"] = joined["nation"].apply(lambda x: x.split(",")[0]);

joined["second_nation"] = joined["nation"].apply(lambda x: x.strip()).replace(clean_map,regex=True).apply(lambda x: x.strip());
joined["first_nation"] = joined["second_nation"].apply(lambda x: x.split(",")[0])
joined["second_nation"] = joined["second_nation"].apply(lambda x: x.split(",")[1] if (x.count(",") > 0) else None)

nation_payloads_year = joined.groupby("year")["first_nation","second_nation"].apply(lambda x: len(pd.unique(x.values.ravel()).tolist()))

#joined["first_nation"].concat(joined["second_nation"]).unique()

#clean_country("monaco")

/opt/conda/lib/python3.7/site-packages/ipykernel_launcher.py:19: FutureWarning: Indexing with multiple keys (implicitly converted to a tuple of keys) will be deprecated, use a list instead.

launches_per_month = unique_launches.groupby( ["year","month"] ).agg({"country":"count"}).reset_index()

countries_per_month = unique_launches.groupby( ["year","month"] ).agg({"country":"nunique"}).reset_index()
countries_per_month.head()
country_month_pivot_table = countries_per_month.pivot("month","year","country").fillna(0)

launches_per_month = unique_launches.groupby( ["year","month"] ).agg({"country":"count"}).reset_index()
launches_month_pivot_table = countries_per_month.pivot("month","year","country").fillna(0)

#We transform both columns, first nation and second nation into a single list, and count the unique values with len. pd.unique(x.values.ravel()) 
nation_payloads_year = joined.groupby("year")["first_nation","second_nation"].apply(lambda x: len(pd.unique(x.values.ravel()).tolist())).reset_index().rename({0:"nation"},axis=1)
#nation_payloads_year = joined.groupby(["year"],as_index=False).agg({'first_nation': 'nunique'})

fig,(ax,ax2) = plt.subplots(2,1,figsize=(16,8))
sns.heatmap(country_month_pivot_table,ax=ax, square=True,yticklabels=False)
sns.lineplot(x="year",y="nation",data=nation_payloads_year[ nation_payloads_year["year"]<2020 ],ax=ax2,legend="full")

ax.set_xticklabels([year if year%5==0 else "" for year in countries_year["year"]])
ax.set_title("Number of different countries launching to space each month",fontsize=20)
ax.set_yticklabels([0, 1,2,3,4,5,6,7,8])
ax.yaxis.set_ticks([ 0, 1,2,3,4,5,6,7,8])

ax2.set_title("Different nations putting payloads in orbit each year",fontsize=20)

/opt/conda/lib/python3.7/site-packages/ipykernel_launcher.py:11: FutureWarning: Indexing with multiple keys (implicitly converted to a tuple of keys) will be deprecated, use a list instead.
  # This is added back by InteractiveShellApp.init_path()

Text(0.5, 1.0, 'Different nations putting payloads in orbit each year')

png

Let’s see how many different countries have launched from their territory, and how often:

country_launches = unique_launches.groupby("country").agg({"year":["min", "max"],"type":"count"}).reset_index()
country_launches.columns = ['-'.join(col) for col in country_launches.columns.values]
country_launches = country_launches.rename({"type-count":"launches","country-":"country","year-min":"first_launch_year","year-max":"last_launch_year"},axis=1)
country_launches["yearly_rate_since_first"] = country_launches["launches"] / (2020-country_launches["first_launch_year"])
country_launches.sort_values("first_launch_year",ascending=False)

	country	first_launch_year	last_launch_year	launches	yearly_rate_since_first
13	New Zealand	2017	2020	12	4.000000
16	South Korea	2009	2013	3	0.272727
7	Iran	2008	2020	13	1.083333
12	Marshall Islands	2000	2012	9	0.450000
6	International	1999	2014	36	1.714286
14	North Korea	1998	2016	5	0.227273
2	Brazil	1997	2003	3	0.130435
17	Spain	1997	1997	1	0.043478
11	Kazakhstan	1992	2020	539	19.250000
15	Russia	1992	2020	277	9.892857
8	Israel	1988	2016	10	0.312500
5	India	1979	2019	74	1.804878
4	France	1970	2020	299	5.980000
3	China	1970	2020	361	7.220000
1	Australia	1967	1971	6	0.113208
9	Italy	1967	1988	9	0.169811
10	Japan	1966	2020	122	2.259259
0	Algeria	1965	1967	4	0.072727
18	USA	1957	2020	1720	27.301587
19	USSR / Russia	1957	1991	2453	38.936508

Most countries don’t seem to stick to their programs, or develop their acquired capabilities, since many countries in this list only have a handful of launches . Algeria stopped being a French colony in 1962, but maybe the launches in ‘65 were from a French site? That requires some research. What about the last time a country launched to space?

country_launches.sort_values("last_launch_year",ascending=False)

	country	first_launch_year	last_launch_year	launches	yearly_rate_since_first
10	Japan	1966	2020	122	2.259259
3	China	1970	2020	361	7.220000
4	France	1970	2020	299	5.980000
7	Iran	2008	2020	13	1.083333
18	USA	1957	2020	1720	27.301587
11	Kazakhstan	1992	2020	539	19.250000
13	New Zealand	2017	2020	12	4.000000
15	Russia	1992	2020	277	9.892857
5	India	1979	2019	74	1.804878
8	Israel	1988	2016	10	0.312500
14	North Korea	1998	2016	5	0.227273
6	International	1999	2014	36	1.714286
16	South Korea	2009	2013	3	0.272727
12	Marshall Islands	2000	2012	9	0.450000
2	Brazil	1997	2003	3	0.130435
17	Spain	1997	1997	1	0.043478
19	USSR / Russia	1957	1991	2453	38.936508
9	Italy	1967	1988	9	0.169811
1	Australia	1967	1971	6	0.113208
0	Algeria	1965	1967	4	0.072727

#Pace of launches for selected countries
#
#country_and_month = unique_launches.groupby( ["country","year","month"] ).size().reset_index()
#
#countries = []
#
#country = country_and_month[ country_and_month["country"]=="China" ].pivot("year","month",0).fillna(0)
#
#sns.heatmap(country,cmap="coolwarm")



argentina = joined[ joined["nation"]=="argentina" ].groupby(["year","month"] ,as_index=False).agg({'nation': 'count'})

argentina.pivot("year","month","nation").fillna(0)

month	1	2	4	5	6	8	9	10	11	12
year
1990	1.0	0.0	0.0	0.0	0.0	0.0	0.0	0.0	0.0	0.0
1996	0.0	0.0	0.0	0.0	0.0	1.0	0.0	0.0	1.0	0.0
1997	1.0	0.0	0.0	0.0	0.0	0.0	0.0	0.0	0.0	0.0
1998	0.0	0.0	0.0	0.0	0.0	0.0	0.0	0.0	0.0	1.0
2000	0.0	0.0	0.0	0.0	0.0	0.0	0.0	0.0	1.0	0.0
2007	1.0	0.0	0.0	0.0	0.0	0.0	0.0	0.0	0.0	0.0
2013	0.0	0.0	1.0	0.0	0.0	0.0	0.0	0.0	1.0	0.0
2014	0.0	0.0	0.0	0.0	1.0	0.0	0.0	1.0	0.0	0.0
2015	0.0	0.0	0.0	0.0	0.0	0.0	1.0	0.0	0.0	0.0
2016	0.0	0.0	0.0	2.0	0.0	0.0	0.0	0.0	0.0	0.0
2017	0.0	0.0	0.0	0.0	1.0	0.0	0.0	0.0	0.0	0.0
2018	0.0	2.0	0.0	0.0	0.0	0.0	0.0	1.0	0.0	0.0
2020	2.0	0.0	0.0	0.0	0.0	0.0	0.0	0.0	0.0	0.0

Time to look at the data but grouping by country: let’s see what countries have launched to space in the different decades. We’ll draw several heatmaps, one for each quarter century, and see how the information has changed over time.

decade_tables = []
years = [1950,1975,2000]
fig, axes = plt.subplots(len(years),1,figsize=(20,20))

for i,year in enumerate(years):
    decade_group = unique_launches[ (unique_launches["year"]>=year) & (unique_launches["year"]<year+25) ].groupby(["country","year"])
    decade_pivot_table = decade_group.agg({"month":"count"}).reset_index().pivot("country","year","month")
    decade_tables.append( decade_pivot_table.fillna(0 ) )
    sns.heatmap(decade_pivot_table.fillna(0),ax=axes[i], cmap="coolwarm")                                                                     

png

We can see the trend we saw earlier reflected here as well: before the collapse of the Soviet Union, a great number of launches that died down. The number of launches from the US stays steady, and China really picks up the pace late this past decade. If the countries are grouped more conveniently, it’s possible to plot this in a line chart to see the countries with the most launches more clearly.

countries = unique_launches.groupby( ["year","country"] ).size().reset_index().rename({0:"launches"},axis=1)
top_countries = countries.groupby("country").mean().sort_values("launches", ascending=False).head(8).index

countries["is_top"] =  countries["country"].isin(top_countries)

other_countries = countries[ countries["is_top"]==False ].groupby("year").sum().reset_index()[ ["year","launches"] ]
other_countries["country"] = "Other"

all_countries = pd.concat( [ countries[ countries["country"].isin(top_countries) ][ ["year","launches","country"] ],other_countries] )

fig, axes = plt.subplots(2,1,figsize=(20,20))

axes[0].set_title("Number of launches by country",fontsize=30)
axes[1].set_title("Number of launches by country, 2010 onwards",fontsize=30)
sns.lineplot(x="year",y="launches",hue="country",data=all_countries,ax=axes[0], lw=3)
sns.lineplot(x="year",y="launches",hue="country",data=all_countries[ (all_countries["year"]>2010) & (all_countries["year"]<2020)] ,ax=axes[1], lw=3)

<matplotlib.axes._subplots.AxesSubplot at 0x7f8b60380a90>

png

Launch vehicles, number of satellites per launch

One last look at our data: what about the number of satellites being put to space? Let’s try visualizing how many satellites get to space each year. We’ll plot the number of launches, the year, and the total number of satellites each year, and also visualize the number of different space vehicles employed each year. That will give us some insight about how the private sector is beginning to form the space industry

fig, (ax) = plt.subplots(1,1,figsize=(20,6))
#sns.scatterplot(x="year",y="country", size="country", hue="nation", data=countries_year)
ax.set_title("Number of launches per year, with total number of payloads",fontsize=30)
sns.scatterplot(x="year",y="launches", size="payload_total", color="green", palette="coolwarm", data=countries_year,ax=ax, sizes=(1,600), marker="o")
fig.show()

png

A perhaps better way to visualize this is to plot the payload divided by the number of launches.

#Let's start plotting the number of launches we had each year, and how many different countries launched into space.
#We'll also plot how many different vehicles were used each year.
fig, (ax,ax2) = plt.subplots(2,1,figsize=(20,6))
#sns.scatterplot(x="year",y="country", size="country", hue="nation", data=countries_year)
ax.set_title("Number of satellites/payload put in orbit per year")
successful_countries_year["payload_per_launch_avg"] = successful_countries_year["payload_total"] / successful_countries_year["launches"]

chart = sns.barplot(x="year",y="payload_total", data=successful_countries_year,ax=ax,color="#8080DA")
ax.set_xticklabels([year if year%5==0 else "" for year in countries_year["year"]])

ax2.set_title("Average number of satellites/payloads per launch")
chart = sns.barplot(x="year",y="payload_per_launch_avg", data=successful_countries_year,ax=ax2,color="#8080DA")
ax2.set_xticklabels([year if year%5==0 else "" for year in countries_year["year"]])

fig.show()

png

There was some increase before the introduction of Starlink by SpaceX, but so far in 2020, the difference has been staggering.

What we’ve learned, and what’s next

After putting our data in order, we extracted a few interesting insights:

We learned that peak of launches happened in the Cold War, driven mainly by the then USSR.
Launches have increased in pace this past decade, with China specially active.
Not many other countries have acquired launch capabilities. Only Iran, South Korea, and New Zealand, since the year 2000. (The Marshall islands launch site belongs to the United States).
Many countries are unable to sustain the pace of launches once they get to orbit. However, more research and domain knowledge is needed there: after further research, the 1997 launch from Spain consisted in launching a rocket from an aircraft, that took off from Gran Canaria. There might be some other cases similar to that one.
However, the number of different countries placing payloads in orbit increased markedly, specially in these past 15 years. The last few years saw 50 different countries consistently putting at least a satellite in orbit.

What would we like to find out next? There are many interesting questions that can answer with this data, if we properly handle the columns that were disregarded. Furthermore, after our initial dive, some other questions arise:

What’s the use of the satellites currently in space? (military, techonology)
What’s the average mass of the payloads in orbit? Satellites supposedly got smaller these last few years.
What about private companies? Can we tell when they started launching into space?
What’s the most reliable launch vehicle?
It would be nice to visualize some of this information in a map.
Do we have any information about humans in space in our data?
What about the launch vehicles employed? What are the different methods of launching payloads to space?