Satellite Imagery Analysis in Python Part II: GOES-16 Land Surface Temperature (LST) Manipulation
This is the second entry in the satellite imagery analysis in Python. For part I, click here.
For part II, the focus shifts from the introduction of file formats and libraries to the geospatial analysis of satellite images. Python will again be used, along with many of its libraries. Land Surface Temperature will again be used as the data information, along with shapefiles used for geometric boundary setting, as well as information about buildings and land cover produced by local governments - all of which are used in meteorological and weather research and analyses.
A shapefile is a file format used to identify geographic boundaries and attributes of given geometries on earth [read more here]. Shapefiles are often used to quantify statistically significant information associated with rural, suburban, and urban areas. As an introduction to statistical analysis of satellite imagery, I will use a publicly-available shapefile of New York City (NYC). The shapefile can be downloaded at the following link:
https://data.cityofnewyork.us/api/geospatial/tqmj-j8zm?method=export&format=Shapefile
After downloading the file, unzip it and place it in the local Python script folder. The NYC shapefile can be plotted using similar Python Basemap methods introduced in the first entry in this satellite series. A code example is shown below, along with the sample output plot:
We will use this shapefile to quantify variation of land surface temperature over each borough of New York City. The shapefile contains information about each borough, which we can analyze using the following method:
If we print out the first polygon in the shapefile it gives the following:
{'boro_code': 2.0, 'boro_name': 'Bronx', 'shape_area': 1186612476.97, 'shape_leng': 462958.186921, 'RINGNUM': 1, 'SHAPENUM': 1}
Where we can see the borough code and borough name, along with the area and length of the shape. We can use the borough codes to color each borough of the city:
Since we are only interested in a small region for analysis (NYC), we can use the shapefile of the city to clip the boundaries of data and minimize the amount of processing needed. The code below is a continuation of the codes outline above, with the addition of corner calculations that find the indices of the lower-left, upper-left, lower-right, and upper-right corners of the LST file. This will allow us to clip the full 1500x2500 data file down to a much smaller size for more efficient processing.
The section that clips the data finds the regions in the GOES-16 latitude/longitude points that most-closely align with the shapefile bounds. The resulting satellite image is plotted below atop the NYC shapefile:
In the figure above, we can see that around 1600 UTC time (11 a.m. NY time), there is quite a bit of warming. And upon checking the historical weather for that day, the air temperature around that time (not the same as LST) was roughly 280 K. LST is typically warmer than air temperature during the day time due to the solar heating of the surface, so we can assume the LST range of 280K-290K is fairly justified. In the next section, we will explore the average temperature of each borough by grouping each GOES-16 satellite LST pixels into its respective borough and calculating a few LST statistics for each borough.
A somewhat interesting and applicable calculation can be carried-out using the borough boundaries to approximate the mean land surface temperature (LST) for each borough of New York City. One method for determining if a data point is within a shapefile geometry is to loop through each point and use the Geospatial Data Abstraction Library (GDAL) spatial filter. This filter method is shown below:
This generalized code can be added to the code in the previous section. It loops through the clipped latitude, longitude, and data file to compile statistics for each borough. Of course, this can be done for any geometry that has defined boundaries. In our case, we use the five boroughs of NYC to quantify the average LST for each borough. We can visualize the mean LST values for each borough by using a colormap that colors in each borough with a specific color that mirrors a given temperature. A plot of this is shown below:
For the given day and hour, it appears that the boroughs of the Bronx and Queens were warmer than Staten Island, Manhattan, and Brooklyn. Staten Island is the coolest borough during this peak hour. The full code to replicate this is shown below:
I decided to investigate five days from the summer of 2018 (June 29 - July 3) during a heatwave (read about heat waves here). We can use the heatwave days to explore each borough’s daily LST profile. We do this by looping through each file, taking each borough’s averaged LST, and then stringing the multi-day profile into plot. This type of calculation is lengthy and involved, however, I have list the general process flow below to elucidate the methods used in the code:
Read the city shapefile and establish bounds
Loop through shapefile to count boroughs
Clip latitude/longitude vectors down to shapefile bounds
Loop through each file
For each file, use borough boundaries to calculate mean LST for each file (hour)
After the loop, rearrange times to plot a mean 24-hour diurnal profile during the heat wave
The resulting code and plot should be five lines that represent the mean LST trend for the heat wave for each borough of NYC.
From the heat wave data taken from June 29th - July 3rd, we see the general daily trend in land surface temperature. If we translate the data from UTC to local time, we can conclude that peak heating occurs between 12 p.m. - 3 p.m. If we take the mean of each diurnal profile, we can quantitatively cite the mean daily temperature for each borough during the heat wave:
Manhattan : 302.2 K
Bronx : 302.2 K
Brooklyn : 301.7 K
Queens : 301.6 K
Staten Island : 301.6 K
We can conclude that the mean temperatures for each borough are about the same during the heat wave, though we can observe certain behaviors for each borough. Manhattan appears to hold on to heat more than the other boroughs, and Staten Island appears to be the coolest at any given time. I imagine the particularities of each borough would emerge if more data was added to the analysis.
Furthermore, if we focus on one borough, we can observe the temperature’s standard deviation for each hour during the heat wave. This will give us an idea of the span of the mean temperature for each hour. The plot is shown below for Manhattan:
This concludes the second entry into the Python satellite imagery analysis tutorial series. In the first entry, I explored netCDF files and Python’s Basemap toolkit for reading and plotting geographic data and satellite imagery. In this tutorial, I introduced shapefiles as tools for clipping satellite data to city boundaries. I also demonstrated some powerful Python techniques for calculating mean parameters of particular geographic regions, using shapefiles and LST data. We were able to investigate the mean LST during a heat wave for New York City, while also visualizing the standard deviation of the diurnal LST profile. In the upcoming entry, I will use the National Land Cover Database (NLCD) and digital elevation maps to understand the distribution of LST under varying surface parameters.
See More in Python and GIS: