Improving ReforesTree: Correcting Drone Imagery and Satellite-Based AGB Estimation
By Autumn Nguyen, Sulagna Saha, David Dao, Gyri Reiersen, and Björn Lütjens
Introduction
ReforesTree is a valuable new dataset for the community of Machine Learning and data scientists who want to apply their skills to help the environment, especially through forestry applications. The dataset comprises of the field data of over 4,600 individual trees in six agro-forestry sites, and high-resolution RGB drone images of these sites. The information about each tree in the field data is matched with the corresponding point in the drone image. Using these data, we can train models to estimate the aboveground biomass (AGB), hence carbon stock, of a forest area through aerial imagery, and we can also benchmark existing AGB estimation models. There have been studies in this area, but mainly for temperate forests. For a tropical agro-forestry area, ReforesTree is the first publicly available dataset with both aerial imagery and ground truth field measurements.
There are shortcomings in the original pipeline and dataset of ReforesTree, however. One of the core problems, discovered by Barenne et al, was that the areas captured by the drone images were bigger than the areas that the ground truth field measurements were taken on. Because ReforesTree authors were benchmarking the AGB estimations of satellite maps based on their drone images, the conclusion in the original ReforesTree paper that the satellite-based maps overestimated forest AGB couldn’t be proven right. To know whether those maps overestimated AGB or not, we would have to crop the drone images so that they capture exactly the same field measured area and recalculate the AGB estimation for comparison. This was what Barenne et al didn’t do, and we set out to do this.
The learning curve for us when working on this project was very steep. There were few tutorials, most of which are incomplete; their open-source code had few helpful comments but many errors; and no one in our college has domain knowledge about this topic to help us. We wrote this report hoping to help make it less difficult for future students and researchers to get started with ReforesTree in particular, and with using Computer Science for forestry in general. We are not choosing to go in depth on the dataset structure and pipeline in this blogpost. If you are interested, you can read the paper in depth!
Understanding the Problems

In the original paper, the whole area of the drone images has been assumed to correctly fit into the area covered by the field measurements. As we can see in the pictures, that’s far from the truth. There may also be errors in the GPS locations and in the number of trees from the field measurements, as described in Barenne et al. Our aim is to find out how we can make clear, fix, and overcome these mistakes, so that this ReforesTree dataset can still help the machine learning community learn and get accurate results.
Tackle Problem 1: Area captured in drone images bigger than area got field measured
The main ReforesTree pipeline works on an incorrect assumption that field data boundaries and the drone image boundaries are exactly the same. To fix that we aim to fix the drone imageries for the six agroforestry sites. Here are the steps taken:
i) Get the field data in a GeoDataframe
Initially, we create GeoDataframes (from geopandas library) using the longitude and latitude information of each field data point. It creates point geometries from numeric data for us to work with them easily and visualize the data. The field data points are the red dots in Figure 3.
ii) Get the boundary of the collected field data points using alphashape
We used alphashape1 library to find the convex hull2 that encloses a set of points. The bigger the alpha value is, the more border points the convex hull will fit around, resulting in more tight and complex hulls. We chose an alpha value of 15000 as it was a reasonable value also chosen by Barenne et al 2021.
iii) Overlap the alphashape on the tif image and cropping it
We chose to use the coordinate of the tif images and plot the field data points in the same system. We used a mask from the rasterio library, to overlap the alphashape on the tif images. We cropped out the unnecessary parts outside of the boundary replacing them with white pixels. A visualization of this step was the transition between the third and fourth picture in figure 4.
iv) Fixing the white pixels
After cropping the tif images, we found out the bounds of non-white pixels of the images and we made sure it is fitted around the square shape correctly to be used for the AGBench library later. A visualization of this step was the transition between the fourth and last picture in figure 4.
Tackle Problem 2: Getting inaccurate AGB estimation from satellite-based maps
AGBench is a Python library that benchmarks satellite-based AGB maps by filtering and overlapping them with ReforesTree’s drone imagery and comparing with the AGB estimations from ReforesTree’s field data. We followed their AGBench3 tutorial, and made changes to solve issues we encountered along the way, to benchmark the satellite-based maps again using the correctly cropped drone images. The details were explained below:
First, we obtained the satellite-based maps of the 6 sites that we had field measurements. They were raster files which contained the AGB information, like AGB density, for each pixel.
Then, we retrieved the drone images that we had correctly cropped as explained in section 3.3 above.
We overlayed the corresponding drone image onto the satellite map, and cropped out the same portion of the satellite map.
To be able to crop out the same portion, we needed to interpolate pixel values, because the drone images had much finer resolution than satellite maps.
Before interpolating, we cropped out just a small square area of the satellite map that contained our site area, so that we wouldn’t be doing interpolation on the whole satellite-based map, which would be too computationally expensive.
We obtained the AGB density of the pixels in those portions, and calculated the average for the whole site.
Results
Cropped Drone Images
Instead of cropping drone images based on field boundaries and then cropping satellite-based AGB maps based on drone images, we also had a second alternative way: crop the satellite AGB maps directly based on the field boundaries. However, we chose to go the first way because these correctly cropped drone images have other values. For example, we can use these drone images to train models to estimate AGB, detect tree species, or count the number of tree crowns, etc., which we may not be able to do with satellite images because they have much lower resolution.
Correct Satellite-based AGB

For all sites except site 4, GFW still overestimated AGB density. Compared to the previous overestimation factors, the new factors increased for site 1, 2, and 3, and decreased for site 4, 5, and 6. It is interesting that after cropping, for site 4, the GFW estimation exactly matched the ground truth field AGB value. Overall, the satellite imagery still overestimated compared to the field measurements.
Ambiguities
Below were a few ambiguities in the original work, as well as the assumptions we had to make in order to proceed with our analysis.
Ambiguity 1:
The unit used for AGB was inconsistent and not clearly explained across the tables in the original paper and code: for total AGB, sometimes it was tons, sometimes kg; for AGB density, sometimes it was ton/ha, sometimes Mg/ha, sometimes kg/pix. We assumed that total AGB should always be in metric tons, and AGB density should always be in Mg/ha.
Solution: We changed the caption and figured out the unit should be Mg/ha.
Ambiguity 2:
The year of the GFW dataset used was not clear. In the published paper, the authors wrote “The Global Forest Watch’s Above-Ground Woody Biomass dataset is a global map of AGB and carbon density at 30m×30m resolution for the year 2000.” (Reiersen et al., 2022, p. 12123)” However, in the caption of their Table 2, it was GWF 2019. If it was true that they were comparing GWF estimation for the year 2000 with their field data near 20 years later, that 20 years difference in time might be a reason for the discrepancy in AGB estimation.
Solution: We noted that we are comparing GFW 2000 (published on 2022) estimation to the field data which was collected in 2019-2020.
Ambiguity 3:
There were inconsistencies regarding site areas across the tables in the paper.

Solution: Because there is discrepancy in the site areas which has been mentioned in the paper and the codebase, it is not confident that the published total AGB values are correct. For example, let’s take the site with 846 trees, which is site 6 with an area of 0.48 ha in the paper, but site 2 with an area of 0.53 ha in the code. This leads to the inconsistency in total AGB calculation in the Table 2 of the paper. We matched the sites by matching the tree counts. We have taken the AGBench numbering as the real one according to the authors.
Conclusion
After correcting the pipeline and re-benchmarking using the ReforesTree dataset, our findings confirm that the main hypothesis of the original paper still holds: satellite-based AGB maps do indeed overestimate aboveground biomass in tropical agroforestry areas. By aligning drone imagery precisely with field-measured boundaries, we were able to validate the initial observation that satellite maps generally provide higher AGB estimates than what is observed on the ground.
Our corrections and results will contribute to an updated version of the original ReforesTree paper, incorporating these refined insights and improving the dataset’s utility for future research. This updated version will offer more reliable benchmarking for environmental researchers, strengthening the dataset's value in carbon stock estimation and forest conservation efforts.
Inspired from https://github.com/TimEngelmann/ai4est/blob/main/exploration/data_ground_truth.ipynb
A convex hull can be thought of as a rubber band stretched around the outermost points of a set.
https://github.com/gyrrei/AGBench/blob/master/test/AGBench_tutorial.ipynb