"Wishlist" Datasets

Comprehensive Data on Wind Farm Production By Site

Our dream dataset: open data on the actual realized performance of all wind farms in the US, and/or any data it holds on actual measured wind resources at different sites, in any format the DOE chooses.


We believe this data could be captured from Treasury/DOE records on production tax credit reporting, which were required of all US wind farms over the past few years.


With this data, the wind industry in general, and the MIT/Harvard spin-out I co-founded - Cardinal Wind - in particular, could drive faster, more accurate prospecting for wind farms, as well as lower risk, lower-cost wind project development across the US.


To identify and evaluate sites for wind farms, the wind industry currently relies on long-term reference data from patchy public datasets, and site-specific data captured by installing towers or mobile LIDAR sensor units at a particular site.


The public datasets, such as ASOS, provide useful but limited estimates of how wind speeds and resources might vary over time. These datasets are limited in terms of the geographies and time spans they cover, and by measurement error caused by fault or aged equipment.


Because of the limitations of public datasets, at least 12 months of site-specific data from towers and sensor units is needed to gain enough confidence in a site to develop and invest in it. Collecting this information is costly, both because it can cost up to $10,000 per month for the land and equipment needed, and because it can delay the development process by months or years.


After acquiring the wind rights to a site, a prospector will set up a met tower and collect data on wind speed and direction for twelve to twenty-four months. The prospector then correlates this dataset to a nearby weather station using simple regressions. This is a predictive process, which generates a back-cast against which forward-looking predictions can be made.


The prospector requires years of data and, at the end of it, earns a wind resource estimate that has been, on average, a seven to nine percent overestimate.


Because of this, bankers invest with extreme caution at low valuations, and developers face heightened uncertainty and higher capital costs. The lack of complete data on wind resources and wind farm performance inflates costs, risks, and project development timelines, and hampers the rollout of one of our most proven renewable energy technologies.


By using sophisticated mathematical tools developed at MIT’s Computer Science and Artificial Intelligence Laboratory (CSAIL), Cardinal Wind can generate highly accurate, site-specific wind resource predictions from limited and noisy data. We'd take this dream dataset, learn from it, combine it with existing data, and deliver better prospecting tools and drive wind farm development.



6 votes
Idea No. 73