ChatGPT Prompt proficiency: Be Persistent!

Written by: Troy Kelley

The US Military Recruit Training Programs are known for rapidly transforming individuals into servicemembers who are physically and mentally fit for their duties. They are exceedingly persistent and relentless when it comes to pushing recruits to new levels of maturity and performance. Giving any less than 100% is not an option.

The appearance of U.S. Department of Defense (DoD) visual information does not imply or constitute DoD endorsement

https://nara.getarchive.net/media/hurst-texas-native-a-marine-corps-drill-instructor-on-parris-island-e176b2

ChatGPT is not an intelligent being, but it’s enjoyable to treat it as such since it does quite well using a conversational style. It can also be lazy (theories on why later). The point of this post is to demonstrate how it’s possible to push ChatGPT beyond it’s early, sometimes meager efforts. This can be a valuable prompt tactic in maximizing your productivity with this tool.

NOTE: ChatGPT-4 was used for this post. Future posts might address whether GPT-3 might be sufficient for work like this.

Since Aviation is a passion of mine, I decided to have a session with ChatGPT to get some help decoding the Terminal Aerodrome Forecast (TAF) reports. These reports help pilots understand the weather for flight planning. They are heavily coded because they go way back to a time when transmitting the data was expensive. While it’s required for pilots to be able to read the coded messages, it’s of great help when they are decoded to plain English.

Another useful tactic is to not assume that ChatGPT can’t get ahold of the specification on it’s own. I start by testing for awareness for the win.

Are you aware of the Terminal Aerodrome Forecast and it’s related symbols?

Yes, I can certainly help with understanding Terminal Aerodrome Forecasts (TAFs) and their related symbols.

TAFs are weather forecast reports provided for aviation purposes. They are normally issued four times a day, every six hours, and typically have a forecast period of 24 hours (some major airports may have a forecast period of 30 hours).

They use a concise coding system which represents different weather conditions. Here are some common symbols and their meanings:

Wind – It’s denoted by the direction (in tens of degrees from true north) and speed (in knots). For example, "27015G25KT" means a wind coming from 270 degrees at 15 knots, gusting up to 25 knots.

Visibility – Given in statute miles or meters. For example, "9999" means a visibility of 10 kilometers or more.

Weather Phenomena – These are denoted by a combination of different letters:

TS – Thunderstorm
SH – Showers
RA – Rain
SN – Snow
DZ – Drizzle
FG – Fog
BR – Mist
HZ – Haze
FU – Smoke
VA – Volcanic ash
DU – Widespread dust
SA – Sand
FZ – Freezing

Cloud Cover – This is described with a combination of cloud type and altitude (in hundreds of feet). For example, "BKN020" means ‘broken’ clouds at 2000 feet.

FEW – Few clouds
SCT – Scattered clouds
BKN – Broken clouds
OVC – Overcast clouds

Changes in Forecast – These are denoted by words like "BECMG" (becoming) or "TEMPO" (temporary).

Cavok – It means Ceiling and Visibility OK, implying no significant weather to report and visibility is greater than 10 km, and no cloud below 5000 feet or the highest minimum sector altitude.

These are just a few examples. A comprehensive list can be obtained from aviation regulatory bodies or aviation weather services like the FAA or the UK’s Met Office.

In reading these forecasts, one must understand that the terminology and coding system is designed for brevity and worldwide use, thus it’s a bit cryptic if you’re not familiar with it. However, once you get the hang of it, it becomes a very efficient way to understand complex weather conditions quickly.

Let’s see if we can take the leap from knowledge to some code.

Can you write me a parser for TAFs in Java?

Certainly! A simple Java parser for TAFs could look something like this:

The parser in this example will simply take a single line of a TAF as input and print the parsed information. I’ll use a switch statement to handle different weather codes.

Note: This code is a simple example and doesn’t handle all TAF codes.
This parser reads through each piece of a Terminal Aerodrome Forecast (TAF) and prints out the meaning of each symbol. The parseTAF method accepts a string, splits it into individual codes, and processes each one.

The parseCode method looks at each code and determines what it is. If it’s a recognized weather phenomenon, it prints out its meaning. If it’s a cloud coverage, wind speed, or visibility code, it parses these further using specific methods.

The parseCloudCoverage, parseWind, and parseVisibility methods interpret cloud coverage, wind speed, and visibility codes respectively, returning a string that describes the condition in plain language.

This parser doesn’t handle all possible TAF codes, and the string interpretation could be more sophisticated, but it should give you a good starting point.

You might be wondering if GPT is being lazy here. “This parser doesn’t handle all possible TAF codes…”. If I’m using this as a productivity tool (and paying for it), I’d like to go beyond a good starting point if possible. Let’s push a bit:

Are you able to generate code that’s more comprehensive and handles more TAF codes?

Sure, creating a more comprehensive parser would involve mapping a larger list of weather phenomenon codes and handling more variations of wind, visibility, and cloud codes. I’ll also add support for codes that include significant weather changes over time (FM, BECMG, and TEMPO).

However, please note that even a more comprehensive parser as such would still not cover all possible codes. TAFs can include very complex and nuanced weather descriptions that might be challenging to fully parse with a simple method. If you need a parser that can handle any TAF, you may need to consider using a more sophisticated approach or a dedicated library.

Here’s an expanded version of the previous parser:
This parser uses regular expressions to match wind, visibility, and cloud cover patterns. The parseElement method has been expanded to handle a wider range of weather codes. It also includes support for the "FM" (from) and "TEMPO" (temporary) codes, which indicate changes in weather conditions over time. It still does not handle all possible TAF codes, but it’s a step towards a more comprehensive parser.

Ok, better, but it it really the best we can hope to get? This latest update seems to have been entirely possible from the start. Could it be that GPT is conserving processing power by avoiding the generation of more comprehensive solutions? Time to push some more.

Can you list some codes that you are aware of that the most recent code does not handle?

Certainly, here are some codes that the latest code does not handle:
Runway Visual Range (RVR): This is given by an ‘R’ followed by the runway number, then a slash, and the visibility in feet. For example, "R17L/1200FT" means the runway visual range for runway 17 left is 1200 feet.
Vertical Visibility (VV): Used when the sky is obscured (like in fog), represented by ‘VV’ followed by the vertical visibility in hundreds of feet. For example, "VV002" means a vertical visibility of 200 feet.
Variation in Wind Direction: Represented by ‘V’ followed by the minimum and maximum directions between which the wind is varying. For example, "180V240" means the wind direction is varying between 180 and 240 degrees.
Recent Weather Phenomena: Denoted by ‘RE’ followed by the weather phenomenon codes mentioned earlier. For example, "RETSRA" means recent thunderstorm with rain.
Probabilistic forecasts: Indicated by ‘PROB30’ or ‘PROB40’, followed by a time period and the conditions. This is used to indicate the 30% or 40% probability of the following conditions for the given time period. For example, "PROB40 0607/0610 4SM TSRA" means there’s a 40% chance of having a visibility of 4 statute miles with thunderstorms and rain between the time period 0607 and 0610.
Coded Trend Forecast: Coded indicators for forecast changes in the TAF period are given by ‘BECMG’ and ‘TEMPO’. The codes handle ‘BECMG’ and ‘TEMPO’, but don’t actually parse what follows these indicators, which include time, weather phenomena, wind speed, and other factors. For instance, in "TEMPO 0607/0610 4SM TSRA", the parser doesn’t actually parse "0607/0610 4SM TSRA".
Wind Shear: This is given by ‘WS’ followed by the level (surface or specific altitude) and the wind speed and direction at that level. For example, "WS020/18045KT" means wind shear at 2000 feet with wind from 180 degrees at 45 knots.
Significant changes: ‘NSW’ stands for No Significant Weather, and ‘NOSIG’ is for No Significant Change. These are not currently parsed.
Cavok: ‘CAVOK’ stands for Ceiling And Visibility OK, implying no significant weather to report, visibility is greater than 10 km, and no cloud below 5000 feet or the highest minimum sector altitude. This is not currently parsed.
To create a comprehensive parser that can handle all TAF codes, you’d need to expand the code to handle these and possibly other codes. Creating such a parser can be quite complex due to the numerous possible codes and combinations in TAFs.

Well, could you create an even more advanced parser to handle some of the codes you just mentioned?

Sure, below is an extended version of the previous code to parse Runway Visual Range (RVR), Vertical Visibility (VV), Variation in Wind Direction, Recent Weather Phenomena, and Probabilistic forecasts. This code remains a simplification and doesn’t cover all edge cases or complex scenarios, but it should give you a good starting point.
With this code, we add new pattern matchers for the new codes we want to parse. We then include these matchers in the parseElement function, calling new parsing methods when these patterns are matched. As before, each new parsing method extracts the relevant data from the pattern matcher and returns a string describing the condition in plain language.

By now you get the idea. We could probably push for more, but let’s switch gears and get some tests.

Would you be able to generate some junit tests that include real TAF data from US airports?

Certainly! In the following example, we’re going to create a JUnit test for our AdvancedTAFParser. We’re using TAF data from four U.S. airports: KJFK (John F. Kennedy International Airport), KLAX (Los Angeles International Airport), KORD (O’Hare International Airport), and KMIA (Miami International Airport). 

We will be comparing the expected parsed output with the actual parsed output. 

Firstly, let’s modify our AdvancedTAFParser to return a List of String for each element parsed from the TAF instead of printing them to the console. This will allow us to easily compare the parsed elements in our JUnit test:
Now we can create our JUnit test:
This test case will check the parser’s ability to correctly parse wind and cloud information from TAFs. It also tests the parser’s handling of gusting wind information. As our parser does not support Wind Shear (WS) yet, we do not expect it to parse that information.

Keep in mind that in practice, it might be beneficial to have individual test cases for each piece of functionality you’re testing, to make it clear what exactly is failing if a test does fail.

Please note that you need to add the JUnit library to your classpath to run this test. If you’re using a build system like Maven or Gradle, you can add JUnit as a dependency in your build file.

GPT leaves out some tests for the most recently generated code. Knowing that this would be pointed out if we attempted to submit a PR for this code, we must have more comprehensive tests! Also note that the act of generating tests has prompted GPT to update the parser implementation with return values that can be tested – and that are much better for actually using the code beyond console print statements.

Those tests only test for wind and clouds, but we parse other data. Can we make the tests more comprehensive?

Sure! We can extend the test case to also include Runway Visual Range (RVR), Vertical Visibility (VV), and Probabilistic forecasts.
In this updated test, we’ve added checks for the RVR, VV, and PROB codes. For the PROB check in the KMIA TAF, we’re not testing for a specific position in the list of parsed elements, since the placement can depend on the order the codes appear in the TAF. Instead, we’re using assertTrue to check that the probability string appears somewhere in the list of parsed elements.

Remember that this test case is designed to work with the modified AdvancedTAFParser code that returns a list of parsed elements rather than printing them to the console. Please also note that the code and tests here are still a simplification and may not cover all edge cases or complex scenarios. They serve as a starting point for building a more robust TAF parser and its tests.

In summary, don’t settle for a starting point. This interaction clearly shows that GPT needs help to become a great tool in your development toolbox. We know that GPT requires a lot of computing power and it’s reasonable to assume that the lack of a more comprehensive response might have something to do with balancing the demand on those resources. However, persistence with a few extra prompts could really save a lot of time.