I don't see anything wrong with its reasoning. UM16 isn't explicitly mentioned in the data sheet, but the UM prefix is listed in the 'Device marking code' column. The model hedges its response accordingly ("If the marking is UM16 on an SMA/DO-214AC package...") and reads the graph in Fig. 1 correctly.
Of course, it took 18 minutes of crunching to get the answer, which seems a tad excessive.
However it is less true with info missing from the training data - ie. "I have a Diode marked UM16, what is the maximum current at 125C?"