Tuesday, March 16, 2010

What's Wrong With This Picture?

Crazy Data From NASA Or Bugs In My Programs?

While I'm waiting to hear from NASA, I've started doing a QA check both of my code and the Aqua Satellite AMSU data. The above picture shows an example of some of the things I've come across. The picture shows NASA's AMSU channel 5 data for every day in January, 2008, and January, 2009 through January, 2010 for all 30 footprints (14 months total). The view along the X axis is by footprint, with an average at the end. The data was generated by my AMSUSummary program and displayed in Apple Numbers.

You can see the obvious problems with the data. Channels 17 and 18 are just whacked. Channels 25 through 30 go in the wrong direction, and channels 1 through 6 have too high a value. Compare this to the more standard picture of what such a footprint snapshot should look like, shown below.

What 30 Footprints Should Look Like.

Below is the same data again, this time the X axis represents time.

And here's daily data from December 31st, 2009. This data has been through my QA process and none of the scans for channels 24 or 25 passed QA. The QA checks to make sure none of the data is more than +/- 50% of the expected readings as defined in the literature.

So the question is, is this strange data due to bugs in my code, or is this the way the data actually looks. I've already done QA checks on my code, but for this next week, I'll be going over it again to see where the problem lies.

So for the rest of the week expect QA posts from me. In addition to checking out the code and data, these posts will give good instructions on how to use the programs I've written so far.


  1. Just amazing what some QA can uncover :-)

    It might help to demonstrate the scale of the problem by also showing "the other half of the orange" for the last graph... blank out the good data and just show the data that is more than +/- 50% of the expected readings...

    Keep up the good work... thank you...

  2. Thank you, Malaga.

    And a very good idea about graphing the bad data. Hopefully I can get that up tonight.

  3. "At this point I realized it was hopeless to try and salvage the concept of running a limb validation in the extract."
    This seems to be valid conclusion based upon your results...

    But it does leave me wondering:

    What "data" are we really dealing with here?

    But we can't really answer that question at the moment because there are still too many unknowns regarding the data because the jagged footprint profile you have graphed is so very far removed from the ideal "smooth semicircle" we were told to expect.

    1) Does this mean that putting GARBAGE IN will only generate GARBAGE OUT results?

    2) Does this mean that the channel calibrations are wrong and that the data could be "recalibrated" in software to form a "smooth semicircle"?

    3) Does this mean that we have some "bad" sensors that should be ignored?

    4) Does this mean the theory behind the "smooth semicircle" is wrong?

    Therefore, I would be very interested to hear the Magic Java take / perspective on: What "data" are we really dealing with here?

  4. The "Magic Java" take is it's too soon to tell. Much of the strangeness in the graphs may be due to bugs in my code. Even the limb readings being out of range may not be enough to throw off the monthly or even daily averages. We'll have to wait and see.

    The main reason I took out the limb validation is I was concerned that leaving it in would "shape" the data to the expected semi-circle. Any data that didn't agree would get tossed out. That's really not what I want to do.

    So it's gone. But learning the shape of the data after this change will have to wait until I've finished QAing AMSUSummary.