Data Standardization Question

For the discussion of math. Duh.

Moderators: gmalivuk, Moderators General, Prelates

User avatar
pogrmman
Posts: 603
Joined: Wed Jun 29, 2016 10:53 pm UTC
Location: Probably outside

Data Standardization Question

Postby pogrmman » Mon Apr 01, 2019 11:19 pm UTC

Hi everyone.
I’m doing a project on the urban heat island effect. To that end, I’ve got daily temperature data at different climate sites for the 100 largest metro areas from 1966-2015. I’m using census data to get the populations of each metro area during each decade.

The thing is, there’s a very noticable effect from global warming: after standardizing temperature data for each climate site based on averages over the whole timescale, I’m getting an R-squared of nearly 30% just from comparing decade to standardized temps. Some of that is due to increasing urban heat island from growing population, but it exists pretty strongly even for the most rural climate sites I’ve got data from (35-50km from metro center), where the urban heat island is lessened.

Sure, some metro areas are big enough where that far might be within a dense area, but some of the ones showing this certainly aren’t — for instance, I doubt the station in Claremore, OK outside of Tulsa is in a pretty darn rural area.

Is there a good way to separate the global warming effect from the population effect, or is that unrealistic?

User avatar
Sizik
Posts: 1243
Joined: Wed Aug 27, 2008 3:48 am UTC

Re: Data Standardization Question

Postby Sizik » Tue Apr 02, 2019 2:32 pm UTC

Have a control group of climate sites in rural areas that shouldn't be affected by the effect you're testing for?
gmalivuk wrote:
King Author wrote:If space (rather, distance) is an illusion, it'd be possible for one meta-me to experience both body's sensory inputs.
Yes. And if wishes were horses, wishing wells would fill up very quickly with drowned horses.

User avatar
pogrmman
Posts: 603
Joined: Wed Jun 29, 2016 10:53 pm UTC
Location: Probably outside

Re: Data Standardization Question

Postby pogrmman » Wed Apr 03, 2019 6:35 am UTC

Sizik wrote:Have a control group of climate sites in rural areas that shouldn't be affected by the effect you're testing for?

The biggest reason I haven't done that is I have no need or want for additional data! The whole, clean dataset is already quite large (~2GB) and it's already made certain things sluggish to calculate... I'm wondering if just taking the most rural 10 or 15% of the stations in my dataset, removing the effect on temperature predicted by population and distance from the metro area, then averaging those all out by decade, to create a baseline for each decade would be a valid technique.

User avatar
gmalivuk
GNU Terry Pratchett
Posts: 26725
Joined: Wed Feb 28, 2007 6:02 pm UTC
Location: Here and There
Contact:

Re: Data Standardization Question

Postby gmalivuk » Wed Apr 03, 2019 10:40 pm UTC

pogrmman wrote:
Sizik wrote:Have a control group of climate sites in rural areas that shouldn't be affected by the effect you're testing for?

The biggest reason I haven't done that is I have no need or want for additional data!

I mean, it sounds a lot like you *do* need additional data because otherwise you have no good way to remove the biggest confounder from your analysis.
Unless stated otherwise, I do not care whether a statement, by itself, constitutes a persuasive political argument. I care whether it's true.
---
If this post has math that doesn't work for you, use TeX the World for Firefox or Chrome

(he/him/his)


Return to “Mathematics”

Who is online

Users browsing this forum: No registered users and 5 guests