Correlation tools?


#1

Hi folks!

I have a bunch of data about myself, and I’d love to see if there are any real correlations between them.

I haven’t found any great tools for it. It appears that the fancy pants folks who “learned a bunch of statistics” so they can “know what they’re doing” and not “completely abuse the tools” :slight_smile: use things like R and SPSS. Other folks use things like Exist.io which would be perfect if I could enter my own data into it…

I ran into a post by Nick Winters, http://blog.nickwinter.net/cold-showers-tested, where he uses Statwing, and it looks like a cheatcode so I can be somewhat fancy without investing a lot of time.

Statwing appears to have been acquired and may not actually be a real solution anymore either, so I turn to you.

  1. I have a few semesters of stats under my belt, and I try to use “stats thinking” frequently, but I would have a very hard time doing hypothesis testing for example due to the length of time it has been since I have done anything like that. Is looking for a quick tool for this stuff going to end up with fake correlations due to caveats I’m forgetting?

  2. If you are a fancypantser, if I have the raw data, how long would it take to get up to speed to be able to do things like this? https://www.instagram.com/p/Bj_VTVGhKT6/?taken-by=anomalily
    I’m proficient in Excel and Python and Linux and CLI wrangling.

  3. Is there a new slick tool like Statwing I should be looking at?

  4. Is there a better forum for this question?

Thanks!


#2

So, uh, as a professional fancypantser (can I put that on my resume?) who uses Python and Pandas to do this for my job, I actually just use google sheets to do this for my QS-y sorts of things. I get pearson correlations (=CORREL), standard deviations, t-tests (=TTEST) and confidence intervals (=CONFIDENCE), and make plots with trendlines. Their help docs are pretty easy to follow, too.

Is it perfect? God no! Will it give you false correlations? Hell yes, because you’re not going to be doing FDR corrections and so on because this is a fun thing. Does it matter too much? Nope, because you can pretty trivially repeat the experiment (of living your life) on things that appear to correlate, and figure out pretty fast if they’re legit or just errors.


#3

Wizard is a (somewhat costly) desktop application that does something similar to Statwing. I don’t use it myself, due to being an R-using fancy-pants type. Buster Benson had an interesting post a few years ago about how he uses Wizard for self-experimentation.


#4

About custom data and Exist.io. Their apps support “custom tags” (https://exist.io/blog/custom-tracking/) which is a way to enter if something happened on not happened or not on that day. So you can enter a “read books” tag but not associate value.
They also have an API http://developer.exist.io/ so someone could potentially do a Beeminder integration (hint hint nudge nudge @dreev)


#5

Ah, that didn’t exist the last time I looked at it!

What sort of integrations would people want?


#6

I would guess the idea is to be to able to send beeminder data to Exist and then let Exist do its correlation magic. But it seems they still have a set amount of “attributes” (http://developer.exist.io/#list-of-attributes), e.g. types of data they store. So for example me tracking body fat in kg rather than % would not be possible.