<![CDATA[dumbmatter.com]]>http://dumbmatter.com/metalsmith-feedMon, 11 Mar 2019 22:31:34 GMT<![CDATA[The double Dunning-Kruger effect]]>Wikipedia defines the Dunning-Kruger effect as:

a cognitive bias in which people of low ability have illusory superiority and mistakenly assess their cognitive ability as greater than it is

Look it up on Google Images and you'll find various summary images, like these:

The basic idea is that incompetent people so irrationally overinflate their abilities that they think they are just as good or even better than experts.

And some of these figures are quite condescending. "Mt. Stupid", haha, imagine all the stupid people who are so stupid they don't even realize they are on Mt. Stupid! Those people must be really stupid, right?

But before being too condescending, we should take a look at the evidence behind the Dunning-Kruger effect. In the original paper as well as follow-up publications, the data looks like this:

That's Figure 1 from the original Dunning-Kruger paper, and others are similar.

It's quite a bit different than the sylized results from Google Images, isn't it?

Most notably, Mt. Stupid is gone. People with low skill do overestimate their abilities, but not to the point where they exceed high skill people, and there is no dropoff in confidence as ability increases.

Additionally, the Wikipiedia article also includes discussion of a skeptical viewpoint which suggests that even the original Dunning-Kruger result may largely be a mathematical artifact rather than a genuine effect.

This is all quite hilarious, because from what I've seen, the main reason the Dunning-Kruger effect is referenced is to say something like "look at all those idiots who are too stupid to realize they are on Mt. Stupid". When in reality Mt. Stupid doesn't exist. Oh, the beautiful irony.

I hope I'm not making some elementary mistake here, but if I am, please write a blog post about how doubly ironic that is.

References

Kruger, Justin; Dunning, David (1999). "Unskilled and Unaware of It: How Difficulties in Recognizing One's Own Incompetence Lead to Inflated Self-Assessments". Journal of Personality and Social Psychology. 77(6): 1121-1134. doi:10.1037/0022-3514.77.6.1121

Burson, Katherine A.; Larrick, Richard P.; Klayman, Joshua (2006). "Skilled or unskilled, but still unaware of it: How perceptions of difficulty drive miscalibration in relative comparisons". Journal of Personality and Social Psychology. 90(1): 60-77. doi:10.1037/0022-3514.90.1.60

Nuhfer, Edward; Cogan, Christopher; Fleischer, Steven; Gaze, Eric; Wirth, Karl (2016). "Random Number Simulations Reveal How Random Noise Affects the Measurements and Graphical Portrayals of Self-Assessed Competency". Numeracy. 9(1). doi:10.5038/1936-4660.9.1.4

]]>
http://dumbmatter.com/posts/dunning-kruger.mdhttp://dumbmatter.com/posts/dunning-kruger.mdFri, 22 Feb 2019 00:00:00 GMT
<![CDATA[Age gaps and death gaps]]>

As you may have noticed, relationships often have age gaps. This is fine, but it does imply that the younger partner is likely to outlive the older partner. Further compounding this problem is that the older partner usually is the man, and men tend to die younger than women to begin with.

So here is a calculator to show the probability distribution of the number of years between the deaths of two partners, based on age and gender. Of course there are many other factors at play besides those two, but this should give you a rough estimate.

I know this is kind of stupid because you can just take the actual age gap and add/subtract a couple years to account for the gender difference. But it's fun to see the probability distributions.

Partner 1

Partner 2

Average gap: ?? years

Data on mortality rates comes from the Social Security Administration. Charts are generated with Google Charts.

]]>
http://dumbmatter.com/posts/age-gaps-death-gaps.mdhttp://dumbmatter.com/posts/age-gaps-death-gaps.mdFri, 12 Oct 2018 00:00:00 GMT
<![CDATA[CSV SQL Live]]>I made another mini project for no apparent reason. You can load one or many CSV files into a SQL database within your web browser (client side) and run SQL queries on them. In most situations it's not that useful (most databases already support importing CSVs without too much trouble), but it might save somebody some time in a pinch. And it's cool that things like this are even possible, purely client side.

Code is on GitHub.

]]>
http://dumbmatter.com/posts/csv-sql-live.mdhttp://dumbmatter.com/posts/csv-sql-live.mdFri, 10 Aug 2018 00:00:00 GMT
<![CDATA[My greatest technical accomplishment]]>Since this is my blog, it makes sense to talk about the greatest technical accomplishment of my life.

...except I already wrote a long blog post about it. Just not here, over on the Basketball GM blog.

TLDR: I got an order of magnitude performance improvement out of a mature piece of software, with no negative tradeoffs from an end user perspective.

]]>
http://dumbmatter.com/posts/my-greatest-technical-accomplishment.mdhttp://dumbmatter.com/posts/my-greatest-technical-accomplishment.mdSat, 09 Jun 2018 00:00:00 GMT
<![CDATA[Upgrading Prebid.js from 0.34 to 1.x when using the Index Exchange adapter]]>Over at Basketball GM, when I upgraded Prebid.js from 0.34 to 1.x, most of the bid adapters could just be easily copied over. Index Exchange was the one exception. For a while it wasn't availble at all on 1.x, but that changed with version 1.10.0, released a few weeks ago. However it requires some changes to your configuration to upgrade from 0.34 to 1.10+, and I found the documentation to be a bit lacking, so I wrote a blog post!

First, find your old bidder configuration:

{
  bidder: "indexExchange",
  params: {
    siteID: "12345",
    id: "1"
  }
}

Make three changes. Change indexExchange to ix because they renamed the adapter for some reason. Change siteID to siteId because fuck you, I guess - that one took me a while to debug! And instead of id, it needs a size field containing the size of the ad unit. I'm not really sure why that is necessary because there is size configuration elsewhere in the Prebid.js config, but oh well. The end result looks like this:

{
  bidder: "ix",
  params: {
    siteId: "12345",
    size: [728, 90]
  }
}

However, there is one more catch! If you don't explicitly specify that the ad unit is a banner, the ix adapter will ignore it. This is not documented anywhere and is probably a bug since the Prebid.js documentation says it assumes the it is a banner by default.

Bug or not, we can work around it. Your ad unit configuration needs to look something like this:

{
    code: "whatever",
    mediaTypes: {
      banner: { sizes: [[728, 90]] }
    },
    sizes: [[728, 90]],
    bids: [
      {
        bidder: "ix",
        params: {
          siteId: "12345",
          size: [728, 90]
        }
      }
    ]
}

So there you have it. The ix adapter working in Prebid.js 1.x. And it only requires you to specify the size three times:

  1. sizes is needed because it's a required parameter, as per the Prebid.js documentation
  2. mediaTypes is needed despite the documentation saying it is optional, otherwise ix won't work
  3. The ix configuration needs its own size parameter too, for some reason.

I hope this saves you from banging your head against your desk as much as I did while figuring out how to get it working!

]]>
http://dumbmatter.com/posts/upgrading-prebid-1.x-index-exchange.mdhttp://dumbmatter.com/posts/upgrading-prebid-1.x-index-exchange.mdTue, 29 May 2018 00:00:00 GMT
<![CDATA[How I almost got a job with the Sixers, and a Twitter analysis of the 2015 NBA draft class]]>As you may know, I wrote this cool basketball video game and work as a data scientist. A couple years ago, back in the Hinkie era when the Sixers were a forward-thinking organization, someone in the Sixers front office noticed those two things about me and asked me to apply for a job. It was a somewhat long and unclear process. They didn't seem to know what they were hiring for or what their criteria were (or maybe they did and I was just a bad fit). Regardless, eventually I didn't get the job, but as part of the application I did a mini analysis project, and 2 years out, I figure I might as well share that.

The project was: tell us some useful quantitative information about the 2015 draft class, but don't use any basketball stats or physical measurements.

That's really fucking hard! Not only are stats and physical measurements super important, they are also easy to access in structured format. Most other data requires a lot more effort before you're even at the point of beginning to analyze it. So immediately my mind went to Twitter, since at least there I have a semi-structured dataset: a list of Tweets for each player. Check out my silly analysis here. As best I could tell, the Sixers thought it was pretty cool!

Anyway, as I said, I didn't get the job. Then shortly after, Hinkie was fired, so maybe it was all for the best.

]]>
http://dumbmatter.com/posts/how-i-almost-got-a-job-with-the-sixers-and-a-twitter-analysis-of-the-2015-nba-draft-class.mdhttp://dumbmatter.com/posts/how-i-almost-got-a-job-with-the-sixers-and-a-twitter-analysis-of-the-2015-nba-draft-class.mdTue, 24 Jan 2017 00:00:00 GMT
<![CDATA[I switched to a static site generator, and you don't care]]>...except possibly to the extent of making fun of me for being a hypocrite. Beyond that, I will spare you the typical post about all the amazing reasons I switched to a static site generator. Check out the code if you're actually interested.

]]>
http://dumbmatter.com/posts/i-switched-to-a-static-site-generator-and-you-dont-care.mdhttp://dumbmatter.com/posts/i-switched-to-a-static-site-generator-and-you-dont-care.mdMon, 23 Jan 2017 00:00:00 GMT
<![CDATA[Side projects galore]]>God damn, I've been neglecting this blog. It's sad really, because I do have a lot to say. I guess I've just been talking to myself instead of blogging lately, which maybe says something about my mental stability, but whatever. I'm blogging now, and I'm going to blog the fuck out of this blog.

It's 2016. My job is pretty cool in some ways, but in some other ways it bothers me a great deal. It's probably not in my best interest to go into that in great detail here (as if anyone is reading this, right?) so I will leave the rest unsaid, and just get to the broader point. I am someone who cares a great deal about science, engineering, creating cool things, doing things the right way, etc. And when I say "cares a great deal" I mean probably more than I can adequately articulate with my pedestrian writing skills. It's almost like a spiritual thing. So for someone like me, what do you do when your day job is preventing you from attaining your desired level of spiritual satisfaction?

Side projects!

Do some shitty work 9-5? Come home and do some good work on your own projects! What could be more fun than that? Okay, okay, I can hear what you're saying. "Friends!" "Sports!" "Games!" Well guess what, I have time for all that stuff too, not an issue. "Family!" "Kids!" There's your big timesinks! Sadly/fortunately, my idea of a relationship right now comes from Tinder, and that doesn't really take much time or effort (praise the God of Genetics for making me a 6'3" white guy). In total, that leaves plenty of time for...

Side projects!

So I'm going to write about a few things I've been working on lately.

Project #1: Basketball GM

Ah yes, first and foremost, my most successful current project. If you're not aware, check it out. There's about a 99% chance you'll think it's really stupid. But if you're in that 1%, you'll thank me.

I still put a lot of time into Basketball GM. Much effort has gone into modernizing the codebase. When I started writing the JavaScript version, callbacks were king, RequireJS provided modularity, and the strict rules of JSLint kept me from shooting myself in my foot. Now I'm using all kinds of crazy shit like ES2015 (and newer!), Babel, and Browserify. It's a lot of churn, but it keeps me on the cutting edge and it staves off code rot (BTW see side project #2 for some more code rot discussion). And it's just plain satisfying when you go from the old way to the new way and the new way is just objectively better. My dick got hard when I got async/await compiled to Bluebird working. (And for those who are still reading through this paragraph of technical jargon, I am still using Bluebird because I can't rely on native promises using microtasks "correctly", and "correctly" is in quotes because as best I can tell the spec is ambiguous, yay!)

I also spend an inordinate amount of time geeking out on relatively minor details of Basketball GM. Like player names. Imagine you're writing a game, and you need to generate hundreds of fake names. "Well make a list of names and pick randomly from it, problem solved." Fuck you! That is not an acceptable solution. I am compelled to do better. And as I'm typing this I really want to go off on a long discussion on this topic... except I already did that in a blog post on the Basketball GM blog.

Another cool thing that happened lately: logos! Basketball GM uses fake teams because I tragically do not have a license from the NBA. The existence of these fake team names fits into the "geeking out" category as well, but that was yearsago at this point. But the logos, that's new. I found a great artist to work for a reasonable price, but as is the case in many aspects of life, there are tradeoffs, and in this case the tradeoff was time. These logos were in the works for a long fucking time until they recently were completed, but that's okay because they are really good. Check it out. For many of them, you wouldn't think twice if you saw the logo on a real uniform. So fucking cool.

Another cool thing about the logos was the greatest comment in the history of my fledgling Basketball GM subreddit.

And one other cool thing... for a few months I had a Donald Trump easter egg. Very rarely, Donald Trump would appear, dominate your league for a year, and then retire. What was his best skill? Perimeter defense, of course! I crack myself up.

I hope to get back into more of the core AI/simulation aspects of the game soon, because there is a lot that could be improved there. But I don't know if I'll actually follow through or if I'll be perpetually distracted by shiny things on the periphery.

Project #2: Screw

I'm a big hip hop fan. I really like chopped and screwed songs. You know, the ones where the music is slowed down and the pitch drops and it's just awesome. Often I ask myself, "Why isn't more music chopped and screwed? Heck, why isn't all music chopped and screwed?"

Well the chopping part is hard, that requires a lot of skill and manual effort. But the screwing part is easy. Just slow the music down and/or decrease the pitch. Absolutely trivial. Yet does your music player have a "slow this song down by 20%" button? No? Why the fuck not! As I mentioned before, it's 2016. We have the technology.

But fuck 2016, we had the technology way before that. How do I know? Because I solved this problem in 2013. I created a plugin for the Banshee music player (my music player of choice) which gave you a button to press to alter the tempo and pitch of playback. So I could just put my whole hip hop library on shuffle and listen to screwed versions of every song in my collection. Paradise!

Sadly, paradise then burned to the ground when the next version of Ubuntu was released and bit rot set in. See, Banshee is written in C#. C# is a fairly nice programming language created by Microsoft with the goal of ruthlessly subjugating the world. That made it somewhat controversial in the Linux world - some people liked it because it was fairly nice, others were not on board with subjugating the world. In the end, the latter group of people won and C# on Linux has been dead in the water for years. I was not able to overcome the extent of the bit rot, so my Banshee plugin only runs on 2013-era Linux distros. No fun.

In 2016, we have a better [citation needed] platform than C#: the web! Write once, run anywhere, but for real this time (except for cross-browser compatibility)! Joking aside, I really do fucking love the web, its overarching goals, and the results you can achieve with it. So I decided to port my Banshee plugin to client-side JavaScript. That's a perfectly sensible and normal thing to do, right? Right??

I probably would have failed if I didn't eventually find a 4 year old undocumented library that helped me figure out the proper incantations to make the Web Audio API do what I want. I'm not sure if that is awesome or terrifying. But the end result is Screw. Change the tempo and pitch of an audio file, all in your web browser using client-side JS.

It's still not as good as my old Banshee plugin because it doesn't integrate with my music library and it only plays one song at a time. But it changes the fucking tempo and pitch of an audio file in client-fucking-side JS. That's good enough for me.

When I was building and testing it, I listened to Peace of Paper/Cup of Jayzus by Lupe Fiasco about 5000 times, and I'm still not tired of it. It sounds great slowed down and pitched down about 20%.

Project #3: SAS7BDAT Web Viewer

This one is even more esoteric.

SAS is some ridiculously expensive statistics software that I've never even used, because I prefer open source and I don't like ridiculously expensive things. But it's super popular in my industry (pharma) so people often send me SAS files and expect me to do things with them. But my company won't even give me a SAS license because it's so fucking expensive. They do give me a Stat/Transfer license which I can use to convert SAS files to CSV, but wouldn't it be nicer if there was some open source library we could simply integrate into our data flow pipeline?

Well, there are a few. But none in JavaScript, and actually it would be slightly more convenient for me if there was a native Node.js one. And obviously this is not a great use of my time, so I did it on my own time, not company time. I took this Python library and converted it to JavaScript, which is already pretty insane. But wait, we're not done.

In the process of porting that library, I wrote a ton of tests to compare my output to Stat/Transfer. I tested on all the files I could find on Google, plus every file I had on my workstation. Turned out that, even after squashing all the bugs introduced in porting, there were still like 2% of files that would get parsed wrong. SAS7BDAT is just a very flexible format, and people apparently get very creative with it. It'd probably take a ton of work to solve that 2%, which is probably why Stat/Transfer exists as a company. So sadly, after all this work, I was left with something that I can't actually use on real data because I can't just ignore that 2%.

So I decided to try something different, to salvage some meaningfulness out of this endeavor. I got my library to run in client-side JS and build a nifty-little UI for it. So there you have it. SAS7BDAT to CSV conversion in client-side JavaScript. The future is now.

But is it actually useful? Turns out, yes! I was sitting in my cubicle at work a few days ago and I overhear a conversation in the cubicle next to mine: "hey, do you know how to open SAS files?" Naturally I had to butt in and force him to use my app, which actually worked! Score one for the SAS7BDAT Web Viewer side project.

]]>
http://dumbmatter.com/posts/side-projects-galore.mdhttp://dumbmatter.com/posts/side-projects-galore.mdSat, 18 Jun 2016 00:00:00 GMT
<![CDATA[The Bishop Sankey Diagram]]>Or, my best pun ever:

]]>
http://dumbmatter.com/posts/the-bishop-sankey-diagram.mdhttp://dumbmatter.com/posts/the-bishop-sankey-diagram.mdWed, 07 Jan 2015 00:00:00 GMT
<![CDATA[Basic income vs. basic job]]>Chris Stucchio wrote an article about the differences between basic income and basic job policies, based on relatively straightforward math. Briefly, basic income says give everyone money with no strings attached and get rid of other forms of welfare. Basic job is the same, except anyone who can work is mandated to work, either in a normal job like today or in a New Deal-style government works program.

Chris's main conclusion was that basic job came out looking way better than basic income. Additionally, a major purpose of his post was to encourage other people to play around with the math as well rather than just bloviating. Since I'm a big basic income proponent and have some quibbles with how he came to conclude that basic income doesn't look too good, I will follow his lead and play around with the math.

I don't know Chris Stucchio and I don't know if he was inherently biased for basic income or basic job, but I'm definitely inherently biased for basic income, so take this whole post with a grain of salt. However, to give it some semblance of fairness, I'm going to write this whole thing without doing any math. I'm going to make what I think are reasonable changes to Chris's assumptions and see what that tells me. Maybe it will say basic income sucks, and then I will be sad, but I will still publish those results. You'll just have to trust that I'm telling the truth, I suppose.

In Chris's model, basic income is paid to everyone. It is also possible to have a system like progressive income tax, where it gradually phases out; in fact, fellow Rutgers alumnus Milton Friedman proposed to implement basic income through a negative income tax. So let's imagine some system like that and reduce the costs by 50% right off the bat.

    direct_costs = num_adults * basic_income / 2

Chris correctly noted that there are incentives for more work and less work in basic income. He thinks it's more likely that the negative incentive will be more prominent. I think it's more fair to just call it a wash, since it's very unclear. So I deleted that part of his model. I doubt this has a big impact on anything anyway.

At this point, I want to add an effect that has been neglected. Chris treated the number of disabled adults as a constant, but that is likely not true. So let's conservatively say 2 million people currently on disability would start working if they got a basic income, likely at some not-so-great wage.

    undisabled = 2e6
    undisabled_hourly_wage = uniform(0, 10).rvs()
    undisabled_cost_benefit = -1 * undisabled * (40*52*undisabled_hourly_wage)

Chris included the "JK Rowling effect", the odds that someone not forced to work a shitty job could create a great achievement that would have a significant positive economic impact, like JK Rowling writing Harry Potter while on welfare. I think there should be an additional effect for less spectacular events. With a basic income, many people would be free to pursue new career paths and start small businesses (or even bring existing careers and businesses out from under the table, as people on welfare often cannot work without facing penalties). How big is this effect? Fuck if I know. But I want to include something. Fuck, let's just say that basic income improves average productivity by something between 0 and 20%. The average hourly wage in the US is about $25/hr and I don't know if the average wage for increased productivity should be higher or lower, so let's pick it from between $10 and $30.

    avg_hourly_wage = uniform(10, 30).rvs()
    productivity_multiplier = uniform(0.0, 0.2).rvs()
    productivity_cost_benefit = (-1 * labor_force * (40*52*avg_hourly_wage) *
                                 productivity_multiplier)

Now let's move to basic job. Most of Chris's assumptions seem good enough. I'll make one change - the value of work from people who currently aren't working. Chris says it's worth somewhere between $0/hr and $7.25/hr, as otherwise they'd probably be working a minimum wage or higher job. Sounds reasonable enough, but there are also people who bring negative value to the table. These people would be forced to work, likely in some boring job they hate. So I'm doing this:

    basic_job_hourly_productivity = uniform(-7.25, 7.25).rvs()

I could definitely quibble more, but somebody could quibble with my changes too, so I don't want to go too crazy. The above changes seem reasonable enough to me. So here's my modified code. Now I'm going to try to run it. This will be interesting not only to see the results, but to see if I could make these changes without introducing a syntax error!

Lower is better on these plots, so it looks like basic income wins! At least, if you agree with my completely unbiased assessment...

Update: Chris posted a follow-up article that I basically entirely agree with.

4 archived comments

  1. direct_costs = num_adults basic_income / 2

    Milton Friedman's proposal looks similar to this, but he proposed the cutoff would be at twice subsistence level, assuming a 50% claw-back. So ‘basic income' in that formula has to be twice as high as ‘existential/socio-cultural minimum'

    Comment by Raoul — July 29, 2014 @ 12:58 pm

  2. Where's the edit button I phrased that quite poorly.

    Oh well. All I mean is, a NIT model proposes a cutoff point above the subsistence level, in correlation to the clawback rate. So the assumption remains, that anyone at any point in time can claim a check of at least subsistence level, and if they earn money on the market, the check diminishes at clawback rate.

    So I don't quite understand how to interpret

    direct_costs = num_adults basic_income / 2

    Comment by Raoul — July 29, 2014 @ 1:03 pm

  3. An alternative thing to keep in mind for keeping cost in check though, is to stop excluding existential minimum from taxation.

    I only know the German numbers but there I know for a fact, that we're already giving the poor and middle class here tax exemptions that alone could nearly cover the supposed financing gap of basic income models.

    I mean over 8000 euro per year per adult and nearly 8000 euro per year per child, tax free, plus income taxation starting in the low 20%s past that. plus lower rates if you have a partner for life, for one of the partners. It's a triple digit billion deal in Germany.

    Comment by Raoul — July 29, 2014 @ 1:10 pm

  4. Raoul, it's just a rough estimate. The upper bound for no clawback would be num_adults*basic_income. Add clawback and you get something less than that. I just arbitrarily picked 1/2 as a factor to represent "there is some clawback going on". Change the 1/2 to 3/4 and it doesn't substantially change the conclusions, it just shifts the Basic Income cost curve to the right by about 1e12.

    Comment by Jeremy Scheff — July 29, 2014 @ 6:31 pm

]]>
http://dumbmatter.com/posts/basic-income-vs-basic-job.mdhttp://dumbmatter.com/posts/basic-income-vs-basic-job.mdWed, 13 Nov 2013 00:00:00 GMT