Bias Hunter
  • Home
  • Blog
  • Resources
  • Contact

Ethical Algorithms

27/12/2016

0 Comments

 
In a wonderful and very interesting turn of events, ethical algorithms are suddenly all the rage. Cathy O’Neil wrote a book called Weapons of Math Destruction, in which she went through a couple of interesting case examples of how algorithms can work in an unethical and destructive fashion. Her examples came from the US, but that the phenomenon doesn’t limit itself on the other side of the pond.

In fact, just a month ago, the Economist reported on the rise of credit cards in China. The consumption habits in China are becoming closer to resembling Western ones, including the use of credit cards. And where you have credit cards, you also have credit checks. But how do you show your creditworthiness, if you haven’t had credit?

Enter Sesame Credit, a rating firm. According to the Economist, they rely on “users’ online-shopping habits to calculate their credit scores. Li Yingyun, a director, told Caixin, a magazine, that someone playing video games for ten hours a day might be rated a bad risk; a frequent buyer of nappies would be thought more responsible.” Another firm called China Rapid Finance relies on looking at users’ social connections and payments. My guess would be that their model predicts your behavior based on the behavior of your contacts. So if you happen to be connected to a lot of careless spend-a-holics, too bad for you.
​
Without even getting to the privacy aspects of such models, one concerning aspect – and this is the main thrust of O’Neil’s book – is that these kinds of models can discriminate heavily based completely on aggregate behavior. For example, if CRF:s model sees your friends spending and not paying their bills, they might classify you as a credit risk, and not give you a credit card. And if there is little individual data about you, this kind of aggregate data can form the justification of the whole decision. Needless to say, it’s quite unfair that you can be denied credit – even when you’re doing everything right – just because of your friends’ behavior.
Picture
Four credit ratings, coming down hard.
Now, O’Neil’s book is full of similar cases. To be honest, the idea is quite straightforward. The typical signs of an unethical model (in O’Neil’s terms, a Weapon of Math Destruction) has a few signs: 1) they have little to no feedback to learn from, and 2) they make decisions based on aggregate data. The second one was already mentioned, but the first one seems even more damning.

A good example of the first kind is generously provided by US education systems. Now, in the US, rankings of schools are all the rage. Such rankings are defined in with a complicated equation, that takes into account how well outgoing students do. And of course, the rankings drive the better students to the better schools. However, the model never actually learns any of the variables and their importance from data – these are all defined by pulling them from the administrators’, programmers’, and politicians’ collective hats. What could go wrong? What happens with systems like these, is that the ranking becomes a self-fulfilling prophecy, and that changing how the ranking is calculated becomes impossible, because the schools that do well are obviously up in arms about any changes.

This whole topic of discrimination in algorithms is actually gaining some good traction. In fact, people at Google are taking notice. In a paper that was recently presented at NIPS, the authors argue that what is needed is a concept of equality of opportunity in supervised learning. The idea is simple: if you have two groups, (like two races, or rich and poor, etc.) in both groups the true positive rate should be the same. In the context of loans, for example, this means that of all those who could pay back loans, the same percentage of people are given a loan. So if groups A and B have 800 and 100 people that could pay the loan back, and your budget can account a loan to 100 people, then 88 in group A and 11 in group B would get the loan offer (both having 11% loan offer rate).
​
Mind you, this isn’t the only possible or useful concept for reducing discrimination. Other useful ones group-unaware and demographic parity. A group-unaware algorithm discards the group variable, and uses the same threshold for both groups. But for loans, depending on the group distributions, this might lead to one group getting less loan offers. A demographic parity algorithm, on the other hand, looks at how many loans each group gets. In the case of loans, this would be quite silly, but the concept might be more useful when allocating representatives for groups, because you might want each group to have the same number of representatives, for example.
Anyway, there’s a really neat interactive graphic about these, I recommend you to check it out. You can find it here.
0 Comments

The Nonlinear life as a Random Walk

27/11/2015

3 Comments

 
​The past two months, I’ve been completing University of Michigan’s fantastic Model Thinking course, available for free on Coursera. There’s so much to love about the modern world: you can learn interesting things through quality teaching, no matter where you are (well, you need a wifi), no matter when. And it doesn’t cost a cent!

Anyway, the course had a section about Random Walks, and it got me thinking. A while back I wrote about how the nonlinear life and our linear emotions aren’t exactly optimally suited to each other. Your brain craves signs of progress, so it could reward you with a burst of feel-good chemicals. Unfortunately, the nonlinear life doesn’t work like that. Often, you can spend days or weeks slaving away at the office/studio/whatever, not really moving forward – or even taking two steps back for each move forward. Despite the hours that you put in, the article/thesis/design never seems to be finished, making you question whether you’re really cut out for this kind of job. Perhaps you’d do the world a favor by setting your sights lower and working as a sales clerk instead.

Now, while watching one of the course lectures, I suddenly realized that the creative nonlinear work is exactly a random walk! I don’t claim this to be a unique insight or anything – I’m sure many of you have realized this before. But for the fun of it, it might be a nice exercise to show with a random walk model how the nonlinear life functions. At least in my own case, models often help to see the bigger picture, and forget about the noise in the short term. And who knows, maybe this will help to quell those linear emotions, too.

So, a random walk is very simple. In this case, let’s assume that we have a project that has a goal we’re trying to reach. Arbitrarily, let’s say that the completion means we reach a threshold of 100 points. Of course, these numbers are completely make-believe and I pulled them from my magical hat. Now, further, let’s assume that each unit of time – say 1 unit equals 1 day – means we have three possibilities: make progress, stay where we are, or take steps backward. In my personal experience, this is an ok model for work: sometimes you’re actually making progress, and things move smoothly. Sometimes, though, you’re actually hurting your project, for example by programming bugs into the software, which need to be fixed later on (just happened to me two weeks ago). Most often, though, you’re trying your best, but nothing seems to work. Maybe you’re stuck in a dead end with your idea, and need to change tack. Maybe you’re burdened with silly tasks that have nothing to do with the project. Well, I’m sure we all have these kinds of days.
So let’s again use my magical hat and pull out some probabilities for these options. Let’s say you have a 5% chance of making a great jump forwards (10 points), 25% chance of making 3 points of progress, 55% chance of getting stuck (0 points), 10% chance of making a mistake (-2 points), and a 5% chance of doing serious damage (-6 points). Now we just simulate these across and get a graph that shows your cumulative progress towards the goal (yes I'm doing this in Excel):
Picture
​So, in the graph there are several periods when it’s just going downhill, or plateauing for several time periods. Even though the numbers are really made up, I feel the above graph is actually a pretty decent example of how the nonlinear work often feels. However, there’s still the additional complication: the emotions.

Suppose that our emotions work as follows. If you’re making progress, you feel good. And this is mostly irrespective of how much progress you’re making. Suppose the same holds for drawbacks – it hurts, but it hurts almost as much to look for a bug for two hours or the full day. Finally, I’ll assume that if you’re not moving anywhere, you inherit the feeling from the day before. Now, I realize this is probably not how emotions really work (we’re often annoyed by our administrative duties, for example). But on the other hand, when I have a day I have spent at a dull seminar, I seem to find myself looking back a bit to evaluate the progress. The “inherit from t-1” rule tries to describe this: I feel good if the past has been good, and I feel annoyed if the past wasn’t successful. Why just t-1 and not the actual level? Well, I’ve also found that it’s really hard to evaluate how far the project actually is, which makes that option unrealistic. And when looking back, our memories are much stronger from the immediate past than the long-gone part. In short, I’m modeling here the short-sightedness. The actual progress-emotions payoff table looks like this:
Picture
So with these assumptions, we get the following graph portraying emotions:
Picture
Now this is pretty interesting! You can see how 1) there’s a lot of fluctuations back and forth, and 2) how there’s still “runs”, ie. the same emotional state tends to linger for a while. If you run the numbers, with this particular string of successes and failures you get 99 positive time periods and 51 negative ones, out of the total 150 periods I ran the simulation for. I think the above graph is quite a good summary of how the nonlinear life often feels: you love you’re job, but you’re not above hating it when things are not going well.

A final word of warning: this was of course just one simulated outcome. With the exact same parameters, you can get project outcomes that never finish, that run into negative progress, that finish in less than 30 periods, etc. They are not very nice for terms of a presentation, but also capture the great amount of uncertainty in a nonlinear project. Sometimes it just falls apart, and after 50 periods you’re back to exactly where you started. Or that a project you thought takes 6 weeks takes 16 weeks instead. Well, I’m sure everyone has had these experiences.
3 Comments

    RSS Feed

    Archives

    December 2016
    November 2016
    April 2016
    March 2016
    February 2016
    November 2015
    October 2015
    September 2015
    June 2015
    May 2015
    April 2015
    March 2015
    February 2015
    January 2015
    December 2014
    November 2014
    October 2014
    September 2014
    August 2014

    Categories

    All
    Alternatives
    Availability
    Basics
    Books
    Cognitive Reflection Test
    Conferences
    Criteria
    Culture
    Data Presentation
    Decision Analysis
    Decision Architecture
    Defaults
    Emotions
    Framing
    Hindsight Bias
    Improving Decisions
    Intelligence
    Marketing
    Mindware
    Modeling
    Norms
    Nudge
    Organizations
    Outside View
    Phd
    Planning Fallacy
    Post Hoc Fallacy
    Prediction
    Preferences
    Public Policy
    Rationality
    Regression To The Mean
    Sarcasm
    Software
    Status Quo Bias
    TED Talks
    Uncertainty
    Value Of Information
    Wellbeing
    Willpower

Powered by Create your own unique website with customizable templates.