Equality of Utility II

Some time ago we investigated the equality of benefits. Roughly speaking let us consider degenerate real world actions into discretely selectable choices of action a\in A given individual x, who has observable features f(x) and protected feature p(x). Suppose the company has to choose among a set of actions to take a \in A. What is a workable definite of fairness or equality in such a decision making effort with respect to protected properties p?

Let god bestow us, a neutral third party, with a utility functor u whose evaluation on the individual u(x) results in a function u(x)(a) is the utility of company taking action a to individual x, u(x)(b) is the utility to individual x of company taking action b.
Let f be the decision process of company g, g(x) is the decision company makes, some a for the individual x. Then the right thing to do
g(f(x)) = argmax_{a\in A}(u(x)(a)) = g(f(x), p(x))
Simple, we do as god says, act as if we have the knowledge of an oracle–even when knowing some discriminable information that we then chose to ignore.

This is not as easy as it looks in a formula. Think of a person with a clown nose and one without, your behavior will likely be very different between those two persons, even if you decide that a clown nose has absolutely nothing to do with the task at hand.

Additionally, the nature of our imperfection dictates that our systems that we build are imperfect. What if we cannot achieve God’s will? What if we fail to do the virtuous even when we know what the right thing to do is?

What could a neutral thirdparty reasonably demand of a faulty company? One suggested approach is to establish probabilistic equality among protected classes. Suppose there are some number of classes, m\in M which corresponds to values of p(x), between which we must protect their utility. (So for example M could be cartesian product of age, sex, race, birthplace, religion and political party)

E(u(x)(g(f(a)))| m) = c\ \forall m\in M

That the customer utility for each class is identically some value c. This is a simplification as there are other classes of equivalence in stochastic variables.

Note this framework has some slight benefit over traditional machine learning framework evaluating equality on confusion matrix of classifier performance g. There two most inspiring examples that I suffer from:

Situation 1: I noticed that my coworker was getting Tesla car advertisements while I do not receive one. Even though my utility in not receiving the advertisement was a negligibly loss–because I cannot afford a tesla, I still feel angry. I may even be tempted to find a protected attribute of mine to claim that tesla discriminated against me in its advertisement campaign: What! they think mid-aged Asian man can’t have a midlife crisis or can’t afford to splurge on a Tesla? In this case a true negative for prediction regarding response/conversion through a Tesla car Ad but offensive enough to cause problems. In retrospect this would have had positive utility for me, when I reached out to Tesla I learned more about how the car would work for me. But the decision seem to produce a negative sentiment from its subject.(The company has, since my drafting of this blog entry, sent me repeated invitation to test drive the S, perhaps due to recent but small increase in my disposable cash, which I may consider calling upon by taking the offer to test drive, at a suitable time. this is just an example)

Situation 2: I am offended when I do receive an advertisement for STD testing, and in particular for hepatitis family of diseases. For gods sake, there’s a Asian Liver Center at Stanford whose purpose for establishment is to check me for hepatitis or other Liver problems present in Asian livers. In this case, god bless me, that I am free of hepatitis and other liver problems of any kind, and that this is a false positive in advertising. I am offended. And in reality one may argue that the benefit of this advertisement, to me, to increase my chances of early detection is positive–E(u(huan)g(f(huan)))>0 I still feel offended. This case is a false positive to advertisement conversion. It is a positive utility to have shown it to me. And yet it produced negative sentiment.

Situation 3: I just received a piece of snail mail from a Redwood City mortuary advertising their service to Mr. And Mrs. Chang. I am terrified. I feel this is a death threat of some form. Putting the idea of me dying in Redwood City in my head. The letter has hand addressed envelope. This is a false positive for advertising relevance(I did not die, not yet any ways, and I am not planning on dying) it has zero utility for me, and I am definitely feeling very negative sentiment.

These are but several of many possible situations where the company could do the right thing in front of God, and in front of the board, by still be erring and thereby producing very negative sentiment. At risk of running out of numbers to enumerate all of them, I have not numbered all the types starting at 1.

To summarize, there are several factors that ultimately factor into a company’s decision making process, nonexclusively they are:

  • The E.u.g.f for x, whether it is defensible in front of an oracle, God, or court of law;
  • how will any action make the subject individual feel, the sentiment it produces, irrespective of objective utility;
  • is utility function universally accepted;
  • and finally the company’s bottom line.

With these considerations in mind, we can now continue with our exploration of fairness.

In a world… where there are nebulous words!

It occurs to me to write these down on a special occasion… Initially I was considering an optimization problem in a situation where localized optimization is essentially playing a zero sum game with an opponent who is much more powerful.

In this situation, even though we have big data, and even though we have deep learning, it still remains that there is bigger data and yet more sophistication else where. One of the challenges of the nascent big data and deep learning enabled AI industry is one of problem selection.

There are people who are trying to cure cancer and save lives. And there are people trying to trade stocks, win political campaigns, or engage in armed conflict (not that these are the same things) Their continued admonishments against AI are the people who fear the latter. I would imagine there may be very few who would oppose the prior.

That! That is the underlying restriction to the technology: what it can do for prior cause is practically restricted by what it does for the latter. The same applies to all tehnolgies of course. We’ve had internet, social media, a typical Californian would probably take a few minutes to recognize therebeing anything exceeding unusual about the potential downside of yet another meme…

Also, consider aliens, of the interstellar variety, one should always be mindful of our real  competition. There is likely a far greater intelligence out there. Let us not doubt, and let us certainly not delay the development of our own Big Intelligence as matter in due course of our kind’s progress.

Speed bumps and Scheduling

Hmm, so I’m reading this blog of my own and wondering if I’ve made a big mistake mocking the established schedule. Train 207 takes any passenger from first half of the track to a middle point where a timed transfer occurs to allow the slow rider to transfer to a second slow train that reaches every station of second half of the tracks. This scheme enables travelers from any minor station on the southern half of the tracks to reach major stops on the northern half and allows them to reach minor destinations via a timed transfer.


2013-05-11 pic 1                                    2013-05-11 pic 2

My proposal was to have 207 be fast on it’s first leg and 211 fast second half so that people needing to reach minor destinations arrive faster. The main issue is that we need to look one layer deeper at the underlying data–the commuters. The way we work is that we congregate during the day where people are densely packed in small cubes so we can work and communicate more effectively. This means there are a few hotspots of arrival during the morning and a same few hotspots of departure in the afternoon. People live in suburbs so that they can have larger living quarters to provide distance and privacy between people. Therefore the living quarters are spread over large areas and there are few hotspots. Admittedly the bay area is different from other metropolitan areas in that there are several major hotspots along the entire length of the track. The southern most and northern most end points are themselves very larger commuter hotspots: San Jose and San Francisco. In any other metro, Hub and Spoke is probably more appropriate with more stops near the center than on the edge.

Let me disclaim briefly that I am not a zealous fan of overly quantified and optimized life–at least not the kind where it is forced upon us. One case in point is the San Francisco time-of-use parking meter rates. There is a spreadsheet of rate schedule of every meter and it changes regularly based on parking patterns. AND you have turn your wheels towards or away from curb or else there is a huge fee penalty. Futuristic worlds with so many more dynamic aspects for our brain to think about. Pretty soon we will need a smart car to figure out if we can afford to park somewhere. Pretty soon, we’ll have a time-of-use tax system that changes the tax you pay depending on when you work down to the minutes and where you work down to 4 decimal digits of degrees of latitude and longitude. Pretty soon, we’ll have carbon emission credit for human beings and that includes what you exhale, pay for sewage by the gram and overage on volume as well. None of this is imminent, but it’s a little bit scary.

Mean while, back at the lab, the fact that there are major hot spots along the norther half of the tracks means train 207 provides important service from all of the southern suburbs to all of the northern hotspots. The fact that 211 exists seem to mean that there are some major residential hotspots where people get on the train as well, but they need only reach as far as Menlo Park and redwood city.

Under my proposal, the slow train would depart early and make every single stop. A faster bullet train would depart later and pick up the slow train passengers half way. (The trains here can pass each other as well, so the slow train will drop off passengers and pause at a station as the fast train passes) This way we will have given all the passengers on the southern half of the track a way to arrive at hotspots on the norther half of the tracks faster than the current scheme. And all passengers from southern tracks can reach northern tracks faster than the current schedule.

To see this, take a look at the 6 minutes lay-over highlighted in green. The passengers on my slow train will not wait 6 minutes to get on another slow train, they can just ride normally and arrive at their northern destinations 6 minutes sooner and butt never leaving their seats.

What about the people at the residential hotspots on the southern tracks? The practical matter here is are there a lot of people who travel from major residential hotspots to every station on the southern track? My feeling is that there are not. If there are few they suffer from the implementation of my scheme whereby they must take slow train all the way to their slow destinations. And if this accounts for a large proportion of the population, we should run two fast trains, one arriving at mid-point before slow train and one arriving at the mid-point after the slow train–The complete opposite of the current schedule. Here is an artist rendition of the proposed schedule, orange is fast, brown is slow, green are timed transfers and black are stops:

2013-05-11 pic 3

We should also talk about the speed  of the train. It would appear at fist glance that there is a lot more fast speed than before. If the train uses more fuel at high speeds than this is a problem. But most engines (an the entire train system), operate better driving at long smooth speeds. high speeds means the brakes are not being used as frequently and reduces wear and tear on the joints of the trains. The average speed at any segment of the track traveled by these triplet is the same as the existing schedule: two fasts and one slow. Similar analysis show that the number of stops made by trains, the average deceleration and acceleration experienced by these trains are the same as existing scheme. The count of stops and starts to and from fast segments are also the same. in all, the effort made by the train to carry the passengers is the same as before in the three train scheme.

To summarize, we have proposed two separate schedules. A couplet schedule and a triplet schedule. The couplet train schedule requires two trains and services most common office hotspot by providing a timed transfer from residential non-hotspots to a fast train arriving at office hotspots. The couplet schedule slows down the rides of commuters from residential non-hotspots to office non-hotspots while providing same level of service to all other passengers while using one less train. The triplet schedule runs two fast trains before and after a slow train instead of two mixed-fast-slow trains before and after a fast train. Timed transfer enable commuters from residential hotspots to travel directly to office hotspots without getting off the train using one of two possible trains. Passengers needing to commute between residential hotspots to office non-hotspots and from residential non-hotspots to office hotspots now have the choice of making a timed transfer for a faster trip. This schedule is better than the existing schedule because every possible trip is faster and it requires fewer timed transfers than before using the same number of trains fuel and wear and tear..