In a world… where there are nebulous words!

It occurs to me to write these down on a special occasion… Initially I was considering an optimization problem in a situation where localized optimization is essentially playing a zero sum game with an opponent who is much more powerful.

In this situation, even though we have big data, and even though we have deep learning, it still remains that there is bigger data and yet more sophistication else where. One of the challenges of the nascent big data and deep learning enabled AI industry is one of problem selection.

There are people who are trying to cure cancer and save lives. And there are people trying to trade stocks, win political campaigns, or engage in armed conflict (not that these are the same things) Their continued admonishments against AI are the people who fear the latter. I would imagine there may be very few who would oppose the prior.

That! That is the underlying restriction to the technology: what it can do for prior cause is practically restricted by what it does for the latter. The same applies to all tehnolgies of course. We’ve had internet, social media, a typical Californian would probably take a few minutes to recognize therebeing anything exceeding unusual about the potential downside of yet another meme…

Also, consider aliens, of the interstellar variety, one should always be mindful of our real  competition. There is likely a far greater intelligence out there. Let us not doubt, and let us certainly not delay the development of our own Big Intelligence as matter in due course of our kind’s progress.

Wikipedia Dependency

I find that I can spend a lot of money on a book on a subject but wiki still makes the subject most clear in far far shorter time.
This is becoming a problem. I am less and less able to read longer expositions. Less patience and probably reduced mental capacity to hold longer strands of thoughts. As a species, wiki-style knowledge transfer improves our knowledge sharing, as a person this drastically reduces my own distinctiveness and competitiveness. I may, in fact, be organizing my thoughts as wiki articles. I can say every thing I know in a few minutes, and they are all incredibly clear and right.
I am frustrated with everything else: people speak in imprecise and unedited ways–I can’t stand it, need to ask for clarificationnof every thing! Books do not have introductory paragraph that actually introduces the ensuing content of discussion–what will I be spending next few hours on? Idk! TV will conveniently cut away when vital information should naturally be revealed–and there should be a infographic explaining the relationship between all the characters!!! I can’t stand not knowing the definition of everything–all of which is only available through a link on wiki.
In the future, where we actually do depend on wiki for knowledge, how should it maintained? Admittedly the current management has done well, but when all of humanity shifts to depending on wiki for up to 50% or even 80% of the facts they depend on, there should probably be more thoughts on how it should be maintained.
Not to be worrying about malicious or political edits that the website can have. And further, not worrying about psychological and evolutionary impacts when everyone has access to high quality information. Not considering the possible problems associated with monopolistic situations such as Wikipedia.
If it becomes a public utility, should it not be regulated as public utility. Granted the foundation is an American incorporated organization it already comes with a lot of American values: non-discrimination, nonprofit, apolitical, etc–it is already regulated.
But that regulation is not sufficient for a public utility that a large proportion of the population depends on in a way vey much like how they depend on roads, electricity, water, the weather report, etc. some guarantee of universality must be made to ensure every human has access to knowledge. Some higher level of backup and guarantee of reliable availability in times of crisis. Stated another way, this is mainly to say that more financial resources and more social procedures to safeguard the utility(usefulness, universality and availability) and righteousness(adherence to American values) of Wikipedia and related internet establishments. I’d love for a portion of my taxes to pay for its upkeep, if there comes a time that government regulation are so strong that it becomes part of the government  operations(e.g. USPS, military, intelligence, education, roads, etc), when that is established.
In the same breath, we should say that human knowledge loves freedom. If there is any person in the world who knows of freedom, and who values freedom, and who insists on freedom, that person is with high likelihood a knowledgeable one. Knowledge will resist restriction to the extent of self-destruction. If we do impose any additional restriction not yet ingrown organically, it may be ruined. 
Must tread carefully.

Must think more on the matter.

You have got to be Kidding me

So, Madiant discovery of Chinese hacker has lead to the “discovery” of one of their blogs.

You have got to be fucking kidding me.

I mean, the obvious parallel one would draw is Mark Zuckerberg who used his hacking skills to hack db’s and get pretty girls’ headshots and has now been accepted by society as very successful and very good person… By that I mean, Mark is very rich and not that many people hate him like there are who hate other billionaires.

His Chinese counterpart may be a lowly employee who finally joined his company, or maybe, he committed suicide after he was too embarrassed for not being able to find a wife or … actually more likely provide for a wife in the Chinese social/economic order.

But really, I am having trouble suspending disbelieve and continue that thought. Really? Would the Chinese censor allow this kind of stuff to be posted from a Chinese military installation? You have got to be kidding right?

Hey, also, what’s with this thing where the US spy agencies are given access to US citizen’s financial information?  Don’t they already have it and mine the shit out of them? why the fuck would the CIA and NSA not already have access to this data? Seems really odd

Anyway, I guess it’s nice that Obama Administration decides to make the populace aware of this fact. Those who has anything to hide probably already know, and those who don’t know should be informed.

The other problem with monitoring and surveillance is that I really don’t trust my private information to a stranger. I don’t trust the information I keep private to anybody and that’s why I keep it private. These law enforcement people, they all have a, to a large or small extent, perverse interest in power. The cook gadgets that enable them to snoop, to record, to change things, to have control over other peoples’ lives. The elitist feeling: I’m more important, I have higher authority because I am doing something more important than you.

Fundamentally, these are the factors that drive society. But since law enforcement is to prevent the problems caused by these factors, they cannot be motivated by these same factors. And if YOU tell ME that YOU are a law enforcement officer and that YOU do NOT find a deep attraction to your WEAPON, your VEHICLE, your COMPUTER, your CODE, your TOOLS, your BADGE, your next COMMAND, your next SUSPECT/VICTIM and that you dream about them and that you some times cum to the thoughts of them, then I DO NOT BELIEVE YOU.

And if I do believe you then you are driven by the same forces that drive criminals to do the illegal things (much less bad things), which makes you no more trust worthy than them.

I do not want you to jerk off while looking at my bank accounts or my personal photos or my children’s personal photos. But what guarantees do I have that there is not a law enforcement officer doing that every day? It cases me no material harm, but I just don’t want that to happen. How do I explain this? Under what grounds can I justify my distrust and disgust ??? Is this a human right? is privacy a human right? It feels like it oughta be. It ought to be even more important for me to be able to keep my papers private than my right of speech regarding these papers.

I wish President Obama has an answer to this… I’m sure he does… I mean he signed up to be the commander in chief of all of these perverts. Anyway, all this fussing on my personal blog are probably not going to cause society any good… sigh, for a brief moment, some bits in some computer on some planet in some galaxy… these patterns formed and then vanished…

Another one Bites the Dust!!

This is funny news… Chinese solar panel company SunTech defaults on $541 million bond.

Man, the environment is down for the count, and human politics is getting in the way of wild spread use of solar power. Why? Because there appears to be an anti-dumping legislation in the works in the EU and US to prevent Chinese from dumping cheap solar panels in the markets

Btw, if you haven’t noticed, this blog is set to auto-publish. I might have won the lottery since I wrote this piece and gone away and these posts will appear automatically. It’s kind of funny, because there’s another news about some Taiwanese singer/actor posting instructions from Chinese government for him to bash Apple. Article is carried in fortune magazine site originally something called the TLN.

Like a typical American I read the news and thought, wow! that’s funny, wonder how much they paid the 820 party and how I can get in on it.

But wait, so,… how do we know if it’s the Chinese government paying him or if it’s another company that competes with Apple? Lenovo comes to mind as they make both PC’s, and Huawei is a big manufacturer of cellphones. What about Sony? Samsung? Nokia? All these competitors… why the fuck would Chinese government risk exposure and use a state-sponsor media in an attack?

 

JESUS!

 

Save us!

 

Give us some very basic ability to think critically!

 

 

Should it be legal?

Time for another episode of “should it be legal ?”

 

Think of it… we’re in Philadelphia, no the movie, not the city. And Tom Hanks discovers that the corporate email server is very slow… too slow in fact to receive the document he is trying to emailed to his assistant before the end of statute of limitations was set to expire the next day. Would this count towards illegal discriminatory behavior based on race, age, sexual preference or country of origin?

 

Actually a more important question to ask is does anybody even care of fairness at work place? Are there any amongst you that would agree to racial discrimination just to receive some shares of stocks or to feed your family? In this time of terrible economic crisis, I think most people in America do not have the liberty to act on concerns of unfairness.

 

Why has there been more frequent economic crisis? I think I finally know why. It is not because corporate America cannot keep accounts straight or evaluate risk on mortgage loans! The crisis for all practical purposes legalizes discrimination. Everybody is holding their own mouths shut for fear of being seen as against the company.

 

Is it legal in America to restrict employee work-place internet connections and bandwidth based primarily on race, and place of origin?

 

Personally, having no law degree, I feel that it is race based preferential treatment and unfairly bias against a certain group based on racial characteristics and place of origin.

 

Oh, I mean, I know it can’t be traced to the company… just like that fax was lost and recovered inexplicably in Philadelphia. But the mere fact of this capability should be announced publicly like when police decides to arrest people they have to say out loud what and why they are doing it. When the company inspects the employee’s connections from work place computer and delay it or disrupt it, it must be done in an unbiased way.

 

Am I, like, the only one?

Dude, am I like the only one under the sun who don’t know who or how emails are being “unsent” ?

 

The symptom is this: I type the email, hit send, it goes away. Next day (or several days later), I become aware that recipient did not receive the email. I look for the email and it is stored as an unsent “DRAFT” in gmail.

 

I did some quick search on google and didn’t see anybody else talk about this. But my email (gmail) often become unsent after I hit the send button. I doubt it is a bug on google’s side. I also doubt it is very wide spread, since I have neither seen or heard anybody mention this problem.

 

But it does happen often when the content of email is undesirable for the recipient. This happens both in google’s free accounts and in a paid enterprise version of gmail. It happens both in work email and in personal email.

 

I mean, I guess I should admit, now that I’m at it, that I also have occasional ED… Because it is of similar level of embarrassment for a computer guy to not know this crucial skill is probably like ED to sexual ability of man–naturally occurring but failing. Oh, and!?, btw!? I also have urinary incontinence. Experiencing all three, I can tell you that they don’t kill you, but all are very inconvenient and can be very very embarrassing.

 

Let’s see, what have I tried:

 

* Tried google’s 2-phase verification.

* Tried paying google for the gmail account.

* HTTPS always, man-in-the-middle due to invisible corporate proxy cannot be. And it happens at home too.

* And failing that, using a mobile device that goes through an entirely physically separate cellular network.

* Use chrome, which supposedly is more secure than other browsers.

* Bcc myself on all mail.

* porn, sex, not drinking water, and diapers.

 

Still, emails become unsent the next day. The problem with this is that if it is not a bug, then the people who cause this to happen is seriously detracting from my ability to work and live. I mean, I have thought about how it might be my boss who just want to delay a few projects so that he doesn’t have to give me bonus, or my coworker who want to make me look bad so that he can get bonus, or the HR/legal of company who want to reduce liability of the company by making it look like I didn’t communicate vital but damaging information.

 

But those are just suspicions of a really insane person. I mean, seriously, what are the chances that the silly secretary or office manager have more access to information and control my communications than I do? I mean, com’on I actually work and produce things that the company sell for money, it cannot possibly be that there is a person who sits there and reads every single email and evaluates them and selectively unsends them.

 

I don’t have trouble believing that shrewd corporate competitors and business man and an occasional hacker have the means to do this, but the unsending of email happens at several companies, several accounts under management by different people. It happens enough to make me think that every company officially has the capability of unsending emails hosted by google?

 

Is this an attack by Microsoft? Part of the scroogle campaign? Some coworker do come from M$ family… Corporate conspiracy to defame google?

 

Despite these occasional intrusions, I have not been motivated to seek out a new email service provider (ESP) for my personal account, and certainly have no better alternative to recommend to work place.

 

Also, it could be that I just suffer from some kind of interruption in consciousness and somehow I have clicked on “INBOX” instead of “Send” on those occasions. But this is very unlikely as many of these emails contain important information. Also, there are occasions when I’ve checked that the email is in the “SENT” box before leaving work and then seeing the email in “DRAFT” folder several days later.

 

I know I won’t be the first or last guy to complain about ED… But how come there isn’t awareness campaigns and support groups for people who’s email get unsent?

 

 

p.s.

Btw, if you ever get raging hemorrhoids that stay for months and months or anal fissure that reappear daily, try to use some baby diaper cream in addition to the fiber that the doctor prescribe. They cream help you heal just as much as they help baby. fyi I guess… At least I have found some solutions regarding this embarrassing matter.

Code.org Advertisement and no-WFH

Recently code.org publicized a promotional video featuring ppl like Mark Zuckerberg of Facebook and Bill Gates of Micro$oft saying American schools should teach programming more.

 

I don’t like it.

 

I don’t think programming is for everyone and that more programming is for social good or scientific advancement. It lowers cost of labor for all those people in the Advertisement, but it isn’t as good as it sounds.

 

As a person who completed a CS degree, I feel that computer language can be made much better so that there won’t be a “computer programming”

 

The day that I tried to teach my dad to program a for-loop in C and he turned around and teased me about forgetting the closed form expression for arithmetic series was the first time that I thought about how stupid this stuff I do is. It was the expression on my dad’s face… I remember it vividly… For it was then that I realize that I did not comprehend the sheer vulgarity of

for(int x=0;x<100;++x);

so primitive, so stupid.

The next time is when I read about Map-Reduce–sooo freaking cool. I think tomorrow I will find another way to think, another way to say, and another way to program.

 

I want to make a better programming language. a better computer. That would be better than community colleges teaching Fortran IMHO

 

Oh, and p.s.

I think Yahoo!’s new no-policy is nice. I think is real progress for protection of civil liberty in America. Technology companies insists on ownership and monitoring of its employees while working, and admittedly justified to do so. Therefore when Marissa Mayers decided to cancel all WFH, she made a call that will end monitoring of employees’ home networks–because if you don’t work from home, the company will have no cause to instrument any kind of monitoring of your home network.

I think this is a really forward thinking technology leader who care about her employees. I am buying myself some Yahoo! stocks in support of this bold move.

With Higher Knowledge Come Higher Responsibility

The other day, at work, (and by now you know I work for a Japanese Automotive Electronics company), we talked about autonomous cars for consumes. Since everyone is either technology freak or car freak the discussion was pretty intense.

 

I explained to every one the ethical issue surrounding autonomous cars that may be not be completely resolved or resolvable by technology.

 

The matter is this: an autonomous car will with absolute certainty be faced with a situation where it has to choose between two actions each will be killing a different person. Suppose two person suddenly dash in front of the car to the left and to the right, and suppose that the car is moving too fast to stop. it can veer to avoid one person with certainty. But which will it choose?

 

Another scenario: the car can brake very hard and avoid killing a pedestrian, but in the process it will have killed the passenger because the car is mechanically able to endure much higher de-acceleration than its occupants.

 

The legal problem also, if I configure the car, or if some car company configure the car to always protect its owner (rational), that I the owner, the designer, the manufacturer is then liable to be sued for killing people?

 

“But your honor, the car swerved!! I had nothing to do with it”

 

Okay, so the people who want autonomous cars (myself partially included), will say that with better equipment, high-speed video/audio recording and black-boxes, there might be far fewer arguments about who was responsible for accidents. But there are some things in our current law that are absolute. If a car hits a person inside the cross walk, the car is always responsible. If the car is rear-ended the car in arrear will be responsible. What will happen to these absolute laws that are in many circumstances unreasonable but serve to protect the safety of the population?

 

And finally, even if, and I believe it will, autonomous vehicles reduce death to 1% or less of today’s vehicle related death rate, that 1% where two person dash in front of the car, and the car has to choose, what then? Why is this so hard?

 

One of the big problems is informed decision is hard. The car, given today’s technology, machine learning technology for object detection, vision algorithms, radar, laser range scanner, eeg/ekg, EMR technologies can pretty reliably detect with plenty of time to choose which one to save, that there are two person dashing infront of the car one to the left, one to the right, velocity, estimated trajectory, mass, the certainty of these estimate and the margins of error (where else could each person likely be by the time we collide, etc.)

The reason human get away with killing in this situation is that we do not have the speed and ability. It is beyond our control–until we programmed a computer to do it, and then we are suddenly faced with choice that we never had to make before: kill left, kill right or maim both? or risk killing both? or kill myself to completely avoid  their injury?

Hmm, let’s see, What would Confucius allow? What would Jesus insist? Well, I don’t want to be killed, so don’t kill other people. I would want other people to save me so I would want to brake an save both crazy people. Hmm, I guess it really depends on the person’s desire. One would say a more moral person may not wish for another moral entity to suffer in exchange for his own sake, as well to exchange another’s life for his own. But by and large most people would ask the car to save himself no matter which place he is in.

The moral problem arise in that we are not in any of those three situations. We are in the autonomous car’s designer’s shoes. We are in Asimov’s shoes. What should we write as the laws of autonomous vehicles? When we know that at some point, the car will know almost certainly that it must kill/damage/disrupt someone/thing, and knows exactly which wire to send electric signal down to to choose which person. What should we tell the car to do?

Because soon the car will be looking at that scenario in slow-mo… with 10ms to decide and then 250ms seconds to turn the steering wheel left or right and apply brakes.

So, as you can see, the mere knowledge of morality and capability to choose encumber us with the responsibility of behaving morally. Because I know it’s wrong, I must not do wrong. Another person may think that the root of this evil is the fact that I know of this moral dilemma and that I have gained the speed to travel fast or gained the speed to determine people’s fate.

I wonder if the are right that those things are works of devil and that the absolute best moral thing to do is just to stay away from them? I should consider this carefully. What if I find that it is wrong for me to live? or wrong for me to blog about morality? What if it is found that internet is not moral? or god forbid that it is immoral to have stereo audio in cars? Because I already of the ability to terminate any one of them–at least for myself.

 

*shiver*

 

p.s.

I can accept an argument that placing one’s self into a situation where there is no moral choice is immoral. The autonomous car makers will insist that car drive carefully so that it will never be faced with 2 people in said situation. But somehow, science, technology–human inquiry–may find a way to inform us that that is just delusional, that it is provably impossible to avoid crazy human. 😉 back to square-one I suppose.

 

 

IG and the Quantification of Privacy

A while back, I talked about computing IG–information gain–by clandestine methods via an otherwise secret(personal) email. I will point to some other prior blogs entries about what can we reasonably consider private and some reasons why I think it’s bad (Because it removes competition….

The basic challenge is this: If your competitor can spy on what you do (unilaterally) then they will never be motivated to innovate. Their key strength will be their ability to hack your secrets and they will work hard on that, but not on how to build a better product or cure a disease or solve a new problem. If you can both spy on each other with perfect information then there is no need to innovate, just calculate the equilibrium and aim for that. If you can disinform your opponent then all your effort will go into disinformation instead of innovation. Basically it is much easier to do something sneaky and cheat than to do the right thing and innovate. This is why the government, a non-competing body whose interest is to make sure everyone compete (at least in America government this is the case), should provide for information security.

)

I realize in retrospect that IG may not make sense to most people based on the formulation I laid out. Let’s review. IG is the change in entropy from a state without additional knowledge to a state with knowledge

IG = H(secret) – H(secret | private email)

This measurement seem to be of a quite abstract concept of entropy–a unitless measurement. Why would I think this useful for any reason other than that it is called “Information Gain?” Well truth be told, what I had in mind was more of the IG from machine learning literature: Class purity after conditioning on some private information. It is actually used more as a measurement of correctness of predicting discrete output than abstract change in entropy of distribution after conditioning. I will refer reader to these excellent introductory books regarding “classification” algorithms.

… Some days passes and the books will hopefully have arrived on your desks…

So the example is if my secret is the probability that I will have Chinese food tonight. Let’s throw in several more classes, say Italian, Mexican cover 99.9% of all possibilities. This probability may be internal to me. Or it may be an externalizable model like I will toss a three-sided die and figure out what I will eat tonight.

Actually, this system forces us to think of a new class. I will call this new class the innovation class. It covers all cases where something new might happen, such as tonight when I went off on a tangent and forgot to eat dinner completely. Or I might be abducted by Aliens for demanding privacy, Japanese paramilitary for blogging, or God for thinking all these awful things. The fact is, I do not know what will happen, but what I do know is that things I don’t know will happen. So the class is called IC, Innovation Class–now we have a 4 sided die: Chinese, Mexican, Italian, IC; Let’s write naively that the probability for each class is:

Chinese Mexican Italian IC
33% 33% 33% 1%

The formula for the entropy of these classes is written as:

-H(Dinner)= p(Chinese) * log(p(Chinese)) + p(Mexican) * log(p(Mexican)) + p(Italian) * log(p(Italian)) + p(IC)*log(p(IC))

the above evaluates to almost the maximum possible entropy in three-class situation: H(Dinner)= 1.6499060116098556

that’s it. that’s the formula for calculating entropy that we will use repeatedly. Now, suppose that you have read my email to my wife saying “oh man, look at this great deal on groupon, 50% off on Indian food right near our home” What is the right thing to think about the distribution of my dinner?

P(IC)=99%

Indian food is not Chinese or Mexican or Italian, but we have thought of that and put in IC to account for it.

Chinese Mexican Italian IC
10% 10% 10% 70%

-H(Dinner|private email to wife) = p(Chinese|private email to wife) * log(p(Chinese|private email to wife)) + p(Mexican|private email to wife) * log(p(Mexican|private email to wife)) + p(Italian|private email to wife) * log(p(Italian|private email to wife)) + p(IC|private email to wife)*log(p(IC|private email to wife))

gives us the conditional entropy of probability of dinner after reading my private email. This entropy H(Dinner|private email to wife)=0.09596342477405478

IG(Dinner; private email to wife) = H(Dinner) – H(Dinner|private email to wife) = 1.6499060116098556-0.09596342477405478=1.5539425868358008. This corresponds to an IGR of 1619.31%, that is, 15X more information after you saw the email than before.

 

Great! so now we know how much information is gained by reading that one private email of mine. This number, I think quantifies my loss of privacy.

 

Btw, this innocent example contain some hand waving. H(Dinner) for example is something that we may or may not know. Most people have trouble writing down a distribution for dinner choices. also, P(Dinner|private email to wife) here written as a table contain assumed values. What if after reading my private email you feel that P(IC)=85%? Who is to say what the reality of this probability is? This is why I felt that this model will not make to main stream legal system because the link between private email and the actual secret itself is not so obvious. You might use naive Bayes as the definitive of reality (refer to chapter in books or wiki), logistic regression, decision trees, or you might use something else… You may even use a distributions system like SVM or god forbid rule based systems…

If you understand this computation above, then it will be easy for you to understand the continuous version. Let dinner be a continuous variable, we can still write the same expression

IG(Dinner; private email to wife) = H(Dinner) – H(Dinner|private email to wife)

and it would have the same meaning. How far are we from the truth. This idea, btw, is indeed partially inspired by the name Information Gain, which also goes by Kullback-Leibler divergence when computed over distributions. The above formation exactly with the exception that “private email to wife” is a distribution, say, perhaps, my emails are generated randomly.

KL( Dinner|private email || Dinner )

But KL divergence does point us to some other interesting characterizations. Divergence–distance without some properties of distance. Namely that it is not a metric distance:

* Nonnegative dl(x,y)>=0:  yes

* Indiscernability: dl(x,y)=0 iff x==y: yes

* Symmetric dl(x,y)==dl(y,x): NO

* Triangle inequality dl(x,y)+dl(y,z) >= dl(x,z): NO

This has some serious implications regarding this formulation of privacy. Somethings that we naturally think should make sense do not.

Let’s say I have two emails, e1 and e2, and let’s say dinner is still the subject of intense TLA investigation:

KL(d;e1) + KL(d;e2) != KL(d;e1,e2)

All private information must be considered together, because considering them separately would yield inconsistent measurement of privacy loss

Let’s say there’re two secrets, d1 is my dinner choose and d2 is my wife’s dinner choose

KL(d1;e1,e2) + KL(d2;e1,e2) != KL(d1,d2; e1,e2)

All secrets must be computed together, because computing IG separately and adding is not equal to the total information gain.

Let’s say we have an intermediate decision called Mode of Transportation (mt), and it is a secret just like my dinner choice.

KL(mt;e1,e2) + KL(d ; mt) != KL(d; e1,e 2)

The intermediate secret can be calculated, but again, it must be calculated carefully and not by additive increase of IG.

Bummer, but fascinating!! But we we must make some choice about how to proceed. Knowledge about the nature of information (and especially electronic information), I believe, informs us about how we make choice in our privacy laws:

 

  • Should the whole data be analyzed all at once?
  • or should we only allow each individual’s data be processed all at once?
  • or should we only allow daily data of everyone to be processed together?
  • or should we only allow daily data  of each individual to be processed separately?

Each of these choice (and many other) impact the private information loss due to clandestine activities.