Friday, April 2, 2010

Use sandbox to prevent runaway cars?

I read yet another news story of runaway cars from Toyota today.  All these sad stories about runaway cars make me think seriously about software bugs.
As a software engineer, my daily job is to create bugs (as a by-product of writing code) and fix (hopefully a large percent of) them.  When my code is shipped to millions of customers, inevitably and sadly, it still has bugs.  The cost could be customers losing time restarting computer or losing data.
However, for the software used in cars, the cost could be losing people's lives!  Finding and fixing all bugs in complex software is extremely hard, if not impossible.  I hope one day we could prove the correctness of any software (see L4 kernel).  But before that day comes, we still have to deal with software bugs.
There are news saying that the cars recalled and repaired by Toyota can still runaway.  This could imply that Toyota didn't really fix the problem.  Here are possible reasons of the problem that I can think of:
  1. Driver's problem.  If the driver pushes pedal to the end, the car will runaway.  This is the most common cause of runaway cars, but is not the case in several Toyota accidents.
  2. Hardware problem.  I find it hard to believe floor mat could stuck the gas pedal, because the floor mat is very sticky to the floor and not likely be pushed over, and it is not that hard to be able to hold gas pedal all the way down.  Is it possible that the gas pedal just sticks itself?  Since I never heard stories about sticky gas pedals before software is widely used in car's control, this sounds not very likely too.
  3. Software problem.  I think this is the most possible reason.
I don't know what the control software in cars looks like.  But it should be easy to add a check before sending command to push gas pedals: if break is pressed, cancel the gas pedal command.  However, if there is a memory corruption bug and the code doing this check is corrupted, this check will have no use.
So I think a sandbox could be used as a way of protection.  If the control software runs inside of the sandbox, the sandbox could perform this check and reject unreasonable commands sent by the control software, and restart the control software to reset its status.  In this way, no matter what bugs happened inside of the sandbox, unreasonable combination of commands will never happen.

Thursday, March 11, 2010

Are you smarter than a compiler?

I recently read a survey about C compilers by Felix von Leitner published in Linux Kongress 2009, and was surprised by how smart modern C compilers are.
For example, to calculate the absolute value of a 64-bit signed integer, you need:
    x > 0 ? x : -x
The compiler optimizes into branchless code:
    long tmp = x >> 63;
    return (tmp ^ x) - tmp;
Since a mispredicted branch cost 10 cycles, and CPU could issue 4 instructions per cycle, such a branchless variation is a lot faster but much less readable by human.
I still remember the days where I use "++i" instead of "i++", and use "x>>2" instead of "x / 4".  But such kind of "optimizations" do not matter any more because of advancements in compiler technologies.
In most cases, it is better for us to spend time making the code more readable than getting the last bit of performance from the hardware.
Remember, readable code may be as fast as "optimized" code, and more importantly, readable code is more likely to be correct!

Sunday, March 7, 2010

FAST 2010 Impressions: "Technology for Developing Regions"

The annual FAST conference is an excellent conference in file and storage system technologies.  I attend this conference every year and really like it.  I want to share with you the first key note on FAST 2010: " Technology for Developing Regions" by Professor Eric Brewer from UC Berkeley.

Professor Brewer and his team went to Africa and India and researched technologies to help local education, health care, and preserve culture.  Here are some interesting points from the talk.

The cell phone is a much bigger market than PC, and Africa is the fastest growing region of cell phones, although it has only 10% coverage.  Many ladies in Africa buy cell phones and rent the minutes out to local people since there are no land lines.  This is a very profitable business for them.  Interestingly, lots of money supporting the use of cell phones are money mailed into Africa from Africa people working outside of Africa.

There are 6000 languages in Africa, but sadly, most are dying, because there are no storage or technology to record them.  Their local radio stations also do not record their aired programs because they don't have enough storage.  Brewer's team provided storage technologies for them as well as recorded education materials for local schools.  They ship DVDs there and use SMS to apply small updates.  Shipping DVDs is still the cheapest way to transfer large amount of data with good bandwidth. They have TierStore, a mostly disconnected distributed file system as the storage technology for them.

Brewer's team was also helping rural India to build up infrastructure for telemedicine.  They used WiFi network as the long range communication method, which was the cheapest technology they found.  As long as one point could see another one, a WiFi connection of several Mbps throughput could be established.  They had made a world record length of WiFi transmission: 382km! One side of the network was on a mountain, because earth is round and you cannot see far enough on the rounded ground of earth.  These WiFi network was used to let the hospitals reach more patients.  The doctor interviewed the patients through the network.  This had worked quite well, and over 25,000 patients recovered sight through eye hospitals.

Without listening to his talk, I would never imagine that one of the biggest challenges to build such an infrastructure was -- Power Supply, such a 99.999% reliable and taken-for-granted thing in developed countries! Brewer's team found that the voltage of the power line could go from much lower than 220v to high spikes of 500-1000V.  As a result, they had lost over 50 power adapters and some equipments because of these spikes.  I wonder how could such high voltage ever reach the equipment without damaging equipments in the middle?!