Google awarded software patent covering the principle of distributed mapreduce

Posted on January 20th, 2010 by bile
Tags: , , , , , , , , , , , , , 1 Comment »

The USPTO awarded search giant Google a software method patent that covers the principle of distributed MapReduce, a strategy for parallel processing that is used by the search giant. If Google chooses to aggressively enforce the patent, it could have significant implications for some open source software projects that use the technique, including the Apache Foundation’s popular Hadoop software framework.

“Map” and “reduce” are functional programming primitives that have been used in software development for decades. A “map” operation allows you to apply a function to every item in a sequence, returning a sequence of equal size with the processed values. A “reduce” operation, also called “fold,” accumulates the contents of a sequence into a single return value by performing a function that combines each item in the sequence with the return value of the previous iteration.

Google’s MapReduce framework is roughly based on those concepts. A series of data elements is processed in a map operation, then combined at the end with a reduce operation to produce the finished output. The advantage of partitioning a workload this way is that it’s extremely conducive to parallelization. Each discrete unit of data in the series can be processed individually and combined at the end, making it possible to spread the workload across multiple processors or computers. It’s a fairly elegant approach to scalable concurrency, one that offers efficiency regardless of whether your environment is a single multicore processor or a massive grid in a data center.

Google published a paper in 2004 that described how it uses MapReduce. The paper attracted considerable interest and paved the way for the MapReduce pattern to become a common technique for parallelization. One of the most well-known third-party implementations of MapReduce for distributed computing is Hadoop, an open source Apache project now used by Yahoo, Amazon, IBM, Facebook, Rackspace, Hulu, the New York Times, and a growing number of other companies.

Google’s patent on MapReduce could potentially pose a problem for those using third-party open source implementations. Patent #7,650,331, which was granted to Google on Tuesday, defines a system and method for efficient large-scale data processing:

A large-scale data processing system and method includes one or more application-independent map modules configured to read input data and to apply at least one application-specific map operation to the input data to produce intermediate data values, wherein the map operation is automatically parallelized across multiple processors in the parallel processing environment. A plurality of intermediate data structures are used to store the intermediate data values. One or more application-independent reduce modules are configured to retrieve the intermediate data values and to apply at least one application-specific reduce operation to the intermediate data values to provide output data.

i suspect google applied for a patent defensively as many software firms do and don’t expect them to go after anyone. if they didn’t someone else likely would have. it shows us another business distortion and resource waste brought about by intellectual property. nothing in this patent is all that impressive or new. even if it were it doesn’t justify an artificial monopoly privilege enforced through aggression and the threat thereof by another monopoly.

this patent is important both in that distributed mapreduce has become a popular way of processing data but also personally affects the work i’m currently doing. i hope that in this regard Google sticks with their ‘do no evil’ slogan and simply sits on this patent.

Why Chris Anderson has it wrong about scarcity and abundance

Posted on July 7th, 2009 by bile
Tags: , , , , , , , , , , , , , , ,…

Rules Everything is forbidden unless it is permitted. Everything is permitted unless it is forbidden.
Social model Paternalism (“We know what’s best”) Egalitarianism (“You know what’s best”)
Profit plan Business model We’ll figure it out
Decision process Top-down Bottom-up
Organizational structure Command and control Out of control
  1. A reduction of scarcity in electrical energy would be far far more significant then storage capacity or computing power. Increased computing power may help lead to new inventions, new ways to grow crops, etc. but it does not provide the means to do those things. Even with a major reduction in the cost of electrical energy there would still be lots of other components in life which would be scarce. All things being equal… even if my energy costs were zero you’d still have the costs of labor, rent, etc. and those are of greater cost.
  2. Egalitarianism is “You know what’s best”? Since when? Every egalitarian philosophy and political theory is paternalistic in practice and often in theory. Egalitarianism is almost always collectivist and tends to remove responsibility from individuals for the supposed betterment of society.
  3. Ultimately, humans act because of scarcity. When combined with libertarian property rights you do have a “Everything is forbidden unless it is permitted” system, more or less. But that’s a good thing. History and analysis of human behavior shows that such a system leads to less conflict. Communistic, “Everything is permitted” as in “everything is everyones”,  systems almost always fail due to conflict of interest and asymmetric desires which ultimately lower productivity and therefore per capita wealth.
  4. Private property theory and the free market, which exist due to scarcity, is bottom-up. There can never be a post scarcity scenario as described… only greatly reduced scarcity in particular areas of the economy. So if you have a free market the decision process will always be bottom-up and ordered chaos. Only statism is “top-down” and “command and control”in any significant manner.
  5. Craigslist and Wikipedia are not gift economies any more then this blog is or any other not for profit. Craigslist and Wikipedia actively request donations which is contrary to the fundamentals of a gift economy. Besides… they aren’t closed loops and therefore not an isolated economy but part of the greater world economy which is fascistic.
  6. Mr. Anderson makes several references to “waste.” That nature is wasteful. It isn’t. That YouTube is filled with “waste.” It isn’t. Waste is a subjective evaluation. Animals of simpler structure and lesser ability to protect offspring naturally play the statistics game. Those fit enough, ie those who produced more offspring thereby providing a greater chance of survival, have survived. There is no waste… just a different yet equally sufficient method of continuing the species. It is completely different for YouTube. There is no waste because it is what the customers desire. Only from another’s perspective can one’s property be considered ill used. He even makes mention of this fact yet still calls them ‘crap” and “waste.” It seems to me that he’s just being contradictive in his language and not intent but it’s frustrating none the less.
  7. He makes reference to the prime-time broadcast schedule being a scarce resource. That’s only because of government intervention. The FCC has held back radio technology since it’s inception. Rather then allowing the industry to run it’s natural course, much like the digital tech industry he likes to use in comparison, the FCC has regulated the life out of the industries using the radio spectrum. The cost of running a transmitter and even getting the basic equipment to film a TV or radio show isn’t that expensive. (And it would have been less so if it wasn’t regulated.) If individuals were free to transmit content as they saw fit and the government just enforced property rights regarding the homesteading of the radio frequencies there would have been more content and far lower “real costs.”

Seems to me Chris Anderson needs a lesson in economic theory. Appears his book is available for free. Perhaps when I’m done with a few other audiobooks I’ll check out his.

The First 100 Days: 100 of Obama’s Lies, Blunders, Gaffes, and Abuses of Liberty

Posted on April 30th, 2009 by bile
Tags: , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , ,…

  1. Promising to “publish all non-emergency legislation to the website for five days… before the President signs it,” then breaking that promise over and over again.
  2. Despite promising to keep lobbyists out of his administration, Obama has broken his word again and again (making 17 exceptions to this promise in his first two weeks).
  3. Obama promised to eliminate income taxation for seniors making less than $50,000 a year. He has broken this promise despite numerous opportunities to keep it, including the economic stimulus package and his administration’s first budget proposal.
  4. The President also boasted during his campaign that “During 2009 and 2010, existing businesses will receive a $3,000 refundable tax credit for each additional full-time employee hired,” and has failed to keep his word.
  5. Obama made it part of his agenda to “allow withdrawals of 15% up to $10,000 from retirement accounts without penalty (although subject to the normal taxes). This would apply to withdrawals in 2008 (including retroactively) and 2009,” but didn’t include this measure in the stimulus package or his budget proposal.
  6. Obama broke his promise to recognize the Armenian Genocide.
  7. Obama did a shameless 180 degree turn on earmarks by sharply criticizing them (and bragging that he would pass legislation without a single one) and then signing a spending bill with literally thousands of them.

I’m completely OK with 7. Better the so call representitives waste the money than the executive branch. The collection, allocation and spending of the money in the first place is the actual problem.

This list isn’t bad. It stretches to get 100 things but much of it is reasonable.