Saturday, September 02, 2006


Reading about the Blue-Ray / HD-DVD disaster today and thinking about HDMI.  I am really aggravated.  I am aggravated not just about DRM preventing all my existing projectors and plasmas from being able to play content at resolutions that they are capable of, but also about the need for yet another type of cable.  I have boxes full of different kinds of cables.  Everything I plug in to anything uses it's own cable!  Cables drive me insane!  

Isn't it about time that we consumers start demanding that everyone use one type of cable?  Let me suggest optical cables; they can carry the highest bandwidth, can be run arbitrary distances without distortion and can conceivably carry many signals simultaneously (many different ways to multplex optical signals).

The world would be a much better place if I could grab a cable (any cable), knowing it can be used to connect any two devices.  

Of course, there is the issue of power.  What a mess that issue is.  How many power-bricks are connected to how many power-strips in your house?  What a disaster!  And why?  Maybe I'll rant about free energy and wireless power distribution another time.  The entire topic is stupefying.

Wednesday, May 25, 2005

Interesting article on XBox 360 and Proceedural Synthetis

This article at Ars Technica goes into some detail about the new XBox 360's proceedural synthesis scheme for scene generation. I found this article very interesting as it is taking approaches that I referred to in my previous post. The interpolation and synthesis of 3D models is an approach that may be used in a the computer visualization system such as I was talking about.

This sounds pretty cool; If it works like Ars is claiming then scene programming can be abstracted into semantic representations. It probably isn't as far along as what I was imagining, but it sounds very promising. For Microsoft to be using this approach, it is likely that their R&D labs are anticipating much more complex uses of this technology.

Friday, May 06, 2005

3D Video Compression and Computer Vision & Imagination

For quite a while I have been tossing around in my mind the concept of 3D scenes as a means of both Video Compression and Computer Vision.

It seems reasonably clear to me, for instance, that when I have a memory of a physical experience (as opposed to a memory of a thought, music, emotion, concept, etc..), that the memory is not a picture at all. The memory, in fact, is that of some number of things (objects, people, backgrounds, etc..). When I visualize the memory, I often don't remember things that would be obvious in a photographic sense, such as the color of someone's shirt or whether there were clouds in the sky.

Thinking about this makes me reasonably sure that what I am actually doing when I am visualizing the past is that I am imagining it based on what I remember happening at the time. In other words, I am mentally reconstructing the imagery much like a computerized 3D modeling program would. I know what different objects should and do look like and I know what they look like when they are modified by obvious actions. So, for example, if I want to visualize an old, blue, Ford Thunderbird with a dent in the side from an accident, I can do it even if I have never seen such a vehicle. The act of imagining that automobile, I believe, is identical to the act of remembering something that happened in my real past. The exercise feels the same and the imagery in my mind has the same real quality to it.

Take a second to try this out yourself. Imagine an old, blue Ford Thunderbird with a dent in the door and front-right quarter panel from a car accident. Wait a second before continuing to read.

Your brain is an amazing thing. Like me, your imagination probably also furnished you with an appropriate scene for the car. It might have been parked in a parking-lot, or a garage or by the curb of your house. Most likely, it was placed in a comfortable and likely scene for you by the interpolation engine that your imagination uses to fill in the blanks.

Ah, that interpolation engine is an interesting thing though. That bugger is probably responsible for more mistakes in the history of mankind than any other artifact of our intelligence. You see, if I am right and our memories are the imagined reconstructions of the important details that we do semantically and relationally encode in our memories, then we can very easily fall victim to ghost memories. Ghost memories are those memories that were imposed by the interpolation part of your imagination to fill in the gaps in the scenes that you create in your mind. In our example, it was the parking lot or garage. If we pay close attention to our memories while we reconstruct them, we should be able to avoid this problem - because I believe that the actual memories contain few, if any, real scenes. Although imagery is stored, I believe, if it is unique and interesting.

This brings me to how this technique could be used for Video Compression and Computer Vision. It probably is pretty obvious based on what i just wrote, but let me go through it.

First, we must discuss Computer Vision because that is required for 3D Compression to be possible. 3D Computer Vision is the act of composing 3D scenes from video. There are several interesting projects doing this today. I'll try to dig up links another time.

We probably don't have the computational power to do this right today, but what we want is a system with that does the following:

(1) Using stereo cameras and/or temporal (time based) recording (e.g. video), be able to compose virst-approximation 3D scenes at greater than 30fps.
(2) Within the processing constraints required to process video at greater than 30fps, the ability to do model matching against a database of known objects. We want our system to be able to search a database of 3D models for objects that are similar to what is present in the current 3D scene. It should be noted that since we are running in a non-discontiguous system that we should already have model data from previous time-states. So every new time-state or frame provides us with the opportunity to refine our recognition of objects and recognize new objects. If this sounds eerily like what you do when you walk into a room, I believe there is no coincidence.
(3) After seperating out known models, refining models that we did recognize (e.g. that is a person becomes that is John), we must be able to both update the database and add new objects to the database. This gets complicated, but for simplicity sake, consider the database to be always a work in progress. Both 3D models (e.g. polygons, splines, ...) and textures (such as a unique 2D arrangements of color) should be so encoded.
(4) Now, start building a semantic/relational model of the scene (continue building from the previous frame). This model contains references to objects, attributes of those objects and relationships between objects instead of polygons and pixels. Again, the goal is >30fps. We will be encoding movement of objects, movement of camera, appearance of new objects, recognition of objects unrecognized in previous frames (this may force us to walk backwards in the time-stream and adjust previous frames to account for new knowledge), etc..

At the end of this process, we have an updated database and a stream of frames that is purely a series of coded semantics of references attributes and relationships.

From a Computational Intelligence point of view, this stream of semantic knowledge is incredibly valuable. Processing on semantic knowledge is much, much simpler than on image data. The Computational Vision system already provided enough information to let a C.I. system know who is in the room and how things are moving around.. This knowledge can now be run through another pass to push forward into new frontiers of awareness for the system.

If a C.I. system records it's memories in this manner, it would take very little real memory. The database, of course, would hog a bit by encoding all the models of polygons and whatnot as well as textures and attributes that are common for the models. But the actual scenes use virtually no memory, which means that encoding video produces tiny, miniscule resources to be stored away.

Reconstructing full imagery (imagining the scenes) from these semantic memories is reasonably easy. All the system must do is follow the semantic encoding and fetch the referenced objects, apply the semantic attributes and relations (such as movement or contact, distortion, etc..) and a full 3D scene becomes available to the computer imagination system. That scene can be rendered from various angles (including the original angle that the scene was recorded). It can also be analyzed both from an image point-of-view and a semantic point-of-view.

Remember that we were talking about interpolation earlier? Well, it happens here also, but mostly it happens because the database is not complete and so imperfect models will be encoded in scenes. Since the models are always improving (one would hope), old scenes remembered through the filter of new models may yield imprecise results. It would be advisable for model versioning to be present. It would also be advisable for some feedback encoding to occur both at original encode time as well as in later re-visializations (e.g. take another look and see if the visualizations make sense or can be factored into better models)

Using such a system for compression alone would have very interesting consequences. Every re-run of a movie would be different (depending on the interpolations and model refinements in the system). You could view a movie from multiple angles, you could alter the actors, you could change just about anything very very easily...

My predictions: (A) This technology will exist as soon as the computational horsepower to make it work is affordable and (B) It will become the defacto standard for computer vision and video compression.

But, what the heck do I know?

Friday, April 29, 2005

Hierarchical Encyclopedia - Recursive 10 percentile knowledge

I was thinking today about how disorganized information is, not just from the standpoint of finding information but more specifically, from the point of organizing information around it's dependent information. At any rate, this got me wandering and eventually landed me on an interesting idea.

The idea, quite simply, is a website themed around 10 percentile knowledge, recursively... Let me explain; When you first hit the site, it would be much like wikipedia except all articles would have an organization based upon the community that submitted it. And that community would have self-organizing expertise that emerges from a process I call 10 percentile knowledge. In short:

Every article will have associated with it a set of questions and answers. Both the article and the questions and answers are modifiable (in wiki fashion) by anyone at 10 percentile expertise level (or higher) relative to the knowledge area that the article is contained within. In order to achieve the accreditation required to edit within a knowledge area, therefore, a person must be able to achieve at least a 90% average score on questions relevant to that knowledge area.

Taking this further, new articles that are submitted may be forced into a more general (common) knowledge area if more than 90% of their questions are answerable within their current context, or may move to a more specific (expert) knowledge pool if the questions are more difficult.

Similarly, as the system evolves, individuals may find their accreditations changing either higher or lower depending on how knowledge areas mature. And of course, knowledge categories may branch off or merge as the system evolves as well.

The primary focus of such a system is that all participants are both teachers and students and that all participants also actively engage in the shaping of the system itself (although this may be mostly invisible to the user).

Such a system would have remarkable characteristics. It would be self organizing and it would naturally develop checks-and-balances in it's own organization as users with more expertise would have more power of their subject areas than those with less expertise. This would prevent the raging knowledge battles found on sites such as Wikipedia where University professors battle over territory with High-school students. The University professors would naturally have more advanced knowledge and would therefore have knowledge areas/realms that were so high up in the recursive 10 percentile expertise metrics that they were the only ones that could edit the material. And for lesser material, their editorial rights could be more profound.

Such a system would allow users to very quickly seek knowledge areas that matched their own personal levels of expertise, providing them with a direct path to learn from the point that they are currently at.

Such a system would allow extremely advanced knowledge to be imparted in a manner that it could be directly peer-reviewed, studied and critiqued by other experts in their field who are at their own level of expertise.

More on this later... I am not sure I laid out the organizational dynamics that I have in my mind very eloquently in this write-up, but there are some very simple game-theoryish dynamic rules that would allow this system to self organize in the way I am describing. I'll try to find time to lay those out in some form soon.

Wednesday, April 27, 2005

Bonjor - Such Potential!

Apple computer's Bojor technology (aka. Rendezvous, aka. Zero-Conf, aka Multicast DNS) is a very interesting way for things to magically happen. It gives us printer configuration without the mess, it gives us iChat with the people around us and websites that are on our network. But there is something more that it can provide that I have been speculating on (well, I have gone further, but that is mostly irrelevant).

One of the most dreadful things about the Internet is the distance that it injects in our lives between us and the people around us. But does the Internet have to do that to us? Must we become more isolated from our surroundings and community simply because we are connected?

The short answer today is yes. Today's Internet forces us to find and interact at extraordinary distances (in many senses). While I could pontificate on that at great length, I will cut to the chase and postulate that the primary problem is in how we discover and interact on the net.

Is there a solution? Is there a way to make it possible to make it more interesting and accessible to interact with people and things close to us, or maybe close to our mindsets? I think there are and Bonjor holds the promise of the key.

Here is the basic idea: Bonjor, in it's untampered with state, is only good for local networks. The problem is that Multicast DNS is not routable and even if it was, people behind firewalls could not take advantage of each-other's services. So Bonjour doesn't really give you the chance to discover and interact with your neighbor, or the local deli, or even with the people who compulsively read the same type of information on the web. But the raw Multicast DNS technology exhibits discovery characteristics that, with the right augmentation, should be capable of making that happen.

In order to get there from here, first we must imagine a way to tunnel both Bonjour discovery multicast DNS messages and connections to application services. This is neccessary because we must imagine connecting applications on different networks together, such that a client could discover and interact with services on relevant non-local networks.

Two examples of relevant networks come immediately to mind. The first is the geographically close network and the second is the contextually close network.

* Bonjor tunnels to geographically close networks

* Bonjor tunnels to contextually close/relevant networks

Let's imagine the second case first. There is an extremely simple example that can illustrate the power of such an idea. That example is tunneling Bonjor discovery services through a proxy that follows your focus in a web browser. Here is a summary of how that might work:

A proxy running on the client examines the URL that the web browser of focus (that which the user has in front) is pointing to. It then attempts to ope a channel to that server to tunnel Bonjor services to. If the server is equipped with the appropriate software, the client can then publish and discover Bonjor services through that remote server.

Immediately, several interesting things fall out of this. Your iChat would show in the rendezvous section all of the users that are looking at the same webpage as well as all of their bookmarks and their websites and other services (as appropriate, considering publishing rules and whatnot). So you have an ad-hoc community based on the context of interest by the user. This brings people together and that is powerful.

Going from contextual to geographic closeness is somewhat more complicated, but potentially even more valuable. The way that may work could, perhaps, be something like the following:

First, someone must run a geographic bonjor tunneling tracker. Such a piece of software would be responsible for registering clients that come online and offline at geographic locations all over the world. It is possible to conceive of such a system as being hierarchical and scalable and perhaps even existing within the context of mesh networking, but details aside, let's continue the thought...

When a client comes on line (like your computer), it would first send a message to a known server (the tracker from above) and receive the ip address of a local tracker. Essentially, the primary tracker would resolve the client ip address to a geographic location and send back a list of other clients/servers near that location to contact. Let us assume that any client may also act as a server (I am thinking BitTorrent as I say this).

There is an iterative set of steps that would occur where the client bounces around attempting to resolve where it is and who it's nearest neighbors are. This process comes out of graph theory and the intended results are for the client to build a graph of some
number of nearest neighbors (other clients that exist geographically close to the client in question).

Once such a graph has been resolved, it becomes possible to begin shared tunneling of Bonjor services with nearest neighbors. The number and type of services tunneled is a set of variables that must be algorithmically controlled. One can imagine an interface which specifies how geographically distant discovery services should look as well as having filters to home in on specific types of services and/or the subject matter that those services deal with. Such specificity is likely to be interdependent as we do not want to wash out a client with zillions of services, so as distance increases the focal context of interest must necessarily decrease and vice versa.

What can we do with it? We can find people near us to chat to. We can look at websites in our neighborhood. We can see what the dinner specials are in restaurants in our city. It brings us closer together.

And these things, I believe, are important.

So what can we do to make it possible? Well, I have written code to do Bonjor tunneling... It more or less works... But I am lazy. I guess, more on this later...

What is Holoradix?

Holo-radix is simply the combination of the latin roots Holo (complete, whole, entire) and radix (root or source). The word literally means the full root or full history. Think of holoradix as the holistic construction of a thing, whether that thing is tangible or intangible (an object vs. a thought, for instance).

When one contemplates the holoradix of a thing1, one contemplates all things2 that combined to make thing1 exist and all things3 that combined to make things2 exist, ad infinitum.. While it is impossible to fully comprehend the holoradix of a thing, the act of attempting to do so can be very enlightening...

Okay, so an interesting word... A word that represents a perspective, my perspective... And from that perspective emerges this blog. It is not so much a decomposition of my reality as you may guess from the word, it is just my journal of thoughts, of ideas of observations. Probably, it is just for me. But if you enjoy it, all the better...

Web Counters
Dell Canada Coupons Discount