Tuesday, January 31, 2012

My question about GPS UI

My question here is to deal with the interface design on Global Positioning System (GPS). As we may be aware, a GPS device usually provides three features: 1. Audio system broadcasting instantly when direction changes; 2. Textual information showing instructions; 3. an up-to-present map occupying a large portion of LCD display. Based on Baddeley’s (1986) working memory model, as Wicken et al. (1983, 1984) argued before, the display format should fit the subsystems used to perform the task in the working memory. This implication somehow triggers my interest in two coarsely defined inquiries below:

  • Which feature is most helpful for people to drive in an unfamiliar area;
  • If that feature is found, will the effect be improved or diminished by adding additional feature(s)?
  •  
What people hear from audio will be encoded in the phonological store, so will the textual data presented to drivers; at the same time, the spatial information from the map will go to visuospatial sketch pad. The driving routes may upgrade at varied intervals depending on characteristics of the area, so people have to receive new information from time to time while following the old instructions correctly. Although the verbal information will eventually go to visuo-spatial sketch pad, the central executive need to transfer information, synthesize information, and analyze information. Those activities add the pressure of cognitive load that constantly looks for available attention resources. In that sense, from the perspective of decrementing cognitive load while satisfying working memory’s need, we may just need the map, or a combination of two features; perhaps the presence of certain feature will deteriorate the benefits from other feature(s). 


Another aspect of this problem is pertaining to human information processing, specifically the attention. In a complex environment like driving, we are watching the traffic, listening to radio/GPS instructions, controlling the speed, probably talking on the phone. Thus, we have used three attention resources already, which are namely sight, touch and hearing. Furthermore, when we discuss the concept of selective attention, they are most likely related to visual perception, and it is basically the product of four factors: salience, effort, expectancy and value (Wicken et al., 2003). 


To explain each of them in order, first of all, as what the “salience” means, some unique feature may attract you right away from other less salient objects, for instance, a horse running in the high way with cars; typically, the auditory signal is more attention-grabbing than visual affairs, that is why we often take actions upon hearing the alarm. If the sound is replaced by the flashing LED light, we would imagine how long it is going to take us to notice a potential miserable event. The second variable is expectancy, which is defined as the knowledge with respect to probable time and location of information availability. Imagining that you are driving in a mountain with lots of sharp turns, you should keep an eye on the curvy road more continuously when you are traveling fast; on the other hand, if you are just cruising around Chicago on lake-shore drive, most of the road are straight, so you can relax a little bit and attend to your music more than the road. The third factor is the value of information. 


Let us illicit the concept by referencing the same instance of driving, it makes sense for people to look forward most of time, because you want to be quick to take actions whenever see something occurring. With the second and third factor introduced, there is actually an interaction between them, if we multiply the expectancy with value; we can obtain a better function of selective attention, which can be de-abstracted in this way: an experienced air traffic control of certain airport know where to scan most and when. Last but not least, effort. As we know, most of people only can pay attention to one thing at one time, if you focus on the mirror trying to change lane, you probably would not share this effort with looking toward, because it simply cannot be achieved by the majority of human beings. Hereby, the effort seems to appear as a negative value, as attention allocations to multiple places may decrement operation performance as a result.


To summarize the massage for selective attention, researchers often refer to the following model:

P(A) = sS – efEF + exEX * vV (Wickens et al. 2003)
(“sS” refers to “salient feature”; “efEF” means “effort”; exEX corresponds to “Expectancy”; and “vV” points to “value”)

Applying this model to our question of interest, GPS can produce audio instructions that are more salient than visual in theory; in the meanwhile, the effort to receiving auditory information usually cost less compared to reading information from the screen, although the GPS is attached to the windshield. However, this effort can reach a margin-value when it comes to a complex area such as 6-way intersection. In this scenario, visual feedback is better, because in other case, our center executive has to consume more attention resources, as the textual information needs to be rehearsed by articulatory loop, and then converted to visual data for further processing. With respect to value and expectancy, they are inconsistent depending on the routes, thus hard to predict. Collectively speaking, the problem domain is very interesting to look into, and we may even resort to AI to help us determine which method is more effective and safe.


To address those two questions, an experiment may be acquired. In addition, a set of independent/ dependent variables should be defined along with a reasonable measurement. Without going to much detail, I think the independent variable would be of categorical type (e.g. audio, text, and map), the dependent variable could be “helpfulness” gauged by “time to drive to destination”, “level of satisfactions”, and “fuel consumed”, while each element carries different weight. For the concern of safety, the study can also be conducted virtually in a lab, given the fact that there are many driving training games available; additionally, we are capable of controlling possible confounding factors relatively easier in the lab condition. In terms of the analysis method, ANOVA would be optimal; because three types of treatments are seen here, and we aim to find out whether there are real differences among them (statistics such as F–value need to be reported). 


Reference:

Baddeley, A. D. (1986). Working memory. New York: Oxford University Press. Baddeley, A. D. (1992). Working memory. Science, 225, 556-559.

Wickens, C. D., and Carswell, C. M. (2006). Information Processing. In Salvendy, G. (Ed.), Handbook of Human Factors and Ergonomics, 3rd Edition, 111-149, Hoboken, NJ: Wiley.

Wickens, C. (1980). The structure of attentional resource. In R. S. Nickerson (ed.), Attention and Performance VIII, 239-257, Hillsdale, NJ: Lawrence Erlbaum.

Wickens, C. D., Vidulich, M., & ILLINOIS UNIV AT URBANA ELECTRO-PHYSICS LAB. (1982). S-C-R Compatibility and Dual Task Performance in Two Complex Information Processing Tasks: Threat Evaluation and Fault Diagnosis. Ft. Belvoir: Defense Technical Information Center.

(© 2012 Miaoqi Zhu)

Sunday, January 29, 2012

From Quantum Computing to HCI

What is quantum computing?

At this point of time, we are still relying on digital computing, which basically run on this simple principle: “the bit is on, or the bit is off”. These bits will “grow” into larger forms or structures such as phrases, words, which can represent other type of information like sounds, animations. Although scientists and engineers are surely affording every possible effort to improve the computation densities of a regular personal computer, but it still cannot match a human’s brain. You may ask why?  

The human brain has about 100 billion neurons. With an estimated average of 1000 connections between each neuron and its neighbors, we have nearly 100 trillion connections, each capable of a simultaneous calculation. That's massive parallel processing capability, which is one of the most critical strengths of human thinking.

On the other hand, the weakness is the slow pace for neural circuitry for human beings, and it is about 200 calculations a second. However, with 100 trillion connections, each computing at 200 calculations a second per connection, we will get an unbelievable capacity. Moreover, we do not need to worry about running out of memory, as long as you are willing to spend time in rehearsing information, it will go to long-term memory, as we have around 100 trillion connections. 

Since neural net emulations benefit from both strands of the acceleration of computational power, this capacity will double every twelve months. Thus by the year 2020, it will have doubled about twenty-three times, resulting in a speed of about 20 million billion neural connection calculations per second, which is equal to the human brain. Besides the help from parallel processing, the memory is important as well. Nowadays, the memory circuits are doubling their capacities every 18 months.  

Collectively speaking, Implied by the Law of Accelerating Returns, the exponential growth of computing will prove that shortly in the future, machinery’s thinking power will surpass ours. (I refer to a regular personal computer, not supercomputer)

Different from digital computing, quantum computing has this “qu-bits”, which are zero or one in the meanwhile. The state of a particle will stay “ambivalent” until a process of disambiguation forces the particle to choose which position it should stand at. For example, a stream of photons are about to hit the lake surface at 45 degree. Each individual photon will have to determine whether to bounce off the surface at the same angle, or “dive” into the water. In other words, those photons are too “lazy” until the process pushes them to go on with only one path. The same principles apply to carbon-based neurons.

The series of qu-bits represents simultaneously every possible solution to the problem. For instance, a single qu-bit represents two possible solutions; two linked qu-bits represent four possible answers. A quantum computer with 1,000 qu-bits represents 21,000 (this is approximately equal to a decimal number consisting of 1, followed by 301 zeroes) possible solutions at the same time. 

Yes, it is fast, it is like a genius who can handle a complex math problem by counting for every possible combination. In that sense, it is a good tool for decoding encryption.  Dr. Nicolas Gisin found out the "communication" speed in-between a photon pair is way faster than the speed of light. But they are not communicating information; instead, they are passing each other a sort of randomness. If we can convert the randomness into information, we may be able to disentangle the photon pair. Nevertheless, this needs a lot of observations on those photons' decisions.


Do human beings have quantum computing inside brains?

Gödel's famous "incompleteness theorem" has been considered as the most important theorem in mathematics. A corollary of Gödel's theorem is that there are mathematical propositions that cannot be decided by an algorithm. In essence, these Gödelian impossible problems require an infinite number of steps to be solved. Dr. Roger Penrose conjects that machines cannot do what humans can do because machines can only follow an algorithm. An algorithm cannot solve a Gödelian unsolvable problem. But humans can. Hence, in this regard, humans are more advanced. Penrose also suggest that human beings’ consciousness comes from quantum computing and quantum de-coherence. 

Well, there is something inaccurate with Penrose’s conjecture: first, human being can only estimate, yet estimation is not equivalent to knocking down the problem; second, if we own quantum computing capability, currently we do not have it outside ourselves yet, and then how come we know a machine cannot solve Gödelian impossible problems? Plus, if a computing task requires an incredible amount of calculations, the quantum computing may still die under certain circumstances. 

But if a human brain indeed displays quantum computing ability, that means this technique is feasible, thus we need to dig out the mechanism that nurture this special treat, and hopefully “feed” it into another piece of hardware. That is why scientists are busy with inventing high-resolution scanning technology such as Magnetic Resonance Imaging. 

Collectively speaking, we do not know much about our brain, not even a little. We carry such a huge potential that we even dare to imagine; that being said, conducting “Reserve Engineering” on our brain is a possible means to understand those God-given abilities and apply those to machinery.


What is the implication for HCI?

Will quantum computing change the way we write the code? Will it change the way we design the interface?  Are we still having browsers? …

On the other hand, What if a machine can have emotion and personality at that time? They may have “a user experience” based on the feedback that human give. Perhaps they are intelligent enough to figure out whether design can convey a good usability for humans, as they have perception then… Lol, those are just some random thoughts. 

The thing is that quantum computers may only be used to tackle a certain kind of problem that is intractable; the power they carry does not apply across the board. BUT what I am more interested is that: during the process of studying our brain (partially driven by quantum computing), how do the outcomes affect the manner we deploy more or less profound design mechanisms to take full advantage of human thinking capability. Perhaps by the time we get this thing run, out neuron-cognitive structure will change due to computational development rather than biological evolution. I mean, thinking about it, interacting with an agent that is far more intelligent than our brains, should we get better fit by not thinking about hard questions from then on? I don’t know.


Reference:

Kurzweil, R. (1999). The age of spiritual machines: When computers exceed human intelligence. New York: Viking.