Wednesday, March 28, 2012

Wrapping up things related to working memeory


The new definition of working memory
In his seminar work on the capacity differences of individual working memory, Carpenter (Just & Carpenter, 1992) redefines its concept by borrowing a computational theory. They posit that both information restoring and processing are supported by the very same property named “Activation”.

When we are taking in new information from computations, each element is associated with an activation level. The element can be a word, sentence or a physical object such as a cat.

During the comprehension, only the element whose activation level exceeds the threshold value can enter working memory; in other words, not all the candidates are able to be further processed by brain even it is well retrieved from long-term memory.

However, people differ from each other with respect to their work memory capacity. In this case, which is seen as the total amount of activation level that the system can sustain.

Therefore, if at one time, the sum is about to go beyond the system limitation, certain chunk of old information must be de-allocated in order to accommodate more incoming data and computations.

Serving as one of fundamental concepts in cognitive science, we often find it in other literature.

Complementary learning systems
The hippocampal (HC) systems will be active in receiving information and temporally restoring the information, which means that the novel material cannot be truly learned immediately.

The neocortical systems will incorporate the new information gradually, as any quick absorption could be detrimental to existing structures of knowledge. 

The novel input need to go through various consolidation stages to obtain a membership from neocortical systems. Appropriate external events can serve as good opportunities and will help adjust neocortical connections.

Usually, the incorporation of unfamiliar material is slow, especially for arbitrary and idiosyncratic materials. Thus, when the memory traces degrade with time passing by, it is possible to lose them before they can be built into the shared structures in neocortical systems.

The complementary learning systems are more related to information storage and maintenance than information processing.

However, the weight changes of neurological connections may also be bound with “activation”. Without appropriate activations, the connections are not re-constructed or even established.

Although McCelland (McCelland & McNaughton, 1995) does not explicitly mention how the incorporation rate is to do with working memory capacity, the concept of “learning rate” may be a potential factor affected by the level of activation.

Metaphor
Metaphor in linguistic practices is fairly common, if a person wants to express certain idea from utterances, the hearer often can detect the meaning rather than the literal meaning of sentences.

If “S is P” actually refers to “S is R”, the comparison theory attempts to address the question of “how can we compute the potential value of R”; on the other hand, the interaction theory tries to look into the range restriction for R’s value by referencing the relationship between S and P.

To make this communication possible, the speaker and hearer must have something in common such as the principles to translate utterances.

Searles provides eight principles and suggests the methods by which the utterance of P term calls to mind the meaning of R term in the way pertaining to metaphor. For example, one principle is about human beings’ sensibility, and it is applicable for people from multiple cultures, as it is naturally determined.

From the perspective of working memory capacity, the duty to locate possible links between S and P or narrow the scope of R values heavily rely on activation for each property.

I maintain that the task to interpret metaphor is less like to challenge us in that there are normally three base elements to begin with, unless the range of R is quite large, or the “overlapping” attributes of S and P are too many.

In fact, an interesting question here is that: which principles consume more resources in working memory. The seemly more complex principle is not necessarily the one that requests a larger “brain”; it also depends on other constructs such as hearer’s knowledge base related to a specific domain.

Categorical Perception
The interplays between perceptual information and high-level knowledge of a particular object are another universal cognitive process. There are basically four rules that people use to group object.

While prototypes allow us to bring in more attributes to formulate semi-stereotypical representations, the exemplar models are stored stably in the memory, and they appear in a highly-abstracted form.

The process of establishing prototypes and exemplar sees the differences in the light of McCelland’s theory on complementary learning systems. Specifically, to build a new prototype may take a longer time than modeling an exemplar, as the former has to undergo a systematic procedure to live within existing structures that is shaped based on other surviving prototypes.

From the standpoint of working memory, it should work well to predict the efficiency of categorizing a perceived object.

Nevertheless, it may not directly determine how soon the new token will reach neocortical systems.
In the context of metaphor, in addition to referencing currently stored categories for the purpose of producing the metaphorical meanings of the utterance, we can also use the outcomes to re-categorize an object.

Cognitive Breakdowns
The cognitive breakdowns appear in multiple forms; for example, the loss of balance, the difficulty to focus, or the lack of reasoning capabilities.

Working memory is one major component being affected, for instance, some patients need to consistently re-chunk information in order to continue searching for the answer to a simple inquiry.

If the brain sees too much cognitive load, certain part of it will cease functioning. From that perspective, a wise usage of limited working memory resources is recommended for pertaining subjects.

The Representation of Personality in the Affective Reasoner
The situation is presented in frames, and frame matching asks for sufficient resources from working memory.

The computations on situation frames are supposed to generate emotion. Working memory is crucial for instantiating this intensive process on left hemisphere (LHS).

The emotion is only available if over intensity threshold, which is determined by bindings on LHS. Working memory is attributable to the binding process.

Some Implications for Design
First, designers should realize that users have different ability to store and process information; thus they should not be self-centered as the solution provider.

Second, designers must avoid using a solution that is too novel; although it may surprise users for its creativity, it may damage existing usage patterns that the users hold for other tasks at that moment.

Third, providing a "buffer-zone" for chunking/rechunking is recommended, as well as “Recognize Rather Than Recall”. Methods such as those will reduce users’ cognitive load, thus decrement the chances of over-charging working memory.

Forth, metaphor is equally important in that it is relatively easier for users to figure out the underlying meaning of design elements such as button or a gesture.

Furthermore, the calling for “universal design” does not only apply to people with physical disabilities, but also patients with brain damages at various levels.

Last but not least, if by its definition, the title of “User Experience” caters to user’s emotion responses, designers are expected to understand a little bit about the principles of emotion generation.
 
(© 2012 Miaoqi Zhu)

Tuesday, March 6, 2012

What if?

I just had this "crazy" thought:

What if each neuron represents each planet, what if each neurological connection represents each galaxy. We may just live in our brains...

One of my friends responded with his understanding. He said "we may live in other beings' brain." However, if I see "you", you are likely to be perceived by me, in that sense, it is not implausible to say that you are just "living" in the world of my brain. 

The question may need to be answered by philosophers, and it is difficult to prove or disprove anything by science, although it could be possible someday. 

(© 2012 Miaoqi Zhu)

Saturday, March 3, 2012

Complementary Learning Systems, Dreams


The motivation for this post stems from my recent reading on McClelland's complementary learning systems on “Hippocampus (HC) and Neocortex”. This paper is also kind of survey papers from certain perspective. In particular, the authors smartly put relevant theories together to reach the goal of demonstrating his own idea, which is: "the HC provides training trials, allowing the cortical system to select representations for itself through interleaved learning." The paper, of course, starts with two key question s of interest:
  1. Why is the HC system needed? If we have neocortex system that owns more neurons, why do we want HC to do?
  2. Why do the changes in neocortical connections take so long? In other words, can the new materials fully absorbed rapidly?
It looks like HC system is responsible for storing recent memory, and neocortex is used to reserve remote memory. To avoid interference with the knowledge stored in neocortex, HC is here to help accommodate the initial storage from a new learning event; in the meanwhile, to not bother existing system of knowledge structure, the changes should be made slowly within neocortex. There are certainly “communications” between HC and neocortex, otherwise, no information will be transferred, and no changes will be made.

Marr (1971) propose that the HC systems stores experience in the day time, and replayed the memories in the HC back to neocortex in the night. Does it imply that brain activities in the night are caused by consolidation (or the opposite)? For example, dream.

A researcher in this domain indicates that dreams can be emotional as we replay old memories and update them with novel information from recent events. Associating with personal experience, I agree with it, but I become more interested in the verb “update” in terms of its functionality. Specifically, why the brain needs to update in what sub-systems of our brains?

So a ROUGH idea here is that: if the dream happens partially owing to the connection changes (learning) in neocortex system from HC’s input, let us contemplate on some questions below:
  1. Is it due to too many computations requested from day-time tasks, so we better carry this out in the night time?
  2. Why should it appear in the form of visual imagery? Does it facilitate the learning process?
  3. Why cannot we dream every single night? If it stops dreaming, does it mean dreams are not caused by learning in HC and neocortex systems? Perhaps dream causes learning, not the opposite.
  4. The memory trace in HC will decay, if we have an exceptional set of material, will it take more priority to be incorporated into existing knowledge structure using this precious period of dreaming?
  5. What is the implication for HCI? Designing novel interfaces “encouraging” people to dream, thus users can learn it faster? This question may be addressed after knowing why exactly people dream.

Reference:

Web article: http://www.livestrong.com/article/78256-parts-brain-produce-dreams/

Marr, D. (1971). Simple memory: A theory for archicortex. The Philosophical Transactions of the Royal Society of London, 262(Series B), 23–81.

McClelland, J. L., McNaughton, B. L., & O'Reilly, R. C. (January 01, 1995). Why there are complementary learning systems in the hippocampus and neocortex: insights from the successes and failures of connectionist models of learning and memory. Psychological Review, 102, 3, 419-57.

(© 2012 Miaoqi Zhu)

Monday, February 20, 2012

Activation and direct emotion generation


If by definition, the title of “User Experience” indicates catering to users' emotion responses, “activation" can help in understanding the principles of emotion generations. In his ground-breaking work on affective reasoning (Elliot, 1992), Elliot summarizes the emotion generation based on his previous studies. Taking the direct emotions for illustration, when an agent perceives a potential relevance with a situation, the frames including the goals, standards and preferences (GSP) are matched against the eliciting situation frame. If a match is established, the situation is officially the agent’s concern in the form of construals. The working memory determines the matching success rate in that the agent must retrieve information for GSP. Bindings take place in the left hemisphere when the situation frame slots marry the variables in slots of construal frame from the previous phase, then an emotion eliciting condition relation (EECR) is created for each construal of that situation. Because there may be many interpretations of a situation, there are multiple EECRs to be confirmed individually and gathered together by involved agent. In the following step, compound-emotions EECRs are formed from the separation and recombination of the event-based construals and attribution-based construals. The prospect results will work with the domain-independent rules to generate an emotion instance such as hope.  From my understanding, a basic underlying activity is activation, for instance, the matching process depends on working memory, and we have covered the meaning of activation to working memory earlier. The same story may happen to binding as well, because there are intensive computations to analyze and integrate frame slots of two different resources.

For instance, if a scenario tells us a group of users are more likely to be frustrated under a circumstance, what can a designer do to make users feel emotionally easier?  We can manipulate some variables to suppress activation in the working memory. Imaging a busy Mom standing in front of an airline kiosk with two playful little boys, she may have developed a prospect-base emotion of fear already, as she can construe the situation based on existing frame slots gained from latest experience in relation to her goal, standards and preferences. In this case, designers can either employ a quick interface walk-through demo as a light tutorial when users are still in waiting areas, or adjust environmental variables such as lighting and private space, as long as those measures help diminish the chance of activation. 



Figure 2: the mapping structure from situation to emotion (Elliot 1992)

I agree that the above theoretical framework is hard to validate, but it is still a good model to look at. As HCI practitioners, we do not have to be "addicted" to a particular theory, or over-criticize it. A well-shaped science mind is good in terms of understanding design solutions as the inputs to human brain.

Reference:
Elliott, C. D., & Northwestern University. (1992). The affective reasoner: A process model of emotions in a multi-agent system. Evanston, Il: Northwestern University 

(© 2012 Miaoqi Zhu)

Friday, February 10, 2012

When the reality tells me the "feeling" is wrong


I was on the north-bound train almost approaching my destination; an idea suddenly came to my mind, why did I feel the train is heading to a different direction at this time comparing to the time when I was aboard.

There is no doubt that the train is driving to North; otherwise, I may never get back home. Having this assumption in mind that the train is right, I began recalling the scenario in which I got aboard from Adam/Wabash. This does not sound like a very difficult task; however, here are a few things making it more complicated than expected:
 
  1. The stop where I departed is underground, thus there are a few clues for me to get a sense where the train is going. In other words, the sort of survey knowledge I established before cannot take effect here, at least for me; 
  2. The entry of a particular station: specifically, for those stations with centered platform, there are typically two entries located on two sides. If a passenger going North enter from the north gate,  since there are two floors, she or he need to turn around to catch the Northbound train once reaching the platform, vice versus;  
  3. The seats are placed towards to two directions. One is with where the training is driving to, while the other is the opposite. Having said that, there could be another couple of mental rotations to carry out;
  4. The train's door opening at either sides depending on the design of station platform.

When the first factor is relatively independent, another three in fact has certain interactions. Imaging you are heading to North: you firstly enter the station from the South gate, which means that you don’t have to turn around; fortunately, when the trains arrives, after boarding from the right-side door, you decide to use the seats on the right is following train’s direction, in this case, you  may feel less striving to recognize the direction. Nevertheless, if you get into the train through the left door and decide to sit on the left side where those seats are facing backward, it might add extra work for the agent in terms of figuring out where s/he is going to. If you start from an above-ground station, the puzzle could be even easier to resolve, because you have a number of visual clues to reference such as landmarks.

Having listed the possible noises preventing me from feeling right for the direction, I continue recalling each scene from the time to enter the station to board the train. Since I am more type of visual thinker, I enjoy playing back every single memory frame. The first challenge was presented because of the internal construction structure of the station, it is important in that the number of turns along with its direction co-determine my actual orientation in the world. I resolve this problem by putting myself in an imaginary blueprint of the station, and re-experiencing the journey virtually in head. Then the second challenge follows because I have to map the spatial relationship between my seat and the door from which I got in. Please be aware each cart has two doors on one side, which means that if I sit closer to the door where I boarded, it will be easier; but if I choose a seat that is further away, or if I don’t remember which door I entered, it would be another story.

Finally, I got the “feeling” corrected by reasoning the spatial relationships among each object that I am able to recall. Further, playing back the scenario is helpful in terms of structuring the space I was in before. A big question here is that: why did I lose the feeling of direction in the first place? I mean when the train is operating underneath, the feeling is alright; yet once the train drives out of the tunnel, I became a little anxious since my intuition tells me that feeling is incorrect. I was also surprised by how many chunks of information and attention resources I have employed to get that feeling fixed, let alone the intensive computations occurring in my head. Every time I spare a little attention for other activities (e.g. talking to friends), I may need to restart the process again, because a sub-process generates so much data that needs to be temporarily stored in working memory for the next thread of computation(s).


(© 2012 Miaoqi Zhu)

Sunday, February 5, 2012

Activation


When I was in the “flow” state of reading Anderson’s paper of “ACT A Simple Theory of Complex Cognition”, I could not help myself referring back to Just and Carpenter’s seminar work on capacity theory of comprehension. I am wondering perhaps there is a connection between two theories, which is the “Activation-Level”.

First of all, just a little recap of Anderson’s paper: He tries to understand the basic components and processing principles of our cognition. By studying how people write recursive programs, he claims that there are two elements: productions (procedural knowledge) and long-term chunks (declarative knowledge). Then he further explains what are “Knowledge Acquisition” and “Knowledge Deployment”.

For “Knowledge Acquisition”, it was said that long term chunks come from environment encoding, and to transform those chunks, ACT-R looks for some existing chunks for mapping; From “Knowledge Deployment”, the author answers the question: how humans select the most appropriate knowledge for a particular context. Based on a rational analysis that “knowledge is made available according to its odds of being used in a particular context, activation process implicitly performs a Bayesian inference in calculating these odds”, he elicits a basic equation, which is:


Activation_Level = Base_level + Contextual_Priming

Anderson further illustrates this equation with three domains; memory, categorization, and problem solving.  It looks to me that for those three domains, the way we see “contextual priming” slightly differs. For instance, for memory, it is the association of chunk n and chunk m; for problem-solving, it is about the effect of distance to the goal that participants set up. 

When we go back to Just and Carpenter’s paper: they “redefine” the concept of working memory by presenting a computational theory, which suggests both storage and processing are fueled by an identical property called “Activation”. More specifically, each element (e.g. word, phrase, objects from real world) carries an associated activation level. During a course of understanding, relevant chunks are activated from either a computation or long term memory; however, not all of them can enter working memory; only the one that meets certain minimum threshold value obtain the permission. As long as the total amount of activation level is within system limitation, we are good to process the information; but if the sum exceeds system limitation, we need to de-allocate some old elements. It may be not hard to get the idea if you draw a scenario in which you try to understand a difficult sentence with several clauses embedded. Some people with large working memory capacity may understand it quicker than those who own a small capacity.

You may notice that the activation level is a common thread for both works. If Anderson is right, can he help explain where the activation level from in Just and Carpenter’s theory? Just and Carpenter adopt assess working memory capacity using the "Reading-Span" task; while Anderson address his curiosity from studying people writing recursive programs, and his idea is published a few years after. I am just wondering if we swap their methodologies, are we still seeing this common thread?



Reference:

Anderson, J. R. (1996). ACT: A simple theory of complex cognition. American Psychologist, 51, 355-365.

Just, M. A., Carpenter, P.A., (1992). Capacity Theory of Comprehension: Individual Differences in Working Memory. Psychological Review, 99(1), 122-149.

(© 2012 Miaoqi Zhu)

Friday, February 3, 2012

Sing a song


I have kept asking myself this question: why I feel using just a little effort to remember a whole piece of lyrics after I recall the first or first few words? 

When I am reading the book written by Dr. Lawrence Barsalou, specifically the Chapter 6 dedicated to “long-term memory” encoding, I found the above question partially addressed. I admit that a person may listen to a song many times; as basically, she/he must very much enjoy it for some reason. In that sense, the amount of processing on the lyrics is improved by presentation duration, the chances to rehearse, and the number of presentations. Those three variables are found to be the significant factors determining the information processing quality. 

What appears more interesting to me is the sort of elaboration that contributes to data encoding: first, incidental versus intentional learning. Human beings are in fact ready to encode any information without "realizing" it. Although there is a benefit for us to try to remember something, because it may result in increasing number of rehearsals, but it is not fair to say that the information cannot be learned well, because people are not inducing to do so.  When we are listening to a favorite song, of course, it is fine for us to try hard to remember the lyrics word by word, but let us think about it again, how many times you found yourself practicing the song perfectly without the intention to memorize. Second, the depth of information processing. If you are given a stimulus such as a car’s engine part that you have never seen before, how can you restore it? Well, ideally, you may develop a set of characteristics pertaining to that objective stimulus, with those characteristics becoming more conceptual, the stimulus will be remembered better. In fact, when you try to recall that stimulus later, the previous generated conceptual information will be activated and retrieved as well. Applying this point to explaining my question, each word in the lyrics relates to another, and they come together to form a meaning that musicians want to convey. I can say I still recall a song easily from a decade ago, as it is my favorite one from my favorite show. When I retrieve the lyrics, I can still perceive its meaning to the show. In addition, I am wondering whether the rhythm helps extending the depth, although it is different type of information, the phonological loop could be affected by it, as we sing out individual word with the sound. Third, imagery, in other word, you can visualize the information, which may produce more conceptual information to elaborate. For example, you are asked to remember “subway airport”, then you can picture a scene that “a CTA blue line subway is heading to the Chicago O’Hare airport.” It does not have to be that rich decorations attached to the original materials, yet we see an improvement in remembering. Again, going back to my question of interest, as I said before, I am able to play back a portion of the show (e.g. the main characters get back together), as the lyrics correspond to the story quite well; in the meanwhile, some words can be transformed to real objects (e.g. river, horse)  from the show. 

The reason why my question is partially un-addressed at this point is that: are the lyrics stored as a whole in a chunk? or divided into several chunks with some special “linkage”? Because when we succeed in recalling or recognizing the first one or two word, the rest is usually flowing out like a river. Perhaps people will say it is up to individual depending on her or his capacity and strategy to encode information. :)


Reference:

Lawrence W. Barsalou. (1992) Cognitive Psychology, An overview for Cognitive Scientists , Hillsdale, NJ: Lawrence Erlbaum Associates. ISBN 0-89859-966-0


(© 2012 Miaoqi Zhu)

A few words from many years ago


I have been intrigued by a comment from my Mom’s colleague when I was in China last month, after about 5 years since I saw her last time. She can still remember a joke that I said over 10 years ago. Although she tried to make fun of me by quoting that phrase, I started contemplating about a question arisen by the instance: why human being can recall a tiny piece of information about a person after such a long time.

After reading the capacity paper (Just et al., 1992), I am aware of its theory about the activation on information processing and storage, but when it comes to retrieve information from long term memory, despite the low-span or large-span people’s capacity differences in terms of working memory, does element/item in long term memory carry activation level? If so, will the activation also mediate the process, thus affect the outcomes?

I have come across three models pertaining to information retrieval from long-term memory , and it seems that “matching” is a key word for all of them. For example, imaging the information seeking process resembles the scenario of searching for your bags at luggage claim, you are going to examine each matching item until the first piece is found, then the entire process starts over again. There is also another one called Resonance Retrieval Theory that treats information as a vector with elements representing different conceptual subjects.

Let me propose this: people may tend to organize or categorize long term chunks according to a specific object such as a pet, a friend, etc.  Each item appears like a vector with an equal number of attributes as long as that person put them into the identical category.

Element CategoryHuman_Male_001111 = { A[0], A[1], A[2] …. A[n-3], A[n-1]), A[n-1] }

The attribute can be an individual element such as hair color, relationship status, education degree, etc. However, they are assigned with different weights due to the product of external stimulus and the person’s strategy to encode information. There is an pointer that always waits at the first attribute- A[0], while the first attribute is reserved to any elements bearing the most substantial weight. That being said, they are changeable, because any attributes can be strengthened or weakened by various events, and this process may be done by an unknown internal mechanism. Thus, when the working memory calls for information activated by means of retrieving and decoding items/elements from long term memory, the weight of the candidate may affect the efficiency of the process, and I would like to name the weight as activation level for each information unit in long term memory. 

The above model/process is just my "random" thought, I am looking forward to reading more literature to see it is "correct" or the opposite. :)

Reference:
Wickens, C. (1980). The structure of attentional resource. In R. Nickerson (ed.), Attention and Performance VIII, 239-257, Hillsdale, NJ: Lawrence Erlbaum.


(© 2012 Miaoqi Zhu)

Tuesday, January 31, 2012

My question about GPS UI

My question here is to deal with the interface design on Global Positioning System (GPS). As we may be aware, a GPS device usually provides three features: 1. Audio system broadcasting instantly when direction changes; 2. Textual information showing instructions; 3. an up-to-present map occupying a large portion of LCD display. Based on Baddeley’s (1986) working memory model, as Wicken et al. (1983, 1984) argued before, the display format should fit the subsystems used to perform the task in the working memory. This implication somehow triggers my interest in two coarsely defined inquiries below:

  • Which feature is most helpful for people to drive in an unfamiliar area;
  • If that feature is found, will the effect be improved or diminished by adding additional feature(s)?
  •  
What people hear from audio will be encoded in the phonological store, so will the textual data presented to drivers; at the same time, the spatial information from the map will go to visuospatial sketch pad. The driving routes may upgrade at varied intervals depending on characteristics of the area, so people have to receive new information from time to time while following the old instructions correctly. Although the verbal information will eventually go to visuo-spatial sketch pad, the central executive need to transfer information, synthesize information, and analyze information. Those activities add the pressure of cognitive load that constantly looks for available attention resources. In that sense, from the perspective of decrementing cognitive load while satisfying working memory’s need, we may just need the map, or a combination of two features; perhaps the presence of certain feature will deteriorate the benefits from other feature(s). 


Another aspect of this problem is pertaining to human information processing, specifically the attention. In a complex environment like driving, we are watching the traffic, listening to radio/GPS instructions, controlling the speed, probably talking on the phone. Thus, we have used three attention resources already, which are namely sight, touch and hearing. Furthermore, when we discuss the concept of selective attention, they are most likely related to visual perception, and it is basically the product of four factors: salience, effort, expectancy and value (Wicken et al., 2003). 


To explain each of them in order, first of all, as what the “salience” means, some unique feature may attract you right away from other less salient objects, for instance, a horse running in the high way with cars; typically, the auditory signal is more attention-grabbing than visual affairs, that is why we often take actions upon hearing the alarm. If the sound is replaced by the flashing LED light, we would imagine how long it is going to take us to notice a potential miserable event. The second variable is expectancy, which is defined as the knowledge with respect to probable time and location of information availability. Imagining that you are driving in a mountain with lots of sharp turns, you should keep an eye on the curvy road more continuously when you are traveling fast; on the other hand, if you are just cruising around Chicago on lake-shore drive, most of the road are straight, so you can relax a little bit and attend to your music more than the road. The third factor is the value of information. 


Let us illicit the concept by referencing the same instance of driving, it makes sense for people to look forward most of time, because you want to be quick to take actions whenever see something occurring. With the second and third factor introduced, there is actually an interaction between them, if we multiply the expectancy with value; we can obtain a better function of selective attention, which can be de-abstracted in this way: an experienced air traffic control of certain airport know where to scan most and when. Last but not least, effort. As we know, most of people only can pay attention to one thing at one time, if you focus on the mirror trying to change lane, you probably would not share this effort with looking toward, because it simply cannot be achieved by the majority of human beings. Hereby, the effort seems to appear as a negative value, as attention allocations to multiple places may decrement operation performance as a result.


To summarize the massage for selective attention, researchers often refer to the following model:

P(A) = sS – efEF + exEX * vV (Wickens et al. 2003)
(“sS” refers to “salient feature”; “efEF” means “effort”; exEX corresponds to “Expectancy”; and “vV” points to “value”)

Applying this model to our question of interest, GPS can produce audio instructions that are more salient than visual in theory; in the meanwhile, the effort to receiving auditory information usually cost less compared to reading information from the screen, although the GPS is attached to the windshield. However, this effort can reach a margin-value when it comes to a complex area such as 6-way intersection. In this scenario, visual feedback is better, because in other case, our center executive has to consume more attention resources, as the textual information needs to be rehearsed by articulatory loop, and then converted to visual data for further processing. With respect to value and expectancy, they are inconsistent depending on the routes, thus hard to predict. Collectively speaking, the problem domain is very interesting to look into, and we may even resort to AI to help us determine which method is more effective and safe.


To address those two questions, an experiment may be acquired. In addition, a set of independent/ dependent variables should be defined along with a reasonable measurement. Without going to much detail, I think the independent variable would be of categorical type (e.g. audio, text, and map), the dependent variable could be “helpfulness” gauged by “time to drive to destination”, “level of satisfactions”, and “fuel consumed”, while each element carries different weight. For the concern of safety, the study can also be conducted virtually in a lab, given the fact that there are many driving training games available; additionally, we are capable of controlling possible confounding factors relatively easier in the lab condition. In terms of the analysis method, ANOVA would be optimal; because three types of treatments are seen here, and we aim to find out whether there are real differences among them (statistics such as F–value need to be reported). 


Reference:

Baddeley, A. D. (1986). Working memory. New York: Oxford University Press. Baddeley, A. D. (1992). Working memory. Science, 225, 556-559.

Wickens, C. D., and Carswell, C. M. (2006). Information Processing. In Salvendy, G. (Ed.), Handbook of Human Factors and Ergonomics, 3rd Edition, 111-149, Hoboken, NJ: Wiley.

Wickens, C. (1980). The structure of attentional resource. In R. S. Nickerson (ed.), Attention and Performance VIII, 239-257, Hillsdale, NJ: Lawrence Erlbaum.

Wickens, C. D., Vidulich, M., & ILLINOIS UNIV AT URBANA ELECTRO-PHYSICS LAB. (1982). S-C-R Compatibility and Dual Task Performance in Two Complex Information Processing Tasks: Threat Evaluation and Fault Diagnosis. Ft. Belvoir: Defense Technical Information Center.

(© 2012 Miaoqi Zhu)

Sunday, January 29, 2012

From Quantum Computing to HCI

What is quantum computing?

At this point of time, we are still relying on digital computing, which basically run on this simple principle: “the bit is on, or the bit is off”. These bits will “grow” into larger forms or structures such as phrases, words, which can represent other type of information like sounds, animations. Although scientists and engineers are surely affording every possible effort to improve the computation densities of a regular personal computer, but it still cannot match a human’s brain. You may ask why?  

The human brain has about 100 billion neurons. With an estimated average of 1000 connections between each neuron and its neighbors, we have nearly 100 trillion connections, each capable of a simultaneous calculation. That's massive parallel processing capability, which is one of the most critical strengths of human thinking.

On the other hand, the weakness is the slow pace for neural circuitry for human beings, and it is about 200 calculations a second. However, with 100 trillion connections, each computing at 200 calculations a second per connection, we will get an unbelievable capacity. Moreover, we do not need to worry about running out of memory, as long as you are willing to spend time in rehearsing information, it will go to long-term memory, as we have around 100 trillion connections. 

Since neural net emulations benefit from both strands of the acceleration of computational power, this capacity will double every twelve months. Thus by the year 2020, it will have doubled about twenty-three times, resulting in a speed of about 20 million billion neural connection calculations per second, which is equal to the human brain. Besides the help from parallel processing, the memory is important as well. Nowadays, the memory circuits are doubling their capacities every 18 months.  

Collectively speaking, Implied by the Law of Accelerating Returns, the exponential growth of computing will prove that shortly in the future, machinery’s thinking power will surpass ours. (I refer to a regular personal computer, not supercomputer)

Different from digital computing, quantum computing has this “qu-bits”, which are zero or one in the meanwhile. The state of a particle will stay “ambivalent” until a process of disambiguation forces the particle to choose which position it should stand at. For example, a stream of photons are about to hit the lake surface at 45 degree. Each individual photon will have to determine whether to bounce off the surface at the same angle, or “dive” into the water. In other words, those photons are too “lazy” until the process pushes them to go on with only one path. The same principles apply to carbon-based neurons.

The series of qu-bits represents simultaneously every possible solution to the problem. For instance, a single qu-bit represents two possible solutions; two linked qu-bits represent four possible answers. A quantum computer with 1,000 qu-bits represents 21,000 (this is approximately equal to a decimal number consisting of 1, followed by 301 zeroes) possible solutions at the same time. 

Yes, it is fast, it is like a genius who can handle a complex math problem by counting for every possible combination. In that sense, it is a good tool for decoding encryption.  Dr. Nicolas Gisin found out the "communication" speed in-between a photon pair is way faster than the speed of light. But they are not communicating information; instead, they are passing each other a sort of randomness. If we can convert the randomness into information, we may be able to disentangle the photon pair. Nevertheless, this needs a lot of observations on those photons' decisions.


Do human beings have quantum computing inside brains?

Gödel's famous "incompleteness theorem" has been considered as the most important theorem in mathematics. A corollary of Gödel's theorem is that there are mathematical propositions that cannot be decided by an algorithm. In essence, these Gödelian impossible problems require an infinite number of steps to be solved. Dr. Roger Penrose conjects that machines cannot do what humans can do because machines can only follow an algorithm. An algorithm cannot solve a Gödelian unsolvable problem. But humans can. Hence, in this regard, humans are more advanced. Penrose also suggest that human beings’ consciousness comes from quantum computing and quantum de-coherence. 

Well, there is something inaccurate with Penrose’s conjecture: first, human being can only estimate, yet estimation is not equivalent to knocking down the problem; second, if we own quantum computing capability, currently we do not have it outside ourselves yet, and then how come we know a machine cannot solve Gödelian impossible problems? Plus, if a computing task requires an incredible amount of calculations, the quantum computing may still die under certain circumstances. 

But if a human brain indeed displays quantum computing ability, that means this technique is feasible, thus we need to dig out the mechanism that nurture this special treat, and hopefully “feed” it into another piece of hardware. That is why scientists are busy with inventing high-resolution scanning technology such as Magnetic Resonance Imaging. 

Collectively speaking, we do not know much about our brain, not even a little. We carry such a huge potential that we even dare to imagine; that being said, conducting “Reserve Engineering” on our brain is a possible means to understand those God-given abilities and apply those to machinery.


What is the implication for HCI?

Will quantum computing change the way we write the code? Will it change the way we design the interface?  Are we still having browsers? …

On the other hand, What if a machine can have emotion and personality at that time? They may have “a user experience” based on the feedback that human give. Perhaps they are intelligent enough to figure out whether design can convey a good usability for humans, as they have perception then… Lol, those are just some random thoughts. 

The thing is that quantum computers may only be used to tackle a certain kind of problem that is intractable; the power they carry does not apply across the board. BUT what I am more interested is that: during the process of studying our brain (partially driven by quantum computing), how do the outcomes affect the manner we deploy more or less profound design mechanisms to take full advantage of human thinking capability. Perhaps by the time we get this thing run, out neuron-cognitive structure will change due to computational development rather than biological evolution. I mean, thinking about it, interacting with an agent that is far more intelligent than our brains, should we get better fit by not thinking about hard questions from then on? I don’t know.


Reference:

Kurzweil, R. (1999). The age of spiritual machines: When computers exceed human intelligence. New York: Viking.