HCI and Research Design

Wednesday, March 28, 2012

Wrapping up things related to working memeory

The new definition of working memory

In 
his seminar work on the capacity differences of individual working 
memory, Carpenter (Just & Carpenter, 1992) redefines its concept by 
borrowing a computational theory. They posit that both information 
restoring and processing are supported by the very same property named “Activation”.

When
 we are taking in new information from computations, each element is 
associated with an activation level. The element can be a word, sentence
 or a physical object such as a cat.

During the comprehension, 
only the element whose activation level exceeds the threshold value can 
enter working memory; in other words, not all the candidates are able to
 be further processed by brain even it is well retrieved from long-term 
memory.

However, people differ from each other with respect to their work 
memory capacity. In this case, which is seen as the total amount of 
activation level that the system can sustain.

Therefore, if at one
 time, the sum is about to go beyond the system limitation, certain 
chunk of old information must be de-allocated in order to accommodate 
more incoming data and computations.

Serving as one of fundamental concepts in cognitive science, we often find it in other literature.

Complementary learning systems

The
 hippocampal (HC) systems will be active in receiving information and 
temporally restoring the information, which means that the novel 
material cannot be truly learned immediately.

The neocortical 
systems will incorporate the new information gradually, as any quick 
absorption could be detrimental to existing structures of knowledge. 

The
 novel input need to go through various consolidation stages to obtain a
 membership from neocortical systems. Appropriate external events can 
serve as good opportunities and will help adjust neocortical 
connections.

Usually, the incorporation of unfamiliar material is 
slow, especially for arbitrary and idiosyncratic materials. Thus, when 
the memory traces degrade with time passing by, it is possible to lose 
them before they can be built into the shared structures in neocortical 
systems.

The complementary learning systems are more related to information storage and maintenance than information processing.

However,
 the weight changes of neurological connections may also be bound with 
“activation”. Without appropriate activations, the connections are not 
re-constructed or even established.

Although McCelland (McCelland 
& McNaughton, 1995) does not explicitly mention how the 
incorporation rate is to do with working memory capacity, the concept of
 “learning rate” may be a potential factor affected by the level of 
activation.

Metaphor

Metaphor 
in linguistic practices is fairly common, if a person wants to express 
certain idea from utterances, the hearer often can detect the meaning 
rather than the literal meaning of sentences.

If “S is P” actually refers to “S is R”,
 the comparison theory attempts to address the question of “how can we 
compute the potential value of R”; on the other hand, the interaction 
theory tries to look into the range restriction for R’s value by 
referencing the relationship between S and P.

To make this 
communication possible, the speaker and hearer must have something in 
common such as the principles to translate utterances.

Searles 
provides eight principles and suggests the methods by which the 
utterance of P term calls to mind the meaning of R term in the way 
pertaining to metaphor. For example, one principle is about human 
beings’ sensibility, and it is applicable for people from multiple 
cultures, as it is naturally determined.

From the perspective of 
working memory capacity, the duty to locate possible links between S and
 P or narrow the scope of R values heavily rely on activation for each 
property.

I maintain that the task to interpret metaphor is less 
like to challenge us in that there are normally three base elements to 
begin with, unless the range of R is quite large, or the “overlapping” 
attributes of S and P are too many.

In fact, an interesting 
question here is that: which principles consume more resources in 
working memory. The seemly more complex principle is not necessarily the
 one that requests a larger “brain”; it also depends on other constructs
 such as hearer’s knowledge base related to a specific domain.

Categorical Perception

The
 interplays between perceptual information and high-level knowledge of a
 particular object are another universal cognitive process. There are 
basically four rules that people use to group object.

While 
prototypes allow us to bring in more attributes to formulate 
semi-stereotypical representations, the exemplar models are stored 
stably in the memory, and they appear in a highly-abstracted form.

The
 process of establishing prototypes and exemplar sees the differences in
 the light of McCelland’s theory on complementary learning systems. 
Specifically, to build a new prototype may take a longer time than 
modeling an exemplar, as the former has to undergo a systematic 
procedure to live within existing structures that is shaped based on 
other surviving prototypes.

From the standpoint of working memory, it should work well to predict the efficiency of categorizing a perceived object.

Nevertheless, it may not directly determine how soon the new token will reach neocortical systems.

In
 the context of metaphor, in addition to referencing currently stored 
categories for the purpose of producing the metaphorical meanings of the
 utterance, we can also use the outcomes to re-categorize an object.

Cognitive Breakdowns 

The
 cognitive breakdowns appear in multiple forms; for example, the loss of
 balance, the difficulty to focus, or the lack of reasoning 
capabilities.

Working memory is one major component being 
affected, for instance, some patients need to consistently re-chunk 
information in order to continue searching for the answer to a simple 
inquiry.

If the brain sees too much cognitive load, certain part 
of it will cease functioning. From that perspective, a wise usage of 
limited working memory resources is recommended for pertaining subjects.

The Representation of Personality in the Affective Reasoner

The situation is presented in frames, and frame matching asks for sufficient resources from working memory.

The
 computations on situation frames are supposed to generate emotion. 
Working memory is crucial for instantiating this intensive process on 
left hemisphere (LHS).

The emotion is only available if over 
intensity threshold, which is determined by bindings on LHS. Working 
memory is attributable to the binding process.

Some Implications for Design

First, designers should realize that users have different ability to store and process information; thus they should not be self-centered as the solution provider.

Second, designers must avoid using a solution that is too novel; although it may surprise users for its creativity, it may damage existing usage patterns that the users hold for other tasks at that moment.

Third, providing a "buffer-zone" for chunking/rechunking is recommended, as well as “Recognize Rather Than Recall”. Methods such as those will reduce users’ cognitive load, thus decrement the chances of over-charging working memory.

Forth, metaphor is equally important in that it is relatively easier for users to figure out the underlying meaning of design elements such as button or a gesture. 

Furthermore, the calling for “universal design” does not only apply to people with physical disabilities, but also patients with brain damages at various levels.

Last but not least, if by its definition, the title of “User Experience” caters to user’s emotion responses, designers are expected to understand a little bit about the principles of emotion generation.

(&copy; 2012 Miaoqi Zhu)

Tuesday, March 6, 2012

What if?

I just had this "crazy" thought:

What if each neuron represents each planet, what if each neurological connection represents each galaxy. We may just live in our brains...

One of my friends responded with his understanding. He said "we may live in other beings' brain." However, if I see "you", you are likely to be perceived by me, in that sense, it is not implausible to say that you are just "living" in the world of my brain.

The question may need to be answered by philosophers, and it is difficult to prove or disprove anything by science, although it could be possible someday.

(&copy; 2012 Miaoqi Zhu)

Saturday, March 3, 2012

Complementary Learning Systems, Dreams

The motivation for this post stems from my recent reading on McClelland's complementary learning systems on “Hippocampus (HC) and Neocortex”. This paper is also kind of survey papers from certain perspective. In particular, the authors smartly put relevant theories together to reach the goal of demonstrating his own idea, which is: "the HC provides training trials, allowing the cortical system to select representations for itself through interleaved learning." The paper, of course, starts with two key question s of interest:

Why is the HC system needed? If we have neocortex system that owns more neurons, why do we want HC to do?
Why do the changes in neocortical connections take so long? In other words, can the new materials fully absorbed rapidly?

It looks like HC system is responsible for storing recent memory, and neocortex is used to reserve remote memory. To avoid interference with the knowledge stored in neocortex, HC is here to help accommodate the initial storage from a new learning event; in the meanwhile, to not bother existing system of knowledge structure, the changes should be made slowly within neocortex. There are certainly “communications” between HC and neocortex, otherwise, no information will be transferred, and no changes will be made. 

Marr (1971) propose that the HC systems stores experience in the day time, and replayed the memories in the HC back to neocortex in the night. Does it imply that brain activities in the night are caused by consolidation (or the opposite)? For example, dream. 

A researcher in this domain indicates that dreams can be emotional as we replay old memories and update them with novel information from recent events. Associating with personal experience, I agree with it, but I become more interested in the verb “update” in terms of its functionality. Specifically, why the brain needs to update in what sub-systems of our brains?

So a ROUGH idea here is that: if the dream happens partially owing to the connection changes (learning) in neocortex system from HC’s input, let us contemplate on some questions below:

Is it due to too many computations requested from day-time tasks, so we better carry this out in the night time?
Why should it appear in the form of visual imagery? Does it facilitate the learning process?
Why cannot we dream every single night? If it stops dreaming, does it mean dreams are not caused by learning in HC and neocortex systems? Perhaps dream causes learning, not the opposite.
The memory trace in HC will decay, if we have an exceptional set of material, will it take more priority to be incorporated into existing knowledge structure using this precious period of dreaming?
What is the implication for HCI? Designing novel interfaces “encouraging” people to dream, thus users can learn it faster? This question may be addressed after knowing why exactly people dream.

Reference:

Web article: http://www.livestrong.com/article/78256-parts-brain-produce-dreams/

Marr, D. (1971). Simple memory: A theory for archicortex. The Philosophical Transactions of the Royal Society of London, 262(Series B), 23–81.

McClelland, J. L., McNaughton, B. L., & O'Reilly, R. C. (January 01, 1995). Why there are complementary learning systems in the hippocampus and neocortex: insights from the successes and failures of connectionist models of learning and memory. Psychological Review, 102, 3, 419-57.

(&copy; 2012 Miaoqi Zhu)

Monday, February 20, 2012

Activation and direct emotion generation

If by definition, the title of “User Experience” indicates catering to users' emotion responses, “activation" can help in understanding the principles of emotion generations. In his ground-breaking work on affective reasoning (Elliot, 1992), Elliot summarizes the emotion generation based on his previous studies. Taking the direct emotions for illustration, when an agent perceives a potential relevance with a situation, the frames including the goals, standards and preferences (GSP) are matched against the eliciting situation frame. If a match is established, the situation is officially the agent’s concern in the form of construals. The working memory determines the matching success rate in that the agent must retrieve information for GSP. Bindings take place in the left hemisphere when the situation frame slots marry the variables in slots of construal frame from the previous phase, then an emotion eliciting condition relation (EECR) is created for each construal of that situation. Because there may be many interpretations of a situation, there are multiple EECRs to be confirmed individually and gathered together by involved agent. In the following step, compound-emotions EECRs are formed from the separation and recombination of the event-based construals and attribution-based construals. The prospect results will work with the domain-independent rules to generate an emotion instance such as hope.  From my understanding, a basic underlying activity is activation, for instance, the matching process depends on working memory, and we have covered the meaning of activation to working memory earlier. The same story may happen to binding as well, because there are intensive computations to analyze and integrate frame slots of two different resources.

For instance, if a scenario tells us a group of users are more likely to be frustrated under a circumstance, what can a designer do to make users feel emotionally easier?  We can manipulate some variables to suppress activation in the working memory. Imaging a busy Mom standing in front of an airline kiosk with two playful little boys, she may have developed a prospect-base emotion of fear already, as she can construe the situation based on existing frame slots gained from latest experience in relation to her goal, standards and preferences. In this case, designers can either employ a quick interface walk-through demo as a light tutorial when users are still in waiting areas, or adjust environmental variables such as lighting and private space, as long as those measures help diminish the chance of activation.  

Figure 2: the mapping structure from situation to
emotion (Elliot 1992)

I agree that the above theoretical framework is hard to validate, but it is still a good model to look at. As HCI practitioners, we do not have to be "addicted" to a particular theory, or over-criticize it. A well-shaped science mind is good in terms of understanding design solutions as the inputs to human brain.

Reference:

Elliott, C. D., & Northwestern University. (1992). The affective reasoner: A process model of emotions in a multi-agent system. Evanston, Il: Northwestern University 

(&copy; 2012 Miaoqi Zhu)

Friday, February 10, 2012

When the reality tells me the "feeling" is wrong

I was on the north-bound train almost approaching my destination; an idea suddenly came to my mind, why did I feel the train is heading to a different direction at this time comparing to the time when I was aboard. 

There is no doubt that the train is driving to North; otherwise, I may never get back home. Having this assumption in mind that the train is right, I began recalling the scenario in which I got aboard from Adam/Wabash. This does not sound like a very difficult task; however, here are a few things making it more complicated than expected:

The stop where I departed is underground, thus there are a few clues for me to get a sense where the train is going. In other words, the sort of survey knowledge I established before cannot take effect here, at least for me;
The entry of a particular station: specifically, for those stations with centered platform, there are typically two entries located on two sides. If a passenger going North enter from the north gate, since there are two floors, she or he need to turn around to catch the Northbound train once reaching the platform, vice versus;
The seats are placed towards to two directions. One is with where the training is driving to, while the other is the opposite. Having said that, there could be another couple of mental rotations to carry out;
The train's door opening at either sides depending on the design of station platform.

When the first factor is relatively independent, another three in fact has certain interactions. Imaging you are heading to North: you firstly enter the station from the South gate, which means that you don’t have to turn around; fortunately, when the trains arrives, after boarding from the right-side door, you decide to use the seats on the right is following train’s direction, in this case, you  may feel less striving to recognize the direction. Nevertheless, if you get into the train through the left door and decide to sit on the left side where those seats are facing backward, it might add extra work for the agent in terms of figuring out where s/he is going to. If you start from an above-ground station, the puzzle could be even easier to resolve, because you have a number of visual clues to reference such as landmarks. 

Having listed the possible noises preventing me from feeling right for the direction, I continue recalling each scene from the time to enter the station to board the train. Since I am more type of visual thinker, I enjoy playing back every single memory frame. The first challenge was presented because of the internal construction structure of the station, it is important in that the number of turns along with its direction co-determine my actual orientation in the world. I resolve this problem by putting myself in an imaginary blueprint of the station, and re-experiencing the journey virtually in head. Then the second challenge follows because I have to map the spatial relationship between my seat and the door from which I got in. Please be aware each cart has two doors on one side, which means that if I sit closer to the door where I boarded, it will be easier; but if I choose a seat that is further away, or if I don’t remember which door I entered, it would be another story. 

Finally, I got the “feeling” corrected by reasoning the spatial relationships among each object that I am able to recall. Further, playing back the scenario is helpful in terms of structuring the space I was in before. A big question here is that: why did I lose the feeling of direction in the first place? I mean when the train is operating underneath, the feeling is alright; yet once the train drives out of the tunnel, I became a little anxious since my intuition tells me that feeling is incorrect. I was also surprised by how many chunks of information and attention resources I have employed to get that feeling fixed, let alone the intensive computations occurring in my head. Every time I spare a little attention for other activities (e.g. talking to friends), I may need to restart the process again, because a sub-process generates so much data that needs to be temporarily stored in working memory for the next thread of computation(s).

(&copy; 2012 Miaoqi Zhu)

Sunday, February 5, 2012

Activation

When I was in the “flow” state of reading Anderson’s paper of “ACT A Simple Theory of Complex Cognition”, I could not help myself referring back to Just and Carpenter’s seminar work on capacity theory of comprehension. I am wondering perhaps there is a connection between two theories, which is the “Activation-Level”. 

First of all, just a little recap of Anderson’s paper: He tries to understand the basic components and processing principles of our cognition. By studying how people write recursive programs, he claims that there are two elements: productions (procedural knowledge) and long-term chunks (declarative knowledge). Then he further explains what are “Knowledge Acquisition” and “Knowledge Deployment”. 

For “Knowledge Acquisition”, it was said that long term chunks come from environment encoding, and to transform those chunks, ACT-R looks for some existing chunks for mapping; From “Knowledge Deployment”, the author answers the question: how humans select the most appropriate knowledge for a particular context. Based on a rational analysis that “knowledge is made available according to its odds of being used in a particular context, activation process implicitly performs a Bayesian inference in calculating these odds”, he elicits a basic equation, which is:

Activation_Level = Base_level + Contextual_Priming

Anderson further illustrates this equation with three domains; memory, categorization, and problem solving.  It looks to me that for those three domains, the way we see “contextual priming” slightly differs. For instance, for memory, it is the association of chunk n and chunk m; for problem-solving, it is about the effect of distance to the goal that participants set up.  

When we go back to Just and Carpenter’s paper: they “redefine” the concept of working memory by presenting a computational theory, which suggests both storage and processing are fueled by an identical property called “Activation”. More specifically, each element (e.g. word, phrase, objects from real world) carries an associated activation level. During a course of understanding, relevant chunks are activated from either a computation or long term memory; however, not all of them can enter working memory; only the one that meets certain minimum threshold value obtain the permission. As long as the total amount of activation level is within system limitation, we are good to process the information; but if the sum exceeds system limitation, we need to de-allocate some old elements. It may be not hard to get the idea if you draw a scenario in which you try to understand a difficult sentence with several clauses embedded. Some people with large working memory capacity may understand it quicker than those who own a small capacity.

You may notice that the activation level is a common thread for both works. If Anderson is right, can he help explain where the activation level from in Just and Carpenter’s theory? Just and Carpenter adopt assess working memory capacity using the "Reading-Span" task; while Anderson address his curiosity from studying people writing recursive programs, and his idea is published a few years after. I am just wondering if we swap their methodologies, are we still seeing this common thread?

Reference:

Anderson, J. R. (1996). ACT: A simple theory of complex cognition. American Psychologist, 51, 355-365.

Just, M. A., Carpenter, P.A., (1992). Capacity Theory of Comprehension: Individual Differences in Working Memory. Psychological Review, 99(1), 122-149.

(&copy; 2012 Miaoqi Zhu)

Friday, February 3, 2012

Sing a song

I have kept asking myself this question: why I feel using just a little effort to remember a whole piece of lyrics after I recall the first or first few words?

When I am reading the book written by Dr. Lawrence Barsalou, specifically the Chapter 6 dedicated to “long-term memory” encoding, I found the above question partially addressed. I admit that a person may listen to a song many times; as basically, she/he must very much enjoy it for some reason. In that sense, the amount of processing on the lyrics is improved by presentation duration, the chances to rehearse, and the number of presentations. Those three variables are found to be the significant factors determining the information processing quality.

What appears more interesting to me is the sort of elaboration that contributes to data encoding: first, incidental versus intentional learning. Human beings are in fact ready to encode any information without "realizing" it. Although there is a benefit for us to try to remember something, because it may result in increasing number of rehearsals, but it is not fair to say that the information cannot be learned well, because people are not inducing to do so. When we are listening to a favorite song, of course, it is fine for us to try hard to remember the lyrics word by word, but let us think about it again, how many times you found yourself practicing the song perfectly without the intention to memorize. Second, the depth of information processing. If you are given a stimulus such as a car’s engine part that you have never seen before, how can you restore it? Well, ideally, you may develop a set of characteristics pertaining to that objective stimulus, with those characteristics becoming more conceptual, the stimulus will be remembered better. In fact, when you try to recall that stimulus later, the previous generated conceptual information will be activated and retrieved as well. Applying this point to explaining my question, each word in the lyrics relates to another, and they come together to form a meaning that musicians want to convey. I can say I still recall a song easily from a decade ago, as it is my favorite one from my favorite show. When I retrieve the lyrics, I can still perceive its meaning to the show. In addition, I am wondering whether the rhythm helps extending the depth, although it is different type of information, the phonological loop could be affected by it, as we sing out individual word with the sound. Third, imagery, in other word, you can visualize the information, which may produce more conceptual information to elaborate. For example, you are asked to remember “subway airport”, then you can picture a scene that “a CTA blue line subway is heading to the Chicago O’Hare airport.” It does not have to be that rich decorations attached to the original materials, yet we see an improvement in remembering. Again, going back to my question of interest, as I said before, I am able to play back a portion of the show (e.g. the main characters get back together), as the lyrics correspond to the story quite well; in the meanwhile, some words can be transformed to real objects (e.g. river, horse) from the show.

The reason why my question is partially un-addressed at this point is that: are the lyrics stored as a whole in a chunk? or divided into several chunks with some special “linkage”? Because when we succeed in recalling or recognizing the first one or two word, the rest is usually flowing out like a river. Perhaps people will say it is up to individual depending on her or his capacity and strategy to encode information. :)

Reference:

Lawrence W. Barsalou. (1992) Cognitive Psychology, An overview for Cognitive Scientists , Hillsdale, NJ: Lawrence Erlbaum Associates. ISBN 0-89859-966-0

(© 2012 Miaoqi Zhu)