豆瓣粥

武政宝宝 发表于 2009-10-18 18:30:24

豆瓣是碗粥,热气腾腾蜜枣稠。豆瓣是片海,上下浮沉东印度。豆瓣是碟菜,色鲜味清赛本帮。豆瓣是只猫,啜泣呜呜新婴孩。

欢迎大家以文对文:)
关键词(Tag): 豆瓣

Word Bank of Personality

武政宝宝 发表于 2009-10-12 20:55:32

一个规划中的项目,完成时间尚未确定。

Course Project Proposal

Word Bank of Personality: Extracting Meaningful Knowledge describing Personality using WordNet and domain text


Brief Description of the Aim:

Please have a look at this Graph first:

http://www.lexipedia.com/english/man

http://www.lexipedia.com/english/woman

This is just an illustration of what a semantic word network is

[Aim]
For this course project, we are going to base our work on two basis:

1. a well trained semantic word network, with links between words and topics [Word Bank: WordNet from Princeton]
2. a specific domain of closely-related documents. [Word Domain: describing Personality]

We will use the semantic links and weights of the Word Bank to specifically extract topics inside the Word Domain. We hope to find interesting descriptions of a human person, and establish a whole Word Bank of Personality, containing different levels of descriptions for a person's personality.

[Towards Reorganization of knowledge]


Long before the beginning of Internet and Web, people have dreamed of better communication with people, better understanding of other people’s thinking, and better understanding of the various human creations. After the birth of Internet, our society has since been transformed to a more uniform basis, and our everyday lives have laid upon these layers. But, it is not until the recently development of web 2.0, and more, do people start to realize that, Web, and the whole Internet itself, is a great assembly of human knowledge. What’s more, people could use this kind of “new” knowledge to understand other knowledge, primarily because they are more “structured” and could be utilized by computers more easily. First, this kind of “structured knowledge” is different from the traditional forms of knowledge, such as a rigidly compiled large book of encyclopedia, or the piles of books of one specific domain inside the library, etc. They are “structured”, either following some kind of format, or being tagged and marked with some kind of abstraction or digestion. Second, these new kinds of knowledge are sparse, de-centralized, and very dynamic. Taking examples such as blogs, news feeds, stock data and corporate reports into account, we find that these new knowledge are happening everywhere, being generated all the time, and could cover almost everything. They are the real-time recorder of the life of billions of people, they are people’s minds spoken out and people’s wills expressed. They are different from traditional ways in their more collective forms. Since individuals all write diaries and send messages on Internet, their intellectual and emotional life shifts greatly to these newly created layers: man-made layers.

If these newly created information are merely in the form of texts and words, they are in no means different from the traditional knowledge print on paper. And if these newly recorded life and human activities are merely in the forms of tables containing measurements and records with only numbers, they are still in no means different from the traditional form of “data” that is processed by computers. The achievement of web 2.0 is that, the newly created information on the internet are becoming “more structured”, both in the larger organizational sense (inter-linking, promoting, ranking, polling), or in the finer understanding and digestion sense (tagging, keyword abstraction, commenting). These kinds of semi-structured data appear in different forms, in XML, some specific instance of XML format families, or in different bill formats, different transaction formats, or in the forms of interconnected tagging and sharing, or in the forms of voluntary collaborative contributions, as wiki, and more. More structured data allows better processing and understanding of the texts and contents by computers, both from the large perspective, and from the finer perspective.

This movement of Internet and Web 2.0 towards better organization of knowledge, is the newest one of the many attempts for people to describe the shape of their own knowledge and their own understanding of the world, ever since the dawn of civilization. This kind of “reflection” process happens as people and civilizations grow more sophisticated. However, historically they are done basically on an individual basis. While after Internet and web, Individual knowledges are combined, merged, interconnected, and activated, to together form this collective contemporary view of the world. We call this process “Collective Reflection”. The initial step of this process is the “structuralization of knowledge”, performed by billions of people, through the very practice of the idea of Web 2.0. This kind of collective assembly of knowledge, if achieved, and if later-on again perceived and utilized by individuals and organizations, would in return greatly help these subjects, whom have always been the very source and origin of this collective knowledge.  In the 21st century, we are already seeing the Collective Reflection process going on. With the help of computer science, we will finally be able to re-organize our collective human knowledge and collective human experience, and record them in a comprehensive form that could be kept, modified, updated, trained, evolutioned, and most important of all, made use of.

[Structured or not? Individual human knowledge is a hierarchy of networks both inside and outside]

“Modeling reality and make reflection on them” is the key topic of every discipline. Computer science is not exempted. To model and record what individual and collective human beings do with their experiences and different kinds of knowledge, we should first understand the very foundation of “knowledge” inside the human brain.

“Word”, and “Object”, are the two main epistemological terms used to describe how we classify things and form concepts and ideas. A single human individual has being establishing mappings of Word and Object ever since he/she heard the first expression of sound from his/her parent, while the hand pointing to some object. A link is shaped in this way. And many links are shaped with time. Colors, tastes, hurts, pains, and happiness; the different “concrete objects”, and the different “abstract objects”. These links establishes and young babies grow into children. Later on, human individuals tend to form Networks of Words, and Networks of objects, and Links between words and objects, together with a fuzzy sense of grammar that is in the deeper layer of these links. At the two ends of these links, we find that Words are interchangeable, and Objects could be replaced, and still they are defined by each other. This is the individual version/copy of the human knowledge. For culture and civilization to grow, mutual understanding and recognized forms of expressing and exchanging experience and knowledge have to exist. From the individual perspective, after the establishment of these well-accepted networks and commonly-known grammars, mutual understanding begins between different people, between groups of people, and between cultures. For example, the meaning of one word ”man” may be different to different people, and be different under different circumstances and after different contexts, but as long as people want to to use this word “man” mutually, these different “individual instance versions of the word ’man’” inside different brains  should always be supported by similar links to other words or concepts. If not supported by similar ideas and concepts, the two individual versions would not be similar, and misunderstanding happens, which is very common in daily lives and even in almost every dialogue. It is because that different people would have different understanding and usage of that word in different circumstances and contexts. It is in fact, on the physical layer of view, because that different people would assign different links to the individual word “man” while having the conversation. And these links would change with the contexts.

Words are supporting each other. Only in the form of “supporting”, could one word be defined and further to be distinguished from each other (the core topic in NLP is word disambiguation). Words are supported by other words, and Concepts are supported other concepts. The supporting links between words and objects, between words and concepts (actually concepts are words, they are represented by a complex assembly of words inside the texts, and they are supported by a complex links of other concepts and words inside the brain.), shows exactly how human brain establishes these in their correspondence forms: Neural Networks. Several Neuron nodes may be used to together function as one representation of one word (only in the “labeling” sense). Several Word-node-complexes may be used to together function as one representation of one concept. And a network of these different levels of node-sets and node-complexities will form a specific understanding or feeling of some specific abstract concept. These complex node-sets would together form large networks and hierarchies. These networks and hierarchies have effect on each other. They adjust, change, and modify the links of each other. And they change the weights of the links of themselves all the time. These modifications and changes and performed based on outer world inputs and responses. The whole human experience and knowledge, if viewed from the individual perspective, is nothing else than these huge self-adjusting and self-changing networks of links. Objects, Words, Concepts, Grammars, are all inside them, and are all different nodes. Forms change from “linear texts” to “visible diagrams” inside the brain. Links within established nodes are quite stable; however they are never stable when being used. Dynamics happen while perceiving, thinking, and talking: Perceiving: Attaching meaning to the words and phrases (attaching links to them); Thinking: With an intension and a driving force in mind, process and make understanding of the ideas and concepts (deal with the incoming nodes with the individual’s existing nodes, which represents his/her own understanding of the concept, and his/her intentions); Talking: Use mutual expression and language grammar to talk to people (use special links representing the parts that could be most probably be similar inside other people’s brains, which means similar languages and similar expressions, in similar cultural settings).

Inside the hierarchies and networks, there are parts dealing with different purposes. The hierarchies of different domain knowledge, and in together with a basic core set hierarchies of everyday knowledge and language, with the other cooperating layers of feeling and emotions, is our INNER representation of our experiences, knowledge, principles, thinking and consciousness. For a child to read an Encyclopedia, he/she would inevitably need the basic language knowledge and a very basic understanding of the world. And for a professional to specialize on some specific area, the core set of common knowledge is indispensable.

These inner representations, would inevitably shape the forms of our OUTER representations of knowledge. And the outer forms, would inevitably be very similar to our inner organization of knowledge. The “outer forms” of knowledge are the books, newspapers, speeches, conversations, which are /could be recorded in physical forms. In this means, TEXT data, namely the different documents, websites, notes, speeches, and more, are really “structured”. They are structured in the form of networks, hierarchies, interlinked objects, and networks of these all. They are structured just as in the way of how they are structured inside our brain, and they are structured as a component instance of our COLLECTIVE knowledge, which is in this means, also structured. This is also true for other forms of knowledge (data, records, links, and more), whether they could be directly processed/understood by computers or not.

While most of the individual copies of knowledge would differ from each other even in every corner of their hierarchies, a sense of similarity could be viewed among most of them. This degree of similarity inside our brains is gained by the similarity of education and personal experience we have. And of course the unlikeness of these instance versions also arises from the different personal experiences and thinking.  “Collective human knowledge”, on the other hand, is an abstract notation that is used to represent all the similar and un-similar parts of all the human knowledge s and experiences, originated from the history, and will last into the future. Traditionally people would view the “merged and combined only version of the individual knowledge s” as the collective human knowledge, and later people would like to take the un-similarities into account. Since everybody’s view and opinions differs, we could hereby have communities and societies. With these differences, we could have motions and lives. After all, every one of us is a constituting component of the concept “human”.


 [Implicit or Explicit form of knowledge? The difference is just how they are represented]

By now we have two groups of dual concepts here: Individual and Collective instances of knowledge (one is a changing copy; the other is a changing repertoire pool), Inner and Outer forms of knowledge (one is a changing network of hierarchies; the other is a somewhat fixed, somewhat linear physical form that could be recorded and interpreted).

Since they are in fact affecting each other, they are all structured. While the word “structured”, if used in the sense of computer science to describe information, means that a piece of information could be processed and made use of. In this sense, “structured knowledge” are explicitly structured, while “un-structured knowledge” are implicitly structured. The reason we call some materials as “un-structured data” is that, their structures are not explicitly known to us, or “not-defined by us” in an explicit form. In fact, every piece of information interpreted by a human being is already defined and structured, but only by the neural networks inside our brain, which on the current computer science of view, is still “un-structured”.

To illustrate this better, let’s consider the situation a kindergarten child may face: he/she may have already mastered 2000 words and is the champion star of the kindergarten-wide spelling competition; however, he/she may still feel very puzzling and misty when reading a piece of article from the professional areas, or simply a piece of news. For this kindergarten child, the professional article, or the piece of news is just “unstructured”, since he/she has little idea of many of the words appearing in these materials, and even fewer idea of the concepts and opinions behind.

Even for us grown-ups, this phenomenon is true. We would have to gain the supporting links for the words we see and the concepts we perceived; otherwise we could not understand them at all; we could only make inferences based on word formation, or some similar expressions we may have heard of. The world becomes more and more explicit to us as we grow up; and that’s the reason we see the world in a more structured way than we did in our childhood times. The feelings of misty and confused, we do not want them again.

Let’s again consider one more example. There are some forms of knowledge, such as XML text, transactional data, rules in logic, mechanisms of communicating objects, streamline of manufacturing, interlinked websites, interlinked word entries in Wikipedia, bibliographical data such as DBLP, Medline, tagged Flicker photos and more.  They are the large networks that contemporary human beings are currently working on, so as to weave the future web of collective human knowledge.

These larger networks would inevitably need these smaller networks inside everybody’s mind to be Meaningful. They are meaningful, only after they are supported by the IMPLICIT networks of knowledge. Otherwise the larger network is just a connection of items/entities. We human beings have been cultivating our own implicit networks since born, and then later we can read the complex networks of knowledge created by other people (textbooks, newspaper, and more), and of course shaping these complex networks again in our brains. This process is the Learning Process of the human being. We build our ever-growing network of knowledge and experience, based on our own existing version of the implicit network, and taking in the inputs of other networks from other people, books, movies, from our own life, from the outside world. This process is the Growth Process of human being. Should computers also gain such kind of underlying implicit knowledge, before they can really understand what the information is?

The answer is certainly yes. Larger-scale networks are always supported by lower basic-scale networks. Modern search engines are always supported by an initial level of semantic understanding of the user’s query. IR applications are always supported by a core set of interconnected words and topics . Ontological applications’ RDF definitions should always be trained by a specific domain of knowledge before they can actually perform tasks. Collaborations shown in the weaving of Wikipedia is also supported by the individual implicit networks of their understanding of the world. Without these basic understandings, the higher level larger networks simply cannot work.

In this sense, a basic semantic understanding of the words is inevitable for the higher level applications dealing with larger networks. We should always gain some sort of semantic understanding in the task to deal with text (explicit unstructured data), or xml (explicit semi-structured data), and more. In conclusion, unstructured data will become structured, as long as we know the implicit structures behind. We call this underlying implicit basic understanding knowledge “core network”. With some sort of core network, we could really start the task of understanding more comprehensive combinations of words and concepts i.e. large networks. In fact, we are doing nothing, other than training, upgrading, and evolutioning the core networks, by human force, using computer science.

[WordNet as the underlying core network: our choice]

“WordNet®(http://wordnet.cs.princeton.edu/)  is a large lexical database of English, developed under the direction of George A. Miller. Nouns, verbs, adjectives and adverbs are grouped into sets of cognitive synonyms (synsets), each expressing a distinct concept.” Synsets are interlinked by means of conceptual-semantic and lexical relations. This well trained network of words, topics, frequencies, co-occurrences, inter-connections, and even common usage forms of the verbs and adjectives, is a very friendly and reliable network of knowledge for any of the larger-scale projects.

The role of human intervening computer systems and training them with knowledge and experience a human being have, is a topic that has long been discussed. Although many people argue that we should develop algorithms, systems, and frameworks for them to “grow and learn themselves”, many others find this not that practical at the very starting of artificial intelligence. Later on the concept of “training” and “learning” becomes dominant in computer science. Most often people would develop a system that allows change in some of the weights inside the parts, and would train them in feedback or feed forward settings/ supervised or unsupervised settings. Sometimes these systems do not only change the weights they have, they also gradually learn the rules that are behind the change of the weights and parameters. Later on computer systems tend to learn patterns that are implicit or explicit in the data, time series, images, and videos. The world “pattern” could cover many different levels of formations and structures. Pattern recognition techniques are numerous and highly developed, and becomes the core mechanism of many applications nowadays. Evolution is another core idea of many applications, which reinforces computer systems with an inherited unpredictability other than hard determinism, and mimics the competition and natural selection process for the candidate instances. Recently, ontology becomes a hot concept, which primarily means using “meta-data” as a guideline to understand “ordinary data”. This concept of ontology is the core functioning mechanism of human learning, education, professional activities, and even human thinking itself. Although the area of ontology itself may not produce that much result as people may hope, at the current time, just like the A.I area in the early days of its development, People are counting on the future development of these new and existing techniques to finally make use of them. “Training computers to understand, and even to think”, is a long way to go. Before some certain stage far in the future, people should always provide the computer systems with the experiences and understandings they have, about the world.

This is exactly the key idea of Web 2.0. Just as a child would need the instructions and demonstrations of his/her parents and peers, computers would also need human beings to tag the documents, to contribute to the compiling of Wikipedia, to judge the similarity of similar objects in galleries of images, and to judge the similarity of different words. Web 2.0, as a social phenomenon, shares this same principle in aiding the computers to understand and think. The “thing” web 2.0 is learning is no longer mere weights, rules, patterns, or evolution mimics, the “thing” is the larger and finer constitution of human knowledge. The “thing” is ontology, the “thing” is language, the “thing” is opinion, and the “thing” is social structure. We human beings, are actually cultivating our own children, voluntarily, and happily.

WordNet is one of the children, or embryos of children we have. The Princeton group trained this "evocation" synsets, which collects human judgments on how much one synset brings to mind another. “100,000 semantic similarity judgments from at least three human raters” for each of the judgment.  As shown in the following picture, multiple human people are hired to provide the weights of the links of their own individual knowledge networks, to form the WordNet network of words.


As a result, the connections between the words are gained. An “average explicit network” is thus established, and is there for us to use, for us to base our own applications upon.

A variety of researches have been done using WordNet, generating a large bibliography:

http://lit.csci.unt.edu/~wordnet/

Identical to the idea of Web 2.0, the WordNet itself is being tuned, refined, retrained, re-formalized, and making use of. “Artificial Artificial Intelligence”? Everything created by human is just this. WordNet is one of the existing core networks we have, and could be used for the larger understanding tasks.

[Personality articles as the specific implicit larger network: Our Domain]

As we have stated above, using an existing explicit core set of knowledge, we could learn a larger set of knowledge, making the larger network appear from implicit to explicit. This process is also known as the training and upgrading process of the original core network of knowledge.

And now, we would need a specific domain to further train our instance knowledge network, bearing in mind that we already have a well-trained core set of knowledge: WordNet, which provides basic understandings and supporting links of the common word-object entities we have in our everyday dialogues and conversations.

Our project’s topic is

“Word Bank of Personality: Extracting Meaningful Knowledge describing Personality using WordNet and domain text”

For this project, we are going to specify the domain mainly as articles of Astrology.  Astrological articles usually describes the personalities of people, and is one of the most concentrated material set of all human texts to describe Personality (You could hardly find such a dense collection of description for personalities in any other major forms of text). Leaving alone the question of whether astrology is meaningful or not, we would be able to extract a large data set of terms and expressions used to describe personality from this very specific source of articles. We will generate a Word Bank of Personality, which will have interconnected words, expressions, and maybe usages inside, focusing only on the description of personalities. Astrology is one of such potential sources of texts for us to train the network, and to get the interesting entities extracted, for our specific domain of Word Bank of Personality.

This Word Bank of Personality, thus gained, would be meaningful in several ways.

(1)    First, with this Word Bank of Personality, we could judge whether one paragraph of a document, one document, one set of documents, is discussing topics related to Personality, or not. As Topic Identification has long been a topic in IR, a specific identifier for “Personality” is meaningful.  For this task, we would only need a relatively simpler version of the Word Bank of Personality.

(2)    Secondly, we could base on this Word Bank of Personality, to further generate the Bank of Ontology for judgment of people. This potential bank of ontology for judgment of people, though may-be rudimentary in the beginning, should be useful for the later tasks for distinguishing the personalities of people from their conversations and interaction records. This is one of the many functionalities our own inner individual networks of knowledge (our own neural networks) could provide, and is therefore also implied by the quest of founding a mutual knowledge basis for computers and human beings. This task would need a very complicated version of the Bank, accompanied by many other supporting banks providing links to many other parts of human knowledge and experience. Obviously, this task could not be done now.

(3)    Thirdly, if we could not gain how a person is like only based on the conversations of him or her with other people (which would be too hard for computers), we could base these kinds of judgments of what a person’s personality is like on how other people describe him/her. The “Description of other people” contents are very common in our everyday conversations and our texts. The Word Bank of Personality itself would be able to check the “personality” of one people based on these descriptions. This task is somewhat like a classification task, but with multiple classes and labels for one subject. If we cannot determine what the “definition of Personality” is”, or if we could not finally define the categories we may use, we could seek help from the different Personalistics theories, and just follow their categories of personality defined; or, we could simply follow the category of astrology. The later task, if re-stated, is actually “determining a person’s horoscope based on the description of the person”. Since this classification of a person’s astrology is already very meaningful for many of the believers or half-believers, we could mark the task as quite meaningful.

(4)    And finally, the training and establishing of the Word Bank of Personality itself is a challenging and interesting topic, and would involve much work. Through literature reading, concept forming, idea generating, system designing, programming and testing, our group could learn a lot.

[Proposed Structure of the Bank]

Nodes, Labels, connections, layers of perception, and instance text fragments are the crucial parts of this database.  Nodes (N) are the counterparts of the words (w), phrases (p), expressions (e), topics (t), and conceptual notation(cn) inside the framework. Labels are the assigned “actual word” that connects with these Nodes. Connections (C) describe the different sort of links within these Nodes of different level and size. We use Connections (C) to denote different forms of network. And Layers of perception describes how the whole structure would look like: it should be hierarchical, yet also in levels. The “instance text fragments”(instance) are the parts of text that are classified and assigned to a node of some level, and are the resources to resort to when performing application tasks.

The mechanisms and interconnections within the nodes should be iterative and self adjusting, just like the other trained networks. We would construct the database in the following steps:

Terminology Set:

·         WordNet, DocumentSet

·         Nodes (N)


·         words (w), phrases (p), expressions (e), topics (t), and conceptual notation(cn)

·         Connections (C)

·         instance text fragments(instance)

·         Network of words C(w)

·         Network of topics C(t)

·         Network of words and topics C(w ,t)

·         Network of topics and phrases C(p, t )

·         Network of topics and expressions C(e, t)

·         Network of topics and conceptual notations C(t, cn)

·         Network of conceptual notations C(cn)

·         Network of conceptual notations and expressions C(cn, e)

·         

Procedure:

1.      Retrieve basic N(w), N(t), C(w), C(t), C(w,t) from WordNet

2.      Re-train C(w) using domain text

3.      Update C(w,t), and C(t) using new C(w)

4.      Extract new topics N(t), using the new C(w), C(t), C(w,t) in step3

5.      Update C(w), C(t), C(w,t) using new topics t

6.      Use the new C(w), C(t), C(w,t), and the domain text, through semantic approaches, extract N(p). and forming links of C(w,p), C(t, p);

7.      Use the new N(p), C(w,p), C(t,p), C(t,w), and domain text, through semantic approaches, extract N(e). and forming links of C(p,e), C(p, t);

8.      Iterate through the interconnected networks of C(w), C(p), C(e),C(w, p), C(w, e), C(t, p), C(t, e), C(t, w) to find  the interesting structures(concentrated, dense, frequent), to generate conceptual notations N(cn).Each cn is represented in the form of a structural aggregation of w, p, e, t  nodes. (Concepts are always supported by links and assemblies of other concepts and words)

9.      Use human knowledge to check the meaningfulness of cn nodes, by looking at their underlying structures. And thus adjusting N(cn) to be more meaningful.

10.  Refine and update C(cn), C(cn, t), C(cn, p), C(cn, e);

11.  N(w), N(t), N(p), N(e), N(cn), and C(w), C(t), C(p), C(e), C(cn), and C(w, t), C(w, p), C(p, e), C(p, t), C(e, t), C(cn, t), C(cn, p), C(cn, e), C(cn, w) are the results we get as our Word Bank of Personality, together with the instance text fragments.


[Brief Summary of the project]

For this course project, we are going to base our work on two basis:

1. a well trained semantic word network, with links between words and topics [Word Bank: WordNet from Princeton]

2. a specific domain of closely-related documents. [Word Domain: Personality]

We will use the semantic links and weights of the Word Bank to specifically extract topics inside the Word Domain. We hope to find interesting descriptions of a human person, and establish a whole Word Bank of Personality, containing different levels of descriptions for a person's personality. This word bank could be useful for several applications.

[Postscript]

This project is only a small one of the many attempts to module human being’s inner network of knowledge into outer explicit forms that could be utilized by computers. The weaving of our own larger-scale networks of knowledge that persist around the Internet and around every corner of our modern life, could only be achieved through small steps for computers to take human knowledge from “unstructured” forms into “structured” forms. These small steps would generate a partial copy of what we humans have inside our knowledge kingdom, focusing on specific domains.

Many of the different disciplines in computer science are all together human being’s tools to model his own world of experiences, ideas, wisdoms, logics, and thinking, into the forms that computers could use. With these information and knowledge, computers could sometimes perform better than ourselves. And they will be better and better at the tasks that are traditionally performed by humans. One of these tasks is to understand structured or unstructured data, and extracting meaningful information entities. In fact, the computers are not just mere computers; they are our children to be cultivated, a new species that is created by us, a descendant of human being, and a whole new mechanism through which we human could strengthen and upgrade ourselves.

That’s the meaning.

Links

武政宝宝 发表于 2009-09-26 11:09:28

一竖,一横,一瞥,一辣,一壁,一花,一声,一言,一行,一丸,一骑,伊苏的起源
关键词(Tag): link

日本机器人受经济衰退影响面临失业

武政宝宝 转载 发表于 2009-07-13 16:48:14

日本机器人受经济衰退影响面临失业

http://news.qq.com/a/20090713/000556.htm



中国日报网环球在线消息:据美国《纽约时报》7月12日报道,由于受到全球经济衰退影响,即便是世界上最有效率的员工——机器人也不得不遭受失业的“痛苦”。

日本拥有世界上规模最庞大的机器人“军团”,不过由于日本经济陷入严重衰退,消费需求下降,这些不吃不睡不用上保险的理想员工也只好歇业了;机器人生产厂家也受到牵连,面临危机。

日本最大的机器人生产公司安川电器在2008年3月至2009年3月的财年中,利润下滑2/3,至大约7200万美元。

安川电器的遭遇只是整个产业情况的一个缩影。据日本机器人协会的数据,2008年第四季度工业机器人的发货量降低33%,今年第一季度的发货量下降59%。

富士经济的研究人员认为,工业机器人市场今年将会萎缩40%,原因是老板们为了保护真正的员工免于下岗,首先就会砍掉购进机器员工的计划。

2005年,日本有37万多工业机器人,也就是说平均每1000个工作岗位中,有32个是机器工人,而日本的工业机器人总数占全球总量的40%。2007年,政府还曾计划到2025年将机器工人的总量提高到100万个,现在看起来这样的宏伟计划几乎不太可能实现了。

除了机器工人以外,可爱的机器人玩具或是家用机器人也凶多吉少。一家生产看家护院机器人的公司的销量下滑了将近1/3。从上市到现在只有不到4年的时间,他们已经决定停产这款可爱的家用机器人。

日本的老龄化问题十分严重,大约有1/4的人口在65岁以上,因此家用机器人的市场前景原本十分光明,可现在却发生了这样的怪事:竟有家用机器人的生产厂家连一件产品都没卖出去。

价格也是机器人产业衰落的重要因素,就拿十分讨人喜欢的机器狗来说,每只2000多美元的天价,让很多工薪族望而却步。

日本一个机器人卖场老板的话真可谓一语中的:“总的来说,机器人还是太贵,而且不太实用。”希望机器人产业能平安渡过经济危机,及时总结教训,“飞入寻常百姓家”。(环球在线:于盟)

[责任编辑:victorbai]

施一公事件逻辑学教育

武政宝宝 转载 发表于 2009-03-30 12:36:30

施一公事件逻辑学教育

(转自XYS)

作者:yehe

A:施一公不诚实。
B:别的海龟更不诚实,你咋不说呢?

A:施一公不诚实。
B:有本事你比他多发几片CNS来。

A:施一公不诚实。
B:施一公是一位勤劳勇敢善良正直的教授。

A:施一公不诚实。
B:比前年的海龟已经进步很多了。

A:施一公不诚实。
B:你自己也干过坏事,你有什么权力说施一公不诚实?

A:施一公不诚实。
B:你这么说是什么居心什么目的?

A:施一公不诚实。
B:滚去北大吧。

A:施一公不诚实。
B:北大给了你多少钱?

A:施一公不诚实。
B:再不诚实也是有名的教授,凭这个就不能说他不诚实。

A:施一公不诚实。
B:方肘子是诚实,可是不符合我国的具体情况

A:施一公不诚实。
B:胡说!施一公比方肘子诚实五倍!

A:施一公不诚实。
B:凡事都有个过程 现在还不是讲诚实的时候….

A:施一公不诚实。
B:光抱怨有什么用,有这个时间还不如努力去做实验发paper。

A:施一公不诚实。
B:方肘子心理阴暗,连施一公回国也要发牢骚

A:施一公不诚实。
B:世界上没有绝对的诚实,美国人诚实,你去吧

A:施一公不诚实。
B:不是改革开放,你现在SCI都没有,还有劲在这里唧唧歪歪

A:施一公不诚实。
B:大家小心A,此人IP在国外

A:施一公不诚实。
B:美国网特,滚,这里不欢迎你

A:施一公不诚实。
B:tmd,我怀疑你是轮子。

A: 施一公不诚实。
B: 施一公你都说不诚实,你还是不是中国人!!!


//没有一个人去论证为什么施一公诚实或者不诚实。大家都忙着表达自己的倾向了。
关键词(Tag): 逻辑 诚实 施一公

研究生人员的几个技术方向

武政宝宝 发表于 2009-01-29 13:49:38

研究生阶段无可避免地需要写代码,完成科研项目的构建工作。有以下几个可能的方向和前景在吸引着研究生技术人员的未来选择,也迷惑着他们的视线,分散了他们的精力。而这样的一个选择,对于他们的未来工作或是科研均有极大的影响。现在,就此作一简要阐述,希望有所帮助。

1. 科研项目的构建

对于偏理论计算机类科研项目,一般使用C++或java构建出算法核心,辅以简单的界面。这类项目的核心内容是理解,改进和创造算法。需要的只是对于算法的深入理解和一些基本的编程技术。这类项目,对于研究生技术人员来说是练内功和打基础的项目。在现在这个越来越脱离底层代码的时代,研究生们需要在构建此类理论计算机科研项目时认真仔细地考虑各数据结构,算法效率,稳健性等问题,写出简洁高效的代码。如有精力,应该深入理解框架内各主要数据结构的底层构造。

2. 科研系统的构建

对于偏重平台和跨平台交流的计算机类科研项目,其实就是雏形级别的商业系统项目。需要认真地考虑数据库的构造,文件系统,内存使用,需要考虑平台间交流的效率,以及简单的用户访问和例程管理。参与这样的项目,需要研究生技术人员耐心细致地阅读和了解各类平台和技术的大致细节。在认真了解的基础之上写出简洁明了的包装类,然后与核心算法挂钩。在问题不能解决时,需要多多请教专门的人员,需要多多地写小型代码来做测试。能够独立完成此类科研系统的构建,标志着研究生技术人员的技术水平达到了独当一面的程度。



对于有技术未来倾向的研究生来说,应该在认真做好第一类项目的同时,不忘自己去尝试各种新的语言,平台,丰富自己的经验。在没有科研系统类的项目做支撑的情况下,应该不断地利用Internet这个庞大的数据源来测试自己的各种实验系统和测试算法。Internet提供给我们的测试option有,爬虫算法,自然语言理解,社会网络,数据挖掘,数据库技术,网络技术,以及搜索算法等。有一个中型的测试项目,很能提高研究生的技术水平:

将一个中型的具有用户管理的动态网站中的所有内容抓取到本机,并且在本机完全重建这个网站。

这样的一个项目涉及了现有实际系统所需要的诸多方面,也考验着研究生们对于算法和基本结构的掌握。既有如何解决结构的异构抓取,异构重建,即时更新等新问题,也有如何实现用户管理,内容管理,负载处理等老问题。研究生技术人员可以选择自己喜欢的那些SNS网站或是博客站点进行此类测试,由简单到复杂。一开始可以只是抓取用户好友社会网络,逐步前行。

对于有着繁重的第一类科研任务的研究生,需要阅读大量的paper并且构建各类实验算法,对于智力投入的要求较高,但是也请不要放弃对于实际的系统的各种实践。把握每一个学习的机会,以虚心的态度去理解大型框架和系统,并不断地尝试各类小型实验系统,最终才能无往不胜,站在技术的制高点上。

对于平时比较偏重于各类系统项目的研究生,在巩固自己已有技术的同时,需要不断地向着算法的构建和系统协调这两个方向进行考虑。需要写出精确和简洁的类型层级结构封装来服务于自己的项目,需要考虑效率问题,而不应该满足于实现功能。对于偏重脚本编程的技术人员,也应该认真贯彻表示和内容相分离的原则,以有层次的方式构建代码;同时需要倾向于去多多地考虑系统负载和会话安排。