Preview

Chinese Is a Unique and Magical Language

Good Essays
Open Document
Open Document
336 Words
Grammar
Grammar
Plagiarism
Plagiarism
Writing
Writing
Score
Score
Chinese Is a Unique and Magical Language
speak
Chinese is a unique and magical language. Natural language processing Chinese data , we will encounter difficulties that many other languages do not have, such as segmentation. Between Chinese words,there are no spaces. So, how can computer know this sentence: Married and not married youth must practice family planning.

This is the so-called problem of segmentation ambiguities. But now a lot of the language model has more beautiful method to solve this problem. However, in the field of Chinese word segmentation, there is a kind of words making us confused- unknown words just as “给力”.
The last decade, the Chinese word segmentation field are concentrated to overcome this difficulty.
So let’s see some interesting ways to solve this problem. In order to extract words from a text , our first question is, what kind of text fragments are considered one word? A standard that we think firstly may be the number of times to see this word is large enough. However, just high appearing frequency are not enough , text fragments may not be a word, but a phrase with more words. "the movie" appears 389 times in the state of all network users at renren.com, "cinema" appears only 175 times, however, we are more inclined to the "cinema" as a word, because "movie" and "courtyard" relate tighter.

In order to prove that the word "cinema" internal solidification is indeed high, we can calculate to prove that if the "movie" and "courtyard" appear independently in the text, they both just spell together probability will be more small. From 24 million characters of data, we can easily find that the probability of “cinema” is more than 300 times the predicted value which equal to the probability of “movie” products the probability of “courtyard”By the same method, the probability of “the movie” is 8.5 times the predicted value. The results show that "cinema" is an interesting mix of these two components of the "movie" which is more like "the" and "movie" occasionally

You May Also Find These Documents Helpful

  • Good Essays

    Split Attached Words: These entities should be split into their normal forms using simple rules and regex.…

    • 522 Words
    • 3 Pages
    Good Essays
  • Satisfactory Essays

    Segmenting-being able to separate a sentence into words. Ex: How many words are in this sentence: Sally walks to the store. (5)…

    • 188 Words
    • 1 Page
    Satisfactory Essays
  • Good Essays

    Scavenger Hunt

    • 666 Words
    • 3 Pages

    Before reading, look for the words in bold. Try to guess their meaning from context. Then, look for their synonyms in the activity after the reading.…

    • 666 Words
    • 3 Pages
    Good Essays
  • Good Essays

    2. Identify main idea and relevant, supporting details in a reading; word parts (prefix, suffix, roots) in vocabulary; and contextual clues.…

    • 846 Words
    • 4 Pages
    Good Essays
  • Satisfactory Essays

    Description: Through reading traditional Chinese stories, we hope to address several critical issues of our time: among them, humanity’s collective ignorance of its own past, growing alienation and tension between China and the rest of the world, and global anxiety over oddities, violence, chaos, and the supernatural in everyday life--four major motifs prevalent in the texts that concern us here. In this course we will read a number of representative short stories from the Han dynasty to the late Qing, to examine ways in which “small talks” and tall tales shape Chinese novelistic discourses and cultural imaginaries. We shall consider how these stories help constitute the essential components for human capabilities development in the pursuit of happiness, drawing on a set of traditional values and concept metaphors like “loyalty,” “filial piety,” “compassion,” and “justice” as the norm. But as we read on, we often find the protagonists to be struggling under most demanding situations, always already tormented by adultery, avarice, betrayal, cruelty, deception, ingratitude, and many sorts of monstrosity. Sometimes, it would be a female ghost, cunning vixen, or a thousand-year old serpent coming to the rescue--or making things worse. Gods and deities seem to have disappeared long ago. Our main objective therefore is to share in class some intricate life lessons, as they testify to Chinese folk wisdoms and practical reasoning in time of crisis. Subgenres like “chuan chi,” “bian wen,” “hua ben,” among others, will be discussed in their historical, philosophical, and trans-regional contexts. Themes include the knight errant, heartless lover, femme fatale, ghost wife, dream adventure, justice, trickster, and so forth. Materials will be in English…

    • 2378 Words
    • 9 Pages
    Satisfactory Essays
  • Good Essays

    Isds Ch 5

    • 3328 Words
    • 14 Pages

    11) By applying a learning algorithm to parsed text, researchers from Stanford University's NLP lab have…

    • 3328 Words
    • 14 Pages
    Good Essays
  • Satisfactory Essays

    Make lists of the words or phrases which are linked to each group. ­ Compare lists. Do…

    • 451 Words
    • 2 Pages
    Satisfactory Essays
  • Satisfactory Essays

    Start to read the article. When you find a word you don't know, try the Words in Context…

    • 343 Words
    • 2 Pages
    Satisfactory Essays
  • Powerful Essays

    5. The linguistic features in the text are demonstrated by high frequency. For example, on the lexical level, what are the author’s preferences? Nouns or Verbs? What are their ratios? Among nouns, what is the ratio of abstract ones as against concrete ones?…

    • 5284 Words
    • 22 Pages
    Powerful Essays
  • Satisfactory Essays

    Reading Log

    • 528 Words
    • 3 Pages

    First,there are still some words I don’t know,to solve this problem I think there is only one way,that is read more and recite more in the future .secondly,there are some sentences which are verylng and…

    • 528 Words
    • 3 Pages
    Satisfactory Essays
  • Good Essays

    The lexical analyzer collects characters into logical groupings and assigns internal codes to the groupings according to their structure.…

    • 2518 Words
    • 11 Pages
    Good Essays
  • Good Essays

    and correct. This is no trivial problem, because natural language and, in particular, speech understanding…

    • 4798 Words
    • 20 Pages
    Good Essays
  • Good Essays

    Mandarin

    • 528 Words
    • 3 Pages

    In terms of identity, the barbarians were the steppe nomads of Inner Asia or Central Eurasia. This are represents one of the toughest places in the world in which to survive. It is an area of ice, forest, desert, and mountains—with bitter winds, dust, and poor soil. Due to the necessity,…

    • 528 Words
    • 3 Pages
    Good Essays
  • Good Essays

    Writing and Spelling Problems Writing form of communication using symbols (letters of the alphabet, punctuation and spaces) allow pupils to put their thoughts and ideas in a readable form Spelling the process of writing or naming the letters of a word helps to cement the connection between the letters and their sounds. Writing Problems difficulties with spelling words and expressing thoughts Writing and Spelling Problems trouble remembering the letters in words noticing…

    • 553 Words
    • 5 Pages
    Good Essays
  • Satisfactory Essays

    chinese

    • 1130 Words
    • 8 Pages

    Wo yi hui er jiu qu hai bian – i will go to the beach very soon…

    • 1130 Words
    • 8 Pages
    Satisfactory Essays