Table of contents
- Fundamentals of natural language processing
- Introduction to natural language processing 1: How it works
- Introduction to Natural Language Processing 2: History
- Introduction to natural language processing 3: What you can do
Natural language processing is one of the technologies used in AI. It is attracting attention not only for business but also as a technology that enriches our lives. Natural language is used in search engines, AI assistants, etc., and further development is expected. This time, we will explain introductory information on natural language processing in an easy-to-understand manner.
Fundamentals of natural language processing
What kind of technology does natural language processing refer to? Let’s take a look at the relationship between natural language processing and AI.
What is natural language processing?
According to the document “Introduction to Natural Language Processing: 1. Overview of the current situation and history,” “Natural Language Processing refers to artificial languages such as programming languages, as well as languages such as Japanese, English, and Russian. It refers to the use of computers to process languages that people speak and write on a daily basis. There are many systems that utilize natural language processing that are deeply connected to our lives, such as AI assistants and search engines.
Natural language and artificial language
Natural language refers to the language formed by humans communicating with each other. Japanese, English, etc. fall under natural languages. Natural language contains a lot of relatively ambiguous information, which requires a variety of processing in order for computers to interpret it. On the other hand, artificial language refers to programming languages and mathematical formulas. In contrast to natural language, it is a language that is easy for computers to understand, so humans need specialized knowledge to interpret it.
Connection with AI and machine learning
These days, we often hear the terms AI and machine learning. So, what kind of connection does natural language processing have with AI and machine learning? AI is an abbreviation for “Artificial Intelligence” and is called artificial intelligence in Japanese. In its broadest sense, natural language processing and machine learning are parts of AI. On the other hand, machine learning is a technology that analyzes various data and derives the rules and characteristics hidden within the data. In other words, machine learning is used in natural language processing to analyze the natural language given by humans.
Introduction to natural language processing 1: How it works
In order to understand natural language processing, you need to know how it works. Natural language processing is performed using the following mechanism.
- Morphological analysis
- Semantic analysis
- Contextual analysis
Let’s explain each.
To begin with, a morpheme is the smallest linguistic unit that has meaning and cannot be further divided into sentences in a natural language written in letters. For example, if we break down “a person riding a blue bike” into morphemes, we can divide it into morphemes: “blue”, “bike”, “ni”, “ride”, “ta”, and “person”.
blue adjective, independent, *,*, adjective/auo-dan, basic form, red, blue, blue
motorcycle noun, general,*,*,*,*, motorcycle, motorcycle
particle, case particle, general,*,*,* ,ni,ni,ni
riding verb,independent,*,*,five-dan/la line,conjunction ta connection,riding,not,notta
auxiliary verb,*,*,*,special/ta,basic form,ta,ta,ta
person noun, general, *,*,*,*, person, human, human
*If you break it down any further, each one becomes a meaningless “phoneme” and is no longer a morpheme.
Syntactic analysis is the process of analyzing the relationships between words obtained through morphological analysis. By performing syntactic analysis, you can visualize the relationships between words.
For example, the sentence “A cat that eats big-eyed fish” is
“Did the eyes of the cat that ate the fish be big?” “
Did the eyes of the cat that ate the fish be big?”
Because it can have multiple interpretations, such as, multiple parsing results can be obtained. What is important is that even if a syntactic structure is considered impossible by common sense, it is correct if it is grammatically correct. Sentences that generally have strange meanings will be considered at the level of semantic analysis below.
Semantic analysis is the process of analyzing the meaning of a sentence with a syntactic structure. Semantic analysis is broadly defined and does not refer to any specific process, but case analysis and disambiguation are often cited as examples of semantic understanding. The previously mentioned syntactic structures that are impossible based on common sense are correctly detected by semantic analysis.
Context analysis is a process that analyzes not only the words but also the context-based expressions of sentences that have multiple sentences. For example, clarifying the subject of a pronoun that spans multiple sentences, or identifying an omitted subject word. Context analysis is said to be highly difficult. The feature is that connected sentences are generated because morphological analysis and semantic analysis are repeatedly performed on multiple sentences.
Introduction to Natural Language Processing 2: History
Next, in order to understand natural language processing, it is important to understand its history. The history of natural language processing can be broadly divided into three periods: the dawn (from 1940), the perseverance period (from 1960), and the development period (from 1990). Let’s explain the history of each.
The earliest history is the dawn period, which corresponds to around 1940 to 1960. In the early days, the first computer was created in 1946. When computers were first introduced, they were used for military purposes such as trajectory calculations and code-breaking, rather than the digital technology that we see today. It was thought that it could be used for translation, and a full-scale translation project began in 1952.
Patience period (1960-)
This was followed by a period of patience from 1960 to around 1990, during which people struggled to implement natural language processing. The patience period was a time when huge amounts of money were spent on research into natural language processing, but various challenges were discovered. In 1966, a report was published on the current state and difficulties of machine translation, and with this report, research funding stopped.
Development period (1990-)
Finally, the period from 1990 to the present is called the development period. The period of development was a time when the Internet became widespread worldwide, and digital technology has permeated our lives. Since the late 1990s, a computing environment for performing natural language processing has been established, and the United States has once again started funding research, which has led to significant development since 2000. In addition, since the beginning of the 2000s, natural language processing has attracted a lot of attention due to improvements in computer specs and the use of big data. In particular, the accuracy of machine translation using neural networks has improved to a practical level.
Introduction to natural language processing 3: What you can do
Finally, let’s take a look at what can be achieved with natural language processing. Natural language processing is utilized in the following technologies:
- Search engine
- machine translation
- AI chatbot
Search engines are search functions such as Google and Yahoo! Search that utilize natural language processing. Search engines predict and display words that are likely to be searched following the keyword you enter. Google uses the latest natural language processing technology, “BERT (Bidirectional Encoder Representations from Transformers),” which is improving the accuracy of natural language processing.
Natural language processing is also used in machine translations such as Google Translate. With advances in natural language processing, it has become possible to translate expressions that are closer to what humans can understand. Before the introduction of natural language processing, when translating from Japanese to English, the translation was done without paying attention to grammar or meaning. Recently, a highly accurate translation function called “DeepL” has appeared. DeepL can translate document files as they are, so you can use the translation function efficiently without having to go through the trouble of re-creating them for translation.
AI chatbot is a service that utilizes voice recognition such as “Siri” implemented on iPhone and “Alexa” provided by Amazon. AI chatbots use natural language processing to analyze the words spoken by humans and respond with answers that have the closest meaning. For example, if you are an iPhone user, you can say “Hey Siri” and ask “What are your plans for tomorrow?” and the device will respond with tomorrow’s plans based on your schedule.
The process of natural language processing using AI is an important technology that supports AI assistants and machine translation. In order to introduce AI, you need mathematical knowledge and programming language knowledge to implement it.